Keras: flow_from_directory seems to find no images

Created on 3 Oct 2016 · 26Comments · Source: keras-team/keras

Hello

I'm running into this issue using the latest version of Keras (1.1.0). I also tried to use version 1.0.0 and 1.0.1 and it's the same.

Whenever I try to use the data augmentation ImageDataGenerator, it seems that the method flow_from_directory can't find any image in my folders.

Here's my model code:

img_width, img_height = 150, 150

train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 2000
nb_validation_samples = 800
nb_epoch = 5


model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(3, img_width, img_height,)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), dim_ordering="th"))

model.add(Convolution2D(32, 3, 3, dim_ordering="th"))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(1))
model.add(Dense(1))
model.add(Activation('sigmoid'))

Here's my data augmentation code: (error is located here)

train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        'data/train',
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        'data/validation',
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

model.fit_generator(
        train_generator,
        samples_per_epoch=2000,
        nb_epoch=50,
        validation_data=validation_generator,
        nb_val_samples=800)

I followed this tutorial (even corrected some mistakes with the dimension ordering)
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

After solving a lot of errors, I'm stuck to the point where the method flow_ffrom_directory finds no image in my folders, thus generating a null output:

$ python ml.py
Using TensorFlow backend.
Found 0 images belonging to 0 classes.
Found 0 images belonging to 0 classes.
Epoch 1/5
Traceback (most recent call last):
  File "ml.py", line 69, in <module>
    nb_val_samples=nb_validation_samples)
  File "/home/clement/ESIEA/5A/Machine_Learning/lib/python3.5/site-packages/Keras-1.1.0-py3.5.egg/keras/models.py", line 874, in fit_generator
    pickle_safe=pickle_safe)
  File "/home/clement/ESIEA/5A/Machine_Learning/lib/python3.5/site-packages/Keras-1.1.0-py3.5.egg/keras/engine/training.py", line 1417, in fit_generator
    'or (x, y). Found: ' + str(generator_output))
Exception: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: None

I even tried to modify the number of images in my folder but there seems to be nothing I could do.

Source

cdalbergue

Most helpful comment

The path needs to be to a folder, that contains folders, that contains the images.

gurbraj on 12 Nov 2016

👍48 🎉11

All 26 comments

My entire bad. Completely forgot to create subdirs. Found my error exploring the source code of Directory Iterator class.

cdalbergue on 3 Oct 2016

👍12 😄4 🚀2 🎉1

I'm having this problem now...How did you fix it? what was the problem wrt subdirs?

UPDATE: Also fixed it through subdirs. The path needs to be to a folder (that contains the images) as opposed to directly to the images. Just in case some one else googles this.

gurbraj on 4 Nov 2016

👍31

How did you solve it? My path is already a folder; what do you mean?

zahrasorour on 11 Nov 2016

The path needs to be to a folder, that contains folders, that contains the images.

gurbraj on 12 Nov 2016

👍48 🎉11

I have it in a folder that contains folders of pictures. Still, it doesn't read my image. Why is that?

zahrasorour on 13 Nov 2016

look carefully at the file structure in the beginning of https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html. good luck!

gurbraj on 13 Nov 2016

👍7 👎1

Ok, thank you. I fixed it.

Now, I get this error:

"model.fit_generator(
... train_generator,
... samples_per_epoch=12,
... nb_epoch=1,
... validation_data=validation_generator,
... nb_val_samples=4)
Traceback (most recent call last):
File "", line 6, in
File "/Users/Zahra/anaconda/envs/tensorflowTwo/lib/python2.7/site-packages/keras/models.py", line 823, in fit_generator
raise Exception('The model needs to be compiled before being used.')
Exception: The model needs to be compiled before being used.

"

zahrasorour on 13 Nov 2016

follow the instructions in the link i sent. as the error suggests, you need to compile the model before you can train it(i,e give the net the settings under which to train).

gurbraj on 13 Nov 2016

Yes, I followed all the instructions on the link but still I get this error. Here is my code:

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator

import numpy as np

model = Sequential()

model.add(Convolution2D(32, 3, 3, border_mode='valid', dim_ordering='tf', input_shape=(150, 200, 4)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25)) #Cannot take float values

model.add(Convolution2D(64, 3, 3, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))

train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
'Data/Train', # this is the target directory
target_size=(150, 200), # all images will be resized to 150x150
batch_size=32,
class_mode='binary') # since we use binary_crossentropy loss, we need binary labels

validation_generator = test_datagen.flow_from_directory(
'Data/Validation',
target_size=(150, 200),
batch_size=32,
class_mode='binary')

model.fit_generator(train_generator, 12, 1, validation_generator, 4)

model.save_weights('thesis.h5') # always save your weights after training or during training

_Where's the problem? How can I give the net the settings beforehand?_

zahrasorour on 13 Nov 2016

@zahrasorour , you need to compile the model using model.compile before model.fit. model.compile is where you initialize your loss function, optimizer, metrics etc.

arundasan91 on 24 Feb 2017

Getting the same issue (using Keras and Tensorflow), any help would be greatly appreciated.

My directory is set to be folder/folder/images - for both training and testing data.

I made a loop to test the different depths/nb_layers in a Resnet, as well as some hyper parameters like learning rate, batch size, etc. The test went from 4, 6, 8, 10 - all the way to 20, then gave me output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: None

I don't understand why it can work for a handful of the iterations, then fail.

I read here to update keras to 2.0, but was told not to change the version of keras by my boss.

I read here to convert all labels to a numpy array, but keras documentation states this already happens to labels while using the 'categorical' attribute in flow_from_directory

Then I read here to put my train_generator in a function, then create an infinite while loop and yield the results, but this results in the data to be loaded over and over at the start of the program. "Found 350 images belonging to 7 classes" (repeated 10 times), which then results in an error "output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: "

Here's my stack trace for the original error:

Traceback (most recent call last):

File "", line 1, in
runfile('K:/Manufacturing Operations/Yield/Tools_Yield/PythonScripts/AI/ISL_DI/Resnet/resISL_Depth.py', wdir='K:/Manufacturing Operations/Yield/Tools_Yield/PythonScripts/AI/ISL_DI/Resnet')

File "C:\Users\paul\AppData\Local\Continuum\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
execfile(filename, namespace)

File "C:\Users\paul.\AppData\Local\Continuum\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 89, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "K:/Manufacturing Operations/Yield/Tools_Yield/PythonScripts/AI/ISL_DI/Resnet/resISL_Depth.py", line 233, in
callbacks=callbacks_list)

File "C:\Users\paul\AppData\Roaming\Python\Python35\site-packages\keras\engine\training.py", line 1481, in fit_generator
str(generator_output))

ValueError: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: None

Here's the code other than vars

`rep=0

#

new model loop

for i in range(retrainings + 1):
#lr_init = [5, 1, .1, .01]
while rep != len(layers) - 1:
lr_init = [5, 1]
for lr_val in lr_init:

        decay_init = .1
        epochs_drop = 20
        patience=60
        # learning rate schedule
        def step_decay(epoch):
            initial_lrate = lr_val
            drop = 0.1
            epochs_drop = 60
            lrate = initial_lrate * math.pow(drop, math.floor((1+epoch)/epochs_drop))
            #print('\nLR: {:.6f}\n'.format(lrate))
            return lrate

        momentum_init=0.9
        sgd = SGD(lr=lr_val, decay=decay_init, momentum=momentum_init, nesterov=False)


        ##reduce learning rate when loss has stopped improving
        #lr_reducer = ReduceLROnPlateau(monitor='val_loss', factor=np.sqrt(0.1), cooldown=0, patience=5, min_lr=0.5e-6)
        ##stop training when accuracy has stopped improving
        early_stopper = EarlyStopping(monitor='val_acc', min_delta=0.001, patience=50)
        #csv_logger = CSVLogger('resnet18_cifar10.csv')

        repititions = 3
        #epochs=[105]
        epochs=[200]
        drop_out=[0]
        #batchsize=[2, 4, 8, 10]
        batchsize=[2, 5, 10]
        zoom=[0]
        shear=[0] 
        channelshift=[0]
        featurewise=[False]
        samplewise=[False]
        rotation=[0]

        nb_train_samples = 350
        nb_validation_samples = 140

        colormode='rgb'

         # input image dimensions
        img_width, img_height = 224, 224
        nb_classes=7                                    
        img_channels = 3

        for epoch_val in epochs:
            for dropout_val in drop_out:
                for batchsize_val in batchsize:
                    for zoom_val in zoom:
                        for shear_val in shear:
                            for channelshift_val in channelshift:
                                for featurewise_val in featurewise:
                                    for samplewise_val in samplewise:
                                        for rotation_val in rotation:
                                            for r in range(repititions):

            #                                    np.random.seed(7)
            #                                    tf.set_random_seed(7)    

                                                train_data_dir = basepath + pathlist[0] 
                                                validation_data_dir = basepath + pathlist[1] 

                                                #############################################
                                                #############################################

                                                params={}    
                                                params['epochs']=epoch_val
                                                params['drop_out']=dropout_val
                                                params['batchsize']=batchsize_val
                                                params['zoom']=zoom_val
                                                params['shear']=shear_val
                                                params['channelshift']=channelshift_val
                                                params['featurewise']=featurewise_val
                                                params['samplewise']=samplewise_val
                                                params['rotation']=rotation_val
                                                params['lr_init']=lr_val
                                                params['momentum_init']=momentum_init
                                                params['decay_init']=decay_init
                                                params['epochs_drop']=epochs_drop
                                                params['img_size']=list([img_width,img_height])
                                                params['patience']=patience                                         

                                                total = 0
                                                currentlayer = [i * 2 for i in layers[rep]]
                                                total = sum(currentlayer) + 2
                                                savefilename='resnet_' + str(total) + '_BKM_lr_' + str(lr_val) + '_batchSize_' + str(batchsize_val) + '_repition' + str((r+1)) + '_Study' 
                                                total = 0
                                                with tf.device('/gpu:0'):

                                                    model = resnet_iter.ResnetBuilder.build_resnet_34((img_channels, img_width, img_height), nb_classes, layers[rep])
                                                    model.compile(loss='categorical_crossentropy',
                                                                  optimizer=sgd,
                                                                  metrics=['accuracy'])

                                                    train_datagen = ImageDataGenerator(
                                                        featurewise_center=False,  # set input mean to 0 over the dataset
                                                        samplewise_center=False,  # set each sample mean to 0
                                                        featurewise_std_normalization=featurewise_val,  # divide inputs by std of the dataset
                                                        samplewise_std_normalization=samplewise_val,  # divide each input by its std
                                                        zca_whitening=False,  # apply ZCA whitening
                                                        channel_shift_range=channelshift_val, #VGG set to 0
                                                        fill_mode="reflect", #VGG set to reflect
                                                        rotation_range=rotation_val,  # randomly rotate images in the range (degrees, 0 to 180)
                                                        rescale=1./255, #VGG set to 1./255
                                                        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width) - VGG set to 0
                                                        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height) - VGG set to 0
                                                        shear_range=shear_val, #VGG set to 0
                                                        zoom_range=zoom_val, #VGG set to 0.1
                                                        horizontal_flip=True,  # randomly flip images
                                                        vertical_flip=True)  # randomly flip images VGG set to True

                                                    test_datagen = ImageDataGenerator(rescale=1./255)

                                                    train_generator = train_datagen.flow_from_directory(
                                                        train_data_dir,
                                                        target_size=(img_width, img_height),
                                                        batch_size=batchsize_val,
                                                        shuffle=True,
                                                        color_mode=colormode,
                                                        class_mode='categorical')

                                                    validation_generator = test_datagen.flow_from_directory(
                                                        validation_data_dir,
                                                        target_size=(img_width, img_height),
                                                        batch_size=batchsize_val,
                                                        shuffle=True,
                                                        color_mode=colormode,
                                                        class_mode='categorical')

                                                    lrate = LearningRateScheduler(step_decay)
                                                    callbacks_list = [lrate, early_stopper]

                                                    try:
                                                        A=model.fit_generator(
                                                            train_generator,
                                                            samples_per_epoch=nb_train_samples,
                                                            nb_epoch=epoch_val,
                                                            validation_data=validation_generator,
                                                            nb_val_samples=nb_validation_samples,
                                                            callbacks=callbacks_list)
                                                    except:
                                                        print("train_generator: " + train_generator)
                                                        print("train_data_dir: " + train_data_dir)
                                                        files=os.listdir(train_data_dir)
                                                        print(len(files))`

ohernpaul on 11 Apr 2017

Maybe someone needs this, it took me quite a lot to find out:
If you are trying to read .gif files with flow_from_directory modify there is a variable in preprocessing/image.py

white_list_formats = {'png', 'jpg', 'jpeg', 'bmp', 'ppm'}

Add 'gif' and it will work smooth.
Do you think a pull request is needed? I think they should warn you in the documentation at least.

HectorAnadon on 3 Sep 2017

👍16

Thank you @HectorAnadon .
I was having the same problem with .tif file type.
Fixed the problem by adding it to white_list_formats.

hassaan4584 on 7 Mar 2018

import keras
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.utils import plot_model
from keras.utils.np_utils import to_categorical
from keras.datasets import mnist
(x_train,y_train),(x_test,y_test)=mnist.load_data()
num_classes=y_test.shape[0]
x_train=x_train.reshape(x_train.shape[0],28,28,1)
x_test=x_test.reshape(x_test.shape[0],28,28,1)
x_train=x_train.astype('float32')
x_test=x_test.astype('float32')
x_train=x_train/255
x_test=x_test/255
y_train = keras.utils.to_categorical(y_train,num_classes)
y_test = keras.utils.to_categorical(y_test,num_classes)
classifier=Sequential()
classifier.add(Conv2D(28,(3,3),input_shape=(28,28,1),activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2)))
classifier.add(Conv2D(64,(3,3),activation='relu'))
classifier.add(Flatten())
classifier.add(Dropout(0.25))
classifier.add(Dense(units=128,activation='relu'))
classifier.add(Dense(num_classes,activation='softmax'))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.fit(x_train,y_train)
from keras.preprocessing.image import ImageDataGenerator
test_datagen = ImageDataGenerator(rescale=1./255)

validation_generator = test_datagen.flow_from_directory(
r'C:\Users\user\Downloads\Research\New folder',
target_size=(28,28),
batch_size=32,
class_mode='binary')
this is showing error found 0 images belonging to 0 classes when i run last 3 lines please help

Hs1000 on 4 Apr 2018

👍1

Hi @Hs1000 !!
I guess your path( "C:\Users\user\Downloads\Research\New folder") has the images.
You need to get those images and store it in a new folder inside the 'New Folder' so your path must look like - C:\Users\user\Downloads\Research\New folder\Sub_Dir_Containing_Images\' . Then you must pass the same path again. Remember don't change the path!

validation_generator = test_datagen.flow_from_directory(
r'C:\Users\user\Downloads\Research\New folder', ## No change in the path
target_size=(28,28),
batch_size=32,
class_mode='binary')

Also for deep dive, one must remember that ImageDataGenerator.flow_from_directory was created keeping in mind the good ol' image classification problem. So flow_from_directory() receives a folder name where different folder resides containing the images(traversal of nested sub directories). What most people tend to do is directly pass the the folder which contains the images.

Happy Coding!!

Inferno-P on 27 Jun 2018

To anyone coming here from Google because you're using class_mode=None, you still have to create a subdirectory:

If None, no labels are returned (the generator will only yield batches of image data, which is useful to use with model.predict_generator(), model.evaluate_generator(), etc.). Please note that in case of class_mode None, the data still needs to reside in a subdirectory of directory for it to work correctly.

BovineEnthusiast on 12 Aug 2018

👍6

Thank you for replying i need to know how i can display or predict the
image whose path i had provided

On Wed, Jun 27, 2018 at 10:57 AM, Adamya Tripathi notifications@github.com
wrote:

Hi @Hs1000 https://github.com/Hs1000 !!
I guess your path( "C:\Users\user\Downloads\Research\New folder") has the
images.
You need to get those images and store it in a new folder inside the 'New
Folder' so your path must look like - C:\Users\user\Downloads\Research\New
folder\Sub_Dir_Containing_Images' . Then you must pass the same path
again. Remember don't change the path!

validation_generator = test_datagen.flow_from_directory(
r'C:\Users\user\Downloads\Research\New folder', ## No change in the path
target_size=(28,28),
batch_size=32,
class_mode='binary')

Also for deep dive, one must remember that
ImageDataGenerator.flow_from_directory was created keeping in mind the
good ol' image classification problem. So flow_from_directory()
receives a folder name where different folder resides containing the
images(traversal of nested sub directories). What most people tend to do is
directly pass the the folder which contains the images.

Happy Coding!!

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/3946#issuecomment-400546787,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AkSY7FrdYAZLUg5Ew3kqM9m59WL-FcgCks5uAxevgaJpZM4KMgpW
.

Hs1000 on 12 Aug 2018

Maybe someone needs this, it took me quite a lot to find out:
If you are trying to read .gif files with flow_from_directory modify there is a variable in preprocessing/image.py

white_list_formats = {'png', 'jpg', 'jpeg', 'bmp', 'ppm'}

Add 'gif' and it will work smooth.
Do you think a pull request is needed? I think they should warn you in the documentation at least.

how come they don't give the option to manipulate this list and that gif is not there by default?

yoavlevy on 20 Dec 2018

Maybe someone needs this, it took me quite a lot to find out:
If you are trying to read .gif files with flow_from_directory modify there is a variable in preprocessing/image.py

white_list_formats = {'png', 'jpg', 'jpeg', 'bmp', 'ppm'}

Add 'gif' and it will work smooth.
Do you think a pull request is needed? I think they should warn you in the documentation at least.

for anaconda users, please dont do this in tensorflow/keras folder but in keras' own folder, this happens when you're too lazy to check and search using windows search

finally, this won't solve the problem completely, gif is a 1 channel image, but keras is still reading it as 3 channel image, which create problem for targets of binary masks

ShubhamDebnath on 25 Dec 2018

My entire bad. Completely forgot to create subdirs. Found my error exploring the source code of Directory Iterator class.
Thanks @cdalbergue ! I got same your issue and your question helps me recognize it.

!mkdir '/content/train/'dog
!mkdir '/content/train/'cat
!bash -c 'mv '/content/train/'cat.{0..12499}.jpg /content/train/cat'
!bash -c 'mv '/content/train/'dog.{0..12499}.jpg /content/train/dog'

chiendo1010 on 26 Feb 2019

👍1

I almost pulled all my hairs out for this error and it turns out it needed a new folder inside the given directory.
Are you kidding me?
Anyways thanks guys ....Real lifesaver

doomSDey on 29 Mar 2019

How did i solve it? My path is already a folder.I have it in a folder that contains folders of pictures. Still, it doesn't read my image. Why is that?

train_path = 'D:data\Train_Data\egg'
train_batches = ImageDataGenerator().flow_from_directory(train_path, target_size=(224,224), classes=classes_required, batch_size=batch_size_train)
type(train_batches)

Found 0 images belonging to 106 classes.

harichandu13 on 6 Sep 2019

The path needs to be to a folder, that contains folders, that contains the images.

It doesn't work in google colab. It gives me you have 0 images but when the directory with images exist

GRTO on 16 Feb 2020

👍2

I have the message shows that the image found, and I set the save_to_dir. but I found nothing in my save_to_dir that I set! how could that happen?

JunoWang on 6 Jun 2020

The path needs to be to a folder, that contains folders, that contains the images.

Thanks for the clear explanation

MariaFrancis713 on 24 Jul 2020

Does this work with any file extension. Tried reading .pgm images but got the error
Found 0 images belonging to 40 classes.

This works fine with .jpg images

sainivedh on 3 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

How to add reshape layer following embedding(for the purpose of 2d convolution)?

rantsandruse · 3Comments

In training process, validation data are necessary?

Imorton-zd · 3Comments

showing raise KeyError('%s not in index' % objarr[mask])

vinayakumarr · 3Comments

Model with Dropout layer wrapped in TimeDistributed fails on Theano

somewacko · 3Comments

Cost-sensitive classification

zygmuntz · 3Comments