I try to set up a multiclass CNN with Keras which relies on ImageDataGenerator and flow_from_directory. Unfortunately I receive the following error:
Traceback (most recent call last):
File "/home/torben/.local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 578, in get
inputs = self.queue.get(block=True).get()
File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/torben/.local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 401, in get_index
return _SHARED_SEQUENCES[uid][i]
File "/home/torben/.local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 1296, in __getitem__
return self._get_batches_of_transformed_samples(index_array)
File "/home/torben/.local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 1773, in _get_batches_of_transformed_samples
x = img_to_array(img, data_format=self.data_format)
File "/home/torben/.local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 423, in img_to_array
x = np.asarray(img, dtype=backend.floatx())
File "/home/torben/.local/lib/python3.6/site-packages/numpy/core/numeric.py", line 501, in asarray
return array(a, dtype, copy=False, order=order)
TypeError: float() argument must be a string or a number, not 'JpegImageFile'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "cnn.py", line 145, in <module>
Cnn()
File "cnn.py", line 23, in __init__
self.train_model()
File "cnn.py", line 69, in train_model
validation_steps=50)
File "/home/torben/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1415, in fit_generator
initial_epoch=initial_epoch)
File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 230, in fit_generator
workers=0)
File "/home/torben/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1469, in evaluate_generator
verbose=verbose)
File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 327, in evaluate_generator
generator_output = next(output_generator)
File "/home/torben/.local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 584, in get
six.raise_from(StopIteration(e), e)
File "<string>", line 3, in raise_from
StopIteration: float() argument must be a string or a number, not 'JpegImageFile'
This is my Keras code:
train_datagen = ImageDataGenerator(rescale=1. / 255)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory('../image-classifier-files/train',
target_size=(constants.IMG_SIZE, constants.IMG_SIZE),
batch_size=32)
validation_generator = test_datagen.flow_from_directory('../image-classifier-files/val',
target_size=(constants.IMG_SIZE, constants.IMG_SIZE),
batch_size=32)
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(constants.IMG_SIZE, constants.IMG_SIZE, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.25))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.25))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.25))
model.add(layers.Flatten())
model.add(layers.Dense(1024, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(self.simterm_count, activation='sigmoid'))
model.summary()
model.compile(loss='binary_crossentropy',
optimizer=optimizers.Adam(),
metrics=['acc'])
history = model.fit_generator(train_generator,
steps_per_epoch=100,
epochs=30,
verbose=2,
validation_data=validation_generator,
validation_steps=50)
I've already did some research and found a closed issue on this GitHub Repo. The recommendation was to search for broken or non-valid images in training and validation data. Therefore I wrote a cleaner script which is using Image.open(…) and deletes an image if…
1.) im.verify() is not None
2.) im.format != 'JPEG'
3.) im.mode != 'RGB'
4.) im.size[0] != 256
5.) filesize.st_size < 1000
Some images were deleted, for example with im.mode == "L". But the error message remains.
Can you find out which image causes this issue? Maybe by removing half the dataset and see if the error remains (similar to binary search)? It's going to be difficult for us to fix it if we can't reproduce this bug. Thank you for your help.
You should find out those wrong jpeg files.It is not keras`s error.You may try this:
import cv2 as cv
import glob
imagepath = 'xxx/xxx/images'
imgs_names = glob.glob(imagepath+'/*.jpg')
for imgname in imgs_names:
img = cv.imread(imgname)
if img is None:
print(imgname)
Then, just delete them!
My Images are all OK, I am still getting this error please help!
My Images are all OK, I am still getting this error please help!
In my opinion, due to the characteristics of Ubuntu16.04 system, too many images will cause fake image errors. You can add ” try except “ in your program, catch exceptions and then skip.
@soufianesabiri I had the same issue. Although it's not the solution, try to substitute Flatten() by Reshape((-1,)).
I had the same issue in cases when target_size parameter in the flow_from_directory method has same value as the size of an image in the dataset. (e.g. my image has 256x256 size and I set target_size parameter value to (256, 256)).
Changing target_size value (or size of the image) helped.
Can you find out which image causes this issue? Maybe by removing half the dataset and see if the error remains (similar to binary search)? It's going to be difficult for us to fix it if we can't reproduce this bug. Thank you for your help.
Hi @gabrieldemarmiesse ,
Recently I also met this problem when using 'model.fit' API to train a model. I set the batch size as 128, and I have 117586 steps in total for each epoch. First I set only 50000 steps for each epoch and the training procedure went well. Then I enlarged it to 117586 and this same error occured.
I also have checked my data that are all readable and OK. And I am using Ubuntu 16.04.6 LTS. Since this issue is raised 2 years ago, I wonder if there is any solution for that.
Thank you.
Hi @soufianesabiri ,
I wonder if you have solved this problem since I met the same situation and all my images have been checked over and over again to be readable.
hi, i have the same problem.
TypeError: float() argument must be a string or a number, not 'TiffImageFile'
The error occurs sometimes, it means sometimes it works fine, and sometimes it doesn't work. The image is the same one, and code also not change.
I recently checked and tried again on the image datasets I used. I found that even a picture that can be read successfully by "cv2.imread" might lose some information, which causes this problem. I use the following codes to resolve this problem:
import cv2
from skimage import io
cv_img = cv2.imread(img_path)
try:
print(cv_img.shape)
except:
print("Error image path: " + img_path)
try:
sk_img = io.imread(img_path)
print(sk_img.shape)
except:
cv2.imwrite(img_path, cv_img)
Most helpful comment
You should find out those wrong jpeg files.It is not keras`s error.You may try this:
import cv2 as cv
import glob
imagepath = 'xxx/xxx/images'
imgs_names = glob.glob(imagepath+'/*.jpg')
for imgname in imgs_names:
img = cv.imread(imgname)
if img is None:
print(imgname)
Then, just delete them!