Keras: Keras CNN TypeError: float() argument must be a string or a number, not 'JpegImageFile'

Created on 6 Sep 2018  ·  10Comments  ·  Source: keras-team/keras

I try to set up a multiclass CNN with Keras which relies on ImageDataGenerator and flow_from_directory. Unfortunately I receive the following error:

Traceback (most recent call last):
  File "/home/torben/.local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 578, in get
    inputs = self.queue.get(block=True).get()
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/torben/.local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 401, in get_index
    return _SHARED_SEQUENCES[uid][i]
  File "/home/torben/.local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 1296, in __getitem__
    return self._get_batches_of_transformed_samples(index_array)
  File "/home/torben/.local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 1773, in _get_batches_of_transformed_samples
    x = img_to_array(img, data_format=self.data_format)
  File "/home/torben/.local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 423, in img_to_array
    x = np.asarray(img, dtype=backend.floatx())
  File "/home/torben/.local/lib/python3.6/site-packages/numpy/core/numeric.py", line 501, in asarray
    return array(a, dtype, copy=False, order=order)
TypeError: float() argument must be a string or a number, not 'JpegImageFile'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "cnn.py", line 145, in <module>
    Cnn()
  File "cnn.py", line 23, in __init__
    self.train_model()
  File "cnn.py", line 69, in train_model
    validation_steps=50)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1415, in fit_generator
    initial_epoch=initial_epoch)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 230, in fit_generator
    workers=0)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1469, in evaluate_generator
    verbose=verbose)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 327, in evaluate_generator
    generator_output = next(output_generator)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 584, in get
    six.raise_from(StopIteration(e), e)
  File "<string>", line 3, in raise_from
StopIteration: float() argument must be a string or a number, not 'JpegImageFile'

This is my Keras code:

train_datagen = ImageDataGenerator(rescale=1. / 255)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory('../image-classifier-files/train',
                                                    target_size=(constants.IMG_SIZE, constants.IMG_SIZE),
                                                    batch_size=32)
validation_generator = test_datagen.flow_from_directory('../image-classifier-files/val',
                                                        target_size=(constants.IMG_SIZE, constants.IMG_SIZE),
                                                        batch_size=32)

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(constants.IMG_SIZE, constants.IMG_SIZE, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.25))

model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.25))

model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.25))

model.add(layers.Flatten())
model.add(layers.Dense(1024, activation='relu'))
model.add(layers.Dropout(0.5))

model.add(layers.Dense(self.simterm_count, activation='sigmoid'))

model.summary()

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.Adam(),
              metrics=['acc'])

history = model.fit_generator(train_generator,
                              steps_per_epoch=100,
                              epochs=30,
                              verbose=2,
                              validation_data=validation_generator,
                              validation_steps=50)

I've already did some research and found a closed issue on this GitHub Repo. The recommendation was to search for broken or non-valid images in training and validation data. Therefore I wrote a cleaner script which is using Image.open(…) and deletes an image if…
1.) im.verify() is not None
2.) im.format != 'JPEG'
3.) im.mode != 'RGB'
4.) im.size[0] != 256
5.) filesize.st_size < 1000

Some images were deleted, for example with im.mode == "L". But the error message remains.

To investigate

Most helpful comment

You should find out those wrong jpeg files.It is not keras`s error.You may try this:


import cv2 as cv
import glob

imagepath = 'xxx/xxx/images'
imgs_names = glob.glob(imagepath+'/*.jpg')
for imgname in imgs_names:
img = cv.imread(imgname)
if img is None:
print(imgname)


Then, just delete them!

All 10 comments

Can you find out which image causes this issue? Maybe by removing half the dataset and see if the error remains (similar to binary search)? It's going to be difficult for us to fix it if we can't reproduce this bug. Thank you for your help.

You should find out those wrong jpeg files.It is not keras`s error.You may try this:


import cv2 as cv
import glob

imagepath = 'xxx/xxx/images'
imgs_names = glob.glob(imagepath+'/*.jpg')
for imgname in imgs_names:
img = cv.imread(imgname)
if img is None:
print(imgname)


Then, just delete them!

My Images are all OK, I am still getting this error please help!

My Images are all OK, I am still getting this error please help!

In my opinion, due to the characteristics of Ubuntu16.04 system, too many images will cause fake image errors. You can add ” try except “ in your program, catch exceptions and then skip.

@soufianesabiri I had the same issue. Although it's not the solution, try to substitute Flatten() by Reshape((-1,)).

I had the same issue in cases when target_size parameter in the flow_from_directory method has same value as the size of an image in the dataset. (e.g. my image has 256x256 size and I set target_size parameter value to (256, 256)).
Changing target_size value (or size of the image) helped.

Can you find out which image causes this issue? Maybe by removing half the dataset and see if the error remains (similar to binary search)? It's going to be difficult for us to fix it if we can't reproduce this bug. Thank you for your help.

Hi @gabrieldemarmiesse ,

Recently I also met this problem when using 'model.fit' API to train a model. I set the batch size as 128, and I have 117586 steps in total for each epoch. First I set only 50000 steps for each epoch and the training procedure went well. Then I enlarged it to 117586 and this same error occured.

I also have checked my data that are all readable and OK. And I am using Ubuntu 16.04.6 LTS. Since this issue is raised 2 years ago, I wonder if there is any solution for that.

Thank you.

Hi @soufianesabiri ,

I wonder if you have solved this problem since I met the same situation and all my images have been checked over and over again to be readable.

hi, i have the same problem.
TypeError: float() argument must be a string or a number, not 'TiffImageFile'
The error occurs sometimes, it means sometimes it works fine, and sometimes it doesn't work. The image is the same one, and code also not change.

I recently checked and tried again on the image datasets I used. I found that even a picture that can be read successfully by "cv2.imread" might lose some information, which causes this problem. I use the following codes to resolve this problem:

import cv2
from skimage import io

cv_img = cv2.imread(img_path)
try:
    print(cv_img.shape)
except:
    print("Error image path: " + img_path)
try:
    sk_img = io.imread(img_path)
    print(sk_img.shape)
except:
    cv2.imwrite(img_path, cv_img)
Was this page helpful?
0 / 5 - 0 ratings