Keras: Keras CNN TypeError: float() argument must be a string or a number, not 'JpegImageFile'

Created on 6 Sep 2018 · 10Comments · Source: keras-team/keras

I try to set up a multiclass CNN with Keras which relies on ImageDataGenerator and flow_from_directory. Unfortunately I receive the following error:

Traceback (most recent call last):
  File "/home/torben/.local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 578, in get
    inputs = self.queue.get(block=True).get()
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/torben/.local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 401, in get_index
    return _SHARED_SEQUENCES[uid][i]
  File "/home/torben/.local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 1296, in __getitem__
    return self._get_batches_of_transformed_samples(index_array)
  File "/home/torben/.local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 1773, in _get_batches_of_transformed_samples
    x = img_to_array(img, data_format=self.data_format)
  File "/home/torben/.local/lib/python3.6/site-packages/keras_preprocessing/image.py", line 423, in img_to_array
    x = np.asarray(img, dtype=backend.floatx())
  File "/home/torben/.local/lib/python3.6/site-packages/numpy/core/numeric.py", line 501, in asarray
    return array(a, dtype, copy=False, order=order)
TypeError: float() argument must be a string or a number, not 'JpegImageFile'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "cnn.py", line 145, in <module>
    Cnn()
  File "cnn.py", line 23, in __init__
    self.train_model()
  File "cnn.py", line 69, in train_model
    validation_steps=50)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1415, in fit_generator
    initial_epoch=initial_epoch)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 230, in fit_generator
    workers=0)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1469, in evaluate_generator
    verbose=verbose)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/engine/training_generator.py", line 327, in evaluate_generator
    generator_output = next(output_generator)
  File "/home/torben/.local/lib/python3.6/site-packages/keras/utils/data_utils.py", line 584, in get
    six.raise_from(StopIteration(e), e)
  File "<string>", line 3, in raise_from
StopIteration: float() argument must be a string or a number, not 'JpegImageFile'

This is my Keras code:

train_datagen = ImageDataGenerator(rescale=1. / 255)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory('../image-classifier-files/train',
                                                    target_size=(constants.IMG_SIZE, constants.IMG_SIZE),
                                                    batch_size=32)
validation_generator = test_datagen.flow_from_directory('../image-classifier-files/val',
                                                        target_size=(constants.IMG_SIZE, constants.IMG_SIZE),
                                                        batch_size=32)

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(constants.IMG_SIZE, constants.IMG_SIZE, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.25))

model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.25))

model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Dropout(0.25))

model.add(layers.Flatten())
model.add(layers.Dense(1024, activation='relu'))
model.add(layers.Dropout(0.5))

model.add(layers.Dense(self.simterm_count, activation='sigmoid'))

model.summary()

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.Adam(),
              metrics=['acc'])

history = model.fit_generator(train_generator,
                              steps_per_epoch=100,
                              epochs=30,
                              verbose=2,
                              validation_data=validation_generator,
                              validation_steps=50)

I've already did some research and found a closed issue on this GitHub Repo. The recommendation was to search for broken or non-valid images in training and validation data. Therefore I wrote a cleaner script which is using Image.open(…) and deletes an image if…
1.) im.verify() is not None
2.) im.format != 'JPEG'
3.) im.mode != 'RGB'
4.) im.size[0] != 256
5.) filesize.st_size < 1000

Some images were deleted, for example with im.mode == "L". But the error message remains.

To investigate

Source

TorbenL

Most helpful comment

You should find out those wrong jpeg files.It is not keras`s error.You may try this:

import cv2 as cv
import glob

imagepath = 'xxx/xxx/images'
imgs_names = glob.glob(imagepath+'/*.jpg')
for imgname in imgs_names:
img = cv.imread(imgname)
if img is None:
print(imgname)

Then, just delete them!

HornGate on 2 Nov 2018

👍6

All 10 comments

Can you find out which image causes this issue? Maybe by removing half the dataset and see if the error remains (similar to binary search)? It's going to be difficult for us to fix it if we can't reproduce this bug. Thank you for your help.

gabrieldemarmiesse on 9 Sep 2018

You should find out those wrong jpeg files.It is not keras`s error.You may try this:

import cv2 as cv
import glob

imagepath = 'xxx/xxx/images'
imgs_names = glob.glob(imagepath+'/*.jpg')
for imgname in imgs_names:
img = cv.imread(imgname)
if img is None:
print(imgname)

Then, just delete them!

HornGate on 2 Nov 2018

👍6

My Images are all OK, I am still getting this error please help!

soufianesabiri on 28 Mar 2019

👍2

My Images are all OK, I am still getting this error please help!

In my opinion, due to the characteristics of Ubuntu16.04 system, too many images will cause fake image errors. You can add ” try except “ in your program, catch exceptions and then skip.

HornGate on 29 Mar 2019

@soufianesabiri I had the same issue. Although it's not the solution, try to substitute Flatten() by Reshape((-1,)).

eram1205 on 13 May 2019

I had the same issue in cases when target_size parameter in the flow_from_directory method has same value as the size of an image in the dataset. (e.g. my image has 256x256 size and I set target_size parameter value to (256, 256)).
Changing target_size value (or size of the image) helped.

kakareko on 27 Jun 2019

👍1

Can you find out which image causes this issue? Maybe by removing half the dataset and see if the error remains (similar to binary search)? It's going to be difficult for us to fix it if we can't reproduce this bug. Thank you for your help.

Hi @gabrieldemarmiesse ,

Recently I also met this problem when using 'model.fit' API to train a model. I set the batch size as 128, and I have 117586 steps in total for each epoch. First I set only 50000 steps for each epoch and the training procedure went well. Then I enlarged it to 117586 and this same error occured.

I also have checked my data that are all readable and OK. And I am using Ubuntu 16.04.6 LTS. Since this issue is raised 2 years ago, I wonder if there is any solution for that.

Thank you.

TMaysGGS on 17 Apr 2020

Hi @soufianesabiri ,

I wonder if you have solved this problem since I met the same situation and all my images have been checked over and over again to be readable.

TMaysGGS on 17 Apr 2020

hi, i have the same problem.
TypeError: float() argument must be a string or a number, not 'TiffImageFile'
The error occurs sometimes, it means sometimes it works fine, and sometimes it doesn't work. The image is the same one, and code also not change.

kaixin-bai on 28 Apr 2020

I recently checked and tried again on the image datasets I used. I found that even a picture that can be read successfully by "cv2.imread" might lose some information, which causes this problem. I use the following codes to resolve this problem:

import cv2
from skimage import io

cv_img = cv2.imread(img_path)
try:
    print(cv_img.shape)
except:
    print("Error image path: " + img_path)
try:
    sk_img = io.imread(img_path)
    print(sk_img.shape)
except:
    cv2.imwrite(img_path, cv_img)

TMaysGGS on 28 Apr 2020

❤1 👍1

Was this page helpful?

0 / 5 - 0 ratings