Keras: ImageDataGenerator's load_img() forcing conversion to PIL mode "L" loses bit depth on 16bit grayscale .png

Created on 23 Nov 2016  路  7Comments  路  Source: keras-team/keras

Please make sure that the boxes below are checked before you submit your issue. Thank you!

  • [X ] Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

  • [ X] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
    pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps

  • [X ] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Summary:

I believe it would be helpful to allow toggling off of the forced conversion to 8bit:

def load_img(path, grayscale=False, target_size=None):
    from PIL import Image
    img = Image.open(path)
    if grayscale:
        img = img.convert('L')
    else:  # Ensure 3 channel even when loaded image is grayscale
        img = img.convert('RGB')
    return img

This is causing issues with, in my case, higher depth 16bit, grayscale images. And in the future, I can only imagine images getting more complex as processing power and the complexity demand size grows.

Expected behavior:

from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

img = Image.open('./toy/foo.png')  # this image is 16bit grayscale, PIL.Image autoloads as mode "I" 
#img = load_img('./toy/foo.png')  # keras hardcodes PIL.Image.convert to mode "L", which is acting as a low-pass filter, truncating all values above 255
x = img_to_array(img)  # this is a Numpy array with shape (101, 101, 3)
x

Clearly, the values are quite diverse.

array([[[ 26947.],
        [ 26367.],
        [ 26429.],
        ..., 
        [ 38390.],
        [ 40277.],
        [ 39516.]],

       [[ 27135.],
        [ 27470.],
        [ 26532.],
        ..., 
        [ 39014.],
        [ 39567.],
        [ 39516.]],

       [[ 27723.],
        [ 27323.],
        [ 26781.],
        ..., 
        [ 39972.],
        [ 39491.],
        [ 39063.]],

       ..., 
       [[ 27533.],
        [ 28660.],
        [ 28660.],
        ..., 
        [ 42340.],
        [ 41147.],
        [ 41948.]],

       [[ 27893.],
        [ 27744.],
        [ 29005.],
        ..., 
        [ 42521.],
        [ 41457.],
        [ 41250.]],

       [[ 27914.],
        [ 26532.],
        [ 27366.],
        ..., 
        [ 43681.],
        [ 41897.],
        [ 40684.]]], dtype=float32)

BUG:

from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

#img = Image.open('./toy/foo.png')  # this image is 16bit grayscale, PIL.Image autoloads as mode "I" 
img = load_img('./toy/foo.png')  # keras hardcodes PIL.Image.convert to mode "L", which is acting as a low-pass filter, truncating all values above 255
x = img_to_array(img)  # this is a Numpy array with shape (101, 101, 3)
x

Now, the values are all "whitened"

array([[[ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        ..., 
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.]],

       [[ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        ..., 
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.]],

       [[ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        ..., 
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.]],

       ..., 
       [[ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        ..., 
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.]],

       [[ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        ..., 
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.]],

       [[ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        ..., 
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.],
        [ 255.,  255.,  255.]]], dtype=float32)

Sample images

illustrating the difference between de facto load_img() behavior and manually calling PIL.open(), letting it stick with mode "I".

from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')
#img = Image.open('./toy/foo.png')  # this image is 16bit grayscale, PIL.Image autoloads as mode "I" 
img = load_img('./toy/foo.png')  # keras hardcodes PIL.Image.convert to mode "L" 8bit, which is acting as a low-pass filter, truncating all values above 255
x = img_to_array(img)  # this is a Numpy array with shape (101, 101, 3)
i = 0
for batch in datagen.flow(x, batch_size=1, save_to_dir='./toy', save_prefix='bug', save_format='png'):
    i += 1
    if i > 20:
        break

Original image:
foo
Expected ImageDataGenerator sample:
foo_0_277
Actual ImageDataGenerator:
bug_0_439

Most helpful comment

Update: I found one problem with the example above. I forgot to include grayscale=true while calling load_img(). The reason I forgot was because my original issue was actually with ImageDataGenerator.flow_from_directory(), where I already previously included a color_mode='grayscale'

So despite fixing this toy example, it was not enough to fix my intended issue.

For now, I've changed the source load_img() to mode 'I' as a temporary fix until I figure out a better way. Suggestions welcome!

if grayscale:
        img = img.convert('I')

and correspondingly, I'm passing rescale=1.0/(65535) to the data generator to account for the 16bit range.

This is now giving me augmented images as expected in the save dir.

All 7 comments

Update: I found one problem with the example above. I forgot to include grayscale=true while calling load_img(). The reason I forgot was because my original issue was actually with ImageDataGenerator.flow_from_directory(), where I already previously included a color_mode='grayscale'

So despite fixing this toy example, it was not enough to fix my intended issue.

For now, I've changed the source load_img() to mode 'I' as a temporary fix until I figure out a better way. Suggestions welcome!

if grayscale:
        img = img.convert('I')

and correspondingly, I'm passing rescale=1.0/(65535) to the data generator to account for the 16bit range.

This is now giving me augmented images as expected in the save dir.

Was the output 16bit or was it just scaled to 8bit ? It seem the array_to_img command is hard coded to create 8bit imagery.

@jfx319, you said it was a temporary fix, did you come to a more stable solution?
I wonder if it could be leaving img.mode to what it is..

It happens when img.mode is F (float32) and grayscale the function load_img truncates the values, so I have to avoid it changing the source code in image.py such as :

def load_img(path, grayscale=False, target_size=None):
    ...
    if grayscale and not img.mode != 'F':
        img = img.convert('L')
    ...
    return img

Update: I found one problem with the example above. I forgot to include grayscale=true while calling load_img(). The reason I forgot was because my original issue was actually with ImageDataGenerator.flow_from_directory(), where I already previously included a color_mode='grayscale'

So despite fixing this toy example, it was not enough to fix my intended issue.

For now, I've changed the source load_img() to mode 'I' as a temporary fix until I figure out a better way. Suggestions welcome!

if grayscale:
        img = img.convert('I')

and correspondingly, I'm passing rescale=1.0/(65535) to the data generator to account for the 16bit range.

This is now giving me augmented images as expected in the save dir.

This solution worked for me!

@jfx319 @alexattia Could you please elaborate on how you have changed the source load_img() so that it does not convert the 16 bit grayscale images into 'L' mode? It seems that I too have the same issue of the ImageDataGenerator resulting in white images but am not able to correct that issue. Also, would changing the source load_img() be enough for the ImageDataGenerator also to work correctly (as in after passing rescale = 1./65535)?

I am stuck at this for many days, so any help would be greatly appreciated. Thanks!

My fix was to edit the file: ~/venv/lib64/python3.6/site-packages/keras_preprocessing/image.py (l.500 for me) and instead of converting to 'L', I converted to 'I' as follows

def load_img(path, grayscale=False, color_mode='rgb', target_size=None,
             interpolation='nearest'):
    /// function stuff /// 
    if color_mode == 'grayscale':
        img = img.convert('I')
Was this page helpful?
0 / 5 - 0 ratings