Keras: train autoencoder with keras

Created on 30 Sep 2016 · 22Comments · Source: keras-team/keras

I wrote a code for training autoencoder with Keras. I want this autoencoder for feature extraction from images. In my experiment, images are input, which all belong to one class. I have train and validation sets. The following is my code:

from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model


# dimensions of our images.
img_width, img_height = 256, 256

train_data_dir = '/home/osman/keras_test/train'
validation_data_dir = '/home/osman/keras_test/validation'
nb_train_samples = 200
nb_validation_samples = 50
nb_epoch = 50

input_img = Input(shape=(3, img_width, img_height))

x = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(input_img)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
encoded = MaxPooling2D((2, 2), border_mode='same')(x)

# at this point the representation is (8, 4, 4) i.e. 128-dimensional

x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, 3, 3, activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, 3, 3, activation='sigmoid', border_mode='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=32,
        class_mode=None)

print type(train_generator)

validation_generator = test_datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=32,
        class_mode=None)

autoencoder.fit_generator(
        train_generator,
        samples_per_epoch=20,
        nb_epoch=nb_epoch,
        # batch_size=128,
        #shuffle=True,
        validation_data=(validation_generator,validation_generator),
        nb_val_samples=5
        )

I got this expectation,

Found 200 images belonging to 1 classes.
<class 'keras.preprocessing.image.DirectoryIterator'>
Found 50 images belonging to 1 classes.

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-26-ad3295fdbaae> in <module>()
     68         #shuffle=True,
     69         validation_data=(validation_generator,validation_generator),
---> 70         nb_val_samples=5
     71         )

/home/osman/.virtualenvs/keras/local/lib/python2.7/site-packages/keras/engine/training.pyc in fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose, callbacks, validation_data, nb_val_samples, class_weight, max_q_size, nb_worker, pickle_safe)
   1388                                 '(val_x, val_y, val_sample_weight) '
   1389                                 'or (val_x, val_y). Found: ' + str(validation_data))
-> 1390             val_x, val_y, val_sample_weights = self._standardize_user_data(val_x, val_y, val_sample_weight)
   1391             self.validation_data = val_x + [val_y, val_sample_weights]
   1392         else:

/home/osman/.virtualenvs/keras/local/lib/python2.7/site-packages/keras/engine/training.pyc in _standardize_user_data(self, x, y, sample_weight, class_weight, check_batch_dim, batch_size)
    959                                    self.internal_input_shapes,
    960                                    check_batch_dim=False,
--> 961                                    exception_prefix='model input')
    962         y = standardize_input_data(y, self.output_names,
    963                                    output_shapes,

/home/osman/.virtualenvs/keras/local/lib/python2.7/site-packages/keras/engine/training.pyc in standardize_input_data(data, names, shapes, check_batch_dim, exception_prefix)
     68                             ': data should be a Numpy array, '
     69                             'or list/dict of Numpy arrays. '
---> 70                             'Found: ' + str(data)[:200] + '...')
     71         if len(names) != 1:
     72             # case: model expects multiple inputs but only received

Exception: Error when checking model input: data should be a Numpy array, or list/dict of Numpy arrays. Found: <keras.preprocessing.image.DirectoryIterator object at 0x7f0020899110>...

Source

neouyghur

Most helpful comment

I got it working. We needed to apply the fix for the training generator as well. I did some other small fixes on the code such as:

Output conv should have _3_ channels, since the image is RGB.
One of the convolutions was missing the border_mode='same'
Changed the samples_per_epoch to what I assume is the correct value.
Changed nb_val_samples as well accordingly.

Let me know if the code below works for you.

from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.preprocessing.image import ImageDataGenerator

def fixed_generator(generator):
    for batch in generator:
        yield (batch, batch)

# dimensions of our images.
img_width, img_height = 256, 256

train_data_dir = '/home/osman/keras_test/train'
validation_data_dir = '/home/osman/keras_test/validation'
nb_train_samples = 200
nb_validation_samples = 50
nb_epoch = 50
batch_size = 32

input_img = Input(shape=(3, img_width, img_height))

x = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(input_img)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
encoded = MaxPooling2D((2, 2), border_mode='same')(x)

# at this point the representation is (8, 4, 4) i.e. 128-dimensional

x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(3, 3, 3, activation='sigmoid', border_mode='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None)

print type(train_generator)

validation_generator = test_datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None)

autoencoder.fit_generator(
        fixed_generator(train_generator),
        samples_per_epoch=nb_train_samples,
        nb_epoch=nb_epoch,
        validation_data=fixed_generator(validation_generator),
        nb_val_samples=nb_validation_samples
        )

robertomest on 3 Oct 2016

👍11 ❤4

All 22 comments

I would imagine the problem is on this line:
validation_data=(validation_generator,validation_generator).
The fit_generator method expects either a generator or a tuple with numpy arrays for validation data and labels. Using a tuple of generator is wrong on this case, you should just use it as validation_data=validation_generator: the generator will yield tuples of data and labels automatically.

robertomest on 30 Sep 2016

@robertomest I got different error with your solution, could you fix it?

Exception Traceback (most recent call last)
in ()
67 #shuffle=True,
68 validation_data= validation_generator,
---> 69 nb_val_samples=5
70 )

/home/osman/.virtualenvs/keras/local/lib/python2.7/site-packages/keras/engine/training.pyc in fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose, callbacks, validation_data, nb_val_samples, class_weight, max_q_size, nb_worker, pickle_safe)
1425 raise Exception('output of generator should be a tuple '
1426 '(x, y, sample_weight) '
-> 1427 'or (x, y). Found: ' + str(generator_output))
1428 # build batch logs
1429 batch_logs = {}

Exception: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: [[[[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
...,
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]

[[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
...,
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]

[[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
...,
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]]

[[[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
...,
[ 0.97254908 0.97254908 0.97254908 ..., 1. 1. 1. ]
[ 0.97254908 0.97254908 0.97254908 ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]

[[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
...,
[ 0.97254908 0.97254908 0.97254908 ..., 1. 1. 1. ]
[ 0.97254908 0.97254908 0.97254908 ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]

[[[ 1. 1. 1. ..., 0.98039222 0.98039222
0.98039222]
[ 1. 1. 1. ..., 0.91764712 0.91764712
0.91764712]
[ 1. 1. 1. ..., 0.46274513 0.46274513
0.46274513]
...,
[ 0.77647066 0.77647066 0.77647066 ..., 1. 1. 1. ]
[ 0.78039223 0.78039223 0.78039223 ..., 1. 1. 1. ]
[ 0.81176478 0.81176478 0.81176478 ..., 1. 1. 1. ]]

[[ 1. 1. 1. ..., 0.98039222 0.98039222
0.98039222]
[ 1. 1. 1. ..., 0.91764712 0.91764712
0.91764712]
[ 1. 1. 1. ..., 0.46274513 0.46274513
0.46274513]
...,
[ 0.77647066 0.77647066 0.77647066 ..., 1. 1. 1. ]
[ 0.78039223 0.78039223 0.78039223 ..., 1. 1. 1. ]
[ 0.81176478 0.81176478 0.81176478 ..., 1. 1. 1. ]]

...,
[[[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
...,
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]

[[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
...,
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]

[[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
...,
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]]

[[[ 0.98039222 0.98039222 0.98039222 ..., 0.99215692 0.98431379
0.99215692]
[ 0.98039222 0.98039222 0.98039222 ..., 0.99607849 0.98823535
0.99215692]
[ 0.98039222 0.98039222 0.98039222 ..., 0.99215692 0.98823535
0.99215692]
...,
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]

[[ 1. 1. 1. ..., 0.00392157 0. 0.00392157]
[ 1. 1. 1. ..., 0. 0. 0.00392157]
[ 1. 1. 1. ..., 0. 0. 0.00392157]
...,
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]

[[ 0.99607849 0.99607849 0.99607849 ..., 0.4784314 0.47058827
0.4784314 ]
[ 0.99607849 0.99607849 0.99607849 ..., 0.4784314 0.47450984
0.4784314 ]
[ 0.99607849 0.99607849 0.99607849 ..., 0.47450984 0.47450984
0.4784314 ]
...,
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]
[ 1. 1. 1. ..., 1. 1. 1. ]]]

[[[ 0.98431379 0.98431379 0.98431379 ..., 1. 1. 1. ]
[ 0.98431379 0.98431379 0.98431379 ..., 1. 1. 1. ]
[ 0.98431379 0.98431379 0.98431379 ..., 1. 1. 1. ]
...,
[ 0. 0. 0. ..., 0.92941183 0.90980399
0.91372555]
[ 0. 0. 0. ..., 0.92941183 0.90980399
0.91372555]
[ 0. 0. 0. ..., 0.92941183 0.90980399
0.91372555]]

[[ 0.99215692 0.99215692 0.99215692 ..., 0.99607849 0.99607849 1. ]
[ 0.99215692 0.99215692 0.99215692 ..., 0.99607849 0.99607849 1. ]
[ 0.99215692 0.99215692 0.99215692 ..., 0.99607849 0.99607849 1. ]
...,
[ 0. 0. 0. ..., 1. 0.99215692
0.97254908]
[ 0. 0. 0. ..., 1. 0.99215692
0.97254908]
[ 0. 0. 0. ..., 1. 0.99215692
0.97254908]]

[[ 0.98823535 0.98823535 0.98823535 ..., 1. 1. 1. ]
[ 0.98823535 0.98823535 0.98823535 ..., 1. 1. 1. ]
[ 0.98823535 0.98823535 0.98823535 ..., 1. 1. 1. ]
...,
[ 0. 0. 0. ..., 1. 1. 0.98431379]
[ 0. 0. 0. ..., 1. 1. 0.98431379]
[ 0. 0. 0. ..., 1. 1. 0.98431379]]]]

neouyghur on 3 Oct 2016

It seems that the generator is now yielding tuples as expected. Could you put the updated code snippet with the modified call so I can check it out? Could you also call next(validation_generator) so we can check what the generator is yielding?

robertomest on 3 Oct 2016

@robertomest Thank you for quick replying.

The fixed part is,

autoencoder.fit_generator(
        train_generator,
        samples_per_epoch=20,
        nb_epoch=nb_epoch,
        # batch_size=128,
        #shuffle=True,
        validation_data= validation_generator, # fixed
        nb_val_samples=5
        )

I called the next() function, the following code returns:

print type(next(validation_generator))
print next(validation_generator).shape

<type 'numpy.ndarray'>
(18, 3, 256, 256)

neouyghur on 3 Oct 2016

It seems the iterator is actually returning only the data, instead of a tuple (data, target) as the fit method expects. Can you check how the train_generator is behaving (calling next(train_generator))?

I think that is because of the class_mode=None, but I'd like to see how the train_generator is behaving since it uses the same parameter.

robertomest on 3 Oct 2016

Could you test a quick fix for now? Maybe try something like this

def fixed_generator(generator):
    for batch in generator:
        yield (batch, batch)

and change the fit call to

autoencoder.fit_generator(
        train_generator,
        samples_per_epoch=20,
        nb_epoch=nb_epoch,
        # batch_size=128,
        #shuffle=True,
        validation_data= fixed_generator(validation_generator), # fixed
        nb_val_samples=5
        )

Another weird thing is that you defined samples_per_epoch=20. This should be the same of your dataset, so (from what I could see from what you posted you have 500 training images) that would be samples_per_epoch=500. The same for nb_val_samples (I think the correct value is 50?).

robertomest on 3 Oct 2016

@robertomest the train set returns same output. After running your modification, I got similar error:

Found 200 images belonging to 1 classes.
<type 'numpy.ndarray'>
(32, 3, 256, 256)
Found 50 images belonging to 1 classes.
<type 'numpy.ndarray'>
(18, 3, 256, 256)
Epoch 1/50

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-24-fba06bbb2e63> in <module>()
     76         #shuffle=True,
     77         validation_data= fixed_generator(validation_generator), # fixed
---> 78         nb_val_samples=5
     79         )

/home/osman/.virtualenvs/keras/local/lib/python2.7/site-packages/keras/engine/training.pyc in fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose, callbacks, validation_data, nb_val_samples, class_weight, max_q_size, nb_worker, pickle_safe)
   1425                     raise Exception('output of generator should be a tuple '
   1426                                     '(x, y, sample_weight) '
-> 1427                                     'or (x, y). Found: ' + str(generator_output))
   1428                 # build batch logs
   1429                 batch_logs = {}

Exception: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: [[[[ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   ..., 
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]]

  [[ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   ..., 
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]]

  [[ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   ..., 
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]]]


 [[[ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   ..., 
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]]

  [[ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   ..., 
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]]

  [[ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   ..., 
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]]]


 [[[ 0.47058827  0.47058827  0.47058827 ...,  0.50196081  0.50196081
     0.50196081]
   [ 0.47058827  0.47058827  0.47058827 ...,  0.50196081  0.50196081
     0.50196081]
   [ 0.48235297  0.48235297  0.48235297 ...,  0.50196081  0.50196081
     0.50196081]
   ..., 
   [ 1.          1.          1.         ...,  0.58039218  0.58039218
     0.58039218]
   [ 1.          1.          1.         ...,  0.58823532  0.58823532
     0.58039218]
   [ 1.          1.          1.         ...,  0.58039218  0.58039218
     0.58823532]]

  [[ 0.67450982  0.67450982  0.67450982 ...,  0.71372551  0.71372551
     0.71372551]
   [ 0.68235296  0.68235296  0.68235296 ...,  0.71372551  0.71372551
     0.71372551]
   [ 0.68235296  0.68235296  0.68235296 ...,  0.71372551  0.71372551
     0.71372551]
   ..., 
   [ 0.97647065  0.97647065  0.97647065 ...,  0.64705884  0.64705884
     0.64705884]
   [ 0.97647065  0.97647065  0.97647065 ...,  0.66666669  0.66666669
     0.64705884]
   [ 0.97647065  0.97647065  0.97647065 ...,  0.66274512  0.66274512
     0.66666669]]

  [[ 0.90980399  0.90980399  0.90980399 ...,  0.98823535  0.98823535
     0.98823535]
   [ 0.91764712  0.91764712  0.91764712 ...,  0.98823535  0.98823535
     0.98823535]
   [ 0.92941183  0.92941183  0.92941183 ...,  0.98823535  0.98823535
     0.98823535]
   ..., 
   [ 0.95294124  0.95294124  0.95294124 ...,  0.62352943  0.62352943
     0.62352943]
   [ 0.95294124  0.95294124  0.95294124 ...,  0.63137257  0.63137257
     0.62352943]
   [ 0.95294124  0.95294124  0.95294124 ...,  0.627451    0.627451
     0.63137257]]]


 ..., 
 [[[ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   ..., 
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]]

  [[ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   ..., 
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]]

  [[ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   ..., 
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]
   [ 1.          1.          1.         ...,  1.          1.          1.        ]]]


 [[[ 0.98431379  0.98431379  0.98431379 ...,  0.97647065  0.97647065
     0.97647065]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.97647065  0.97647065
     0.97647065]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.97647065  0.97647065
     0.97647065]
   ..., 
   [ 0.98431379  0.98431379  0.98431379 ...,  0.98431379  0.98431379
     0.98431379]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.98431379  0.98431379
     0.98431379]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.98431379  0.98431379
     0.98431379]]

  [[ 0.98431379  0.98431379  0.98431379 ...,  0.97647065  0.97647065
     0.97647065]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.97647065  0.97647065
     0.97647065]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.97647065  0.97647065
     0.97647065]
   ..., 
   [ 0.98431379  0.98431379  0.98431379 ...,  0.98431379  0.98431379
     0.98431379]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.98431379  0.98431379
     0.98431379]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.98431379  0.98431379
     0.98431379]]

  [[ 0.98431379  0.98431379  0.98431379 ...,  0.97647065  0.97647065
     0.97647065]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.97647065  0.97647065
     0.97647065]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.97647065  0.97647065
     0.97647065]
   ..., 
   [ 0.98431379  0.98431379  0.98431379 ...,  0.98431379  0.98431379
     0.98431379]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.98431379  0.98431379
     0.98431379]
   [ 0.98431379  0.98431379  0.98431379 ...,  0.98431379  0.98431379
     0.98431379]]]


 [[[ 1.          1.          1.         ...,  0.98431379  0.98431379
     0.98431379]
   [ 1.          1.          1.         ...,  0.98431379  0.98431379
     0.98431379]
   [ 1.          1.          1.         ...,  0.98431379  0.98431379
     0.98431379]
   ..., 
   [ 0.99607849  0.99607849  0.99607849 ...,  1.          1.          1.        ]
   [ 0.99607849  0.99607849  0.99607849 ...,  1.          1.          1.        ]
   [ 0.99607849  0.99607849  0.99607849 ...,  1.          1.          1.        ]]

  [[ 1.          1.          1.         ...,  0.99215692  0.99215692
     0.99215692]
   [ 1.          1.          1.         ...,  0.99215692  0.99215692
     0.99215692]
   [ 1.          1.          1.         ...,  0.99215692  0.99215692
     0.99215692]
   ..., 
   [ 0.99607849  0.99607849  0.99607849 ...,  0.99607849  0.99607849
     0.99607849]
   [ 0.99607849  0.99607849  0.99607849 ...,  0.99607849  0.99607849
     0.99607849]
   [ 0.99607849  0.99607849  0.99607849 ...,  0.99607849  0.99607849
     0.99607849]]

  [[ 1.          1.          1.         ...,  0.98039222  0.98039222
     0.98039222]
   [ 1.          1.          1.         ...,  0.98039222  0.98039222
     0.98039222]
   [ 1.          1.          1.         ...,  0.98039222  0.98039222
     0.98039222]
   ..., 
   [ 0.99607849  0.99607849  0.99607849 ...,  1.          1.          1.        ]
   [ 0.99607849  0.99607849  0.99607849 ...,  1.          1.          1.        ]
   [ 0.99607849  0.99607849  0.99607849 ...,  1.          1.          1.        ]]]]

neouyghur on 3 Oct 2016

I got it working. We needed to apply the fix for the training generator as well. I did some other small fixes on the code such as:

Output conv should have _3_ channels, since the image is RGB.
One of the convolutions was missing the border_mode='same'
Changed the samples_per_epoch to what I assume is the correct value.
Changed nb_val_samples as well accordingly.

Let me know if the code below works for you.

from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.preprocessing.image import ImageDataGenerator

def fixed_generator(generator):
    for batch in generator:
        yield (batch, batch)

# dimensions of our images.
img_width, img_height = 256, 256

train_data_dir = '/home/osman/keras_test/train'
validation_data_dir = '/home/osman/keras_test/validation'
nb_train_samples = 200
nb_validation_samples = 50
nb_epoch = 50
batch_size = 32

input_img = Input(shape=(3, img_width, img_height))

x = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(input_img)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
encoded = MaxPooling2D((2, 2), border_mode='same')(x)

# at this point the representation is (8, 4, 4) i.e. 128-dimensional

x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(3, 3, 3, activation='sigmoid', border_mode='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None)

print type(train_generator)

validation_generator = test_datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None)

autoencoder.fit_generator(
        fixed_generator(train_generator),
        samples_per_epoch=nb_train_samples,
        nb_epoch=nb_epoch,
        validation_data=fixed_generator(validation_generator),
        nb_val_samples=nb_validation_samples
        )

robertomest on 3 Oct 2016

👍11 ❤4

@robertomest your modification is running. I trained it with small dataset. And tested it with an image. like that,

# test autoencoder

import matplotlib.pyplot as plt
import numpy as np
import cv2

filename = '/home/osman/logo_for_test/2.jpg'
im = cv2.resize(cv2.imread(filename), (256, 256)).astype(np.float32)
im = im * 1./255
im = im.transpose((2,0,1))

datas = np.zeros((1, 3, 256, 256))
datas[0, :, :, :] = im;

decoded_imgs = autoencoder.predict(datas)

finim = (decoded_imgs[0]*255).astype(int)
finim = finim.transpose(1, 2, 0)
plt.imshow(finim)
plt.show()

output is gray image, although the input is RGB image. Could you point out the problem, thank you.

neouyghur on 4 Oct 2016

Sorry, I'm not sure I understand. Do you want the image in the output to be gray? If so, you should have targets that are gray images.

robertomest on 4 Oct 2016

@robertomest I want to compare the input and output for evaluating the trained autoencoder. My output image is a gray image, it should be a colorful image.

neouyghur on 4 Oct 2016

Ok, I understand it now. It is quite interesting that your decoded image has no color. I just trained the autoencoder on a single image, this is the result:

The autoencoder was able to reproduce the colors just fine. Maybe try and fit the autoencoder on few images to see how it goes? Or upload some reconstruced pictures. To get an image to reproduce, I simply got a batch from the Iterator and selected the first image:

img = next(validation_generator)[:1] # Get one image
dec = autoencoder.predict(img) # Decoded image
img = img[0]
dec = dec[0]
img = (img.transpose((1, 2, 0))*255).astype('uint8')
dec = (dec.transpose((1, 2, 0))*255).astype('uint8')

plt.imshow(np.hstack((img, dec)))
plt.title('Original and reconstructed images')
plt.show()

robertomest on 4 Oct 2016

👍5

@robertomest Thank you, I think I have not trained enough in big dataset. I am running the again with more epoch.

neouyghur on 4 Oct 2016

It is a compact tutorial to learn autoencoder especially colour images. thanks to all

biswajitcsecu on 26 Jul 2018

Same as @biswajitcsecu said, this serves as a great tutorial; many thanks @robertomest !

aldarm on 4 Mar 2019

@robertomest @biswajitcsecu @neouyghur I trained this model for 224*224 images on Scene15 dataset which contains two folders: train and val. Each folder contains 15 sub-folders (representing categories) containing images and trained. Do I need to put all the images from categories into single train and val folder? The reconstruction error is 56% if I don't do so.
from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.optimizers import SGD
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras.preprocessing.image import ImageDataGenerator

WEIGHTS_CLASSIFIER = 'cae.h5'

def fixed_generator(generator):
for batch in generator:
yield (batch, batch)

dimensions of our images.

img_width, img_height = 224, 224

train_data_dir = 'C:/Experiment/vggimagenet-finetune(scene15)/data/train'
validation_data_dir = 'C:/Experiment/vggimagenet-finetune(scene15)/data/val'
nb_train_samples = 1320
nb_validation_samples = 180
nb_epoch = 50
batch_size = 32

input_img = Input(shape=(img_width, img_height, 3))

x = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(input_img)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = MaxPooling2D((2, 2), border_mode='same')(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)

x = Convolution2D(512, 3, 3, activation='relu', border_mode='same')(x)

encoded = MaxPooling2D((2, 2), border_mode='same')(x)

at this point the representation is (8, 4, 4) i.e. 128-dimensional

x = Convolution2D(512, 3, 3, activation='relu', border_mode='same')(x)

x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, 3, 3, activation='relu', border_mode='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, 3, 3, activation='relu', border_mode='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(3, 3, 3, activation='sigmoid', border_mode='same')(x)

autoencoder = Model(input_img, decoded)
print(autoencoder.summary())
opt = SGD(lr=0.1)
autoencoder.compile(optimizer=opt, loss='binary_crossentropy')

this is the augmentation configuration we will use for training

train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)

this is the augmentation configuration we will use for testing:

only rescaling

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None)

print(type(train_generator))

validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None)

lr_decay = ReduceLROnPlateau(factor=0.9, patience=1, verbose=1)
checkpointer = ModelCheckpoint(filepath=WEIGHTS_CLASSIFIER, save_best_only=True, verbose=1)
autoencoder.fit_generator(
fixed_generator(train_generator),
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=fixed_generator(validation_generator),
nb_val_samples=nb_validation_samples,
callbacks=[lr_decay, checkpointer]
)

csitaula on 19 Mar 2019

I trained this model for 224*224 images on Scene15 dataset which contains two folders: train and val. Each folder contains 15 sub-folders (representing categories) containing images and trained. Do I need to put all the images from categories into single train and val folder? The reconstruction error is 56% if I don't do so.

Hi @csitaula, I think you do not need to put them in a single folder. When instantiating the generator we use the argument class_mode=None which should mean no labels are returned regardless. You could check the images returned by the generator just to make sure they are coming out as expected.

On another note, this was just a simple example on how to train an autoencoder. I just fixed the code so that it would actually run and then tested it on a single image. I am not sure the architecture and/or hyperparameters are adequate for a larger dataset.

robertomest on 20 Mar 2019

@robertomest can you help me dubug the code? I can provide you code and datasets to see. I changed the architecture a bit and trained. The loss stands around 0.56, both training and validation loss. No overfitting, I think. I ran for 50 epochs.

csitaula on 20 Mar 2019

I tried working it in R. How to convert
def fixed_generator(generator):
for batch in generator:
yield (batch, batch) into R?

csitaula on 27 Mar 2019

No one talks about the label which is the generated images themselves, I didn't see the important part of training, aka model.fit_generator(...), what should be put in the parentheses that the training process knows that the labels are just the images generated by the generator itself?