[x] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
[x] If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
[x] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).
ImageDataGenerator.flow() expects numpy arrays and does not accept but TF tensors containing the same image data. Should this be updated to accept TF tensors of images and labels, and are there any plans to do so?
Proposed example code of loading such data from a TFRecord for semantic segmentation:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
# tfrecord_filenames_queue : tfrecord filename queue
# String queue object from tf.train.string_input_producer()
reader = tf.TFRecordReader()
_, serialized_example = reader.read(tfrecord_filenames_queue)
features = tf.parse_single_example(
serialized_example,
features={
'height': tf.FixedLenFeature([], tf.int64),
'width': tf.FixedLenFeature([], tf.int64),
'image_raw': tf.FixedLenFeature([], tf.string),
'mask_raw': tf.FixedLenFeature([], tf.string)
})
image = tf.decode_raw(features['image_raw'], tf.uint8)
annotation = tf.decode_raw(features['mask_raw'], tf.uint8)
height = tf.cast(features['height'], tf.int32)
width = tf.cast(features['width'], tf.int32)
image_shape = tf.pack([height, width, 3])
annotation_shape = tf.pack([height, width, 1])
image = tf.reshape(image, image_shape)
annotation = tf.reshape(annotation, annotation_shape)
# This will do preprocessing and realtime data augmentation:
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.1, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.1, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
datagen.flow(image, annotation,
batch_size=32)
This actually seems to apply to many of the model API's methods as well. I did search for a relevant issue, hopefully it isn't a duplicate.
https://github.com/fchollet/keras/issues/5368 is a related, but slightly different issue for feeding tensors directly into a model using a tensorflow based implementation of flow() to handle TFRecords.
The Keras preprocessing is entirely outside of the symbolic graph - it operates on image data not tensors.
Personally I don't think there should be any support for this (preprocessing, IMO, should not be a part of a model; no need to worry about training/test time manipulations of the graph to turn on/off augmentation; no forcing of the model itself to read from disk and being limited by what/how it reads; no concerns about the GPU doing stuff the CPU should be handling in parallel, etc., etc.). I've not seen any discussion about adding a new form of preprocessing within the graph itself, and most of the discussion around it has been confusion about what that actually means in terms of symbolic computing.
The Keras preprocessing is entirely outside of the symbolic graph - it operates on image data not tensors.
@patyork So am I correct that this means what I'd like to do is simply not possible with the current code as-is, and I must convert my images to numpy arrays before training?
Personally I don't think there should be any support for this (preprocessing, IMO, should not be a part of a model
I'm not sure if you're specifically talking about training, but in general this is a heavily application dependent decision that should be left up to the user. Consider a soft realtime deploy-time processing pipeline that first uses a spatial transformer network to transform an image into a space on which another network then makes decisions.... You wouldn't want to go to python for actually applying the transformation, it would add significantly to latency.
no forcing of the model itself to read from disk and being limited by what/how it reads
I find that the python code can become the limiting factor for many use cases, isn't this the purpose of the queues built into tensorflow, for example?
That's correct. You could, alternatively, use whatever preprocessing code TF provides (https://www.tensorflow.org/api_docs/python/image/) and write any missing functionality you need using TF functions.
Thanks. Yes, I could also either use tensorflow only code or a lib like tensorpack/tensorlayer.
Should this issue then become a feature request for additional discussion, or perhaps this issue is best closed immediately.
I created some hypothetical example code that solves this in https://github.com/fchollet/keras/pull/6891#issuecomment-307652155. Also related is https://github.com/fchollet/keras/issues/6538 and https://github.com/fchollet/keras/pull/6928. I'm closing this in favor of those.
datagen.flow(image, annotation, batch_size=32)
The generator doesn't always contains the expected batch_size. How to skip the samples which batch_size is not 32