Keras: Confused about ImageDataGenerator

Created on 7 Apr 2016  路  18Comments  路  Source: keras-team/keras

Using data augmentation can augment data if there isn't enough data. I see ImageDataGenerator is achievable, but the document is not very detailed. I run the example cifar10_cnn.py, and I'm also not clear about how to use it. Beside, I don't see any different between using ImageDataGenerator and not. Who can tell how to use it. Thank you very much!!

stale

All 18 comments

You generate new training samples by randomly distorting your original images, e.g. by applying translation, rotation, etc. Depending on what kind of variations you want, you can set the parameters. The benefit is that you increase your training dataset and cover more variance of the data distribution, which should help to generalize better.

Keras has 2 fit options, one is model.fit that gets conventional numpy arrays as input and the other is model.fit_generator that gets python generators as input. Check it out what generators are if you are not familiar. After that, go back to ImageDataGenerator and see how it is just a wrapper classes around the _flow_generator generator.

Thanks for your answer! But there is an MemoryError when I try to use I ImageDataGenerator and my 8GB CPU ram seems to be to small, or my CPU usage is not efficient. So what should I do?

i don't know how much data you trying to load at once. If your batch size is small enough, it should work.
This is what generators are all about, they don't create all the data before hand, they do that only when you call it. So, I'd assume that generating a single batch with your configurations consumes way too much memory.

So yeah, what is your batch size, how many workers are you using?

I used 13000 pictures sized 227*227 and my batch size is 128. I set batch size to 32 and even smaller then, but the error is still there. Besides, I used the default nb_workers value 1.

@EderSantana Thanks a lot for your help!

sure! let me know if you need anything else. If not, please close the issue!

I want to know how should I get the fixed generator method for I saw it was on keras_1 branch.

if those fixes work for you, I think you will have to modify your code. Here are the diffs:
https://github.com/fchollet/keras/pull/2152/commits/30989dc997afcfe7097692e75ac5ff9e7ab06e55

Thanks again!

I tried to modify my code, but I find out that the code in keras_1 and master branch is quite different. So I just modified fit_generator and it's useless. And another problem, I changed datagen.fit(X_train) into datagen.fit(X_train, augment = True), the error as following:
Traceback (most recent call last):
File "cifar10_cnn.py", line 108, in
datagen.fit(X_train, augment=True)
File "/usr/local/lib/python2.7/dist-packages/keras/preprocessing/image.py", line 288, in fit
img = self.random_transform(img)
File "/usr/local/lib/python2.7/dist-packages/keras/preprocessing/image.py", line 253, in random_transform
x = random_shift(x, self.width_shift_range, self.height_shift_range)
File "/usr/local/lib/python2.7/dist-packages/keras/preprocessing/image.py", line 33, in random_shift
shift_x = np.random.uniform(-wrg, wrg) * x.shape[2]
File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 528, in getattr
raise AttributeError(name)
AttributeError: shape

@EderSantana

that is weird... maybe the change made x not be a numpy array? It says that x.shape cannot be sliced. It would be possible with a regular numpy array.

Was the original code working with Cifar-10? problem with both Theano and tensorflow backends?

If nothing work, try creating a new environment and installing keras-0.3.2 If not even that works, maybe something weird is happening somewhere else besides keras. Unless the datagen was pushed with a bug since always.

@fchollet this generator is working on 0.3.3 right?

Yeah, the original code worked well with cifar-10 and I just used Theano. Then I tried to modify fit_generator in models.py and fit on my own data sized (227*227), but the memory error remained.

We just rewrote the ImageDataGenerator in #2446 .
Can you check if your issue is solved with this?

To help debug the problem can you please show your ImageDataGenerator initialisation and fit snippet?

Yeah, I have solved the problem but not with the rewrote ImageDataGenerator. And I raised an issue some days ago #2318 .

Does ImageDataGenerator change class labels as well? If not, is there a way to change class labels?
Or in other words, which line in class ImageDataGenerator does the implementation make sure that the output of ImageDataGenerator is a tuple (inputs, targets) as requested in fit_generator?
Thanks in advance

@flyingpoops this line ensures that when used as a generator, the output is a tuple of (inputs, targets).

https://github.com/fchollet/keras/blob/master/keras/preprocessing/image.py#L318

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vinayakumarr picture vinayakumarr  路  3Comments

snakeztc picture snakeztc  路  3Comments

Imorton-zd picture Imorton-zd  路  3Comments

amityaffliction picture amityaffliction  路  3Comments

zygmuntz picture zygmuntz  路  3Comments