Keras: memory error when training ResNet or Inception from scratch (dataset of 200k images and image generator is using 20k at a time)

Created on 2 Jan 2018 · 2Comments · Source: keras-team/keras

All,

I've about 200,000 images (each image of size 115x115x3) that 'am trying to classify into 10 classes. All 10 classes are not part of the ImageNet dataset and I think 200k images is a fairly large dataset. Therefore, I've been trying to use ResNet or Inception v4 and train the models from scratch (i did make a few modifications to account for the fact that my images are just 115x115).

Using the suggestions in issue #68 and #2708 I set up my image generator to accept 20,000 images at a time. I made sure that each of the 20,000 images contained all 10 classes and is a representative sample of my entire dataset. To make my life easier I saved the 20,000 images as .npy files. Below is pseudo-code

datagen = ImageDataGenerator(
    featurewise_center=False, 
    samplewise_center=False, 
    featurewise_std_normalization=False, 
    samplewise_std_normalization=False, 
    zca_whitening=False, 
    rotation_range=45, # randomly rotate images in the range (degrees, 0 to 45)
    width_shift_range=0.1, # randomly shift images horizontally 
    height_shift_range=0.1, # randomly shift images vertically 
    horizontal_flip=True, # randomly flip images
    vertical_flip=True) # randomly flip images

# read the first 20,000 images
X_firstSet = np.load("0_images.npy")
datagen.fit(X_firstSet) 

for e in range(200): # total of 200 iterations
    for ithsample in range(10):
        X = np.load(str(ithsample)+'_images.npy') # reads images that are stored as 0.npy, 1.npy, etc.
        y = np.load(str(ithsample)+'_labels.npy') # reads labels that are stored as 0.npy, 1.npy, etc
        # split into train and validation
        X_train, X_validation, y_train, y_validation = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
        model.fit_generator(X_train, y_train,
                           validation_data=[X_validation, y_validation],
                           batch_size = 32, nb_epoch = 1, 
                           verbose=2, show_accuracy=True)

I tried running the above on a Quadro P6000 (24 GB of GPU memory) but I get a Memory Error message and code stops running usually after the first epoch. Below is the detailed error message:

Traceback (most recent call last):
File "/home/paperspace/myScripts/12partdiff/train_using_resnet.py", line 118, in <module>
model.fit_generator(train_datagen.flow(X_train,y_train, batch_size=batch_size),
File "/home/paperspace/anaconda3/lib/python3.5/site-packages/keras/preprocessing/image.py", 
line 493, in flow
save_format=save_format)
File "/home/paperspace/anaconda3/lib/python3.5/site-packages/keras/preprocessing/image.py", 
line 842, in __init__
self.x = np.asarray(x, dtype=K.floatx())
File "/home/paperspace/anaconda3/lib/python3.5/site-packages/numpy/core/numeric.py", line 
531, in asarray
return array(a, dtype, copy=False, order=order)
MemoryError

I installed Keras from master and just to be sure I also followed the suggestion in #8249 and set up my code to use GPU before using any keras import functions.

Do I have to upgrade my machine or is something wrong in my code? Odds are that I will end up with a much larger dataset (around 500k images). Any suggestions will be appreciated.

Thanks

Source

prashanthdumpuri

Most helpful comment

I figured it out - I was running out of memory on my GPU because of real time augmentation. Real-time augmentation creates float arrays and it was eating up my available memory. So, if anyone stumbles upon this post - (i) use less than 20k images at a time even if you have 24GB of RAM on your GPU (ii) do not use data augmentation.

prashanthdumpuri on 4 Jan 2018

👍4

All 2 comments

prashanthdumpuri on 4 Jan 2018

👍4

Just a heads up, I was having the same error when I tried to normalize my image data (it was read in as RGB int values from 0-255, so I was trying to apply X_train/255.). I was using the stanford cars dataset, which meant around 8000 training images, and 8000 test.

I didn't really want to have to give up data augmentation, but I found a way to get it to run on my laptop without any Memory Errors (Intel® Core™ i9-8950HK CPU @ 2.90GHz × 12 ).

1) The main solution was that, instead of normalizing the input data directly, I applied a Scaling step right after every batch normalization. This only normalized the data when it was needed, and not the entire 8000 images at once. This is the code I used https://github.com/flyyufelix/DenseNet-Keras/blob/master/custom_layers.py, and a section of my conv_block code looked like this:

X = Conv2D(F1, (1, 1), strides = (s,s), padding = 'valid', name = conv_name_base + '2a', use_bias=False, kernel_initializer = glorot_uniform(seed=0))(X)
X = BatchNormalization(epsilon = eps, axis = 3, name = bn_name_base + '2a')(X)
X = Scale(axis=3, name = scale_name_base + '2a')(X)
X = Activation('relu')(X)

2) I also altered the main code so that it only read in and saved the test images after training (I also dumped the training image data when I was finished using it). This further cut the memory requirements in half.

I hope this helps!