Keras: out of memory when using BatchNormalization on convolutional layers

Created on 17 May 2017 · 4Comments · Source: keras-team/keras

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

[v ] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
[ v] If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
[ ] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
[v ] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Hi,
I'm trying to train a small model to classify mnist on TitanX GPU.
here's the gist of the script:

data loading and preparation as in the mnist_cnn example and then :

x = Input(shape=input_shape)
layer = Conv2D(32, kernel_size=(3, 3),activation='relu')(x)
layer = BatchNormalization()(layer)
layer = Conv2D(64, kernel_size=(3, 3),activation='relu')(layer)
layer = BatchNormalization()(layer)
layer = MaxPooling2D(pool_size=(2,2))(layer)
layer = Dropout(0.25)(layer)
layer = Flatten()(layer)
layer = Dense(128,activation='relu')(layer)
layer = Dropout(0.5)(layer)
predictions = Dense(num_classes,activation='softmax')(layer)
model = Model(inputs=x,outputs=predictions)
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

I get a message error about running out of memory (on TitanX ?! for such a small model ???)
some experiments:

when commenting out the BatchNormaliation, it works .
When using sequentioal model, same phenomenon happens - fails with Batchnorm
when adding the batchnorm right after the Dense (128,...) it also works
as expected, when using only dense layers, batchnorm works

Am I missing something ? (yes, I know, Batchnorm should allegedly come before activation, but its arguable. I'd want to see it works and not crash first).

Thanks,
Guy

stale

Source

guyk1971

Most helpful comment

BatchNorm is pretty memory intensive.
Let's say you have an image 300x300x512, BN initiate 2 vectors of (512,) + keep a running mean so 4 vector of 512. Then it creates another image of 300x300x512 normalized. So basically, you more than double the memory usage. Expected.

Dref360 on 18 May 2017

👍3

All 4 comments

I had a performance issue, too, with BN on Convonutional Layers.
I'm using Theano as backend.

StefanoD on 17 May 2017

Dref360 on 18 May 2017

👍3

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

stale[bot] on 16 Aug 2017

BatchNorm is pretty memory intensive.
Let's say you have an image 300x300x512, BN initiate 2 vectors of (512,) + keep a running mean so 4 vector of 512. Then it creates another image of 300x300x512 normalized. So basically, you more than double the memory usage. Expected.

The memory you said is the memory of GPU.