Keras: model.fit(): batch_size vs. steps_per_epoch

Created on 19 Feb 2018 · 3Comments · Source: keras-team/keras

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

[x] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps
[x] If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
[ ] If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
[x] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Source

rfernand2

Most helpful comment

In the method model.fit(), if "steps_per_epoch" is specified, "batch_size" cannot be specified and it defaults to "None". This has the effect of setting batch_size to the number of samples. For example, if you have 25,000 samples and you specify "steps_per_epoch=1000", each epoch will consist of 1000 steps, where each step is a batch of 25,000 samples. Ouch.

Below is a test case that shows the problem. When "use_batch_size" is set to True, it works correctly with a batch_size of 32 used. When "use_batch_size" is set to False, it tries to process the entire 25,000 samples in one batch and runs out of memory.

# batchSizeVsSteps.py - test out model.fit using batch_size vs. steps_per_epoch
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM
from keras.datasets import imdb

max_features = 20000
maxlen = 80

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)

model = Sequential()
model.add(Embedding(max_features, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

sample_count = len(x_train)
batch_size = 32
steps_per_epoch = sample_count // batch_size
use_batch_size = False

if use_batch_size:
    model.fit(x_train, y_train, batch_size=batch_size, epochs=1)
else:
    model.fit(x_train, y_train, steps_per_epoch=steps_per_epoch, epochs=1)

rfernand2 on 19 Feb 2018

👍12

All 3 comments

# batchSizeVsSteps.py - test out model.fit using batch_size vs. steps_per_epoch
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM
from keras.datasets import imdb

max_features = 20000
maxlen = 80

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)

model = Sequential()
model.add(Embedding(max_features, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

sample_count = len(x_train)
batch_size = 32
steps_per_epoch = sample_count // batch_size
use_batch_size = False

if use_batch_size:
    model.fit(x_train, y_train, batch_size=batch_size, epochs=1)
else:
    model.fit(x_train, y_train, steps_per_epoch=steps_per_epoch, epochs=1)

rfernand2 on 19 Feb 2018

👍12

我死机了