Unless you are using ModelCheckpoint callback with save_best_only parameter, model.fit() returns not the best model encountered during the training (i.e. the one with the lowest validation loss or with highest validation accuracy), but rather the model whatever it happens to be at the last epoch (which cannot be counted on to have lowest loss or highest accuracy).
Since many users, esp beginners are unaware of this callback, by default their results are always not the best possible with the current parameters and the accuracy of their network is unnecessarily negatively impacted, despite that a better model has likely been already encountered during the already performed training pass.
Proposed change: model.fit() should return the best model encountered during training by default, or if it will negatively impact performance, at least provide a parameter to do so.
The current workaround is the following:
model = Sequential()
...
model.compile(loss='mse', optimizer=opt)
checkpointer = ModelCheckpoint(filepath="weights.hdf5", verbose=1, save_best_only=True)
hist = model.fit(..., callbacks=[checkpointer])
model.load_weights('weights.hdf5')
predicted = model.predict(X_test_mat)
I think changing behavior of fit
is not appropriate and Callback is powerful enough to achieve it. Maybe we should make some tutorials in FAQ to help user new to Keras.
What would be a down side of such change? Why would one not want model.fit() to return the best you can get during the current run?
Why would one not want model.fit() to return the best you can get during the current run?
because the word "best" is relative. sometime user want the lowest loss result on train data, most of the time on validation data.
and yes, +1 for FAQ entry.
I see. But with the current behavior it's neither lowest training loss nor lowest validation loss - it is whatever happens to be at the last epoch. I think selecting an intelligent default (e.g. lowest val loss) would be a right thing to do, in particular to make it easier for beginners to start and get better results. There are no backwards compatibility issues and the experts will still have full control.
Case in point - I've started with Keras about three months ago, trying to use an LSTM for time series forecasting. With the "default" behavior I've been getting accuracy of the forecast around or barely above 50%. But it's only recently I've realized that the model returned after fit() that I've been using to predict() is not the best model and added the ModelCheckpoint callback saving the best model - my accuracy with otherwise the same parameters went up to 55-60%. I am just thinking about others trying machine learning and some for this reason getting unnecessarily wrong impression that "it doesn't work yet", before they get to the depths of the documentation.
And of course I support covering this in FAQ, if the decision is not to make the change.
What would be a down side of such change?
The behaviors of fit
in other ML framework are the same as Keras. Modifying it might produce unexpected result for most of the users.
Could this maybe be an optional parameter to fit? Something along these lines:
fit.(..., return_best_model=False)
Doing so, we would keep the current behaviour and also ease retrieving the best models (with return_best_model=True
) without storing files to disk and with less code.
Sure, I think this will be a nice solution.
Is the only -- currently existing -- solution, to save the weights to a hdf5 file using ModelCheckpoint and then loading the file again and applying it to a model, which is then returned?
so something similar to this:
def getBestModel(...):
model.compile(..)
best_weights_filepath = './best_weights.hdf5'
earlyStopping=kcallbacks.EarlyStopping(monitor='val_loss', patience=10, verbose=1, mode='auto')
saveBestModel = kcallbacks.ModelCheckpoint(best_weights_filepath, monitor='val_loss', verbose=1, save_best_only=True, mode='auto')
# train model
history = model.fit(x_tr, y_tr, batch_size=batch_size, nb_epoch=n_epochs,
verbose=1, validation_data=(x_va, y_va), callbacks=[earlyStopping, saveBestModel])
#reload best weights
model.load_weights(best_weights_filepath)
return model
Could we get an update on this? Current behavior is limiting use of scikit wrappers in grid search.
EDIT: This is the workaround i use for now. Model name needs to be test in model building function that passes the model to the GridSearchCV to avoid same model names and allow multiple threads in cv.
import keras
class CustomCheckpoint(keras.callbacks.ModelCheckpoint):
def __init__(self, output_dir, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto'):
self.output_dir = output_dir
super(CustomCheckpoint,self).__init__('', monitor, verbose, save_best_only, save_weights_only, mode)
def on_train_begin(self, logs={}):
self.filepath = self.output_dir + self.model.name + '_weights.hdf5'
class BestModel(keras.callbacks.Callback):
def __init__(self, output_dir, verbose=0):
self.output_dir = output_dir
self.verbose = verbose
def on_train_end(self, logs={}):
weights_file = self.output_dir + self.model.name + '_weights.hdf5'
self.model.load_weights(weights_file)
Has this issue been resolved or addressed? What is the optimal way to return the best weights from any given epoch based on metrics obtained?
@jerpint Currently, fit does not have a return_best_model
parameter. See fit
docs.
At the moment, the best way to save the best model is to use the ModelCheckpoint:
from keras.callbacks import ModelCheckpoint
...
mcp = ModelCheckpoint(model_chk_path, monitor="val_acc",
save_best_only=True, save_weights_only=False)
model.fit(X_train, Y_train,
batch_size=batch_size,
epochs=nb_epoch,
validation_data=(X_test, Y_test),
shuffle=True,
callbacks=[mcp])
@MartinThoma I think you meant the best way to save the best model, am I correct? My understanding is that after fitting for n epochs if I need to predict using the best model I still need to explicitly load it before predicting.
@roebius Right, I wanted to write "save". Thank you, I've edited it. Yes, I also think that you would need to load it from the checkpoint.
I think changing the default to return the best model is a smart idea... it is definitely confusing. At the very least, the Keras examples should show it very prominently.
As a new user, you shouldn't have to go into a Github issue in order to return the best model.
@Hudler do you have a similar class for EarlyStopping?
I think you might be describing the solution to my problem, but I'm not sure.
When I try to put a KerasClassifier into GridSearch (on a subset of data), I get the following output:
````
(...)
Epoch 00021: val_acc did not improve
173/173 [==============================] - 0s - loss: 0.5392 - acc: 0.7225 - val_loss: 0.3749 - val_acc: 1.0000
109/109 [==============================] - 0s
192/217 [=========================>....] - ETA: 0s
217/217 [==============================] - 0s
Train on 173 samples, validate on 44 samples
Epoch 1/200
64/173 [==========>...................] - ETA: 0s - loss: 0.7012 - acc: 0.5156
96/173 [===============>..............] - ETA: 0s - loss: 0.7198 - acc: 0.4896
128/173 [=====================>........] - ETA: 0s - loss: 0.7217 - acc: 0.5156
160/173 [==========================>...] - ETA: 0s - loss: 0.7209 - acc: 0.5188Epoch 00000: early stopping
Epoch 00000: val_acc did not improve
173/173 [==============================] - 0s - loss: 0.7216 - acc: 0.5145 - val_loss: 0.6673 - val_acc: 0.7273
32/109 [=======>......................] - ETA: 0s
109/109 [==============================] - 0s
217/217 [==============================] - 0s
Train on 174 samples, validate on 44 samples
Epoch 1/200
32/174 [====>.........................] - ETA: 0s - loss: 0.7479 - acc: 0.5000
64/174 [==========>...................] - ETA: 0s - loss: 0.7409 - acc: 0.5156
96/174 [===============>..............] - ETA: 0s - loss: 0.7245 - acc: 0.5938
128/174 [=====================>........] - ETA: 0s - loss: 0.7170 - acc: 0.6172
160/174 [==========================>...] - ETA: 0s - loss: 0.7098 - acc: 0.6312Epoch 00000: early stopping
Epoch 00000: val_acc did not improve
174/174 [==============================] - 0s - loss: 0.7063 - acc: 0.6322 - val_loss: 0.7228 - val_acc: 0.1591
108/108 [==============================] - 0s
64/218 [=======>......................] - ETA: 0s
128/218 [================>.............] - ETA: 0s
192/218 [=========================>....] - ETA: 0s
218/218 [==============================] - 0s
Train on 173 samples, validate on 44 samples
Epoch 1/200
48/173 [=======>......................] - ETA: 0s - loss: 0.6862 - acc: 0.5417
96/173 [===============>..............] - ETA: 0s - loss: 0.6757 - acc: 0.6250
144/173 [=======================>......] - ETA: 0s - loss: 0.6713 - acc: 0.5972Epoch 00000: early stopping
Epoch 00000: val_acc did not improve
````
Apparently, it seems that since the same instance of EarlyStopping is used as in the previous iteration of grid search, if the model does not improve immediately at the first epoch with respect to what was saved as "best" before, the code stops immediately. This obviously makes GridSearch obsolete.
This post gives very detailed description of how to use ModelCheckpoint in Keras. Hope it will help.
@FrugoFruit90
Actually, I ran multiple instances of my python script to perform the gridsearch.
However you can write your custom EarlyStopping for example by changing the 'on_train_end' method of the default class https://github.com/fchollet/keras/blob/master/keras/callbacks.py#L505 . You need to set self.best = np.Inf (or -Inf) and self.wait = 0.
I tried @jerpint recommendation above (copy/paste below). Yet Keras still not giving the best model results. I managed to get val_acc = 1.00, see output below. However when I ran predict_proba and evaluate it gave me much worse results. Can someone please help me understand what is going on??
using Keras==2.1.2
from keras.callbacks import ModelCheckpoint
...
mcp = ModelCheckpoint(model_chk_path, monitor="val_acc",
save_best_only=True, save_weights_only=False)
model.fit(X_train, Y_train,
batch_size=batch_size,
epochs=nb_epoch,
validation_data=(X_test, Y_test),
shuffle=True,
callbacks=[mcp])
28/28 [==============================] - 0s 393us/step - loss: 1.2081 - acc: 0.7500 - val_loss: 16.0314 - val_acc: 0.0000e+00
Epoch 24/30
Epoch 00024: val_acc did not improve
28/28 [==============================] - 0s 357us/step - loss: 1.1232 - acc: 0.8214 - val_loss: 16.0096 - val_acc: 0.0000e+00
Epoch 25/30
Epoch 00025: val_acc did not improve
28/28 [==============================] - 0s 507us/step - loss: 1.1784 - acc: 0.7500 - val_loss: 16.4731 - val_acc: 0.0000e+00
Epoch 26/30
Epoch 00026: val_acc did not improve
28/28 [==============================] - 0s 357us/step - loss: 1.0180 - acc: 0.8929 - val_loss: 14.6493 - val_acc: 0.0000e+00
Epoch 27/30
Epoch 00027: val_acc did not improve
28/28 [==============================] - 0s 357us/step - loss: 1.0103 - acc: 0.8571 - val_loss: 7.5434 - val_acc: 0.0000e+00
Epoch 28/30
Epoch 00028: val_acc did not improve
28/28 [==============================] - 0s 322us/step - loss: 0.8813 - acc: 0.8929 - val_loss: 0.6771 - val_acc: 1.0000
Epoch 29/30
Epoch 00029: val_acc did not improve
28/28 [==============================] - 0s 447us/step - loss: 0.9386 - acc: 0.8571 - val_loss: 0.6273 - val_acc: 1.0000
Epoch 30/30
Epoch 00030: val_acc did not improve
28/28 [==============================] - 0s 357us/step - loss: 0.8414 - acc: 0.9286 - val_loss: 0.6074 - val_acc: 1.0000
Is there any way to still get the best mode without saving/loading the model to the disk?
I modified the ModelCheckpoint
callback so that it can store and reset to the best result at the end of training. The weight will only store in the memory no need to write to the disk.
import numpy as np
from keras.callbacks import Callback
class GetBest(Callback):
"""Get the best model at the end of training.
# Arguments
monitor: quantity to monitor.
verbose: verbosity mode, 0 or 1.
mode: one of {auto, min, max}.
The decision
to overwrite the current stored weights is made
based on either the maximization or the
minimization of the monitored quantity. For `val_acc`,
this should be `max`, for `val_loss` this should
be `min`, etc. In `auto` mode, the direction is
automatically inferred from the name of the monitored quantity.
period: Interval (number of epochs) between checkpoints.
# Example
callbacks = [GetBest(monitor='val_acc', verbose=1, mode='max')]
mode.fit(X, y, validation_data=(X_eval, Y_eval),
callbacks=callbacks)
"""
def __init__(self, monitor='val_loss', verbose=0,
mode='auto', period=1):
super(GetBest, self).__init__()
self.monitor = monitor
self.verbose = verbose
self.period = period
self.best_epochs = 0
self.epochs_since_last_save = 0
if mode not in ['auto', 'min', 'max']:
warnings.warn('GetBest mode %s is unknown, '
'fallback to auto mode.' % (mode),
RuntimeWarning)
mode = 'auto'
if mode == 'min':
self.monitor_op = np.less
self.best = np.Inf
elif mode == 'max':
self.monitor_op = np.greater
self.best = -np.Inf
else:
if 'acc' in self.monitor or self.monitor.startswith('fmeasure'):
self.monitor_op = np.greater
self.best = -np.Inf
else:
self.monitor_op = np.less
self.best = np.Inf
def on_train_begin(self, logs=None):
self.best_weights = self.model.get_weights()
def on_epoch_end(self, epoch, logs=None):
logs = logs or {}
self.epochs_since_last_save += 1
if self.epochs_since_last_save >= self.period:
self.epochs_since_last_save = 0
#filepath = self.filepath.format(epoch=epoch + 1, **logs)
current = logs.get(self.monitor)
if current is None:
warnings.warn('Can pick best model only with %s available, '
'skipping.' % (self.monitor), RuntimeWarning)
else:
if self.monitor_op(current, self.best):
if self.verbose > 0:
print('\nEpoch %05d: %s improved from %0.5f to %0.5f,'
' storing weights.'
% (epoch + 1, self.monitor, self.best,
current))
self.best = current
self.best_epochs = epoch + 1
self.best_weights = self.model.get_weights()
else:
if self.verbose > 0:
print('\nEpoch %05d: %s did not improve' %
(epoch + 1, self.monitor))
def on_train_end(self, logs=None):
if self.verbose > 0:
print('Using epoch %05d with %s: %0.5f' % (self.best_epochs, self.monitor,
self.best))
self.model.set_weights(self.best_weights)
Then you can use it by
callbacks = [GetBest(monitor='val_acc', verbose=1, mode='max')]
mode.fit(X, y, validation_data=(X_eval, Y_eval), callbacks=callbacks)
model.fit()
will now return the best result. Perhaps, keras can include something like this into the library.
@thanks louis925 for this nice solution.,
I agree with most users here: working with Keras and earlystop since a couple of months, I only now, by accident realise, that earlystop does NOT restore the best found value. I expected Early-stop to do so, simply because such behaviour will be desired in most cases.
So at least it would be great, that Keras tells this prominently in its documentation, and provides a handy solution - an additional parameter to earlystop would be the best. Such parameter and its explanation in the docu would make clear the issue to everyone, and provide an easy solution at the same time. Dont see any drawback for this.
louis925
I had a situation, where I initialised your GetBest() CallBack functiononce, but called model.fit() multiple times (for cross validation). I wanted the model to re-initialise the incumbent (self.best) for each training run (find the best weights for each .fit(), not over all folds). Therefore I added an additional parameter "reset" to your code:
def __init__(self, monitor='val_loss', verbose=0,
mode='auto', period=1, reset = True):
super(GetBest, self).__init__()
self.monitor = monitor
self.verbose = verbose
self.period = period
self.reset = reset
self.best_epochs = 0
self.epochs_since_last_save = 0
.......
def on_train_begin(self, logs=None):
if self.reset == True: # useful if multiple calls of fit(), e.g. during cross validation
self.best = np.Inf if self.monitor_op == np.less else -np.Inf
self.best_weights = self.model.get_weights()
js1285 - I don't think any information is retained in the callback between successive calls to fit (_i.e.,_ a new instance is created), hence self.best will not retain its past value. You might have to create a class/static variable to do this.
Any updates regarding this issue?
I did shorten the answer in https://github.com/keras-team/keras/issues/2768#issuecomment-361070688 and submit a pull request as shown above.
There's also the option of the EarlyStopping callback with the restore_best_weights
.
This runs at each epoch end.
There's also the option of the EarlyStopping callback with the
restore_best_weights
.This runs at each epoch end.
Specifically, it only runs when patience
is exceeded - meaning if the model achieves best performance < patience epochs before the epoch limit, patience is not exceeded and the best weights are not restored
e.g.
epochs = 500
patience = 10
best performance at epoch = 495
patience not exceeded and best weights not restored
relevant code from tf 2.2.0 release:
https://github.com/tensorflow/tensorflow/blob/2b96f3662bd776e277f86997659e61046b56c315/tensorflow/python/keras/callbacks.py#L1479-L1485
Most helpful comment
Could this maybe be an optional parameter to fit? Something along these lines:
fit.(..., return_best_model=False)
Doing so, we would keep the current behaviour and also ease retrieving the best models (with
return_best_model=True
) without storing files to disk and with less code.