Keras: Issue with Callbacks logs & ModelCheckpoint

Created on 15 Oct 2015 · 16Comments · Source: keras-team/keras

Hello,
I'm trying to get the loss & accuracy & weights automatically outputed to 2 files after each epoch, but I'm having some troubles.

for weights
I tried the checkpoints example described here:
http://keras.io/callbacks/#example-model-checkpoints

But I got the following issue:
Can save best model only with val_loss available, skipping.
warnings.warn("Can save best model only with %s available, skipping." % (self.monitor), RuntimeWarning)

2.
To get the loss & accuracy, I created a new Callback with an on_batch_end(self, batch, logs={}) method.
But that function was only returning blank dict for logs.

Right now, I was thinking to change the generic_utils.progbar.update to output the data from here, but if you can think of a cleaner way that would be cool.

Thank you very much

stale

Source

Albyorix

👍2

Most helpful comment

Hi, I have a similar issue. I have:
checkpointer = ModelCheckpoint(filepath="./file", monitor=f1, save_best_only=True, mode='max', verbose=1)
Where f1 is only a function to measure the F1 score (it hasn't got any problem measuring predictions with trained models) a function defined as:

def f1(target, prediction): # not tensors but result values \n
    target = np.reshape(target, (-1, MAX_DOCUMENT_LENGTH, num_classes))
    prediction = np.reshape(prediction, (-1, MAX_DOCUMENT_LENGTH, num_classes))
    tp=np.asarray([0]*(num_classes+2))
    fp=np.asarray([0]*(num_classes+2))
    fn=np.asarray([0]*(num_classes+2))
    target = np.argmax(target, 2)
    prediction = np.argmax(prediction, 2)
    for i in range(len(target)):
        for j in range(MAX_DOCUMENT_LENGTH):
            if target[i][j] == prediction[i][j]:
                tp[target[i][j]] += 1
            else:
                fp[target[i][j]] += 1
                fn[prediction[i][j]] += 1
    NON_NAMED_ENTITY = 0
    for i in range(num_classes):
        if i != NON_NAMED_ENTITY:
            tp[5] += tp[i]
            fp[5] += fp[i]
            fn[5] += fn[i]
        else:
            tp[6] += tp[i]
            fp[6] += fp[i]
            fn[6] += fn[i]
    precision = []
    recall = []
    fscore = []
    for i in range(num_classes+2):
        precision.append(tp[i]*1.0/(tp[i]+fp[i]))
        recall.append(tp[i]*1.0/(tp[i]+ fn[i]))
        fscore.append(2.0*precision[i]*recall[i]/(precision[i]+recall[i]))
    efs = fscore[5]
    return efs

and the error says:
/usr/local/lib64/python2.7/site-packages/keras/callbacks.py:390: RuntimeWarning: Can save best model only with <function f1 at 0x7fa22c209320> available, skipping. 'skipping.' % (self.monitor), RuntimeWarning)

iuria21 on 6 Jun 2017

👍5

All 16 comments

Are you providing model.fit with the validation_data field?
model.fit returns an history object that contains information about the training history. If you want the losses, you can do something like:

my_history = model.fit( ... )
losses = my_history.history['loss']

mbchang on 29 Nov 2015

I had the same problem – Keras seems to want validation_data (or validation_split) even if the quantity monitored is loss or acc.
I took a peek at the tests for this and they all provide validation data.
I think this is a Keras problem.

c0d3rman on 27 Dec 2015

@fchollet can you help us with this? It seems that Keras wants validation_data even when you use

ModelCheckpoint(filepath="/path/to/file.hdf5", monitor='loss')

c0d3rman on 2 Jan 2016

It doesn't make logical sense to save the model based on the loss on the
training data. The effective result will be to save the model after every
epoch, which can be achieved with the existing ModelCheckpoint (just set
save_best_only to False, or something similar, check the docs).

On 1 January 2016 at 15:47, c0d3rman [email protected] wrote:

@fchollet https://github.com/fchollet can you help us with this? It
seems that Keras wants validation_data even when you use

ModelCheckpoint(filepath="/path/to/file.hdf5", monitor='loss')

—
Reply to this email directly or view it on GitHub
https://github.com/fchollet/keras/issues/836#issuecomment-168351102.

fchollet on 2 Jan 2016

OK, my bad. Thanks!

c0d3rman on 2 Jan 2016

I tried the following to save the wights but instead of a hdf5 it create a folder called .ipynb_checkpoints and there is a log file called blabla-checkpoint.ipynb! where is the hdf5 file of the weights?

filepath = "/bla bla /weights.hdf5"
checkpointer = ModelCheckpoint(filepath, verbose=1, save_best_only=True)

hosnasattar on 4 Mar 2016

Blabla is also the name of your notebook, if you saved it this folder is created. Try to specify a full path for your weights.

tboquet on 4 Mar 2016

it was a problem with hdf5 lib! it solved!
bla bla was the path to the folder I wanted to save the file!!!

hosnasattar on 4 Mar 2016

@fchollet Dear Prof Chollet, Im a new user of Keras. I have a small question.

In 1st training process, I set: nb_epoch = 5, and use ModelCheckpoint to save my best model to 'best_model.hdf5'.

In 2nd training process, I want to load 'best_model.hdf5' and continue training with nb_epoch = 5. The best model is saved into the same previous model, i.e., model_save('best_model.hdf5').

I dont know, if 1st 'best_model.hdf5' has val_loss = 0.5, and if 1st epoch of 2nd training process gives val_loss = 0.6, will new model weights be overwritten to previous 'best_model.hdf5' or not?

By the way, how to view the parameters (eg. val_loss, loss_acc) of one model saved in .hdf5 format?

Many thanks to you.

duongnghia1988 on 19 Apr 2017

def f1(target, prediction): # not tensors but result values \n
    target = np.reshape(target, (-1, MAX_DOCUMENT_LENGTH, num_classes))
    prediction = np.reshape(prediction, (-1, MAX_DOCUMENT_LENGTH, num_classes))
    tp=np.asarray([0]*(num_classes+2))
    fp=np.asarray([0]*(num_classes+2))
    fn=np.asarray([0]*(num_classes+2))
    target = np.argmax(target, 2)
    prediction = np.argmax(prediction, 2)
    for i in range(len(target)):
        for j in range(MAX_DOCUMENT_LENGTH):
            if target[i][j] == prediction[i][j]:
                tp[target[i][j]] += 1
            else:
                fp[target[i][j]] += 1
                fn[prediction[i][j]] += 1
    NON_NAMED_ENTITY = 0
    for i in range(num_classes):
        if i != NON_NAMED_ENTITY:
            tp[5] += tp[i]
            fp[5] += fp[i]
            fn[5] += fn[i]
        else:
            tp[6] += tp[i]
            fp[6] += fp[i]
            fn[6] += fn[i]
    precision = []
    recall = []
    fscore = []
    for i in range(num_classes+2):
        precision.append(tp[i]*1.0/(tp[i]+fp[i]))
        recall.append(tp[i]*1.0/(tp[i]+ fn[i]))
        fscore.append(2.0*precision[i]*recall[i]/(precision[i]+recall[i]))
    efs = fscore[5]
    return efs

iuria21 on 6 Jun 2017

👍5

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

stale[bot] on 4 Sep 2017

Hi everyone,

I have a similar issue but I don't find any solution yet. Maybe one of you could help me out ?
I'm trying to make my own callback. Here is the code :

`
class EarlyStoppingByLossVal(keras.callbacks.Callback):
def __init__(self, monitor='val_loss', close_enough=1,too_far_away=50, verbose=0):
super(keras.callbacks.Callback, self).__init__()
self.monitor = monitor
self.close_enough = close_enough
self.too_far_away = too_far_away
self.verbose = verbose

def on_epoch_begin(self, epoch, logs={}):
    current = self.predict(X_Test)

    if current is not None:
        if current > self.too_far_away:
            print("Epoch %05d: Starting too far" % epoch)
            self.model.stop_training = True

def on_epoch_end(self, epoch, logs={}):
    current = logs.get(self.monitor)

    if current is not None:
        if current < self.close_enough:
            print("Epoch %05d: early stopping THR" % epoch)
            self.model.stop_training = True

model = Sequential()
model.add(Dense(units=10, activation='relu', input_shape=(3,)))
model.add(Dense(units=10, activation='relu', ))
model.add(Dense(units=10, activation='relu', ))
model.add(Dense(units=10, activation='relu', ))
model.add(Dense(units=10, activation='relu', ))
model.add(Dense(units=10, activation='relu', ))
model.add(Dense(units=10, activation='relu', ))
model.add(Dense(units=10, activation='relu', ))
model.add(Dense(units=10, activation='relu', ))
model.add(Dense(units=1, activation='relu'))

model.compile(loss='mse', optimizer='adam',metrics=['mae'])

callbacks = [
EarlyStoppingByLossVal(monitor='val_loss', close_enough=1.5,too_far_away=50, verbose=1)
]

model.fit(X_Train, Y_Train, epochs=5000, batch_size=4,
validation_data=(X_Test, Y_Test),
callbacks=callbacks)
`

No matter is in the label "monitor", my logs are always empty.

Do you have any explication for this ?

Thanks a lot !

ldesmetepm on 20 Jun 2018

Hey i was able to solve this issue
So the logs variable keys are completely dependent on the metrics that you use.
So the key wouldn't be 'val_acc', you need to change the key value in the ModelCheckPoint Class.
By creating your own class.

For me i was using categorical loss so the logs was
'val_categorical_accuracy': 0.23076923191547394, 'loss': 12.421591474298845, 'categorical_accuracy': 0.21052631709659309, 'val_loss': 10.333990097045898
check your log key and make change the logs.get() keyword parameter

This is a seperate ModelCheckPoint class which i made for my own use

import keras
from sklearn.metrics import roc_auc_score
import numpy as np
import warnings
class Histories(keras.callbacks.Callback):
    def __init__(self, filepath, monitor='val_loss', verbose=0,
                 save_best_only=False, save_weights_only=False,
                 mode='auto', period=1):
        super(Histories, self).__init__()
        self.monitor = monitor
        self.verbose = verbose
        self.filepath = filepath
        self.save_best_only = save_best_only
        self.save_weights_only = save_weights_only
        self.period = period
        self.epochs_since_last_save = 0

        if mode not in ['auto', 'min', 'max']:
            warnings.warn('ModelCheckpoint mode %s is unknown, '
                          'fallback to auto mode.' % (mode),
                          RuntimeWarning)
            mode = 'auto'

        if mode == 'min':
            self.monitor_op = np.less
            self.best = np.Inf
        elif mode == 'max':
            self.monitor_op = np.greater
            self.best = -np.Inf
        else:
            if 'acc' in self.monitor or self.monitor.startswith('fmeasure'):
                self.monitor_op = np.greater
                self.best = -np.Inf
            else:
                self.monitor_op = np.less
                self.best = np.Inf



    def on_train_begin(self, logs={}):
        self.aucs = []
        self.losses = []

    def on_train_end(self, logs={}):
        return

    def on_epoch_begin(self, epoch, logs={}):
        return

    def on_epoch_end(self, epoch, logs={}):
        logs = logs or {}
        self.epochs_since_last_save += 1
        if self.epochs_since_last_save >= self.period:
            self.epochs_since_last_save = 0
            filepath = self.filepath.format(epoch=epoch + 1, **logs)
            if self.save_best_only:
                current = logs.get('val_categorical_accuracy')
                # print('this is current',current,logs)
                if current is None:
                    warnings.warn('Can save best model only with %s available, '
                                  'skipping.' % (self.monitor), RuntimeWarning)
                else:
                    if self.monitor_op(current, self.best):
                        if self.verbose > 0:
                            print('\nEpoch %05d: %s improved from %0.5f to %0.5f,'
                                  ' saving model to %s'
                                  % (epoch + 1, self.monitor, self.best,
                                     current, filepath))
                        self.best = current
                        if self.save_weights_only:
                            self.model.save_weights(filepath, overwrite=True)
                        else:
                            self.model.save(filepath, overwrite=True)
                    else:
                        if self.verbose > 0:
                            print('\nEpoch %05d: %s did not improve from %0.5f' %
                                  (epoch + 1, self.monitor, self.best))
            else:
                if self.verbose > 0:
                    print('\nEpoch %05d: saving model to %s' % (epoch + 1, filepath))
                if self.save_weights_only:
                    self.model.save_weights(filepath, overwrite=True)
                else:
                    self.model.save(filepath, overwrite=True)

    def on_batch_begin(self, batch, logs={}):
        return

    def on_batch_end(self, batch, logs={}):
        return

You can call it using the following code

histories = modelcheck.Histories("/data/Mihir/SpeakAI_data/models/ff-bestmodel.hdf5",  mode='max', monitor='val_loss', save_best_only=True,save_weights_only=True)

callbacks_list = [histories]

model.fit(x_train,y_train,epochs=1,verbose=1, callbacks=callbacks_list,validation_data=(x_test, y_test))

mihirp1998 on 26 Jun 2018

👍2 ❤1

Thank you for your answer !

Sorry for the delay of mine. Perfectly works !

Thanks

ldesmetepm on 3 Jul 2018

Hi, is there any built-in solution to this issue in actual Keras stable versions ? Cant see why this issue is closed.

Thank you in advance

Uiuran on 29 Nov 2018

This is a seperate ModelCheckPoint class which i made for my own use

import keras
from sklearn.metrics import roc_auc_score
import numpy as np
import warnings
class Histories(keras.callbacks.Callback):
  def __init__(self, filepath, monitor='val_loss', verbose=0,
               save_best_only=False, save_weights_only=False,
               mode='auto', period=1):
      super(Histories, self).__init__()
      self.monitor = monitor
      self.verbose = verbose
      self.filepath = filepath
      self.save_best_only = save_best_only
      self.save_weights_only = save_weights_only
      self.period = period
      self.epochs_since_last_save = 0

      if mode not in ['auto', 'min', 'max']:
          warnings.warn('ModelCheckpoint mode %s is unknown, '
                        'fallback to auto mode.' % (mode),
                        RuntimeWarning)
          mode = 'auto'

      if mode == 'min':
          self.monitor_op = np.less
          self.best = np.Inf
      elif mode == 'max':
          self.monitor_op = np.greater
          self.best = -np.Inf
      else:
          if 'acc' in self.monitor or self.monitor.startswith('fmeasure'):
              self.monitor_op = np.greater
              self.best = -np.Inf
          else:
              self.monitor_op = np.less
              self.best = np.Inf



  def on_train_begin(self, logs={}):
      self.aucs = []
      self.losses = []

  def on_train_end(self, logs={}):
      return

  def on_epoch_begin(self, epoch, logs={}):
      return

  def on_epoch_end(self, epoch, logs={}):
      logs = logs or {}
      self.epochs_since_last_save += 1
      if self.epochs_since_last_save >= self.period:
          self.epochs_since_last_save = 0
          filepath = self.filepath.format(epoch=epoch + 1, **logs)
          if self.save_best_only:
              current = logs.get('val_categorical_accuracy')
              # print('this is current',current,logs)
              if current is None:
                  warnings.warn('Can save best model only with %s available, '
                                'skipping.' % (self.monitor), RuntimeWarning)
              else:
                  if self.monitor_op(current, self.best):
                      if self.verbose > 0:
                          print('\nEpoch %05d: %s improved from %0.5f to %0.5f,'
                                ' saving model to %s'
                                % (epoch + 1, self.monitor, self.best,
                                   current, filepath))
                      self.best = current
                      if self.save_weights_only:
                          self.model.save_weights(filepath, overwrite=True)
                      else:
                          self.model.save(filepath, overwrite=True)
                  else:
                      if self.verbose > 0:
                          print('\nEpoch %05d: %s did not improve from %0.5f' %
                                (epoch + 1, self.monitor, self.best))
          else:
              if self.verbose > 0:
                  print('\nEpoch %05d: saving model to %s' % (epoch + 1, filepath))
              if self.save_weights_only:
                  self.model.save_weights(filepath, overwrite=True)
              else:
                  self.model.save(filepath, overwrite=True)

  def on_batch_begin(self, batch, logs={}):
      return

  def on_batch_end(self, batch, logs={}):
      return

You can call it using the following code

histories = modelcheck.Histories("/data/Mihir/SpeakAI_data/models/ff-bestmodel.hdf5",  mode='max', monitor='val_loss', save_best_only=True,save_weights_only=True)

callbacks_list = [histories]

model.fit(x_train,y_train,epochs=1,verbose=1, callbacks=callbacks_list,validation_data=(x_test, y_test))

Oh this looked so promising for me; but sadly failed (I am probably being stupid), throwing the warnings.warn('Can save best model only with %s available, 'skipping.' % (self.monitor), RuntimeWarning) error? even just for 'val_loss'....