Keras: Saving state of callbacks and restoring them

Created on 22 Feb 2018 · 7Comments · Source: keras-team/keras

I tried to save and restore state of my training process using model.save() and keras.models.load_model(). Although weights, optimizer state and learning rate is restored correctly state of callback functions like keras.callbacks.ReduceLROnPlateau() doesn't get it's state restored. Neither does the epoch number gets restored.

So I am essentially looking for a checkpoint mechanism that restore the run state exactly.

This is necessary since many deep learning models keep training for very very long and for different reasons one has to be able to stop and resume training often even every few epoch. in that case restoring lr-schedular state is necessary as otherwise correct stepwise reduction is not possible.

Source

ParthaEth

👍2

Most helpful comment

@Dref360 Any update here? It would be much easier if have an option to save and load callbacks with state with model.save and load_model respectively. However, we should also handle the scenario where someone wishes to add new callbacks and remove them.

manrajgrover on 17 Mar 2020

👍3

All 7 comments

You can pickle your Callback and reload it after.

Dref360 on 22 Feb 2018

I am getting an error like - pickle.PicklingError: Can't pickle <function <lambda> at 0x7f90e88fae60>: it's not found as keras.callbacks.<lambda>

ParthaEth on 23 Feb 2018

Even with import dill, RuntimeError: maximum recursion depth exceeded this happens.

ParthaEth on 23 Feb 2018

It's because you provide schedule as a Lambda (as stated in your error message)

def schedule(epoch):
    return epoch
from keras import callbacks
lr = callbacks.LearningRateScheduler(schedule)
import pickle
pickle.dumps(lr)
pickle.loads(pickle.dumps(lr))

Dref360 on 23 Feb 2018

manrajgrover on 17 Mar 2020

👍3

@Dref360 Any update here? It would be much easier if have an option to save and load callbacks with state with model.save and load_model respectively. However, we should also handle the scenario where someone wishes to add new callbacks and remove them.

Even if it is for fixed callbacks with no scope of addition or removing would be nice feature

SpikingNeuron on 2 Apr 2020

The Callbacks are not serializable in TensorFlow, but it definitely would be helpful if they would be.

All is needed is to implement get_config and from_config methods for the callbacks. There's an open PR that targets to solve this problem in a more generic manner: https://github.com/tensorflow/tensorflow/pull/36635