Instead of using EarlyStop to avoid overfitting, I would like to save the model at the time it had the best (lowest) validation loss. In other words, I would like to keep track only for the best checkpoint (based on val/dev set).
I tried to use a wrapper for tf.train.Saver (this one: https://github.com/vonclites/checkmate), but couldn't make it works with DeepSpeech. Is there a easy way to do tha (maybe using the MonitoredTrainingSession as you are using)?
I looked into this briefly but couldn't find a clean way to implement it with MonitoredTrainingSession (which is IMO a terrible API). I ended up just writing a hack that works, but isn't really code we can land. I'm attaching the patch.
Thanks @reuben , I had to change the name of the MonitoredTrainingSession to train_session for this to work. Now it is working perfectly fine.
@bernardohenz What's the status here, is the issue fixed, do you have a workaround ? Should we close this ?
@lissyx yes, the patch from @reuben worked just fine.
We should have a proper solution for this in-tree. This would be too much work with the current training setup, but would probably be very simple if we used TF Eager, for example. Reopening so we don't forget.
Updated version of the patch is here: https://gist.github.com/reuben/dcc2deaf85568591e34ce363bc3bac2a
load command-line parameter.This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Updated version of the patch is here: https://gist.github.com/reuben/dcc2deaf85568591e34ce363bc3bac2a