Caffe: Any examples of using .solverstate with python interface?

Created on 9 Feb 2016 · 4Comments · Source: BVLC/caffe

I only saw how to initialize a network using .caffemodel, but what about .solverstate? Thank you.

Source

5argon

Most helpful comment

If you use copy_from, it only copies the model parameters corresponding to
the layer names in your prototxt, i.e., link the the same name in your
prototxt and caffemodel and copy the weights. But other training parameters
will re-start based on your settings in the solver.prototxt. It is more
likely used for fine-tune a new model to a certain task.

If you restore the solverstate. it will fully recover the training process
stopped in the last solverstate including the iteration, current learning
rate, etc. It will also find the .caffemodel file internally. If your
caffemodel misses, re-training will never be working. But I am not sure if
solverstate file has a complete copy of the weights (it seems it has due to
the size of the solverstate file).

In addition, asking questions in Google Caffe user group instead of here.
On Tue, Feb 9, 2016 at 9:06 AM, Sirawat Pitaksarit <[email protected]

wrote:

Thank you for the quick reply! May I ask a bit further, it seems like
.caffemodel is a subset of .solverstate. What will be missing if I
initialize the network using .caffemodel?

—
Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/issues/3651#issuecomment-181875930.

Best Regards,
Zizhao

zizhaozhang on 9 Feb 2016

👍14

All 4 comments

Use solver.restore('*.solverstate').
No need to use solver.net.copy_from().

On Tue, Feb 9, 2016 at 8:08 AM, Sirawat Pitaksarit <[email protected]

wrote:

I only saw how to initialize a network using .caffemodel, but what about
.solverstate? Thank you.

—
Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/issues/3651.

Best Regards,
Zizhao

zizhaozhang on 9 Feb 2016

👍5

Thank you for the quick reply! May I ask a bit further, it seems like .caffemodel is a subset of .solverstate. What will be missing if I initialize the network using .caffemodel?

5argon on 9 Feb 2016

In addition, asking questions in Google Caffe user group instead of here.
On Tue, Feb 9, 2016 at 9:06 AM, Sirawat Pitaksarit <[email protected]

wrote:

Thank you for the quick reply! May I ask a bit further, it seems like
.caffemodel is a subset of .solverstate. What will be missing if I
initialize the network using .caffemodel?

—
Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/issues/3651#issuecomment-181875930.

Best Regards,
Zizhao

zizhaozhang on 9 Feb 2016

👍14

.caffemodel contains the weights. .solverstate contains the momentum vector. Both are needed to restart training. If you restart training without momentum, the loss will spike up and it will take ~50k iterations to recover. At test time you only need .caffemodel.

I am closing this as this belongs on the user group, thanks!