Caffe: Any examples of using .solverstate with python interface?

Created on 9 Feb 2016  ·  4Comments  ·  Source: BVLC/caffe

I only saw how to initialize a network using .caffemodel, but what about .solverstate? Thank you.

Most helpful comment

If you use copy_from, it only copies the model parameters corresponding to
the layer names in your prototxt, i.e., link the the same name in your
prototxt and caffemodel and copy the weights. But other training parameters
will re-start based on your settings in the solver.prototxt. It is more
likely used for fine-tune a new model to a certain task.

If you restore the solverstate. it will fully recover the training process
stopped in the last solverstate including the iteration, current learning
rate, etc. It will also find the .caffemodel file internally. If your
caffemodel misses, re-training will never be working. But I am not sure if
solverstate file has a complete copy of the weights (it seems it has due to
the size of the solverstate file).

In addition, asking questions in Google Caffe user group instead of here.
On Tue, Feb 9, 2016 at 9:06 AM, Sirawat Pitaksarit <[email protected]

wrote:

Thank you for the quick reply! May I ask a bit further, it seems like
.caffemodel is a subset of .solverstate. What will be missing if I
initialize the network using .caffemodel?


Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/issues/3651#issuecomment-181875930.

Best Regards,
Zizhao

All 4 comments

Use solver.restore('*.solverstate').
No need to use solver.net.copy_from().

On Tue, Feb 9, 2016 at 8:08 AM, Sirawat Pitaksarit <[email protected]

wrote:

I only saw how to initialize a network using .caffemodel, but what about
.solverstate? Thank you.


Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/issues/3651.

Best Regards,
Zizhao

Thank you for the quick reply! May I ask a bit further, it seems like .caffemodel is a subset of .solverstate. What will be missing if I initialize the network using .caffemodel?

If you use copy_from, it only copies the model parameters corresponding to
the layer names in your prototxt, i.e., link the the same name in your
prototxt and caffemodel and copy the weights. But other training parameters
will re-start based on your settings in the solver.prototxt. It is more
likely used for fine-tune a new model to a certain task.

If you restore the solverstate. it will fully recover the training process
stopped in the last solverstate including the iteration, current learning
rate, etc. It will also find the .caffemodel file internally. If your
caffemodel misses, re-training will never be working. But I am not sure if
solverstate file has a complete copy of the weights (it seems it has due to
the size of the solverstate file).

In addition, asking questions in Google Caffe user group instead of here.
On Tue, Feb 9, 2016 at 9:06 AM, Sirawat Pitaksarit <[email protected]

wrote:

Thank you for the quick reply! May I ask a bit further, it seems like
.caffemodel is a subset of .solverstate. What will be missing if I
initialize the network using .caffemodel?


Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/issues/3651#issuecomment-181875930.

Best Regards,
Zizhao

.caffemodel contains the weights. .solverstate contains the momentum vector. Both are needed to restart training. If you restart training without momentum, the loss will spike up and it will take ~50k iterations to recover. At test time you only need .caffemodel.

I am closing this as this belongs on the user group, thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lixin7895123 picture lixin7895123  ·  3Comments

hawklucky picture hawklucky  ·  3Comments

FreakTheMighty picture FreakTheMighty  ·  3Comments

malreddysid picture malreddysid  ·  3Comments

weather319 picture weather319  ·  3Comments