HI all
I'm training the model to 500k steps and I got the nn file.
But I want to output nn file when the model is 400k steps stage.
Is it possible to output 400k steps stage nn file?
Currently, there is no experimental configuration that supports outputting the nn file every k steps. However, a policy checkpoint is saved every k steps using the value from the --save-freq command line argument.
You may be able to hack it if you want to get an nn every k steps using the export_graph function in the trainer_controller. I think
calling self._export_graph() after self._save_model in the trainer controller (line 226) and modifying the file path in export_model in tf_policy.py so that on each export, files are not overwritten would work.
Alternatively, you can kill the training run after 400k steps which will save to an nn and then rerun with the --load flag to restart from the stopped position. Rename the file so that it does not get overwritten after the 500k steps. This is a bad suggestion but may be simplest.
I agree that this is a feature we should have and I have logged it internally as the ticket MLA-538.
.nn files are now saved at the same time as checkpoints (see https://github.com/Unity-Technologies/ml-agents/pull/4127). This will be in the next release (sometime in August).
Most helpful comment
Currently, there is no experimental configuration that supports outputting the nn file every k steps. However, a policy checkpoint is saved every k steps using the value from the --save-freq command line argument.
You may be able to hack it if you want to get an nn every k steps using the
export_graphfunction in the trainer_controller. I thinkcalling
self._export_graph()afterself._save_modelin the trainer controller (line 226) and modifying the file path inexport_modelin tf_policy.py so that on each export, files are not overwritten would work.Alternatively, you can kill the training run after 400k steps which will save to an nn and then rerun with the --load flag to restart from the stopped position. Rename the file so that it does not get overwritten after the 500k steps. This is a bad suggestion but may be simplest.
I agree that this is a feature we should have and I have logged it internally as the ticket MLA-538.