Ml-agents: How do I manually end training and save the model?

Created on 16 Dec 2018  路  5Comments  路  Source: Unity-Technologies/ml-agents

Most helpful comment

The fact that it is not possible to interrupt the simulation and get the trained model is a known issue on Windows. There are ways to go around by decreasing the checkpoint saving frequency (using the --save-freq flag when using mlagents-learn. After interrupting the simulation you can relaunch it with the --load flag and a low max_step so the model will be loaded at start and immediately return (saving the model in the process)

All 5 comments

I you are using OSX or Linux, you can quit training by doing Ctrl + C. If you are on Windows, you cannot quit the training in a similar way and you will have to set the max_step in the config file to a lower number to make sure it gets exported early. This bug on windows has been raised on #980

Yes, but this way, it doesn't save the model. Sometimes I don't want to wait until max_step is reached, because for example I can already see something is wrong, or the agent has already learned to achieve its goal.

The fact that it is not possible to interrupt the simulation and get the trained model is a known issue on Windows. There are ways to go around by decreasing the checkpoint saving frequency (using the --save-freq flag when using mlagents-learn. After interrupting the simulation you can relaunch it with the --load flag and a low max_step so the model will be loaded at start and immediately return (saving the model in the process)

This bug has been fixed with https://github.com/Unity-Technologies/ml-agents/pull/1558. The error message

forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FFDAFF994C4 Unknown Unknown Unknown
KERNELBASE.dll 00007FFDEF127EDD Unknown Unknown Unknown
KERNEL32.DLL 00007FFDF2131FE4 Unknown Unknown Unknown
ntdll.dll 00007FFDF226EFC1 Unknown Unknown Unknown

still exist, which is related to this stackoverflow https://stackoverflow.com/questions/42653389/ctrl-c-causes-forrtl-error-200-rather-than-python-keyboardinterrupt-exception. Our fix get around this error by adding another ctrl-c event handler which saves the model before this error.

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Porigon45 picture Porigon45  路  3Comments

mattinjersey picture mattinjersey  路  3Comments

MarkTension picture MarkTension  路  3Comments

jlanis picture jlanis  路  4Comments

RavenLeeANU picture RavenLeeANU  路  4Comments