Ray: Easiest way to load/play a checkpoint after training is done? [tune]

Created on 3 Feb 2020 · 4Comments · Source: ray-project/ray

How do you guys currently manage the case of "I want to load up a saved checkpoint of that model from last week and play an episode with it?"

What is the easiest way to find the directory of the latest checkpoint for a given trainable class? There are two different directories where stuff is stored and I can't seem to ever remmember where things are.

Right now I have to open tensorboard, lookup a good experiment e.g. CartPoleTrainable/CartPoleTrainable_86edf4bf_2020-02-02_13-41-50a8gnqyok and then use that in restore(dir).

The workflow I'm thinking is something like this (pseudocode):

t = MyTrainable()
t.load_latest_best_checkpoint_for_this_trainable()
t.play() # my custom function that will play/render an episode

Is there something equivalent to latest_checkpoint from tensorflow?

P.S.: Awesome library, trainable interface is great.

question tune

Source

drozzy

Most helpful comment

Hey, this should help you. The simplest solution I found to load in a checkpoint and perform evaluations.


analysis = Analysis(path_to_results)
checkpoint_dir = analysis.get_best_config(metric=self._config['metric'])  
checkpoint_path = <your code to extract latest checkpoint file from the best logdir> 
model = ppo.PPOTrainer(env="NameOfYourEnv", config=test_config)
model.restore(checkpoint_path)
env = <create your env> 
obs  = env.reset()
while True:
            action = model.compute_action(obs, prev_action=0, prev_reward=0)
            obs, reward, done, info = env.step(action)

Alternatively, you can create a PolicyServer, though I could not get this to work with a different number of workers then what my model was initially trained with, and the server wouldn't support multiple workers because it's listening on some address. Here's more info for that, they used Cartpole as well. Not sure if it's even supported as of right now.

Edit: Also If you'd prefer to use the tune wrapper, just pass the "checkpoint_path" into the restore argument for tune.run(..... restore=checkpoint_path) and set your stopping criteria to 1 episode

RaedShabbir on 5 Feb 2020

👍3

All 4 comments

Hey, this should help you. The simplest solution I found to load in a checkpoint and perform evaluations.


analysis = Analysis(path_to_results)
checkpoint_dir = analysis.get_best_config(metric=self._config['metric'])  
checkpoint_path = <your code to extract latest checkpoint file from the best logdir> 
model = ppo.PPOTrainer(env="NameOfYourEnv", config=test_config)
model.restore(checkpoint_path)
env = <create your env> 
obs  = env.reset()
while True:
            action = model.compute_action(obs, prev_action=0, prev_reward=0)
            obs, reward, done, info = env.step(action)

Edit: Also If you'd prefer to use the tune wrapper, just pass the "checkpoint_path" into the restore argument for tune.run(..... restore=checkpoint_path) and set your stopping criteria to 1 episode

RaedShabbir on 5 Feb 2020

👍3

If you'd prefer to use the tune wrapper, just pass the "checkpoint_path" into the restore argument for tune.run(..... restore=checkpoint_path) and set your stopping criteria to 1 episode

Interesting! How would I access the model?

drozzy on 23 Feb 2020

but this is nondeterministic right ?

how to get it in deterministic way