Don't know if I should put this under bug or question, but when I try to restore my saved model I get the error mentioned in the title; I don't know what it means.
Trace:
Traceback (most recent call last):
File "bitcoin_collusion.py", line 1314, in <module>
run_saved(BLOCKS = args.blocks, ALPHA = args.alphas, GAMMA = args.gammas, SPY = args.spy, use_lstm = args.use_lstm, trainer = args.algo, episodes = args.episodes, ep_length=args.ep_length)
File "bitcoin_collusion.py", line 871, in run_saved
trainer.restore('/afs/ece.cmu.edu/usr/charlieh/ray_results/PPO/PPO_BitcoinEnv_0_2020-01-29_00-56-415e4ywyg1/checkpoint_11223/checkpoint-11223')
File "/afs/ece.cmu.edu/usr/charlieh/.local/lib/python3.6/site-packages/ray/tune/trainable.py", line 353, in restore
self._restore(checkpoint_path)
File "/afs/ece.cmu.edu/usr/charlieh/.local/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 528, in _restore
self.__setstate__(extra_data)
File "/afs/ece.cmu.edu/usr/charlieh/.local/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 161, in __setstate__
Trainer.__setstate__(self, state)
File "/afs/ece.cmu.edu/usr/charlieh/.local/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 824, in __setstate__
self.workers.local_worker().restore(state["worker"])
File "/afs/ece.cmu.edu/usr/charlieh/.local/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py", line 708, in restore
self.sync_filters(objs["filters"])
File "/afs/ece.cmu.edu/usr/charlieh/.local/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py", line 675, in sync_filters
assert all(k in new_filters for k in self.filters)
AssertionError
The relevant pieces of code:
tune.run(
trainer,
loggers = [CustomLogger],
stop={"episodes_total": episodes},
config={
"env": BitcoinEnv,
#"vf_share_layers": True,
"gamma": 0.99,
"num_workers": 7,
"num_envs_per_worker": 1,
"batch_mode": "complete_episodes",
"train_batch_size": args.workers*args.ep_length,
"entropy_coeff": .5,
"entropy_coeff_schedule": args.ep_length*args.episodes,
"multiagent": {
"policies_to_train": policies_to_train,
"policies": policies,
"policy_mapping_fn": select_policy,
},
"env_config": {
"max_hidden_block": BLOCKS,
"alphas":ALPHA,
"gammas":GAMMA,
'ep_length':ep_length,
'print': False
},
"callbacks": {
"on_episode_start": on_episode_start,
"on_episode_step": on_episode_step,
"on_episode_end": on_episode_end
}
},
checkpoint_score_attr="episode_reward_mean",
keep_checkpoints_num=1,
checkpoint_freq=3)
My attempt to restore the trained model to find the policy:
ray.init()
config = ppo.DEFAULT_CONFIG.copy()
trainer = PPOTrainer(env=BitcoinEnv, config={
"num_workers": 7,
"env_config": {
"max_hidden_block": BLOCKS,
"alphas":ALPHA,
"gammas":GAMMA,
'ep_length':ep_length,
'print': False
}
})
trainer.restore('/afs/ece.cmu.edu/usr/charlieh/ray_results/PPO/PPO_BitcoinEnv_0_2020-01-29_00-56-415e4ywyg1/checkpoint_11223/checkpoint-11223')
policy = trainer.get_policy()
print(policy)
Ray version and other system information (Python version, TensorFlow version, OS):
Ray == 0.7.6
Fedora 7.7 (Maipo) server
Ray from pip
Python version 3.6.8
My bad; you have to put the multiagent config into the PPOtrainer when you load.
Most helpful comment
My bad; you have to put the multiagent config into the PPOtrainer when you load.