Anyone have any settings and an idea how long it takes to train the tennis sample? I tried with the defaults, but I get a mean of about...
Mean reward -0.0185785606631
That doesn't seem to change much after an hour.
Thanks!
HI @raeldor
Arthur mentioned that the tennis example takes 3-5 million steps for training.
That's a lot of steps, but even after 750,000 steps it seems there's still no progress in the reward, which strikes me as a bit odd.
Hi @raeldor,
As @MarcoMeter mentioned, it does take a few million steps to train. Also, due to the adversarial nature of the problem, it is also one of the most difficult tasks to train, as these learning problems are less stable. I would recommend training it with a larger beta value than the default as well.
How can we increase the number of steps ?
in the hyperparameters, maxsteps = 5e5. What does it means ? 5*10^5 ? like 5000000 ? This is 5 millions steps. So by default it will be enough ?
Would be great if the documentation could contain the settings for training each of the samples and how many steps it took. I think this would be very, very useful. Thanks.
Hi @raeldor, this is something we are working on adding soon. I completely understand how useful it would be.
@Fangh, 5e5 = 500000, which is five hundred thousand. 5e6 would be 5 million.
In the meantime it would be great if you could post the settings used to train the tennis sample please. :)
Hey @raeldor - this is what I used. As mentioned, it takes millions of steps. But I was able to get a model to work. Hope that helps.
{'--batch-size': '64',
'--beta': '2.5e-3',
'--buffer-size': '2048',
'--curriculum': 'None',
'--epsilon': '0.2',
'--gamma': '0.99',
'--help': False,
'--hidden-units': '64',
'--keep-checkpoints': '5',
'--lambd': '0.95',
'--learning-rate': '3e-4',
'--load': False,
'--max-steps': '5e6',
'--normalize': False,
'--num-epoch': '5',
'--num-layers': '2',
'--run-path': 'ppo',
'--save-freq': '50000',
'--summary-freq': '10000',
'--time-horizon': '2048',
'--train': True,
'--worker-id': '0',
'
Appreciate these settings, thank you. I'll give it a shot! :-)
hey @raeldor let us know if you had any issues. will close this issue out.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Most helpful comment
Would be great if the documentation could contain the settings for training each of the samples and how many steps it took. I think this would be very, very useful. Thanks.