The PPO RL paper outlines that N actors are run in parallel when training a model.
However, the PPO2 implementation does not have a parameter that allows to specify the number of actors. I am wondering if this is something that is set by default somewhere (although I couldn't find it in the source code). Or perhaps you need to use the SubprovVecEnv vectorized environment to exploit the use of actors running in parallel.
What is the default number of actors (notation N in the paper) used when training a PPO2 model? Can the number of actors be modified?
Yes, the number of actors is defined by the number of environments (one actor per environment), so you use the SubprovVecEnv to define how many actors/environments you want to run in parallel.
The default number of environments depends on the environment: Generally you want to have this as high as possible according to some papers (Can't recall the name now, but number of envs can go to hundreds). A safe bet is to start with 8 or 16 environments if your machine can handle it. Even if the total FPS is less with more environments, I recommend going with more environments as it provides better estimate for the update (more independent samples with more environments).
To complete @miffyli answer:
yes the number of actors is the number of environments, so you need to use a VecEnv for that (DummyVecEnv or the sub process version). The default number is therefore one.
Regarding the influence of the number of actors, adding more actors will usually improve exploration and wall clock time but decrease sample efficiency (you can check out the rl zoo for working hyperparameters on different environments).
Edit: you also usually need to change n_steps parameter which is defined per environment
Thank you for clarifying
Most helpful comment
Yes, the number of actors is defined by the number of environments (one actor per environment), so you use the
SubprovVecEnvto define how many actors/environments you want to run in parallel.The default number of environments depends on the environment: Generally you want to have this as high as possible according to some papers (Can't recall the name now, but number of envs can go to hundreds). A safe bet is to start with 8 or 16 environments if your machine can handle it. Even if the total FPS is less with more environments, I recommend going with more environments as it provides better estimate for the update (more independent samples with more environments).