When creating a DQN, there is an option to specify exploration_fraction and exploration_final_eps, but not the initial value.
As a result, the value is in DQN.run() with a linear schedule beginning at 1:
# Create the schedule for exploration starting from 1.
self.exploration = LinearSchedule(schedule_timesteps=int(self.exploration_fraction * total_timesteps), initial_p=1.0, final_p=self.exploration_final_eps)
It would be helpful to set this initial value to something other than 1 so that users could easily specify it ahead of time. Alternatively, it would be helpful to have the option to pass in the schedule (or a custom function that took the step count as input and returned epsilon). A more simple approach would be to simply put the line that defines the self.exploration in __init__ rather than run so that a savvy user could overwrite self.exploration.
Hey. That's a good point. This would also be a useful parameter if e.g. one loads existing parameters and wants to have a small amount of exploration (but some) from the beginning. For simplicity I would provide this as a parameter to DQN __init__ (e.g. exploration_start_eps, which defaults to 1.0). If we move creation of schedule to __init__ we need to check if scheduler should be reset at the beginning of each learn() (in case scheduler has parameters to reset) and it could change some existing behavior.
On one hand I agree on the "support for savvy users", but on the other hand things like these could start complicating the code. Personally I have copied the corresponding files (e.g. dqn.py) and done necessary "savvier" modifications there, which works out well most of the time.
We would appreciate a PR on the exploration_start_epsilon :)
Cool, it sounds like a good approach here is to pass exploration_start_eps to DQN.__init__ with default of 1, then to save that value in the same way that exploration_final_eps is saved as a parameter.
Am still fairly new to proper PR etiquette so I may ask for help but this is a pretty small fix that I think I can manage :)
@Miffyli it looks like testing on the DQN is fairly limited -- are there any tests I should add here?
Since this only changes one parameter to a scheduler, I do not think it would be a DQN test and we should rather test the scheduler. If you could add a test for LinearScheduler in test_schedules.py, that should do the trick :)