Ray: Confused about how to extract optimal schedule after PopulationBasedTraining

Created on 9 Aug 2020  路  2Comments  路  Source: ray-project/ray

I read: https://github.com/ray-project/ray/issues/5489 and understand that pbt_global.txt contains the perturbations from some of the trials, however I am confused by how one would then establish an "optimal" schedule. Would you simply save the last value of each dictionary and then use that as your value for iteration N during the next run?

For example, if one were to run the toy example and after every iteration perturbing the learning rate, how would one extract a schedule so that you could just say "at iteration N the learning rate should be X". Or do I have some fundamental misunderstanding of what PBT provides?

For a more concrete example, I am running online k-NN (adding values to index over time) and thus the optimal value of k can alter. I want to find some sort of a schedule for values of k.

question

All 2 comments

Hi, @alexisdrakopoulos, the schedules for each trial are saved as txt files in the results directory. @krfricke recently added functionality to replay a PBT run (https://docs.ray.io/en/master/tune/tutorials/tune-advanced-tutorial.html#replaying-a-pbt-run). So to get the optimal schedule you can pick the trial with the best accuracy and replay the corresponding schedule. Is this what you're looking for?

As a side note, you can explore the hyperparameter schedule for each trial in pbt_policy_{trial_id}.txt. If you would like to see the extracted schedule we use in the PopulationBasedTrainingReplay utility, maybe this helps:

from ray.tune.schedulers.pbt import PopulationBasedTrainingReplay

replay = PopulationBasedTrainingReplay(
    "~/ray_results/pbt_test_1/pbt_policy_3c28a_00001.txt")

print(replay.config)  # Initial config
print(replay._policy)  # Schedule, in the form of tuples (step, config)

Please note that _policy is a protected attribute and the representation might change in the future.

Was this page helpful?
0 / 5 - 0 ratings