Something similar was asked before but this is different.
What pattern is imagined for run dags that have one config instance leading to a number of (model, data_train, data_test) runs? You might typically collapse this into a mean score - std score or something like that.
I feel like I am fighting the framework so am probably not getting something right. The issue is the Trainable class handle the serialization/persistance but you really way to pass everything down to the remotes that get parallized.
Or is there some pattern at the tune.run level that allows you to sample across the train/test pairs (not optimize) i.e. treat the data like config params?
Hey David, great question. Would you be open to a quick call on this subject - it's something we've been discussing internally (the ray / anyscale team) and it'd be good to understand your perspective to make sure we're thinking about it in the right way - given your well formulated question :).
can you ping me, bill @ anyscale or if you're on the ray slack I'm on there too.
Hey @cottrell, would this work https://ray.readthedocs.io/en/latest/tune-searchalg.html#repeated-evaluations for you?
Documentation here: https://ray.readthedocs.io/en/latest/tune/api_docs/suggestion.html#ray.tune.suggest.Repeater
What you could do is you can have the Trainable execute something different depending on the trial_index.
class TestMe(Trainable):
def _setup(self, config):
index = config[tune.suggest.repeater.TRIAL_INDEX]
data_train, data_test = create_from_index(index, config)
def _train(self):
...
tune.run(TestMe, search_alg=Repeater(HyperOptSearch(search_space), repeat=5, set_index=True))
Does this make sense? Feel free to follow up with any questions (or any suggestions for how we can improve the docs).
@richardliaw I think a modified Repeater would handle that case I'm thinking of ... but I'm kind of reluctant to fit a framework around it. You could get really fancy and just treat all individual runs as independent and just update the stats scoring mechanism in the scheduler I guess. Will try to jump on slack
Can do a call to chat. @anabranch Will msg you @ anyscale ... I've requested to join slack but it might take a few days.
Sent an invite; happy to chat online.
I think we resolved this offline (feel free to reopen if not resolved.)
Curious what the specific resolution of this was. @cottrell, have you settled for the Repeater solution Richard showed above or have you found a different way around it?
Ah, I think the resolution was to just not use this type of computation pattern.