Although Tune is (somewhat) easy to setup and get started, it doesn't provide the best user experience (compared to tools like bazel, keras).
For one, the result logs are too verbose. There's so much text printed, especially for a distributed experiment when running 20+ jobs at once.
Second, for the same reason, it's very annoying and unpleasant to use Tune in a Jupyter notebook. This probably loses half the Data Sci crowd.
Third, there are way too many logs and notifications and info strings provided that don't really inform the user.
Finally, we should integrate with Tensorboard HParams #4528.
Please share your thoughts. We will keep this RFC open for a while. I'd really appreciate your feedback!
If Ray code is running, and is interrupted by the user, the Notebook must be restarted to continue using Ray. To see a workable example, check out this code link by running the first cell, then the second cell, then interrupting the second cell (which hangs) and then trying to re-run the second cell; it throws an error which can only be fixed if the user restarts the Notebook. In the case of a lengthy project contained throughout many Notebook cells, this becomes a bad user experience.
Hey @kiddyboots216 would you mind sharing me the code link to me? [email protected]
@angelotc done
@richardliaw @ericl
Great topic! Background - I've been using gym, tune and rllib full time for the past year with the Python API in a single local machine.
TL;DR:
Overall, I believe that ray has a great and scalable architecture overall so it shouldn't be painful to improve the UX. The UX to set up and run the experiment is great. The UX to inspect running experiment (tensorboard) is great, but can be improved with HParams, computation graph tab and custom user tabs. The UX to restore tuned agents is frankly a painful experience which requires low level work on user side even to do simple stuff.
An ideal workflow could be:
Hey @FedericoFontana, this is really good feedback, and I'll try to get to all of this in the next month. Appreciate you spending time to put this together!
RE: ExperimentAnalysis, I'll probably implement something that keeps everything in memory. That should scale much nicer than the current version.
cc @hershg
I think when it comes to visualizing and comparing different experiments, Weights & Biases is one of the best choice. It can
I suggest either we can build a wandb-like experiment management/comparison/visualization, or we can integrate deeply with wandb.
@richardyy1188 Can you try the Weights and Bias Tune Logger? https://github.com/wandb/client/tree/master/wandb/ray
Yes I tried it. But it seems there's no way to use wandb.watch(my_model_instance) with tune. This call is to track gradient and save the model graph, to show sth. like this

The problem I encountered is watch should called after init, but if we put watch in logger like below, we can't catch the model instance it needs.
class ... (trainable):
def _setup(self, config):
m = Model().cuda()
self.config['my_model'] = m
class wandblogger(tune.logger):
def __init__(self):
wandb.init(...)
wandb.watch(config[my_model]) # pyarrow: ... can't pass cuda ... cpu memory,
And we can't just do init and watch in trainable or outside, cuz it seems they and logger are in different process.
Got it; we should integrate more with them then.
Most helpful comment
Got it; we should integrate more with them then.