The original choice of Binder was because it allowed us to provision the machines with drake via a Docker instance, where Colab does not. But the drake docker instance is large enough that the Binder setup takes just as long as setting up Colab in the first cell of a notebook. We also thought there were a number of things that Colab would not support, such as interactive visualization, but I've worked around many of those (#12645). Colab has also proven to be a much more stable/reliable resource.
But most of all, Colab's connection to Google drive changes the game. Now that it works well, I find myself routinely starting my work on a colab notebook, running it from any device, and (as the name suggests) very easily sharing it with collaborators.
This requires some modifications to the preamble of our notebooks. Here is an example:
https://colab.research.google.com/drive/1rhqV8WMo6pNyzOV3TjAth4R9hSlZx64t
@EricCousineau-TRI , @hongkai-dai , or others, any thoughts? (especially on the preamble)?
Note: I am ok supporting both Binder and Colab, but think Colab should be the first priority (sometimes one visualization approach might be a little better in colab vs binder; let's favor colab always). I am also ok if we simple remove Binder from the readme's and do not advertise/actively support it.
I'm not a huge fan of embedding the preamble in each tutorial, but yeah, I can understand that our usage of public Binder is suboptimal compared to what Google Colab can offer (I think for free, by default?), and cost/benefit of Colab setup time vs. what Binder actually gives us, and most definitely the connection with Google Drive being uber pervasive.
As an alternative to the preamble, is it possible for us to easily specify a Kernel image for Colab to use?
If we were to stick with the preamble, it would be nice to somehow indicate, in each of the tutorials perhaps, when it's necessary for the runtime to be upgraded?
Also, mayhaps there's a way to more easily express / robustify the current logic?
e.g. we can go from:
%%capture
try:
import pydrake
except:
!curl my_url > my_script.py
from my_script import my_setup
my_setup(...)
to instead be something like:
%%capture
from urllib.request import urlopen
with urlopen("https://raw.githubusercontent.com/RussTedrake/underactuated/master/scripts/setup/jupyter_setup.py") as f:
exec(f.read())
setup_drake_if_needed()
In this case, the setup_drake_if_needed could be an entirely encapsulated function (to avoid polluting globals).
It could look like this:
def setup_drake_if_needed(min_date=None, max_date=None):
try:
import pydrake
# Ensure we have the proper version of Drake based on `min_date, max_date`. Fail fast if it's too old.
return
except ImportError as e:
# Ensure it's only `pydrake` that's the problem, not a dependency.
...
Thoughts?
Yes to the "for free" question.
In my mind, an ideal goal for the preamble should be
try:
import pydrake
except ImportError:
!apt install drake
(although I think we'll still need to set the PYTHON PATH)
I don't think we can provision colab, but I'd love to be wrong.
I've been also looking for ways that we can hide/minimize/autorun the first cell, or things like that.
Yes to the "for free" question.
:+1:
In my mind, an ideal goal for the preamble should be [....]
My only issue with the try: import X; except ... is that it exposes the symbol a bit too early... I would rather it be a fast check_and_setup() so that people are not encouraged to inject add'l module imports there, and instead separate their imports from that stanza. But it's a small nit, so I'm fine with it as-is for now.
I don't think we can provision colab, but I'd love to be wrong.
I've been also looking for ways that we can hide/minimize/autorun the first cell, or things like that.
I'm wondering if there's a public forum that we could post your experience of the pros/cons of Binder and Colab w.r.t. provisioning, visualization, etc.
We can thank 'em for their awesome work, let 'em be aware of possible issues w.r.t. our workflow, see if anyone has comments, etc.
https://discourse.jupyter.org/c/binder/12 may be a good place for a Binder summary, if you or your TA's haven't already posted there or elsewhere?
Also, possibly posting our questions to a Colab-oriented forum? e.g.
https://stackoverflow.com/questions/tagged/google-colaboratory
If I sat down on Friday and stepped through each of the pain points you mentioned, would you be able to review my post(s)?
asked by @RussTedrake, here are my thoughts --
I do think using Colab is a great fit for all of these use-cases:
(i) first-time users to Drake in general
(ii) students in particular
(iii) even for expert users, quickly-shareable projects or easy-to-start one-off projects
My personal view is that the current 3D visualization solution Russ already has working is awesome. (You just need to have a separate window with the visualizer in it, but it all streams from the Colab server and works. The only thing that doesn't currently work is embedding that inside a Colab cell, but I actually like it better separate.)
I also think the model that is getting developed here (everything you need in a Colab, including 3D visualization) would be useful/inspiration for a bunch of other projects out there as well.
@EricCousineau-TRI -- possibly relevant: https://github.com/RussTedrake/underactuated/issues/247
If @jamiesnape can land #1183, and we think it's "fail fast" enough, then the entire preamble could be just
!pip install drake
Is the reason you suggest pip here because its just a one-liner, instead of
!add-apt-repository -y ppa:foo/bar
!apt-get update
!apt-get install python3-pydrake
?
The apt work is a short hop away from being complete. The pip work has a meaningful risk of extending past the summer, even if we think it's going to be easy.
Yes. But it鈥檚 even more than that... https://github.com/RussTedrake/underactuated/blob/master/scripts/setup/jupyter_setup.py#L6
We would either need to handle mac/ubuntu, or otherwise make sure that someone on mac doesn鈥檛 get a failure when they run locally. And we would need the lines to sys.path.append. All together, it goes from something that feels welcoming and consumable to something quite a bit more heavy.
!pip install drake
On this count, I think that's great for Colab hosted instances, etc.
But for me, running locally, I'd hate to have this installed in my ~/.local user site-packages just b/c of the dependency hell that would arise. apt install python3-drake sounds much better, since it would more purposefully (I think) encapsulate itself.
On the count of handling apt vs. brew dispatch, I think we'd still have to put some platform-specific logic in there...
Just as a separate note, I'm trying to see if it's at all possible to run a Google Colab-like UI locally to simplify editing local notebooks - e.g. for a PR where we want to rely on Google Colab metadata, like collapsing input cells:
https://stackoverflow.com/a/50842328/7829525
The decision during review of #13697 (and afterwords on slack) was that we should move drake/tutorials out of drake and into a new repo, e.g. drake-tutorials. The motivation is that these tutorials should be runnable/sharable by anybody on colab, without assuming that they are living in the drake source tree, and should have a durable reference to an associated drake binary release. Here are a few key ideas from that thread:
drake-tutorials should show exactly what we want.Breadcrumb: We also need to test drake/tools/install/colab/setup_drake_colab.py in pre-merge CI, as discussed in https://reviewable.io/reviews/RobotLocomotion/drake/13697#-MCR4XhbEwraQHsbipoU
Can I suggest we push this forward (particularly the repo split)? It is going to have to be in tandem with #1183 to an extent, but the split repo part can happen beforehand without that blocking.
Also, we could then decide whether supporting Mac for the tutorials is especially worth the effort.
I think we could go definitely "colab only" (no mac) for these. It would presumably not be hard for someone to run them on mac, but we could skip any overhead in guaranteeing that with CI.
Should I start a repo? drake-tutorials?
Only question here is, will the Colab notebooks be put under CI for quickly checking for issues (e.g. deprecation)?
My read from htmlbook is that yes, it'll have CI: https://github.com/RussTedrake/htmlbook/blob/26cb5cd9184f6cba1ed42b750ad81ed6497a6769/tools/jupyter/defs.bzl#L36-L44
Just wanna confirm!
They will be under more CI than Drake, even. That is one of the main reasons for splitting them out.
https://github.com/RobotLocomotion/drake-tutorials. I guess we get some infrastructure in place first.
Most helpful comment
I think we could go definitely "colab only" (no mac) for these. It would presumably not be hard for someone to run them on mac, but we could skip any overhead in guaranteeing that with CI.
Should I start a repo? drake-tutorials?