Originating from jupyterhub/binderhub#101
As I've watched the size of the docker-stacks images grow and grow, I've often wondered if we should provide some kind of web UI that let's users pick high-level features they want (languages, kernels, conda-forge packages, etc.) and get what we'd consider a well-formed Dockerfile (and/or environment.yaml, requirements.txt, ...) for their personal build environment, or now binderhub. It feels like this would be one way to trim the docker-stacks images back to the opinionated, starter pack concept and perhaps help people create good environment definitions for use in binderhub, for whatever definition of good we have at the moment.
I recognize this isn't a panacea: whatever the tool emits will certainly need updates over time as best practices shift.
https://phpdocker.io/generator is in the vein of what I'm suggesting.
Thoughts:
FROM jupyter/some-stack-with-just-the-scripts.This is largely what repo2docker is doing (as a repo, not web UI): generating a Dockerfile from envirionment.yml, requirements.txt, etc. We could do something similar to use repo2docker to generate a Dockerfile from form input instead of files in a repo.
I was thinking about this from a ground-up perspective rather than from an existing environment.yaml, requirements.txt, etc. which I think is what you're suggesting. Please correct me if I'm wrong.
I think both have merit. Here's a messy brain dump of what I was thinking in a bit more detail:
Work-in-progress mock-up: https://parente.github.io/docker-stacker/
/cc @ian-r-rose
Could shorten the conda-forge links to ( https://conda-forge.org/feedstocks ). Though they are redirected anyways.
Would be good if we could select versions of a package. Though maybe that is possible and I just don't know how. 馃槃
This doesn't actually generate images now, right? Mainly a UI mock up, correct? If so, looks really nice.
Could shorten the conda-forge links to ( https://conda-forge.org/feedstocks ). Though they are redirected anyways.
+1
Would be good if we could select versions of a package. Though maybe that is possible and I just don't know how. 馃槃
No version selection at the moment. I'm scraping the list of packages only, not repodata for conda or the equivalent from the Julia packages site.
This doesn't actually generate images now, right? Mainly a UI mock up, correct? If so, looks really nice.
Mock-up only (in desperate need of design love), implemented to make real the thought I was trying to convey a couple comments up. My next step is to get it to show a little dialog with the Dockerfile it produced, a Download button to get the Dockerfile, and instructions on how to build it. Keepin' it simple to see how much utility we can get out of a basic, client-side app.
Idea: you could emit a zip file. This way when the zipfile is unzipped, the Dockerfile can be in its own
directory as Docker build prefers. The extract directory will have a name you chose, identifying a specific build command. The build dir could also have subdirs with other config files in them used in Dockerfile copy or run commands, which were downloaded into browser memory using ajax (jquery).
The key npm modules are jszip for the zipping and filesaver.js-npm (works, but has a horrible name for npm) to "download" a file made from data within the browser. browserify or jspm will let you glue together a js app for use browser-side from npm modules.
I've used these in the past to write a js app that emits a zip file of multiple nested directories of organized csv data from in-browser calculations . Sounds tedious, but actually, was pretty easy with the right modules.
@parente I like your Docker-Stacker initiative. https://parente.github.io/docker-stacker/
Please give options to opt out libraries like Julia and Scala. Thanks
I stalled on working on the web UI idea and started to focus on better documentation for this repo and community contributed stacks. I'm going to close this out since I don't expect to continue to working on the web approach anytime soon. If someone would like to pick up where I left off, that's fine by me. I think the next challenge with the approach is finding a sane way to generate the flat Dockerfile text in a way that's extensible when people want to contribute new plug-ins for different kernels, package repositories, etc.
I'm glad that you played with this idea and shared it. Sorry to hear there is not time to solve it.
Maybe we should reach out to other people who already build Docker stacks that use conda-forge and see if there is a way to plug them in here somehow. Fully acknowledging merging Docker images is a tricky (though much better) proposition today.
cc-ing @bgruening (in case this is of interest or have suggestions)
@jakirkham there's some discussion in #517. Current thinking is better documentation about how to build your own stack (e.g., what to do on Docker Cloud or elsewhere) and a cookiecutter-jupyter-docker project of some sort to help people get started.
Thanks Peter. Have been reading along in there as well. Oddly enough that is what made me think about this solution for a bit. :)
Had cc'd Bj枚rn as he and others have been involved in a really massive container building effort. So this would be a useful thing to consider as we determine the future of the stacks.
Hi all.
Indeed it seems that we have some solutions for you.
Try out this one: http://biocontainers.pro/multi-package-containers/
Create containers with simple PRs like this one: https://github.com/BioContainers/multi-package-containers/pull/427
What happens underneath is documented here a little bit:
https://docs.galaxyproject.org/en/master/admin/mulled_containers.html#automatic-build-of-linux-containers
Docker/rkt containers are here: https://quay.io/organization/biocontainers
Singularity container are here: https://depot.galaxyproject.org/singularity/
@bgruening Thanks for sharing biocontainers. That's very cool, definitely inline with what I was thinking, and you've already done it. 馃憦
How do you find maintenance of the build process for all the user submissions? Is it pretty much hands off or do you find yourselves needing to restart failed jobs, fix version conflicts, etc. as the set grows?
Most helpful comment
Work-in-progress mock-up: https://parente.github.io/docker-stacker/