Docker-stacks: Spark, PySpark, tensorflow and tensorflowonspark image

Created on 20 Jan 2019  路  5Comments  路  Source: jupyter/docker-stacks

Is there an all in one image?

I'm looking to use jupyter with spark, python and tensorflow. The images provided offer little description.

I'm trying to include it myself, here's what I got so far:

FROM jupyter/pyspark-notebook:latest


USER root

RUN apt-get update -y
RUN apt-get upgrade -y

# Install TensorFlow stuff
RUN conda install -c conda-forge tensorflow 
RUN pip install tensorflowonspark

# Install Spark Cassandra Connector

RUN echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
RUN sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 642AC823
RUN sudo apt-get install apt-transport-https -y
RUN sudo apt-get update -y
RUN sudo apt-get install sbt -y


RUN git clone https://github.com/datastax/spark-cassandra-connector.git
RUN cd spark-cassandra-connector
RUN git checkout v2.4.0
RUN sbt assembly -Dscala-2.11=true

RUN find . -iname "*.jar" -type f -exec /bin/cp {} /srv/spark/jars/ \;
Community Stack Recipe Question

All 5 comments

Hi @Extarys. The documentation about what images are available and their general contents can be found at https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#. To save you a click, I'll point out that there is not an image that contains both Spark and Tensorflow.

If you figure out a Dockerfile that starts from jupyter/all-spark-notebook and adds Tensorflow or starts from jupyter/tensorflow-notebook and adds Spark, it would make a good addition to the Recipes page in the docs.

Alternatively, you could go the route of creating a new Community Stack and linking it in the docs. We have one so far, and another is on its way right now in #800.

Finally, you could use https://github.com/jupyter/repo2docker to build the Docker image containing what you want without dealing with nuances of writing a good Dockerfile.

Thanks @parente, this is well explained. I'll look into it!

Hey @parente, it does make sense to have this as a recipe at a very minimum. I have one of these working with pyspark and tensorflow. Would it not make sense, though, to have one official image that contains as close to everything as possible? It may make things more convenient for people to be able to pull one image that contains everything. I can help with this if it would be approved.

@mkirch https://github.com/jupyter/docker-stacks/issues/517 gives some of the reasoning behind why we're shying away from adding more images here and instead encouraging the creation of images in new repos which we can link to from here. https://github.com/jupyter/docker-stacks/issues/799 is a recent conversation along the same lines.

This issue has been idle for a while now so I'm going to close it out.

If someone would like to contribute a recipe for building an "everything but the kitchen sink" image to the recipes page or maintain such an image in a separate repo with a link to it from the community images documentation here, I'll happily review the PR.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

codingbutstillalive picture codingbutstillalive  路  3Comments

iramsey85 picture iramsey85  路  4Comments

ainiml picture ainiml  路  3Comments

statiksof picture statiksof  路  4Comments

akhmerov picture akhmerov  路  4Comments