Pytorch_geometric: docker for CUDA 10

Created on 20 Jul 2019  路  4Comments  路  Source: rusty1s/pytorch_geometric

馃殌 Feature

Hi,
Apparently GTX 2080s have trouble with CUDA9, which is what the current Dockerfile installs. Would it be possible to have a version of the Dockerfile for CUDA10? I've tried it myself for several hours by starting FROM pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-runtime instead of ubuntu and commenting out the CUDA 9.0-specific steps, but haven't managed to make it work.

In case it helps, the following Dockerfile

FROM pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-runtime
# PyTorch (Geometric) installation
# RUN rm /etc/apt/sources.list.d/cuda.list && \
#     rm /etc/apt/sources.list.d/nvidia-ml.list 

RUN apt-get update &&  apt-get install -y \
    curl \
    ca-certificates \
    vim \
    sudo \
    git \
    bzip2 \
    libx11-6 \
 && rm -rf /var/lib/apt/lists/*

# Create a working directory.
RUN mkdir /app
WORKDIR /app

# Create a non-root user and switch to it.
RUN adduser --disabled-password --gecos '' --shell /bin/bash user \
 && chown -R user:user /app
RUN echo "user ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-user
USER user

# All users can use /home/user as their home directory.
ENV HOME=/home/user
RUN chmod 777 /home/user

# Install Miniconda.
RUN curl -so ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-4.5.12-Linux-x86_64.sh \
 && chmod +x ~/miniconda.sh \
 && ~/miniconda.sh -b -p ~/miniconda \
 && rm ~/miniconda.sh
ENV PATH=/home/user/miniconda/bin:$PATH
ENV CONDA_AUTO_UPDATE_CONDA=false

# Create a Python 3.6 environment.
RUN /home/user/miniconda/bin/conda install conda-build \
 && /home/user/miniconda/bin/conda create -y --name py36 python=3.6.5 \
 && /home/user/miniconda/bin/conda clean -ya
ENV CONDA_DEFAULT_ENV=py36
ENV CONDA_PREFIX=/home/user/miniconda/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:$PATH

# CUDA 9.0-specific steps.
RUN conda install -c pytorch pytorch
RUN conda install -c fragcolor cuda10.0 && conda clean -ya
# RUN conda install -y -c pytorch \
#     cuda90=1.0 \
#     magma-cuda90=2.4.0 \
#     "pytorch=1.1.0=py3.6_cuda9.0.176_cudnn7.5.1_0" \
#     torchvision=0.2.1 \
#  && conda clean -ya

# Install HDF5 Python bindings.
RUN conda install -y h5py=2.8.0 \
 && conda clean -ya
RUN pip install h5py-cache==1.0

# Install TorchNet, a high-level framework for PyTorch.
# RUN pip install torchnet==0.0.4

# Install Requests, a Python library for making HTTP requests.
RUN conda install -y requests=2.19.1 \
 && conda clean -ya

# Install Graphviz.
# RUN conda install -y graphviz=2.38.0 \
#  && conda clean -ya
# RUN pip install graphviz==0.8.4

# Install OpenCV3 Python bindings.
RUN sudo apt-get update && sudo apt-get install -y --no-install-recommends \
    libgtk2.0-0 \
    libcanberra-gtk-module \
 && sudo rm -rf /var/lib/apt/lists/*
RUN conda install -y -c menpo opencv3=3.1.0 \
 && conda clean -ya

# Install PyTorch Geometric.
RUN CPATH=/usr/local/cuda/include:$CPATH \
 && LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH \
 && DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH

RUN pip install --verbose --no-cache-dir torch-scatter \
 && pip install --verbose --no-cache-dir torch-sparse \
 && pip install --verbose --no-cache-dir torch-cluster \
 && pip install --verbose --no-cache-dir torch-spline-conv \
 && pip install torch-geometric

# Set the default command to python3.
CMD ["python3"]
#######
# Mine #
#######
RUN pip --no-cache-dir install -U tensorboardX \
h5py \
matplotlib \
ipdb \
scipy \
tqdm

COPY *.py /code/

allows the docker to be created and import torch works, but

import torch_geometric

leads to

ModuleNotFoundError: No module named 'torch_scatter.scatter_cuda'

Most helpful comment

CUDA9

ARG CUDA="10.0"
ARG CUDNN="7"

FROM nvidia/cuda:${CUDA}-cudnn${CUDNN}-devel-ubuntu16.04

RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections

install basics

RUN apt-get update -y \
&& apt-get install -y apt-utils git curl ca-certificates bzip2 cmake tree htop bmon iotop g++ \
&& apt-get install -y libglib2.0-0 libsm6 libxext6 libxrender-dev

Install Miniconda

RUN curl -so /miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& chmod +x /miniconda.sh \
&& /miniconda.sh -b -p /miniconda \
&& rm /miniconda.sh

ENV PATH=/miniconda/bin:$PATH

Create a Python 3.6 environment

RUN /miniconda/bin/conda install -y conda-build \
&& /miniconda/bin/conda create -y --name py36 python=3.6.8 \
&& /miniconda/bin/conda clean -ya

ENV CONDA_DEFAULT_ENV=py36
ENV CONDA_PREFIX=/miniconda/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:$PATH
ENV CONDA_AUTO_UPDATE_CONDA=false

RUN conda install -y ipython
RUN pip install ninja yacs cython numpy matplotlib opencv-python tqdm pyyaml tensorboardX

install pytorch

RUN conda install pytorch torchvision cudatoolkit=10.0 -c pytorch \
&& conda clean -ya

set cuda path

ENV PATH=/usr/local/cuda/bin:$PATH
ENV CPATH=/usr/local/cuda/include:$CPATH
ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
ENV DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH

RUN pip install --verbose --no-cache-dir torch-scatter
RUN pip install --verbose --no-cache-dir torch-sparse
RUN pip install --verbose --no-cache-dir torch-cluster
RUN pip install --verbose --no-cache-dir torch-spline-conv
RUN pip install torch-geometric
WORKDIR /meshvertex

i'm use this docker
and run

sudo docker run --rm -it --runtime=nvidia --device=/dev/video0 --shm-size 16G -e DISPLAY=unix$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v /home/cery/workspace/MeshVertexNet/:/meshvertex  jsbluecat/pytorch:meshvertex bash

may be this help

All 4 comments

This is a good request, we should definitely support this. Sadly, I am no docker expert. @shengwenLeong, could you look into this?

Edit: The current dockerfile is heavily inspired by the dockerfiles from NVIDIA and PyTorch (see here) for which CUDA 10 versions are provided. IMO, it shouldn't be too hard to convert our dockerfile to CUDA 10 based on those.

Took me a while, but I think I managed to do it. Will use the docker for a bit to check everything is ok and then do a pull request.
Only thing is that I had to comment out the Install Graphviz portion, but that also happened for the CUDA9 version and to a friend of mine trying to use the original Dockerfile. Unless we figure out there's something wrong with it, I'll keep it in the CUDA10 version for consistency

CUDA9

ARG CUDA="10.0"
ARG CUDNN="7"

FROM nvidia/cuda:${CUDA}-cudnn${CUDNN}-devel-ubuntu16.04

RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections

install basics

RUN apt-get update -y \
&& apt-get install -y apt-utils git curl ca-certificates bzip2 cmake tree htop bmon iotop g++ \
&& apt-get install -y libglib2.0-0 libsm6 libxext6 libxrender-dev

Install Miniconda

RUN curl -so /miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& chmod +x /miniconda.sh \
&& /miniconda.sh -b -p /miniconda \
&& rm /miniconda.sh

ENV PATH=/miniconda/bin:$PATH

Create a Python 3.6 environment

RUN /miniconda/bin/conda install -y conda-build \
&& /miniconda/bin/conda create -y --name py36 python=3.6.8 \
&& /miniconda/bin/conda clean -ya

ENV CONDA_DEFAULT_ENV=py36
ENV CONDA_PREFIX=/miniconda/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:$PATH
ENV CONDA_AUTO_UPDATE_CONDA=false

RUN conda install -y ipython
RUN pip install ninja yacs cython numpy matplotlib opencv-python tqdm pyyaml tensorboardX

install pytorch

RUN conda install pytorch torchvision cudatoolkit=10.0 -c pytorch \
&& conda clean -ya

set cuda path

ENV PATH=/usr/local/cuda/bin:$PATH
ENV CPATH=/usr/local/cuda/include:$CPATH
ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
ENV DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH

RUN pip install --verbose --no-cache-dir torch-scatter
RUN pip install --verbose --no-cache-dir torch-sparse
RUN pip install --verbose --no-cache-dir torch-cluster
RUN pip install --verbose --no-cache-dir torch-spline-conv
RUN pip install torch-geometric
WORKDIR /meshvertex

i'm use this docker
and run

sudo docker run --rm -it --runtime=nvidia --device=/dev/video0 --shm-size 16G -e DISPLAY=unix$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v /home/cery/workspace/MeshVertexNet/:/meshvertex  jsbluecat/pytorch:meshvertex bash

may be this help

Took me a while, but I think I managed to do it. Will use the docker for a bit to check everything is ok and then do a pull request.
Only thing is that I had to comment out the Install Graphviz portion, but that also happened for the CUDA9 version and to a friend of mine trying to use the original Dockerfile. Unless we figure out there's something wrong with it, I'll keep it in the CUDA10 version for consistency

Any update on this @FerranAlet ?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

liaopeiyuan picture liaopeiyuan  路  3Comments

yuanx749 picture yuanx749  路  4Comments

ChrisBobotsis picture ChrisBobotsis  路  3Comments

zhangfuyang picture zhangfuyang  路  4Comments

zc-alexfan picture zc-alexfan  路  3Comments