Hi,
Apparently GTX 2080s have trouble with CUDA9, which is what the current Dockerfile installs. Would it be possible to have a version of the Dockerfile for CUDA10? I've tried it myself for several hours by starting FROM pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-runtime instead of ubuntu and commenting out the CUDA 9.0-specific steps, but haven't managed to make it work.
In case it helps, the following Dockerfile
FROM pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-runtime
# PyTorch (Geometric) installation
# RUN rm /etc/apt/sources.list.d/cuda.list && \
# rm /etc/apt/sources.list.d/nvidia-ml.list
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
vim \
sudo \
git \
bzip2 \
libx11-6 \
&& rm -rf /var/lib/apt/lists/*
# Create a working directory.
RUN mkdir /app
WORKDIR /app
# Create a non-root user and switch to it.
RUN adduser --disabled-password --gecos '' --shell /bin/bash user \
&& chown -R user:user /app
RUN echo "user ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-user
USER user
# All users can use /home/user as their home directory.
ENV HOME=/home/user
RUN chmod 777 /home/user
# Install Miniconda.
RUN curl -so ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-4.5.12-Linux-x86_64.sh \
&& chmod +x ~/miniconda.sh \
&& ~/miniconda.sh -b -p ~/miniconda \
&& rm ~/miniconda.sh
ENV PATH=/home/user/miniconda/bin:$PATH
ENV CONDA_AUTO_UPDATE_CONDA=false
# Create a Python 3.6 environment.
RUN /home/user/miniconda/bin/conda install conda-build \
&& /home/user/miniconda/bin/conda create -y --name py36 python=3.6.5 \
&& /home/user/miniconda/bin/conda clean -ya
ENV CONDA_DEFAULT_ENV=py36
ENV CONDA_PREFIX=/home/user/miniconda/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:$PATH
# CUDA 9.0-specific steps.
RUN conda install -c pytorch pytorch
RUN conda install -c fragcolor cuda10.0 && conda clean -ya
# RUN conda install -y -c pytorch \
# cuda90=1.0 \
# magma-cuda90=2.4.0 \
# "pytorch=1.1.0=py3.6_cuda9.0.176_cudnn7.5.1_0" \
# torchvision=0.2.1 \
# && conda clean -ya
# Install HDF5 Python bindings.
RUN conda install -y h5py=2.8.0 \
&& conda clean -ya
RUN pip install h5py-cache==1.0
# Install TorchNet, a high-level framework for PyTorch.
# RUN pip install torchnet==0.0.4
# Install Requests, a Python library for making HTTP requests.
RUN conda install -y requests=2.19.1 \
&& conda clean -ya
# Install Graphviz.
# RUN conda install -y graphviz=2.38.0 \
# && conda clean -ya
# RUN pip install graphviz==0.8.4
# Install OpenCV3 Python bindings.
RUN sudo apt-get update && sudo apt-get install -y --no-install-recommends \
libgtk2.0-0 \
libcanberra-gtk-module \
&& sudo rm -rf /var/lib/apt/lists/*
RUN conda install -y -c menpo opencv3=3.1.0 \
&& conda clean -ya
# Install PyTorch Geometric.
RUN CPATH=/usr/local/cuda/include:$CPATH \
&& LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH \
&& DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH
RUN pip install --verbose --no-cache-dir torch-scatter \
&& pip install --verbose --no-cache-dir torch-sparse \
&& pip install --verbose --no-cache-dir torch-cluster \
&& pip install --verbose --no-cache-dir torch-spline-conv \
&& pip install torch-geometric
# Set the default command to python3.
CMD ["python3"]
#######
# Mine #
#######
RUN pip --no-cache-dir install -U tensorboardX \
h5py \
matplotlib \
ipdb \
scipy \
tqdm
COPY *.py /code/
allows the docker to be created and import torch works, but
import torch_geometric
leads to
ModuleNotFoundError: No module named 'torch_scatter.scatter_cuda'
This is a good request, we should definitely support this. Sadly, I am no docker expert. @shengwenLeong, could you look into this?
Edit: The current dockerfile is heavily inspired by the dockerfiles from NVIDIA and PyTorch (see here) for which CUDA 10 versions are provided. IMO, it shouldn't be too hard to convert our dockerfile to CUDA 10 based on those.
Took me a while, but I think I managed to do it. Will use the docker for a bit to check everything is ok and then do a pull request.
Only thing is that I had to comment out the Install Graphviz portion, but that also happened for the CUDA9 version and to a friend of mine trying to use the original Dockerfile. Unless we figure out there's something wrong with it, I'll keep it in the CUDA10 version for consistency
CUDA9
ARG CUDA="10.0"
ARG CUDNN="7"
FROM nvidia/cuda:${CUDA}-cudnn${CUDNN}-devel-ubuntu16.04
RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
RUN apt-get update -y \
&& apt-get install -y apt-utils git curl ca-certificates bzip2 cmake tree htop bmon iotop g++ \
&& apt-get install -y libglib2.0-0 libsm6 libxext6 libxrender-dev
RUN curl -so /miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& chmod +x /miniconda.sh \
&& /miniconda.sh -b -p /miniconda \
&& rm /miniconda.sh
ENV PATH=/miniconda/bin:$PATH
RUN /miniconda/bin/conda install -y conda-build \
&& /miniconda/bin/conda create -y --name py36 python=3.6.8 \
&& /miniconda/bin/conda clean -ya
ENV CONDA_DEFAULT_ENV=py36
ENV CONDA_PREFIX=/miniconda/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:$PATH
ENV CONDA_AUTO_UPDATE_CONDA=false
RUN conda install -y ipython
RUN pip install ninja yacs cython numpy matplotlib opencv-python tqdm pyyaml tensorboardX
RUN conda install pytorch torchvision cudatoolkit=10.0 -c pytorch \
&& conda clean -ya
ENV PATH=/usr/local/cuda/bin:$PATH
ENV CPATH=/usr/local/cuda/include:$CPATH
ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
ENV DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH
RUN pip install --verbose --no-cache-dir torch-scatter
RUN pip install --verbose --no-cache-dir torch-sparse
RUN pip install --verbose --no-cache-dir torch-cluster
RUN pip install --verbose --no-cache-dir torch-spline-conv
RUN pip install torch-geometric
WORKDIR /meshvertex
i'm use this docker
and run
sudo docker run --rm -it --runtime=nvidia --device=/dev/video0 --shm-size 16G -e DISPLAY=unix$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v /home/cery/workspace/MeshVertexNet/:/meshvertex jsbluecat/pytorch:meshvertex bash
may be this help
Took me a while, but I think I managed to do it. Will use the docker for a bit to check everything is ok and then do a pull request.
Only thing is that I had to comment out the Install Graphviz portion, but that also happened for the CUDA9 version and to a friend of mine trying to use the original Dockerfile. Unless we figure out there's something wrong with it, I'll keep it in the CUDA10 version for consistency
Any update on this @FerranAlet ?
Most helpful comment
ARG CUDA="10.0"
ARG CUDNN="7"
FROM nvidia/cuda:${CUDA}-cudnn${CUDNN}-devel-ubuntu16.04
RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
install basics
RUN apt-get update -y \
&& apt-get install -y apt-utils git curl ca-certificates bzip2 cmake tree htop bmon iotop g++ \
&& apt-get install -y libglib2.0-0 libsm6 libxext6 libxrender-dev
Install Miniconda
RUN curl -so /miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& chmod +x /miniconda.sh \
&& /miniconda.sh -b -p /miniconda \
&& rm /miniconda.sh
ENV PATH=/miniconda/bin:$PATH
Create a Python 3.6 environment
RUN /miniconda/bin/conda install -y conda-build \
&& /miniconda/bin/conda create -y --name py36 python=3.6.8 \
&& /miniconda/bin/conda clean -ya
ENV CONDA_DEFAULT_ENV=py36
ENV CONDA_PREFIX=/miniconda/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:$PATH
ENV CONDA_AUTO_UPDATE_CONDA=false
RUN conda install -y ipython
RUN pip install ninja yacs cython numpy matplotlib opencv-python tqdm pyyaml tensorboardX
install pytorch
RUN conda install pytorch torchvision cudatoolkit=10.0 -c pytorch \
&& conda clean -ya
set cuda path
ENV PATH=/usr/local/cuda/bin:$PATH
ENV CPATH=/usr/local/cuda/include:$CPATH
ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
ENV DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH
RUN pip install --verbose --no-cache-dir torch-scatter
RUN pip install --verbose --no-cache-dir torch-sparse
RUN pip install --verbose --no-cache-dir torch-cluster
RUN pip install --verbose --no-cache-dir torch-spline-conv
RUN pip install torch-geometric
WORKDIR /meshvertex
i'm use this docker
and run
may be this help