Hi, I have a trouble with this image.
imgDecoder "Mixed" operation occurs segfault with this image.
("CPU" operation works well.)
https://drive.google.com/file/d/1dLmS995EMnPo-ul5j7XEZr58SvGxHYGq/view?usp=sharing
For now, I simply remove this datum from data.
FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
WORKDIR /root
# install python3.8
RUN apt-get -qq update && \
apt-get -qq install liblzma-dev && \
apt-get -qq install -y software-properties-common && \
add-apt-repository ppa:deadsnakes/ppa && \
apt-get -qq update && \
apt-get -qq install -y build-essential python3.8 python3.8-dev python3-pip && \
apt-get -qq install -y git && \
python3.8 -m pip install pip --upgrade && \
python3.8 -m pip install wheel
RUN ln -s `which python3.8` /usr/bin/python && \
rm /usr/bin/python3-config && \
ln -s `which python3.8-config` /usr/bin/python3-config
RUN apt-get -qq install wget curl vim
RUN apt-get -qq install autoconf automake libtool make g++ unzip
RUN apt-get -qq install libprotobuf* protobuf-compiler ninja-build
RUN apt-get -qq install -y libsm6 libxext6 libxrender-dev
############################
##### install packages #####
############################
# general
RUN pip install -U six wheel setuptools
RUN pip install ninja Polygon3 numpy cython fire h5py imagesize jupyter notebook jupyterlab lmdb pylint opencv-contrib-python shapely termcolor tqdm nose scipy requests lmdb munch pyyaml overrides
RUN pip install pillow scikit-learn scikit-image pandas imagesize
# restrict-version general
RUN pip install pyarrow==0.17.1 numba==0.49.1 nptyping==1.1.0
# pytorch [1.5.0]
RUN pip install torch==1.5.0+cu101 torchvision==0.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
# detectron [v0.1.3]
RUN git clone -b v0.1.3 https://github.com/facebookresearch/detectron2.git
RUN pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
RUN python -m pip install -e detectron2
# apex [0.1]
RUN git clone https://github.com/NVIDIA/apex.git
WORKDIR /root/apex
RUN python setup.py install --cpp_ext --cuda_ext
WORKDIR /root
# nvidia-dali [0.21.0]
RUN wget https://developer.download.nvidia.com/compute/redist/nvidia-dali/ffmpeg-4.2.1.tar.bz2
RUN wget https://developer.download.nvidia.com/compute/redist/nvidia-dali/libsndfile-1.0.28.tar.gz
RUN tar -xzf libsndfile-1.0.28.tar.gz
RUN tar -xf ffmpeg-4.2.1.tar.bz2
WORKDIR /root/libsndfile-1.0.28
RUN ./configure && make && make install
WORKDIR /root/ffmpeg-4.2.1
RUN apt-get -qq install yasm
RUN ./configure \
--prefix=/usr/local \
--disable-static \
--disable-all \
--disable-autodetect \
--disable-iconv \
--enable-shared \
--enable-avformat \
--enable-avcodec \
--enable-avfilter \
--enable-protocol=file \
--enable-demuxer=mov,matroska,avi \
--enable-bsf=h264_mp4toannexb,hevc_mp4toannexb,mpeg4_unpack_bframes && \
make && make install
WORKDIR /root
RUN wget https://developer.download.nvidia.com/compute/redist/cuda/10.0/nvidia-dali/nvidia_dali-0.21.0-1239036-cp38-cp38-manylinux1_x86_64.whl
RUN pip install nvidia_dali-0.21.0-1239036-cp38-cp38-manylinux1_x86_64.whl
# lanms (EAST text detector - locality aware nms)
RUN pip install lanms
Regards,
Anthony
Hi,
I managed to reproduce this problem with the CUDA 10 build, but the good news is that it is working fine with CUDA 11 one. So it seems that a bug you have encountered has been fixed in the CUDA 11.
What I can recommend is to use DALI for CUDA 11 (0.22 release started supporting it) when possible (you need to update the GPU driver to 450.x, but you can still use your docker file as DALI doesn't have any runtime dependency on the CUDA installed).
Thanks for reply.
Unfortunately, I should use cuda 10.1 . There're some dependencies with essential packages .
Hi,
As I said you can still use CUDA 10.1 for all but DALI packages. So what you need to do is:
OH...! I misunderstood your comment.
After updating host driver and dali, everything works fine.
Thank you very much.
Most helpful comment
Hi,
As I said you can still use CUDA 10.1 for all but DALI packages. So what you need to do is: