When trying to create a nd array with the gpu context python appears to hang.
just create an array
import mxnet as mx
a = mx.nd.ones((2, 3), mx.gpu())
then nothing happened as if it hanged, python was running in the background but no output. Without the gpu context the process was immediate
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python
----------Python Info----------
Version : 3.7.8
Compiler : GCC 5.4.0 20160609
Build : ('default', 'Jul 28 2020 15:58:46')
Arch : ('64bit', 'ELF')
------------Pip Info-----------
Version : 20.2.1
Directory : /home/bernard/opt/python37/lib/python3.7/site-packages/pip
----------MXNet Info-----------
Version : 1.7.0
Directory : /home/bernard/opt/python37/lib/python3.7/site-packages/mxnet
This is all I have since the script apparently takes forever to run, may be hanging too.
System is Ubuntu Linux
GPU Nvidia GTX 1070
CUDA10.0, libcudnn7.6
cmake 3.17
mxnet build from source as follows
source /opt/intel/bin/compilervars.sh intel64
git clone --recursive https://github.com/apache/incubator-mxnet.git
cd incubator-mxnet && git checkout -b v1.7 origin/v1.7.x
git submodule sync
git submodule update --init --recursive
make -j8 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1 CUDA_ARCH=-gencode=arch=compute_61,code=sm_61 USE_JEMALLOC=1
P.S disabling CUDNN didn't fix it.
Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue.
Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly.
If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.
@beew hi Bernard. Sorry to hear that you are facing this issue. Looks like you have the right CUDA_ARCH for your GPU card. Could you try a couple of things to rule out factors such as GPU, compiler, and dependency library? I'd like to know:
pip3 install --pre 'mxnet-cu100<2' -f https://dist.mxnet.io/pythonsource /opt/intel/bin/compilervars.sh intel64For the later two, try varying only that option to rule out these specific causes.
@szha
Hi,
1.without source /opt/intel/bin/compilervars.sh intel64, build with openblas instead of mkl (since can't find mkl root and headers)
python gpu examples works (from here https://mxnet.apache.org/versions/1.6/get_started/validate_mxnet.html)
2 With source /opt/intel/bin/compilervars.sh intel6 and build with mkl then gpu examples hang as reported here, regardless whether JEMALLOC is turned on or off. If it is turned off, got an additional warning "Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages"
So it seems that mkl somehow doesn't play well with cuda...
I haven't tried the nightly branch as it seems that the prebuild mxnet python wheels only support cuda > 10.0 after version 1.5.1, so instead I tried pip install mxnet-cu100mkl==1.5.1 I was able to create the mxnet array in gpu context with no problem .
P.S except for the prebuild mxnet wheels, all other tests were done with mxnet 1.7 from source "git checkout -b v1.7 origin/v1.7.x"
@beew thanks for testing these settings, it's really helpful. the mxnet-cu100mkl doesn't contain MKL as blas, but only has mkldnn in it. The blas library used for that build is openblas.
cc @PatricZhao whose team likely has the knowledge to find out why MKL build doesn't work in GPU builds.
cc @PatricZhao whose team likely has the knowledge to find out why MKL build doesn't work in GPU builds.
It sounds wired. We will take an investigation for the issue.
Hi @beew,
I have tried to reproduce the issue locally on similar hardware (GeForce GTX 1060 6GB, Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz) and software (Ubuntu 18.04, Kernel: 4.15.0-88, Nvidia driver 410.48 , CUDA 10.0.130) but unfortunately I couldn't (the test works for me).
Could you paste the MxNet sha and MKL version or (better) attach info.txt file produced by following script (run from mxnet home directory without sourcing earlier /opt/intel/bin/compilervars.sh)? :
INFO_FILE=info.txt
git rev-parse --verify HEAD >${INFO_FILE}
env >>${INFO_FILE}
echo -------------------------------- >>${INFO_FILE}
source /opt/intel/bin/compilervars.sh intel64
env >>${INFO_FILE}
nvidia-smi -L >>${INFO_FILE}
nvidia-smi >>${INFO_FILE}
cat /usr/local/cuda/version.txt >>${INFO_FILE}
@anko-intel
Here it is
3143aabb60038b555db2960d712fb2806b16d581
XDG_VTNR=7
XDG_SESSION_ID=c2
CLUTTER_IM_MODULE=xim
XDG_GREETER_DATA_DIR=/var/lib/lightdm-data/bernard
GPG_AGENT_INFO=/home/bernard/.gnupg/S.gpg-agent:0:1
SHELL=/bin/bash
VTE_VERSION=4205
TERM=xterm-256color
MKL_THREADING_LAYER=TBB
QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1
LD_PRELOAD=/usr/lib/libblas.so /usr/lib/liblapack.so
WINDOWID=123731978
GNOME_KEYRING_CONTROL=
UPSTART_SESSION=unix:abstract=/com/ubuntu/upstart-session/1000/1492
NVBLAS_CONFIG_FILE=/home/bernard/.config/nvblas.conf
GTK_MODULES=gail:atk-bridge:unity-gtk-module
PYTHONUSERBASE=/home/bernard/opt/python37
USER=bernard
LD_LIBRARY_PATH=/home/bernard/opt/python37/lib:/home/bernard/opt/libglvnd/lib:/usr/local/lib/petsc:/opt/openmpi-cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/tensorrt/lib:/home/bernard/opt/libQGLViewer/lib::/usr/local/lib:/home/bernard/opt/opencv/lib:/home/bernard/opt/latte/lib
QT_ACCESSIBILITY=1
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
XDG_SESSION_PATH=/org/freedesktop/DisplayManager/Session0
XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat0
CPATH=/usr/local/include/petsc::/home/bernard/opt/opencv/include
SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
DEFAULTS_PATH=/usr/share/gconf/ubuntu.default.path
XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg
PATH=/home/bernard/opt/python37/bin:/opt/openmpi-cuda/bin:/usr/local/cuda/bin:/home/bernard/bin:/home/bernard/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/bernard/opt/qt5/bin:/home/bernard/opt/opencv/bin:/home/bernard/opt/latte/bin
DESKTOP_SESSION=ubuntu
QT_QPA_PLATFORMTHEME=appmenu-qt5
QT_IM_MODULE=ibus
JOB=gnome-session
PWD=/home/bernard/opt/incubator-mxnet
XDG_SESSION_TYPE=x11
XMODIFIERS=@im=ibus
LANG=en_CA.UTF-8
GNOME_KEYRING_PID=
MANDATORY_PATH=/usr/share/gconf/ubuntu.mandatory.path
GDM_LANG=en_CA
IM_CONFIG_PHASE=1
COMPIZ_CONFIG_PROFILE=ubuntu
GDMSESSION=ubuntu
GTK2_MODULES=overlay-scrollbar
SESSIONTYPE=gnome-session
XDG_SEAT=seat0
HOME=/home/bernard/opt/python37
SHLVL=2
LANGUAGE=en_CA:en
GNOME_DESKTOP_SESSION_ID=this-is-deprecated
UPSTART_INSTANCE=
SLEPC_DIR=/home/bernard/opt/slepc
LOGNAME=bernard
XDG_SESSION_DESKTOP=ubuntu
UPSTART_EVENTS=started starting
PYTHONPATH=/home/bernard/opt/python37/lib/python3.7/site-packages:
PREFIX=/home/bernard/opt/python37
QT4_IM_MODULE=xim
XDG_DATA_DIRS=/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share/:/usr/share/ubuntu:/usr/share/gnome:/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share
DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-NJkEMOa0FZ
PKG_CONFIG_PATH=:/home/bernard/opt/opencv/lib/pkgconfig
LESSOPEN=| /usr/bin/lesspipe %s
UPSTART_JOB=unity-settings-daemon
INSTANCE=Unity
DISPLAY=:0
XDG_RUNTIME_DIR=/run/user/1000
GTK_IM_MODULE=ibus
XDG_CURRENT_DESKTOP=Unity
PETSC_DIR=/home/bernard/opt/petsc
LESSCLOSE=/usr/bin/lesspipe %s %s
XAUTHORITY=/home/bernard/.Xauthority
_=/usr/bin/env
--------------------------------
MKLROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl
XDG_VTNR=7
MANPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/man:/home/bernard/opt/python37/man:/opt/openmpi-cuda/share/man:/home/bernard/.local/share/man:/usr/local/man:/usr/local/share/man:/usr/share/man
XDG_SESSION_ID=c2
CLUTTER_IM_MODULE=xim
XDG_GREETER_DATA_DIR=/var/lib/lightdm-data/bernard
INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2018.2.199/linux/licenses:/opt/intel/licenses:/home/bernard/opt/python37/intel/licenses
IPPROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp
GPG_AGENT_INFO=/home/bernard/.gnupg/S.gpg-agent:0:1
SHELL=/bin/bash
VTE_VERSION=4205
TERM=xterm-256color
MKL_THREADING_LAYER=TBB
LIBRARY_PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/../tbb/lib/intel64_lin/gcc4.4
QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1
LD_PRELOAD=/usr/lib/libblas.so /usr/lib/liblapack.so
WINDOWID=123731978
GNOME_KEYRING_CONTROL=
UPSTART_SESSION=unix:abstract=/com/ubuntu/upstart-session/1000/1492
NVBLAS_CONFIG_FILE=/home/bernard/.config/nvblas.conf
GTK_MODULES=gail:atk-bridge:unity-gtk-module
PYTHONUSERBASE=/home/bernard/opt/python37
USER=bernard
LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/home/bernard/opt/python37/lib:/home/bernard/opt/libglvnd/lib:/usr/local/lib/petsc:/opt/openmpi-cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/tensorrt/lib:/home/bernard/opt/libQGLViewer/lib::/usr/local/lib:/home/bernard/opt/opencv/lib:/home/bernard/opt/latte/lib
QT_ACCESSIBILITY=1
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
PSTLROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/pstl
XDG_SESSION_PATH=/org/freedesktop/DisplayManager/Session0
XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat0
CPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/pstl/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/include:/usr/local/include/petsc::/home/bernard/opt/opencv/include
SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
DEFAULTS_PATH=/usr/share/gconf/ubuntu.default.path
XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg
NLSPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/locale/%l_%t/%N
PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/bin/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/bin:/home/bernard/opt/python37/bin:/opt/openmpi-cuda/bin:/usr/local/cuda/bin:/home/bernard/bin:/home/bernard/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/bernard/opt/qt5/bin:/home/bernard/opt/opencv/bin:/home/bernard/opt/latte/bin
DESKTOP_SESSION=ubuntu
QT_QPA_PLATFORMTHEME=appmenu-qt5
QT_IM_MODULE=ibus
TBBROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb
JOB=gnome-session
PWD=/home/bernard/opt/incubator-mxnet
XDG_SESSION_TYPE=x11
XMODIFIERS=@im=ibus
LANG=en_CA.UTF-8
GNOME_KEYRING_PID=
MANDATORY_PATH=/usr/share/gconf/ubuntu.mandatory.path
GDM_LANG=en_CA
IM_CONFIG_PHASE=1
COMPIZ_CONFIG_PROFILE=ubuntu
DAALROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/daal
GDMSESSION=ubuntu
GTK2_MODULES=overlay-scrollbar
SESSIONTYPE=gnome-session
XDG_SEAT=seat0
HOME=/home/bernard/opt/python37
SHLVL=2
LANGUAGE=en_CA:en
GNOME_DESKTOP_SESSION_ID=this-is-deprecated
UPSTART_INSTANCE=
SLEPC_DIR=/home/bernard/opt/slepc
LOGNAME=bernard
XDG_SESSION_DESKTOP=ubuntu
UPSTART_EVENTS=started starting
PYTHONPATH=/home/bernard/opt/python37/lib/python3.7/site-packages:
PREFIX=/home/bernard/opt/python37
CLASSPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/daal.jar
QT4_IM_MODULE=xim
XDG_DATA_DIRS=/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share/:/usr/share/ubuntu:/usr/share/gnome:/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share
DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-NJkEMOa0FZ
PKG_CONFIG_PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/bin/pkgconfig::/home/bernard/opt/opencv/lib/pkgconfig
LESSOPEN=| /usr/bin/lesspipe %s
UPSTART_JOB=unity-settings-daemon
INSTANCE=Unity
DISPLAY=:0
XDG_RUNTIME_DIR=/run/user/1000
GTK_IM_MODULE=ibus
XDG_CURRENT_DESKTOP=Unity
PETSC_DIR=/home/bernard/opt/petsc
LESSCLOSE=/usr/bin/lesspipe %s %s
I_MPI_ROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi
XAUTHORITY=/home/bernard/.Xauthority
_=/usr/bin/env
GPU 0: GeForce GTX 1070 (UUID: GPU-98bf2a5b-636b-5cbe-cc66-cdf024b8a920)
Sat Aug 15 03:11:15 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56 Driver Version: 418.56 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 00000000:01:00.0 On | N/A |
| N/A 42C P8 6W / N/A | 408MiB / 8111MiB | 3% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1142 G /usr/lib/xorg/Xorg 211MiB |
| 0 2056 G compiz 192MiB |
| 0 3329 G /usr/lib/firefox/firefox 2MiB |
+-----------------------------------------------------------------------------+
CUDA Version 10.0.130
CUDA Patch Version 10.0.130.1
Hi @beew,
Thanks for your answer. I checked out mxnet sha 3143aabb6 and installed MKL version 2019.1.144 but unfortunately I still cannot reproduce the issue.
Formerly I cannot install MKL 2018.2.199 which you use as it does not support Ubuntu 18.4. When I force to install it, python3 hangs during “import mxnet as mx” – but it could be a different issue than yours.
I also noticed that your driver CUDA version is 10.1 (nvidia-smi) differs than software packet 10.0.130 (cat /usr/local/cuda/version.txt)
So could I ask you to try :
Please let me know if it helps.
@anko-intel
Hi,
1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090 Not sure if there is any solution at the moment.
Thanks for the help.
Hi @beew,
Thanks for your answer.
You probably already know it - you can also have several MKL versions and switch between them.
Thanks to that you can switch to MKL 2019 just before compilation of Mxnet by :
ls -la /opt/intel/compilers_and_libraries
cd /opt/intel
sudo rm compilers_and_libraries
sudo ln -s compilers_and_libraries_2019.1.144 compilers_and_libraries
and after MxNet compilation you can restore compilers_and_libraries link to your default MKL versions used by other software.
@anko-intel
Yes, I can also set up update-alternatives just as I switch between mkl and openblas, I will give give it another shot in a few days and report back. Thanks.
@anko-intel
I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the latest installed via intel's deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
It failed with
/usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
/usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Makefile:647: bin/im2rec] Error 1
But building with USE_BLAS=openblas (keeping all other options unchanged) was successful.
Hi @beew, @gzuchow will try to help you with the second issue.
Hi @beew,
Sorry to hear that you have still a problem with MXNet.
I have tried to reproduce second issue which you encountered. Using Ubuntu 18.04 and 20.04 and flags used by you:
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
It ran smoothly, even completed with cmake method.
I want to ask about a version of used MXNet and other libiraries.
PAX-utlis will be needed for this setp:
apt install pax-utils
Then run this and post to us the result of info.txt file.
INFO_FILE=info.txt
git rev-parse --verify HEAD >${INFO_FILE}
source /opt/intel/bin/compilervars.sh intel64
env >>${INFO_FILE}
scanelf -l -s cblas_ddot |grep dot >>${INFO_FILE}
According to this tutorial you can set MXNet environment.
INFO_FILE=info.txt
git rev-parse --verify HEAD >${INFO_FILE}
source /opt/intel/bin/compilervars.sh intel64
env >>${INFO_FILE}
scanelf -l -s cblas_ddot |grep dot >>${INFO_FILE}
Actually there is no /opt/intel/bin/compilervars.sh intel64
Intelmkl on this machine was installed with the deb repository
https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html
So in this case the library paths should already in standard locations ( e,g /usr/lib/x86_64-linux-gnu)
I appreciate the help, but probably isn't worth the troubles. I cannot test very much on this laptop since it took me ~7 -8 hours to build mxnet from source here while my main machine is currently not accessible. Thanks again, I will try when I get my main machine back in a few weeks.
Using this tutorial: https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html by default mkl packages are installed in /opt/intel/ location.
Using this tutorial: https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html by default mkl packages are installed in /opt/intel/ location.
No, they are not. Doesn't matter what the documentation says, it is probably outdated. I know where they are installed in my machine. ;) I don't know about intel-python, I haven't installed that.