Incubator-mxnet: can not install mxnet on jetson nano, jetpack 4.3, mxnet 1.4

Created on 27 Mar 2020  Â·  7Comments  Â·  Source: apache/incubator-mxnet

Description

Try to install mxnet on jetson nano, according to mxnet jetson_setup guide, during installation, no error raised, but when I import mxnet in python3, "Segmentation fault (core dumped)"

Error Message

root@jetbot:/home/jetbot/Downloads# python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
Segmentation fault (core dumped)

To Reproduce

My installation procedure:

  1. install dependencies:
sudo apt update
sudo apt -y install \
                        build-essential \
                        git \
                        graphviz \
                        libatlas-base-dev \
                        libopencv-dev \
                        python3-pip

sudo pip3 install --upgrade \
                        pip \
                        setuptools

sudo pip3 install \
                        graphviz==0.8.4 \
                        jupyter \
                        numpy==1.18.2

a little different form official guide, which I use pip3 because I need to use python3, and the numpy version is 1.18.2 rather than 1.15.2 because the later pip install suggested to do so.

  1. clone the repo
git clone --recursive https://github.com/apache/incubator-mxnet.git mxnet
  1. environment
export PATH=/usr/local/cuda/bin:$PATH
export MXNET_HOME=$HOME/mxnet/
export PYTHONPATH=$MXNET_HOME/python:$PYTHONPATH
  1. check cuda version
nvcc --version

we can get:

Copyright (c) 2005-2019 NVIDIA Corporation
Built on Mon_Mar_11_22:13:24_CDT_2019
Cuda compilation tools, release 10.0, V10.0.326
  1. download the whl file and libmxnet.so
wget -c https://s3.us-east-2.amazonaws.com/mxnet-public/install/jetson/1.4.0/mxnet-1.4.0-cp36-cp36m-linux_aarch64.whl  # I'm using python3.6

wget -c https://s3.us-east-2.amazonaws.com/mxnet-public/install/jetson/1.4.1/libmxnet.so
mkdir mxnet/lib
mv libmxnet.so mxnet/lib  # put this file in lib
  1. install the whl
sudo pip install mxnet-1.4.0-cp27-cp27mu-linux_aarch64.whl
  1. try to import
root@jetbot:/home/jetbot/Downloads# python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
Segmentation fault (core dumped)

What have you tried to solve it?

  1. try to build from source, it took me 2 hours and exited with error (sorry I forget to log the error info)
  2. reinstall the os, reflash the image, and try the jetson_setup guide again.

Environment

We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python

outputs here

----------Python Info----------
Version      : 3.6.9
Compiler     : GCC 8.3.0
Build        : ('default', 'Nov  7 2019 10:44:02')
Arch         : ('64bit', 'ELF')
------------Pip Info-----------
Version      : 20.0.2
Directory    : /usr/local/lib/python3.6/dist-packages/pip
----------MXNet Info-----------
Hashtag not found. Not installed from pre-built package.
----------System Info----------
Platform     : Linux-4.9.140-tegra-aarch64-with-Ubuntu-18.04-bionic
system       : Linux
node         : jetbot
release      : 4.9.140-tegra
version      : #1 SMP PREEMPT Mon Dec 9 22:47:42 PST 2019
----------Hardware Info----------
machine      : aarch64
processor    : aarch64
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):           1
Vendor ID:           ARM
Model:               1
Model name:          Cortex-A57
Stepping:            r1p1
CPU max MHz:         1479.0000
CPU min MHz:         102.0000
BogoMIPS:            38.40
L1d cache:           32K
L1i cache:           48K
L2 cache:            2048K
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0066 sec, LOAD: 1.6708 sec.
Timing for GluonNLP GitHub: https://github.com/dmlc/gluon-nlp, DNS: 0.0013 sec, LOAD: 1.3129 sec.
Timing for GluonNLP: http://gluon-nlp.mxnet.io, DNS: 0.0048 sec, LOAD: 0.5813 sec.
Timing for D2L: http://d2l.ai, DNS: 0.0032 sec, LOAD: 0.3769 sec.
Timing for D2L (zh-cn): http://zh.d2l.ai, DNS: 0.0027 sec, LOAD: 0.2090 sec.
Timing for FashionMNIST: https://repo.mxnet.io/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0038 sec, LOAD: 0.3148 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0026 sec, LOAD: 4.6375 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0038 sec, LOAD: 3.1302 sec.

Another info:

  • image version: jetpack 4.3
  • cuda: 10.0
  • python: 3.6
Bug

All 7 comments

2020.03.30 update:
I followed the docker cross-compille method, but failed, error log:

aarch64-unknown-linux-gnueabi-g++: internal compiler error: Killed (program cc1plus)
0x55a661ee7e24 execute
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:2854
0x55a661ee8174 do_spec_1
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:4658
0x55a661eeabc3 process_brace_body
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661eeabc3 process_brace_body
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661ee870b do_spec_1
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5427
0x55a661eeabc3 process_brace_body
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661eeabc3 process_brace_body
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661eeabc3 process_brace_body
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661eeabc3 process_brace_body
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
    /dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
make: *** [build/src/operator/numpy/np_percentile_op.o] Error 4
make: *** Waiting for unfinished jobs....
Makefile:569: recipe for target 'build/src/operator/numpy/np_percentile_op.o' failed
2020-03-30 11:15:50,474 - root - INFO - Waiting for status of container 7ee922ca33a2 for 600 s.
2020-03-30 11:15:50,659 - root - INFO - Container exit status: {'Error': None, 'StatusCode': 2}
2020-03-30 11:15:50,659 - root - ERROR - Container exited with an error 
2020-03-30 11:15:50,659 - root - INFO - Executed command for reproduction:

ci/build.py -p jetson

2020-03-30 11:15:50,659 - root - INFO - Stopping container: 7ee922ca33a2
2020-03-30 11:15:50,660 - root - INFO - Removing container: 7ee922ca33a2
2020-03-30 11:15:50,723 - root - INFO - Other running containers: ['e9ad8fc19642', '6acad0d82e1c', '5132086269ff', 'a6249b881f24', 'f0ad24aeb876', '6bb047a698b7', '9e6f0d904a4b', '241f1bc44984', '19ff0cea2fa5', '868864fd7dd7']
2020-03-30 11:15:50,723 - root - CRITICAL - Execution of ['/work/mxnet/ci/docker/runtime_functions.sh', 'build_jetson'] failed with status: 2

platform: x86 pc, i7-9700, rtx 2080ti

Guys, I solve that problem, from nvidia jetson forums advice:

  1. Download the mxnet-1.6 whl for jetson: link

  2. 2.
sudo pip3 install numpy==1.16
sudo pip3 install mxnet-1.6.0-py3-none-any.whl
  1. switch to sudo account! that’s import for me:
sudo su
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/mxnet
root@jetbot:/home/jetbot# python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
>>> mx.__version__
'1.6.0'

If I use python3 under jetbot account, will thorw segmentation fault error, seems I forgot change to sudo.

@JustinhoCHN what was the CUDA version of your Jetson Nano image? It gives me the following error:

>>> import mxnet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/mxnet/__init__.py", line 24, in <module>
    from .context import Context, current_context, cpu, gpu, cpu_pinned
  File "/usr/local/lib/python3.6/dist-packages/mxnet/context.py", line 24, in <module>
    from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
  File "/usr/local/lib/python3.6/dist-packages/mxnet/base.py", line 214, in <module>
    _LIB = _load_lib()
  File "/usr/local/lib/python3.6/dist-packages/mxnet/base.py", line 205, in _load_lib
    lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.10.0: cannot open shared object file: No such file or directory

My Jetson Nano image has CUDA 10.2

@JustinhoCHN what was the CUDA version of your Jetson Nano image? It gives me the following error:

>>> import mxnet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/mxnet/__init__.py", line 24, in <module>
    from .context import Context, current_context, cpu, gpu, cpu_pinned
  File "/usr/local/lib/python3.6/dist-packages/mxnet/context.py", line 24, in <module>
    from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
  File "/usr/local/lib/python3.6/dist-packages/mxnet/base.py", line 214, in <module>
    _LIB = _load_lib()
  File "/usr/local/lib/python3.6/dist-packages/mxnet/base.py", line 205, in _load_lib
    lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.10.0: cannot open shared object file: No such file or directory

My Jetson Nano image has CUDA 10.2

mine is cuda 10.0, jetpack 4.3

Do you still have the image which you used on your Jetson Nano? Nvidia keeps on updating their images regularly and I ended up with CUDA 10.2 :/

Edit: This works?
image

Do you still have the image which you used on your Jetson Nano? Nvidia keeps on updating their images regularly and I ended up with CUDA 10.2 :/

Edit: This works?
image

yes it is

Do you still have the image which you used on your Jetson Nano? Nvidia keeps on updating their images regularly and I ended up with CUDA 10.2 :/

Edit: This works?
image

I changed the os to jetpack 4.4 DP, same issue as yours, I try to install cuda 10.0 in jetpack 4.4, but when I import mxnet, it'll just quit with "segmentation fault". I'm working on compile the cuda 10.2 python whl for jetson. I'll let you know if I make it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Ajoo picture Ajoo  Â·  3Comments

GuilongZh picture GuilongZh  Â·  3Comments

ranti-iitg picture ranti-iitg  Â·  3Comments

luoruisichuan picture luoruisichuan  Â·  3Comments

yuconglin picture yuconglin  Â·  3Comments