Maskrcnn-benchmark: bug

Created on 25 Oct 2018  路  11Comments  路  Source: facebookresearch/maskrcnn-benchmark

Hi, Any idea about this?
everything in here are installed. https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/INSTALL.md

Traceback (most recent call last):
File "./tools/train_net.py", line 18, in
from maskrcnn_benchmark.engine.inference import inference
File "/home/lz/Workspace/maskrcnn-benchmark/maskrcnn_benchmark/engine/inference.py", line 20, in
from maskrcnn_benchmark.structures.boxlist_ops import boxlist_iou
File "/home/lz/Workspace/maskrcnn-benchmark/maskrcnn_benchmark/structures/boxlist_ops.py", line 6, in
from maskrcnn_benchmark.layers import nms as _box_nms
File "/home/lz/Workspace/maskrcnn-benchmark/maskrcnn_benchmark/layers/__init__.py", line 8, in
from .nms import nms
File "/home/lz/Workspace/maskrcnn-benchmark/maskrcnn_benchmark/layers/nms.py", line 3, in
from maskrcnn_benchmark import _C
ImportError: /home/lz/Workspace/maskrcnn-benchmark/maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN2at4cuda20getCurrentCUDAStreamEl

Most helpful comment

I have the impression that at runtime the script might be loading the wrong pytorch binary. Could you check that torch.__file__ points to the right torch?

All 11 comments

You are compiling the extensions with gcc <= 4.8, while PyTorch requires you to compile the C++ extensions with gcc >= 4.9 I believe.
While the compilation was going on, a huge warning must've popped up on screen warning about this, if I'm not wrong.

Also, make sure that the environment that you used to compile the extension is the same as the one you use to run it.

I get an error similar to @lzrobots (no problem during installation - gcc version = 5.4.0).
I have create a new enviroonement pytorch-1.0 at the before running your INSATLL.md

~/github/maskrcnn-benchmark/maskrcnn_benchmark/layers/__init__.py in <module>()
      6 from .misc import ConvTranspose2d
      7 from .misc import interpolate
----> 8 from .nms import nms
      9 from .roi_align import ROIAlign
     10 from .roi_align import roi_align

~/github/maskrcnn-benchmark/maskrcnn_benchmark/layers/nms.py in <module>()
      1 # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
      2 # from ._utils import _C
----> 3 from maskrcnn_benchmark import _C
      4
      5 nms = _C.nms

ImportError: /home/fbaradel/github/maskrcnn-benchmark/maskrcnn_benchmark/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at4cuda20getCurrentCUDAStreamEl

@fmassa @soumith Any idea where this problem could come from?

There is no problem when installing and running demo.pyon macOS (cpu).

I have the impression that at runtime the script might be loading the wrong pytorch binary. Could you check that torch.__file__ points to the right torch?

Indeed I am pointing to the wring pytorch version:

In [1]: import torch

In [2]: torch.__file__
Out[2]: '/home/fbaradel/anaconda3/envs/open-mmlab/lib/python3.6/site-packages/torch/__init__.py'
# it should be '/home/fbaradel/anaconda3/envs/pytorch-1.0/lib/python3.6/site-packages/torch/__init__.py'

That's weird because I am running this script from the good env (pytorch-1.0 and not open-mmlab). Is there a way to enforce running the good version of pytorch?

so, here is what I would do:

  • create a fresh new conda env and switch to it
  • install ipython via conda install ipython. This will install several dependencies, like pip, so that we are sure that we don't pick something else
  • follow the installation instructions as before.

While installing each one of the libs (with conda or pip), check that they are indeed pointing to the right installation path, and also check that which conda and which pip are the right one

Thanks all. Solved.

  1. conda create a new env.

  2. do install ipython via conda install ipython before you install INSTALL.md. This will make the torch.__file__ points to the right torch in the conda env for this project.

  3. if you still encounter this error even the torch is correct. git clone this repo and compile again...This is how I solve it.

Thanks @fmassa and @lzrobots it solved my problem as well.

Great!
I'll add a line in the README explaining that you need to first install ipython with conda.

It's caused by _GLIBCXX_USE_CXX11_ABI=1 when compile pytorch from source or install with conda with defaults channel. For more infos, refer to this link https://discuss.pytorch.org/t/undefined-symbol-when-import-lltm-cpp-extension/32627

Maybe you entered the wrong directory.
For me,I've created a project directory and I've cloned the maskrcnn repository in it.And then I installed the maskrcnn environment following the commands in README.md,and also I'created another maskrcnn directory.And as I entered the previous directory to run the train command,I got the error'cannot import _C'.By accident ,I came back to the later directory,it worked well.Amazing!

Was this page helpful?
0 / 5 - 0 ratings