Apex: undefined symbol: __ZN2at19UndefinedTensorImpl10_singletonE

Created on 22 Jun 2019  路  3Comments  路  Source: NVIDIA/apex

With the latest pytorch binaries and the latest code from apex I get an ImportError when trying to use the fused_layer_norm_cuda module. Specifically the following results in an error:

In  [1]: import fused_layer_norm_cuda
ImportError: <path/to/install>/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at19UndefinedTensorImpl10_singletonE

Following suggestions from #187, here's my system information:
lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:        16.04
Codename:       xenial  

Torch info:

In [1]:  print(torch.__version__, torch.version.cuda, torch.utils.cpp_extension.CUDA_HOME)
1.0.1 10.0.130 /usr/local/cuda

nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

Compiling from source may fix this - but is it expected that this compilation against latest pytorch conda binaries should fail?

Most helpful comment

Just to follow up on this for others who encounter similar issues - I hadn't imported torch in an attempt to create a minimum breaking example, not realizing this causes a different error.

Switching to the correct imports:

In  [1]: import torch
In  [2]: import fused_layer_norm_cuda
ImportError: <path/to/install>/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs

Some googling of this error turned up this issue. The suggestion was to create a clean conda environment, then install ipython followed by pytorch. After doing that, I was able to get things working.

All 3 comments

No that is not expected. First off, make sure you import torch before you import anything from apex (this is a common issue with extensions). If that doesn鈥檛 work we can try to repro.

I was able to reproduce this locally:

>>> import fused_layer_norm_cuda

by itself resulted in an ImportError with an undefined symbol.

To fix the error:

>>> import torch
>>> import fused_layer_norm_cuda

As I said, this is a known issue with extensions in general.

I also recommend using FusedLayerNorm via the wrapper module interface (apex.normalization.FusedLayerNorm). If you call the Cuda binding directly (fused_layer_norm_cuda) it will not route through an autograd function, and therefore will not be differentiable.

Just to follow up on this for others who encounter similar issues - I hadn't imported torch in an attempt to create a minimum breaking example, not realizing this causes a different error.

Switching to the correct imports:

In  [1]: import torch
In  [2]: import fused_layer_norm_cuda
ImportError: <path/to/install>/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs

Some googling of this error turned up this issue. The suggestion was to create a clean conda environment, then install ipython followed by pytorch. After doing that, I was able to get things working.

Was this page helpful?
0 / 5 - 0 ratings