Pytorch_geometric: Please help me with OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

Created on 13 Apr 2020  ·  33Comments  ·  Source: rusty1s/pytorch_geometric

❓ Questions & Help


this is the traceback

`Traceback (most recent call last):
File "/home/yrwang/.local/lib/python3.6/site-packages/torch_sparse/__init__.py", line 15, in
library, [osp.dirname(__file__)]).origin)
File "/home/yrwang/.local/lib/python3.6/site-packages/torch/_ops.py", line 106, in load_library
ctypes.CDLL(path)
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/__init__.py", line 2, in
import torch_geometric.nn
File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/nn/__init__.py", line 2, in
from .data_parallel import DataParallel
File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/nn/data_parallel.py", line 5, in
from torch_geometric.data import Batch
File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/data/__init__.py", line 1, in
from .data import Data
File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/data/data.py", line 7, in
from torch_sparse import coalesce
File "/home/yrwang/.local/lib/python3.6/site-packages/torch_sparse/__init__.py", line 23, in
raise OSError(e)
OSError: libcusparse.so.10: cannot open shared object file: No such file or directory
`

my cuda,cudnn is well installed :
nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243
my torch version:
>>> print(torch.__version__) 1.4.0
I use

`pip3 install torch-scatter==2.0.4+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

pip3 install torch-sparse==0.6.1+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

pip3 install torch-cluster==1.5.4+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

pip3 install torch-spline-conv==1.2.0+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

pip3 install torch-geometric`
to install torch-geometric, but the problem occur, thanks for helping me

Most helpful comment

Ubuntu 18.04
This is my procedure to fix this bug.

  1. cd to /usr/local/cuda
  2. run find -name libcus*

image

if you see "libcusparse.so.11", continue following steps:

remove current cuda

  1. sudo apt-get --purge remove "*cublas*" "cuda*" "nsight*"
  2. sudo apt-get --purge remove "*nvidia*"

install new cuda-10-2

  1. wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
  2. sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
  3. wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
  4. sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
  5. sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
  6. sudo apt-get update
  7. sudo apt-get -y install cuda-10-2

image

add CUDA to PATH

$ export PATH=/usr/local/cuda/bin:$PATH
$ echo $PATH
>>> /usr/local/cuda/bin:...
$ export CPATH=/usr/local/cuda/include:$CPATH
$ echo $CPATH
>>> /usr/local/cuda/include:...
$ export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
$ echo $LD_LIBRARY_PATH
>>> /usr/local/cuda/lib64:...
$ export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH
$ echo $DYLD_LIBRARY_PATH
>>> /usr/local/cuda/lib:...

image

All 33 comments

libcusparse.so and libcusparse.so.10 is already included in usr/local/cuda/lib64

Is this path added to LD_LIBRARY_PATH?

thank you ,yes ,I check it,this is the result:
`echo $LD_LIBRARY_PATH

/usr/lcoal/cuda-10.1/lib64:
`

And what does torch.cuda.version say?

Do you mean torch.version.cuda?

the result of ' torch.version.cuda' is :
`>>> print(torch.version.cuda)

10.1
the result of 'torch.cuda.version' is >>> torch.cuda.version
Traceback (most recent call last):
File "", line 1, in
AttributeError: module 'torch.cuda' has no attribute 'version'
`

Can you do me a favor and see if you can install from torch-scatter from source?

Yes, I am glad to do it. what should I do?And how can I install from torch-scatter from source?

where can I find the way to install torch-scatter from source codes?

See here.

I follow your instruction to install torch-scatter from source,.the process and result is as following, but it still has the problem mentioned above.What should I do? Your PyG is really important for me,thank you very much.

~$ python3 -c "import torch; print(torch.__version__)"
1.4.0
~$ echo $PATH
/usr/local/cuda-10.1/bin:/home/yrwang/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
~$ echo $CPATH
/usr/local/cuda/include:
~$ pip3 install torch-scatter
Collecting torch-scatter
Installing collected packages: torch-scatter
Successfully installed torch-scatter-2.0.4

Mh, this is super weird :( Do you have multiple CUDA versions installed on your system? There must be a reason why it tries to look in the wrong folder.

no,I only have one CUDA version installed on my system.

Dear author, I made it. Thank you for your help. I downgrade CUDA to version 10.0, pytorch to version 1.4.0+cu100, torchvision to 0.5.0+cu100, and install torch-scatter torch-sparse torch-cluster torch-spline-conv from source. I tried to use version cu100 .whl to install them, but it doesn't work. The commands I used are as follows:

pip3 install torch-scatter
pip3 install torch-sparse
pip3 install torch-cluster
pip3 install torch-spline-conv
pip3 install torch-geometric torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html

If someone need help, please contact me.

Glad that you made it, but why did downgrading help?

Mh, this is super weird :( Do you have multiple CUDA versions installed on your system? There must be a reason why it tries to look in the wrong folder.

I want to ask the problem. In the usr/local have the CUDE and CUDE-10.0. Does the meaning represent the one code or multiply CUDE

I follow your instruction to install torch-scatter from source,.the process and result is as following, but it still has the problem mentioned above.What should I do? Your PyG is really important for me,thank you very much.

~$ python3 -c "import torch; print(torch.version)"
1.4.0
~$ echo $PATH
/usr/local/cuda-10.1/bin:/home/yrwang/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
~$ echo $CPATH
/usr/local/cuda/include:
~$ pip3 install torch-scatter
Collecting torch-scatter
Installing collected packages: torch-scatter
Successfully installed torch-scatter-2.0.4
thank you very much. And I use your first orders, then I complete the setup so amazing. And I don't why.
And my environment is Conda env,ubuntu,NVIDIA-SMI 418.67 Driver Version: 418.67,default CUDA and CUDA-10.0, Pytorch1.4 ,cudnn7.6.5.
Then I try to install the torch_geometric. I achieve the work with your orders.

Dear author, I made it. Thank you for your help. I downgrade CUDA to version 10.0, pytorch to version 1.4.0+cu100, torchvision to 0.5.0+cu100, and install torch-scatter torch-sparse torch-cluster torch-spline-conv from source. I tried to use version cu100 .whl to install them, but it doesn't work. The commands I used are as follows:

pip3 install torch-scatter
pip3 install torch-sparse
pip3 install torch-cluster
pip3 install torch-spline-conv
pip3 install torch-geometric torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html

If someone need help, please contact me.

Could you please show your versions of torch-scatter, torch-sparse, torch-cluster, and torch-split-conv respectively?

CUDA 10.0
torch 1.4.0+cu100
torch-cluster 1.5.4
torch-geometric 1.4.3
torch-scatter 2.0.4
torch-sparse 0.6.4
torch-spline-conv 1.2.0
torchvision 0.5.0+cu100
When I import torch_geometric, I meet this error:
Traceback (most recent call last):
File "gcn.py", line 6, in
from torch_geometric.datasets import Planetoid
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch_geometric/__init__.py", line 2, in
import torch_geometric.nn
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch_geometric/nn/__init__.py", line 2, in
from .data_parallel import DataParallel
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch_geometric/nn/data_parallel.py", line 5, in
from torch_geometric.data import Batch
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in
from .data import Data
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch_geometric/data/data.py", line 7, in
from torch_sparse import coalesce
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch_sparse/__init__.py", line 34, in
from .storage import SparseStorage # noqa
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch_sparse/storage.py", line 21, in
class SparseStorage(object):
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch/jit/__init__.py", line 1274, in script
_compile_and_register_class(obj, _rcb, qualified_name)
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch/jit/__init__.py", line 1115, in _compile_and_register_class
_jit_script_class_compile(qualified_name, ast, rcb)
RuntimeError:

__init__(__torch__.torch_sparse.storage.SparseStorage self, Tensor? row, Tensor? rowptr, Tensor? col, Tensor? value, (int, int)? sparse_sizes, Tensor? rowcount, Tensor? colptr, Tensor? colcount, Tensor? csr2csc, Tensor? csc2csr, bool is_sorted) -> (None):
Expected a value of type 'Optional[Tensor]' for argument 'row' but instead found type 'int'.
:
File "/home/wanghui/anaconda3/envs/gnn/lib/python3.7/site-packages/torch_sparse/storage.py", line 283
col = idx % num_cols

    return SparseStorage(row=row, rowptr=None, col=col, value=self._value,
           ~~~~~~~~~~~~~ <--- HERE
                         sparse_sizes=(num_rows, num_cols), rowcount=None,
                         colptr=None, colcount=None, csr2csc=None,

Can you hack torch_sparse.storage.py by replacing this line with:

row = idx / num_cols

OK! It works! Thank you!

I've encountered a similar problem (OSError: libcusparse.so.10.0: cannot open shared object file: No such file or directory), and I think torch-sparse might be looking in a wrong place for the library: I have CUDA 10.2, but I'm using an older version of torch: torch==1.4.0+cu100 and somehow installation of dependencies with cu100 made it problematic. Updating torch to 1.4.0+cu101 and installing the dependencies as in README with cu101 made the issue disappear.

torch 1.5.0+cu101
torch-cluster 1.5.4
torch-geometric 1.5.0
torch-scatter 2.0.4
torch-sparse 0.6.4
torch-spline-conv 1.2.0
torchvision 0.6.0+cu101

I'm having a similar but slightly different issue:

File "/home/aqd215/pyenv/py3.7/lib/python3.7/site-packages/torch_geometric/__init__.py", line 2, in
import torch_geometric.nn
File "/home/aqd215/pyenv/py3.7/lib/python3.7/site-packages/torch_geometric/nn/__init__.py", line 2, in
from .data_parallel import DataParallel
File "/home/aqd215/pyenv/py3.7/lib/python3.7/site-packages/torch_geometric/nn/data_parallel.py", line 5, in
from torch_geometric.data import Batch
File "/home/aqd215/pyenv/py3.7/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in
from .data import Data
File "/home/aqd215/pyenv/py3.7/lib/python3.7/site-packages/torch_geometric/data/data.py", line 7, in
from torch_sparse import coalesce
File "/home/aqd215/pyenv/py3.7/lib/python3.7/site-packages/torch_sparse/__init__.py", line 13, in
library, [osp.dirname(__file__)]).origin)
File "/home/aqd215/pyenv/py3.7/lib/python3.7/site-packages/torch/_ops.py", line 105, in load_library
ctypes.CDLL(path)
File "/share/apps/anaconda3/5.3.1/lib/python3.7/ctypes/__init__.py", line 356, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

Dear author, I made it. Thank you for your help. I downgrade CUDA to version 10.0, pytorch to version 1.4.0+cu100, torchvision to 0.5.0+cu100, and install torch-scatter torch-sparse torch-cluster torch-spline-conv from source. I tried to use version cu100 .whl to install them, but it doesn't work. The commands I used are as follows:

pip3 install torch-scatter
pip3 install torch-sparse
pip3 install torch-cluster
pip3 install torch-spline-conv
pip3 install torch-geometric torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html

If someone need help, please contact me.

Hey, I followed your installation, but the problem is still here.

torch 1.4.0+cu100
torch-cluster 1.5.4
torch-scatter 2.0.4
torch-sparse 0.6.1
torch-spline-conv 1.2.0
torchvision-0.5.0+cu100

Ubuntu 18.04
This is my procedure to fix this bug.

  1. cd to /usr/local/cuda
  2. run find -name libcus*

image

if you see "libcusparse.so.11", continue following steps:

remove current cuda

  1. sudo apt-get --purge remove "*cublas*" "cuda*" "nsight*"
  2. sudo apt-get --purge remove "*nvidia*"

install new cuda-10-2

  1. wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
  2. sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
  3. wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
  4. sudo dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
  5. sudo apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
  6. sudo apt-get update
  7. sudo apt-get -y install cuda-10-2

image

add CUDA to PATH

$ export PATH=/usr/local/cuda/bin:$PATH
$ echo $PATH
>>> /usr/local/cuda/bin:...
$ export CPATH=/usr/local/cuda/include:$CPATH
$ echo $CPATH
>>> /usr/local/cuda/include:...
$ export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
$ echo $LD_LIBRARY_PATH
>>> /usr/local/cuda/lib64:...
$ export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH
$ echo $DYLD_LIBRARY_PATH
>>> /usr/local/cuda/lib:...

image

First check this

  1. cd to /usr/local/cuda
  2. find -name libcus*
    if don't have libcusparse.so.10(maybe u find libcusparse.so.10.0 or libcusparse.so.11 etc.)
    try install corresponding CUDA Toolkit from nvidia org.

I install CUDA Toolkit 10.1 and change environment variables. It's works.

This saved me: https://medium.com/@exesse/cuda-10-1-installation-on-ubuntu-18-04-lts-d04f89287130

se
pip3 install torch-cluster
pip3 install torch-spline-conv
pip3 install torch-geometric torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html

I followed your steps now I am getting this error:
OSError: libtorch_cpu.so: cannot open shared object file: No such file or directory

Can you help me to fix it?

Using PyTorch 1.4.0 is no longer supported and it is recommended to update your PyTorch version. I suggest to use PyTorch 1.6.0 (since wheels are not yet ready for PyTorch 1.7.0).

You can then install PyG as described here:
https://github.com/rusty1s/pytorch_geometric#pytorch-160

I followed your steps as describe in: https://github.com/rusty1s/pytorch_geometric#pytorch-160

My setting is as follows:
torch 1.6.0+cu102
torch-cluster 1.5.8
torch-scatter 2.0.5
torch-sparse 0.6.8
torch-spline-conv 1.2.0
torchvision-0.7.0+cu102

Error: OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

Trace:
Traceback (most recent call last):File "/home/Documents/Graph-master/code/Test.py", line 15, in from inputsdata import MyOwnDatasetFile "/home/Documents/Graph-master/code/inputsdata.py", line 11, in from torch_geometric.data import Data, DataLoader, DatasetFile "/home/anaconda2/envs/torch_vi_Work/lib/python3.8/site-packages/torch_geometric/__init__.py", line 2, in import torch_geometric.nnFile "/home/anaconda2/envs/torch_vi_Work/lib/python3.8/site-packages/torch_geometric/nn/__init__.py", line 2, in from .data_parallel import DataParallelFile "/home/anaconda2/envs/torch_vi_Work/lib/python3.8/site-packages/torch_geometric/nn/data_parallel.py", line 5, in from torch_geometric.data import BatchFile "/home/anaconda2/envs/torch_vi_Work/lib/python3.8/site-packages/torch_geometric/data/__init__.py", line 1, in from .data import DataFile "/home/anaconda2/envs/torch_vi_Work/lib/python3.8/site-packages/torch_geometric/data/data.py", line 7, in from torch_sparse import coalesce, SparseTensorFile "/home/anaconda2/envs/torch_vi_Work/lib/python3.8/site-packages/torch_sparse/__init__.py", line 12, in torch.ops.load_library(importlib.machinery.PathFinder().find_spec(File "/home/anaconda2/envs/torch_vi_Work/lib/python3.8/site-packages/torch/_ops.py", line 105, in load_libraryctypes.CDLL(path)File "/home/anaconda2/envs/torch_vi_Work/lib/python3.8/ctypes/__init__.py", line 381, in __init__self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

I tried every other method that worked for others listed in this thread but still this issue persists for me. @rusty1s Could you please help me resolve this? This is key to proceed with my work.

Thanks in advance!

libcusparse.so.xxx should be either contained in $CUDA_HOME/lib or in .../miniconda3/lib. In case it is only included in the latter, please add that path to LD_LIBRARY_PATH.

libcusparse.so.xxx should be either contained in $CUDA_HOME/lib or in .../miniconda3/lib. In case it is only included in the latter, please add that path to LD_LIBRARY_PATH.

I wasn't able to figure that out but I managed to get it running with the +cpu version instead.
Thanks for the response and this library!

libcusparse.so.xxx should be either contained in $CUDA_HOME/lib or in .../miniconda3/lib. In case it is only included in the latter, please add that path to LD_LIBRARY_PATH.

I wasn't able to figure that out but I managed to get it running with the +cpu version instead.
Thanks for the response and this library!

Just an update:
The CUDA on my system got accidentally deleted and therefore the error. I reinstalled CUDA 11 and then installed the corresponding pytorch 1.7.0 just the way it's mentioned in step 3 of installation via binaries here: https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html
I no longer have this issue.
Thanks again!

I had this problem when using conda. I had installed pytorch and torchvision with pip, but pip uninstall pytorch and torchvision and then install them through conda instead solved the issue

Was this page helpful?
0 / 5 - 0 ratings