Pytorch3d: RuntimeError: CUDA error: invalid device function

Created on 15 Jun 2020  ·  8Comments  ·  Source: facebookresearch/pytorch3d

After configuring CUDA (driver 440.82, cuda 10.2, ubuntu18.04) and pytorch (pytorch 1.5 + 3d), I test pytorch3d with this deform_source_mesh_to_target_mesh tutorials.

First, I get a warning in load this dolphin.obj (provided mesh). There exist the same warning for other testing meshes. This issue may be related to #165 JMingKuo's anwser

pytorch3d/io/obj_io.py:70: UserWarning: Faces have invalid indices
warnings.warn("Faces have invalid indices")

The important issue is the cuda error in

RuntimeError Traceback (most recent call last)
in
----> 7 plot_pointcloud(src_mesh, "Source mesh")
8 plot_pointcloud(trg_mesh, "Target mesh")
in plot_pointcloud(mesh, title)
1 def plot_pointcloud(mesh, title=""):
2 # Sample points uniformly from the surface of the mesh.
----> 3 points = sample_points_from_meshes(mesh, 5000)
4 x, y, z = points.clone().detach().cpu().squeeze().unbind(1)
5 fig = plt.figure(figsize=(5, 5))
/.../pytorch3d/ops/sample_points_from_meshes.py in sample_points_from_meshes(meshes, num_samples, return_normals)
54 # Only compute samples for non empty meshes
55 with torch.no_grad():
---> 56 areas, _ = mesh_face_areas_normals(verts, faces) # Face areas can be zero.
57 max_faces = meshes.num_faces_per_mesh().max().item()
58 areas_padded = packed_to_padded(
/.../pytorch3d/ops/mesh_face_areas_normals.py in forward(ctx, verts, faces)
44 print(torch.isnan(faces).any())
45 ctx.save_for_backward(verts, faces)
---> 46 areas, normals = _C.face_areas_normals_forward(verts, faces)
47 return areas, normals

RuntimeError: CUDA error: invalid device function

I have read some related issues, most of these errors are due to a NaN input/mesh. However, in my case, I just use the provided dolphin.obj.

great-firewall question

Most helpful comment

I think this is unrelated to the values in the inputs. This is hardware specific: I think there is a mismatch between the compute capability of your GPU device and the compute capabilities for which pytorch3d has been built.

Did you build pytorch3d yourself or are you using a conda package? What GPU are you using?

All 8 comments

I think this is unrelated to the values in the inputs. This is hardware specific: I think there is a mismatch between the compute capability of your GPU device and the compute capabilities for which pytorch3d has been built.

Did you build pytorch3d yourself or are you using a conda package? What GPU are you using?

@bottler Thanks for your reply. My environments are Ubuntu18.04, GPU Driver 440.82, cuda 10.2, and pytorch 1.5 with several Titan RTXs. The Installation of pytorch3d follows this official instuctions.

conda create -n pytorch3d python=3.8
conda activate pytorch3d
conda install -c pytorch pytorch torchvision cudatoolkit=10.2
conda install -c conda-forge -c fvcore fvcore
...
git clone https://github.com/facebookresearch/pytorch3d.git
cd pytorch3d && pip install -e .

I build with source code. All things are right?

I think you have installed correctly, so I don't know exactly what's wrong. Maybe a different set of cuda tools is being found. I think you have compute capability 7.5 so you could try the build with

NVCC_FLAGS="-gencode=arch=compute_75,code=sm_75" pip install -e .

You could also try the prebuilt conda packages; it would be interesting to know if they work.

@bottler Sadly, I try hard to fix this error.
rm -rf build/ **/*.so
pip uninstall pytorch3d # I find the preview line can't clean perfectly
NVCC_FLAGS="-gencode=arch=compute_75,code=sm_75" pip install -e .

The building is quite successful.
Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Obtaining file:///data/pytorch3d
Requirement already satisfied: torchvision>=0.4 in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (
from pytorch3d==0.2.0) (0.6.0a0+82fd1c8)Requirement already satisfied: fvcore in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (from pytor
ch3d==0.2.0) (0.1.1.post20200616)Requirement already satisfied: numpy in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (from torchv
ision>=0.4->pytorch3d==0.2.0) (1.18.1)Requirement already satisfied: torch in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (from torchv
ision>=0.4->pytorch3d==0.2.0) (1.5.0)Requirement already satisfied: pillow>=4.1.1 in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (fro
m torchvision>=0.4->pytorch3d==0.2.0) (7.1.2)Requirement already satisfied: tqdm in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (from fvcore-
pytorch3d==0.2.0) (4.46.1)Requirement already satisfied: termcolor>=1.1 in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (fr
om fvcore->pytorch3d==0.2.0) (1.1.0)Requirement already satisfied: pyyaml>=5.1 in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (from
fvcore->pytorch3d==0.2.0) (5.3.1)Requirement already satisfied: portalocker in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (from
fvcore->pytorch3d==0.2.0) (1.7.0)Requirement already satisfied: tabulate in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (from fvc
ore->pytorch3d==0.2.0) (0.8.7)Requirement already satisfied: yacs>=0.1.6 in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (from
fvcore->pytorch3d==0.2.0) (0.1.7)Requirement already satisfied: future in /data/Miniconda3/envs/pytorch3d/lib/python3.8/site-packages (from torch
->torchvision>=0.4->pytorch3d==0.2.0) (0.18.2)Installing collected packages: pytorch3d
Attempting uninstall: pytorch3d
Found existing installation: pytorch3d 0.2.0
Uninstalling pytorch3d-0.2.0:
Successfully uninstalled pytorch3d-0.2.0
Running setup.py develop for pytorch3d
Successfully installed pytorch3d

The import part of the tutorial is fine,
However, the error is still existing. Same warning and cuda error for mesh operation. T_T
I also changed the specific cuda device from 0 to 1, still get same results

I can not build with conda, like:
conda install pytorch3d -c pytorch3d

Collecting package metadata (current_repodata.json): failed
UnavailableInvalidChannel: The channel is not accessible or is invalid.
channel name: pytorch3d
channel url: https://mirrors.tuna.tsinghua.edu.cn/anaconda/pytorch3d
error code: 404
You will need to adjust your conda configuration to proceed.
Use conda config --show channels to view your configuration's current state,
and use conda config --show-sources to view config file locations.

It seems to be a network connection error. Do you think this conda install method can a solution?

Your problem with conda install is a network problem. I don't know much about how to deal with such problems, but it might be worth looking at other issues to see if they help.

For debugging the install, can you print the output of this?

import torch
print(torch.cuda.get_device_properties(0))
print(torch.cuda.get_device_capability(0))

If you just want to experiment a little with pytorch3d, maybe it would be easiest to keep everything on the CPU? Hopefully everything will work easily, but a little slow.

@bottler I can handle this network error of conda. Yep, thanks for the advice of using CPU only. Quite a straightforward but effective way. Thanks! I still need to fix it for future learning.

The output of torch is like below:
print(torch.cuda.get_device_properties(0))

_CudaDeviceProperties(name='TITAN RTX', major=7, minor=5, total_memory=24220MB, multi_processor_count=72)

print(torch.cuda.get_device_properties(0))

(7, 5)

I am fixing the connection network for conda install pytorch3d -c pytorch3d. Hope it works!

@bottler Hi, after changing the channel of conda, I success to use
conda install pytorch3d -c pytorch3d, and this cuda error finally disappears.
Btw, this warning of face invalid indices still exists. I think it could be a bug of warning report.

pytorch3d/io/obj_io.py:70: UserWarning: Faces have invalid indices
warnings.warn("Faces have invalid indices")
Anyway, thanks for your attention and help! Great work!

I can hardly close this issue, because I still do not find a way to fix this cuda error with self-building methods like pip install -e .. Maybe there still exists some unknown configure problem.
My problem has been solved. You can choose to keep this open or close. Thanks!

@ChenFengYe thanks for reporting back! The faces have invalid indices warning is unrelated to this issue and is being fixed separately!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

abhi1kumar picture abhi1kumar  ·  3Comments

cihanongun picture cihanongun  ·  3Comments

shersoni610 picture shersoni610  ·  3Comments

udemegane picture udemegane  ·  3Comments

unlugi picture unlugi  ·  3Comments