Thank you for your great work at first!
When I try to run deformation of sphere to dolphin tutorial, I found an unexpected errors when loading the vertices of meshes to device, which be set to CUDA:0. Here is the error log:
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/THC/THCGeneral.cpp line=50 error=30 : unknown error
Traceback (most recent call last):
File "dolphin.py", line 33, in
faces_idx = faces.verts_idx.to(device)
File "/home/jormungandr/anaconda3/envs/pytorch3d/lib/python3.6/site-packages/torch/cuda/__init__.py", line 197, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/THC/THCGeneral.cpp:50
I've tried rmmod nvidia, nvidia-uvm, but each of these commands has an error about
rmmod: ERROR: Module nvidia_uvm is not currently loaded or
rmmod: ERROR: Module nvidia is in use by: nvidia_modeset
I rebooted once, but nothing changed either.
And my environment is as follows:
Pytorch : 1.4
Python : 3.6.10
CUDA : 10.0 by nvcc(runtime)
cuDNN : 7.0
OS : Ubuntu 18.04
@zzhat0706 I am assuming you built PyTorch3D from local clone? Were you able to run if you change the device to cpu? Are you able to run other pytorch code on device i.e. not with PyTorch3D? This looks like an issue with PyTorch not PyTorch3D.
There's an issue on the PyTorch repo which is referencing this problem - did you check this? https://github.com/pytorch/pytorch/issues/17108
@nikhilaravi Thanks for your answers!
Before Pytorch3D, I've run some pytorch codes such as CycleGAN and W-GAN. And I built Pytorch3D from Anaconda Cloud but not the local clone.
But I haven't tried cpu yet, I'll check it later.
Hello,
I did all the things mentioned above but I still get the error:
RuntimeError Traceback (most recent call last)
23
24 # Create a textures object
---> 25 tex = Textures(verts_uvs=verts_uvs, faces_uvs=faces_uvs, maps=texture_image)
26
27 # Create a meshes object with textures
~/Disk/Software/Anaconda3/envs/pytorch3d/lib/python3.7/site-packages/pytorch3d/structures/textures.py in __init__(self, maps, faces_uvs, verts_uvs, verts_rgb)
118
119 if self._faces_uvs_padded is not None:
--> 120 self._num_faces_per_mesh = faces_uvs.gt(-1).all(-1).sum(-1).tolist()
121
122 def clone(self):
RuntimeError: CUDA error: no kernel image is available for execution on the device
I downgraded the Nvidia driver but the error persists:
-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT... Off | 00000000:04:00.0 On | N/A |
| 26% 32C P8 14W / 250W | 642MiB / 6082MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
Surprisingly, all the tests in the test folder passed. But the following error comes in the notebook tutorials.
RuntimeError: CUDA error: no kernel image is available for execution on the device
I can't see the name of the GPU you are using due to truncation. ("GeForce GTX TIT...") Is it a TITAN X? If it is one of the other TITANs, then I think it will have compute capability 3.5 and so need a local build of pytorch.
Hello,
It is Titan GTX Black card.
Ubuntu 18.04 now have many Nvidia drivers. Which is the latest required
driver to build the pytorch3d?
(440, 435, 410)
On Wed, Feb 19, 2020 at 3:27 AM Jeremy Reizenstein notifications@github.com
wrote:
I can't see the name of the GPU you are using due to truncation. ("GeForce
GTX TIT...") Is it a TITAN X? If it is one of the other TITANs, then I
think it will have compute capability 3.5 and so need a local build of
pytorch.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/facebookresearch/pytorch3d/issues/62?email_source=notifications&email_token=ANZR6GWGGDHS542JASYT4N3RDUJSPA5CNFSM4KVX6OOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMHNBKQ#issuecomment-588173482,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ANZR6GWPVYR4ULVLW35H25TRDUJSPANCNFSM4KVX6OOA
.
I don't think the driver matters, but you might as well use the latest driver which is working with your GPU. If nvidia-smi is working then you probably have a working driver (although I am not sure about this).
The problem is that you cannot install pytorch (except old versions which we don't support) from conda with your GPU. You will need to set up a new conda environment, and follow the instructions at https://github.com/pytorch/pytorch#from-source for your gpu. I suggest you checkout the branch v1.4. I think you will then need to install torchvision from source as well, and then install pytorch3d from source.
(If all this sounds too hard, and you just want to get a feel for the tutorials, and you are not expecting that you will be using pytorch3d much on your computer, maybe you can run them on colab instead. Alternatively, if you install just pytorch3d from github, then you may be able to run the tutorial entirely on the CPU - just change device = torch.device("cuda:0") to device = torch.device("cpu") in the tutorial.)
@zzhat0706, @shersoni610 were you able to resolve this installation issue? If so, please share what you did here for others to replicate!
@nikhilaravi Sorry for the late reply, I eventually chose to run on nvidia-docker, then everything just worked out perfectly!
Thx for your great work again!