https://colab.research.google.com/drive/19irKL4JTWyyPG70QGsihn9vx5966zQq8
It is worked fine a while ago
Me either. FYI, when I omit --global-option="--cpp_ext" --global-option="--cuda_ext", it worked.
Version of default PyTorch in Google Colab has been updated to 1.3.1 and it's compiled by CUDA 10.1.
However, CUDA in the colab runtime is 10.0.130 (you can check it by this command: !cat /usr/local/cuda/version.txt).
So, when you install apex, you may see the following error message (run the pip command without -q):
...
Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
from /usr/local/cuda/bin
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-req-build-_saohd3c/setup.py", line 100, in <module>
check_cuda_torch_binary_vs_bare_metal(torch.utils.cpp_extension.CUDA_HOME)
File "/tmp/pip-req-build-_saohd3c/setup.py", line 77, in check_cuda_torch_binary_vs_bare_metal
"https://github.com/NVIDIA/apex/pull/323#discussion_r287021798. "
RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries. Pytorch binaries were compiled with Cuda 10.1.243.
In some cases, a minor-version mismatch will not cause later errors: https://github.com/NVIDIA/apex/pull/323#discussion_r287021798. You can try commenting out this check (at your own risk).
...
To solve this problem, you can try to install another version of PyTorch compiled by CUDA 10.
In my case, I choose torch 1.2.0 + torchvision 0.4.0:
!pip install https://download.pytorch.org/whl/cu100/torch-1.2.0-cp36-cp36m-manylinux1_x86_64.whl && pip install https://download.pytorch.org/whl/cu100/torchvision-0.4.0-cp36-cp36m-manylinux1_x86_64.whl
Or you can find the version you want to use from the official archives (choose those links with a prefix cu100 and a suffix cp36-cp36m-manylinux1_x86_64): https://download.pytorch.org/whl/torch_stable.html
After re-installing PyTorch, you can install apex again (it doesn't need to omit the option for extension --global-option="--cpp_ext" --global-option="--cuda_ext"), and it should work.
downgrading and restarting the runtime works fine! thx
https://colab.research.google.com/drive/1drodd29aL2B8ufcb0gwrBBhGDvPBaDha
I guess you can close this issue
My colab has CUDA 10.1.243. What pytorch version should I install?
Hi @kushagra1198, apex should work fine without doing any further configuration now.
Currently, PyTorch on Colab is also compiled by CUDA 10.1.243, you can check it out by the following code snippet:
# just some code taken from `apex/setup.py`
import subprocess, torch
from torch.utils.cpp_extension import CUDAExtension
cuda_dir = torch.utils.cpp_extension.CUDA_HOME
print(subprocess.check_output([cuda_dir + "/bin/nvcc", "-V"]))
# you would see the this:
# b'nvcc: NVIDIA (R) Cuda compiler driver\nCopyright (c) 2005-2019 NVIDIA Corporation\nBuilt on Sun_Jul_28_19:07:16_PDT_2019\nCuda compilation tools, release 10.1, V10.1.243\n'
Feel free to let me know if it doesn't work.
Most helpful comment
Version of default PyTorch in Google Colab has been updated to 1.3.1 and it's compiled by CUDA 10.1.
However, CUDA in the colab runtime is 10.0.130 (you can check it by this command:
!cat /usr/local/cuda/version.txt).So, when you install
apex, you may see the following error message (run the pip command without-q):To solve this problem, you can try to install another version of PyTorch compiled by CUDA 10.
In my case, I choose torch 1.2.0 + torchvision 0.4.0:
Or you can find the version you want to use from the official archives (choose those links with a prefix
cu100and a suffixcp36-cp36m-manylinux1_x86_64): https://download.pytorch.org/whl/torch_stable.htmlAfter re-installing PyTorch, you can install apex again (it doesn't need to omit the option for extension
--global-option="--cpp_ext" --global-option="--cuda_ext"), and it should work.