Apex: Cant install on google colab

Created on 24 Nov 2019 · 5Comments · Source: NVIDIA/apex

https://colab.research.google.com/drive/19irKL4JTWyyPG70QGsihn9vx5966zQq8

It is worked fine a while ago

Source

hadaev8

Most helpful comment

Version of default PyTorch in Google Colab has been updated to 1.3.1 and it's compiled by CUDA 10.1.
However, CUDA in the colab runtime is 10.0.130 (you can check it by this command: !cat /usr/local/cuda/version.txt).

So, when you install apex, you may see the following error message (run the pip command without -q):

...
Compiling cuda extensions with
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2018 NVIDIA Corporation
    Built on Sat_Aug_25_21:08:01_CDT_2018
    Cuda compilation tools, release 10.0, V10.0.130
    from /usr/local/cuda/bin

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-req-build-_saohd3c/setup.py", line 100, in <module>
        check_cuda_torch_binary_vs_bare_metal(torch.utils.cpp_extension.CUDA_HOME)
      File "/tmp/pip-req-build-_saohd3c/setup.py", line 77, in check_cuda_torch_binary_vs_bare_metal
        "https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.  "
    RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries.  Pytorch binaries were compiled with Cuda 10.1.243.
    In some cases, a minor-version mismatch will not cause later errors:  https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.  You can try commenting out this check (at your own risk).
...

To solve this problem, you can try to install another version of PyTorch compiled by CUDA 10.
In my case, I choose torch 1.2.0 + torchvision 0.4.0:

!pip install https://download.pytorch.org/whl/cu100/torch-1.2.0-cp36-cp36m-manylinux1_x86_64.whl && pip install https://download.pytorch.org/whl/cu100/torchvision-0.4.0-cp36-cp36m-manylinux1_x86_64.whl

Or you can find the version you want to use from the official archives (choose those links with a prefix cu100 and a suffix cp36-cp36m-manylinux1_x86_64): https://download.pytorch.org/whl/torch_stable.html

After re-installing PyTorch, you can install apex again (it doesn't need to omit the option for extension --global-option="--cpp_ext" --global-option="--cuda_ext"), and it should work.

NaleRaphael on 10 Dec 2019

👍5 🎉1

All 5 comments

Me either. FYI, when I omit --global-option="--cpp_ext" --global-option="--cuda_ext", it worked.

bamps53 on 25 Nov 2019

So, when you install apex, you may see the following error message (run the pip command without -q):

...
Compiling cuda extensions with
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2018 NVIDIA Corporation
    Built on Sat_Aug_25_21:08:01_CDT_2018
    Cuda compilation tools, release 10.0, V10.0.130
    from /usr/local/cuda/bin

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-req-build-_saohd3c/setup.py", line 100, in <module>
        check_cuda_torch_binary_vs_bare_metal(torch.utils.cpp_extension.CUDA_HOME)
      File "/tmp/pip-req-build-_saohd3c/setup.py", line 77, in check_cuda_torch_binary_vs_bare_metal
        "https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.  "
    RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries.  Pytorch binaries were compiled with Cuda 10.1.243.
    In some cases, a minor-version mismatch will not cause later errors:  https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.  You can try commenting out this check (at your own risk).
...

To solve this problem, you can try to install another version of PyTorch compiled by CUDA 10.
In my case, I choose torch 1.2.0 + torchvision 0.4.0:

!pip install https://download.pytorch.org/whl/cu100/torch-1.2.0-cp36-cp36m-manylinux1_x86_64.whl && pip install https://download.pytorch.org/whl/cu100/torchvision-0.4.0-cp36-cp36m-manylinux1_x86_64.whl

After re-installing PyTorch, you can install apex again (it doesn't need to omit the option for extension --global-option="--cpp_ext" --global-option="--cuda_ext"), and it should work.

NaleRaphael on 10 Dec 2019

👍5 🎉1

downgrading and restarting the runtime works fine! thx
https://colab.research.google.com/drive/1drodd29aL2B8ufcb0gwrBBhGDvPBaDha

I guess you can close this issue

henrique on 19 Jan 2020

My colab has CUDA 10.1.243. What pytorch version should I install?

kushagra1198 on 25 Jun 2020

Hi @kushagra1198, apex should work fine without doing any further configuration now.

Currently, PyTorch on Colab is also compiled by CUDA 10.1.243, you can check it out by the following code snippet:

# just some code taken from `apex/setup.py`
import subprocess, torch
from torch.utils.cpp_extension import CUDAExtension

cuda_dir = torch.utils.cpp_extension.CUDA_HOME
print(subprocess.check_output([cuda_dir + "/bin/nvcc", "-V"]))

# you would see the this:
# b'nvcc: NVIDIA (R) Cuda compiler driver\nCopyright (c) 2005-2019 NVIDIA Corporation\nBuilt on Sun_Jul_28_19:07:16_PDT_2019\nCuda compilation tools, release 10.1, V10.1.243\n'

Feel free to let me know if it doesn't work.

NaleRaphael on 25 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings