Ray: Error importing ray in colab due to pyarrow

Created on 25 Jun 2019  路  5Comments  路  Source: ray-project/ray

System information

  • colab
  • pip install ray[rllib]
  • python 3.7

Describe the problem

Very simple, just pip install ray, then import ray:

!pip install tensorflow  # or tensorflow-gpu
!pip install ray[rllib]  # also recommended: ray[debug]
import ray

on Google colab and it fails with:

ImportError: Ray must be imported before pyarrow because Ray requires a specific version of pyarrow (which is packaged along with Ray).

Recreate

Take a look at this gist running on colab

Most helpful comment

Thanks for reporting this! This error happens because colab is preloading certain libraries (more on this below). To fix it, you can do the following:

!pip install tensorflow  # or tensorflow-gpu
!pip install ray[rllib]  # also recommended: ray[debug]
!pip uninstall -y pyarrow

and then click on Runtime/Restart runtime. After that, the Ray cell will work.

Now as to why colab's preloading is problematic: Ray depends on a library called pyarrow for serialization and some other things. Because TensorFlow's binary package is not compliant with the manylinux1 standard [1], we need to ship a custom version of pyarrow that is compatible with TensorFlow. Until TensorFlow is fixed, we don't really have a choice here, so sorry for that.

[1] https://groups.google.com/a/tensorflow.org/d/msg/developers/TMqRaT-H2bI/J1Wiu8_XCQAJ

All 5 comments

Thanks for reporting this! This error happens because colab is preloading certain libraries (more on this below). To fix it, you can do the following:

!pip install tensorflow  # or tensorflow-gpu
!pip install ray[rllib]  # also recommended: ray[debug]
!pip uninstall -y pyarrow

and then click on Runtime/Restart runtime. After that, the Ray cell will work.

Now as to why colab's preloading is problematic: Ray depends on a library called pyarrow for serialization and some other things. Because TensorFlow's binary package is not compliant with the manylinux1 standard [1], we need to ship a custom version of pyarrow that is compatible with TensorFlow. Until TensorFlow is fixed, we don't really have a choice here, so sorry for that.

[1] https://groups.google.com/a/tensorflow.org/d/msg/developers/TMqRaT-H2bI/J1Wiu8_XCQAJ

Ah yes, you are right. I had tried the uninstall command, but I had not restarted the runtime after uninstalling. That is the trick. Thank you for your quick response.

Can I leave it to you to decide whether you want to close the issue? Because it is obviously an issue, but like you said it is out of your control (unless you can remove the pyarrow dependency maybe?).

Yeah, we can leave it open for now, since it actually is an issue and might make it easier for people to find. If this happens broadly on colab, we can add a pointer in the error message on what to do. Dropping pyarrow is not really an option unfortunately because our fast serialization depends on it.

Just a note - the way we get around this on our tutorial notebooks is by using this block of code before running anything:

print("Setting up colab environment")
!pip uninstall -y -q pyarrow
!pip install -q ray[debug]

# A hack to force the runtime to restart, needed to include the above dependencies.
print("Done installing! Restarting via forced crash (this is not an issue).")
import os
os._exit(0)

Pyarrow is no longer vendored with Ray.

Was this page helpful?
0 / 5 - 0 ratings