Running Ray on virtual Python 3.7 or 3.8 environments on Windows sometimes triggers an error.
This does appear to occur with Python 3.6, nor with standard or Anaconda installations.
It is unclear whether this issue is related to #9083, but it might be.
Try this on a machine with multiple CPUs (e.g. 8) with Python 3.7 or 3.8.
Note that passing a low value for num_cpus may avoid triggering this error.
> python -m venv myenv
> myenv\Scripts\activate
> python -m pip install ray
> python -c "import ray; ray.init()"
[(pid=7956) F0623 14:08:16.723009 7956 4168 core_worker.cc:294] Check failed: assigned_port != -1 Failed to allocate a port for the worker. Please specify a wider port range using the '--min-worker-port' and '--max-worker-port' arguments to 'ray start'.
...
I get this same behavior in linux if I attempt to schedule tasks immediately after init. Its not a problem if I wait.
2020-06-25 20:00:01,072 WARNING worker.py:1047 -- The actor or task with ID 45b95b1c8bd3a9c4ffffffff0100 is pending and cannot currently be scheduled. It requires {CPU: 1.000000} for execution and {CPU: 1.000000} for placement, but this node only has remaining {node:10.44.81.203: 1.000000}, {CPU: 96.000000}, {memory: 38.134766 GiB}, {object_store_memory: 13.134766 GiB}. In total there are 1 pending tasks and 0 pending actors on this node. This is likely due to all cluster resources being claimed by actors. To resolve the issue, consider creating fewer actors or increase the resources available to this Ray cluster. You can ignore this message if this Ray cluster is expected to auto-scale.
2020-06-25 20:00:03,272 INFO (unknown file):0 -- gc.collect() freed 16 refs in 1.603849448962137 seconds
(pid=19312) F0625 20:00:18.973336 19312 19312 core_worker.cc:294] Check failed: assigned_port != -1 Failed to allocate a port for the worker. Please specify a wider port range using the '--min-worker-port' and '--max-worker-port' arguments to 'ray start'.
@laxatives Is it also happening in the virtualenv?
No this, is on a notebook server with a dedicated k8s image. My problem was resolved after waiting a few seconds before attemping to use the server.
Hmm the wait doesn't seem to consistently prevent the issue. It resolves after manually retrying to run a task/actor, but I'm not sure whats going on. I'm trying to run in a notebook, so its possible I'm not doing a clean teardown between attempts.
I assume there's race condition that is triggered only at a certain env. this should be resolved when @mehrdadn fixes the Windows issue. I will bump up the priority level.
@mehrdadn Can you prioritizing the fix for this issue?
@rkooo567 I'm not sure鈥擨 can try to diagnose it, but from the looks of it, there's a chance I might not be able to find a fix soon. I can get back to you after doing some more diagnosis, but my guess is someone more familiar with the Ray core would be able to handle this much faster than I could.
I managed to run this locally with ray.init(local_mode=True)
@lelayf Thanks heaps - this resolved this issue for me running Py 3.8 on Win10 in a venv.
I have the same issue. with "local_mode" it is running, but I do seem to have major performance issues.
Does "local_mode" impact the performance? (RL-LIB on one computer with 8 CPUs and 1 GPU)
I managed to run this locally with
ray.init(local_mode=True)
@RocketRider Ray uses single cpu when in local_mode. See #8359
Most helpful comment
I managed to run this locally with
ray.init(local_mode=True)