Ray: [tune][rllib] Windows: GPU not recognized

Created on 27 Jun 2020  路  6Comments  路  Source: ray-project/ray

What is the problem?

I'm getting ray.tune.error.TuneError: Insufficient cluster resources to launch trial.
I specified a GPU in my config but ray does not recognize my GPU (RTX 2080) and throws an error.
I can get passed this by setting num_gpus: 0 in my config for now.

https://gist.github.com/juliusfrost/fa7ebbb8d1dfc66eea0bbc4babcbe5aa

Reproduction (REQUIRED)

git clone https://github.com/juliusfrost/rllib-tune-atari.git
cd rllib-tune-atari
pip install -r requirements.txt
python train.py --algo a2c
  • [x] I have verified my script runs in a clean environment and reproduces the issue.
  • [x] I have verified the issue also occurs with the latest wheels.
P2 bug tune

All 6 comments

9114 @mehrdadn

@mehrdadn Did you get a chance to look at this? I updated the script to reproduce to remove external dependencies.

@juliusfrost I don't have the full CUDA setup available to test this right now, but perhaps try running this after import ray and let me know if it changes anything?

import ray.resource_spec
import subprocess
ray.resource_spec._autodetect_num_gpus = lambda: len([line.rstrip() for line in subprocess.check_output(["WMIC", "PATH", "Win32_VideoController", "GET", "AdapterCompatibility"]).splitlines()[1:] if line.startswith(b"NVIDIA")])

@mehrdadn Awesome it worked! Great job 馃憤

I'm still getting this issue. Following the snippet posted by @mehrdadn ray.resource_spec._autodetect_num_gpus() returns 1 but ray.get_gpu_ids() still returns an empty list [].

Screenshot 2020-09-18 144156

ray version: 0.9.0.dev0
OS: Windows 10

@wuisawesome Might this be due to _get_gpu_info_string not handling Windows?

Was this page helpful?
0 / 5 - 0 ratings