The title stands:
I'm a PhD student at the University of Maryland College Park, and several of the GPU clusters in our department are power architecture; there's enough usage of power CPUs that Nvidia supports CUDA for it and the new largest GPU-bearing supercomputer in the world uses power. Also note that power and x86-64 are the only CPU architectures CUDA supports, so this isn't going down a path of supporting a bunch of obscure architectures, power8 and power9 are the only other applicable architectures for this library. I'm aware building from source is an option with the lacking pip support, but I'm having an issue there that I'm about to submit a separate bug report for. I'd of course be willing to test the library on our power systems.
I'm not too sure what's involved in supporting this platform, but if you do
cd ray/python
python setup.py bdist_wheel
Does that work successfully? It should create a wheel I think under ray/python/dist/.
Does building from source work successfully on this platform?
Here's what happens with pip now, for reference:
pip install ray
Collecting ray
Could not find a version that satisfies the requirement ray (from versions: )
No matching distribution found for ray
I didn't try building the wheel, but I will next time I'm on that cluster if that would be helpful.
Here's the inexplicable error message I get when building from source (you replied before I could submit a separate bug):
File "/gpfs/pkgs/mhpcc/anaconda2-5.0.1/lib/python2.7/subprocess.py", line 186, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['../build.sh', '-p', '/gpfs/pkgs/mhpcc/anaconda2-5.0.1/bin/python']' returned non-zero exit status 1
Cleaning up...
Command "/gpfs/pkgs/mhpcc/anaconda2-5.0.1/bin/python -c "import setuptools, tokenize;__file__='/gpfs/home/tomg/versioned/ray/python/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" develop --no-deps --user --prefix=" failed with error code 1 in /gpfs/home/tomg/versioned/ray/python/
Exception information:
Traceback (most recent call last):
File "/gpfs/pkgs/mhpcc/anaconda2-5.0.1/lib/python2.7/site-packages/pip/_internal/basecommand.py", line 228, in main
status = self.run(options, args)
File "/gpfs/pkgs/mhpcc/anaconda2-5.0.1/lib/python2.7/site-packages/pip/_internal/commands/install.py", line 335, in run
use_user_site=options.use_user_site,
File "/gpfs/pkgs/mhpcc/anaconda2-5.0.1/lib/python2.7/site-packages/pip/_internal/req/__init__.py", line 49, in install_given_reqs
**kwargs
File "/gpfs/pkgs/mhpcc/anaconda2-5.0.1/lib/python2.7/site-packages/pip/_internal/req/req_install.py", line 738, in install
install_options, global_options, prefix=prefix,
File "/gpfs/pkgs/mhpcc/anaconda2-5.0.1/lib/python2.7/site-packages/pip/_internal/req/req_install.py", line 900, in install_editable
show_stdout=False,
File "/gpfs/pkgs/mhpcc/anaconda2-5.0.1/lib/python2.7/site-packages/pip/_internal/utils/misc.py", line 698, in call_subprocess
% (command_desc, proc.returncode, cwd))
InstallationError: Command "/gpfs/pkgs/mhpcc/anaconda2-5.0.1/bin/python -c "import setuptools, tokenize;__file__='/gpfs/home/tomg/versioned/ray/python/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" develop --no-deps --user --prefix=" failed with error code 1 in /gpfs/home/tomg/versioned/ray/python/
[<>@<> python]$ cat /gpfs/home/tomg/versioned/ray/python
cat: /gpfs/home/tomg/versioned/ray/python: Is a directory
[<>@<> python]$ cd /gpfs/home/tomg/versioned/ray/python
[<>@<> python]$ ls
asv.conf.json
benchmarks
build-wheel-macos.sh
build-wheel-manylinux1.sh
ray
ray.egg-info
README-benchmarks.rst
README-building-wheels.md
setup.py
I think the error is happening much earlier and isn't shown in the attached log. Can you share the full output of cd ray/python; pip install -e . --verbose?
Before building, could you rm -rf the ray directory and reclone it?
So it does officially work on power when building from source after fighting with it for awhile. Just compiling .whl files for power8 and power9 are all that's needed now.
Nice! Can you share the steps you had to take to get it working?
Sorry I forgot to reply. I just had to built it from source correctly for the machine and it worked.
As an update, Nvidia now supports ARM for CUDA too, so support for that would be good too.
Most helpful comment
Nice! Can you share the steps you had to take to get it working?