I was able to install PyTorch3d and use it on Oct 14. (I had a slightly modified build pattern due to having to run it on a linux PowerPC machine but nevertheless after sufficient configuration, pip installing via pip install 'git+https://github.com/facebookresearch/pytorch3d.git@stable' would install the libraries correctly.
However, it seems that the code pointed at by that instruction has since changed -- but I can't find what the old pointer would be.
I can't quite install nvidiacub through anaconda since it's not yet offered on linux-pc64lle so it'd be nice to maintain the old installation path.
I'm happy to add the steps I used to get PyTorch building on ppc64le if those would be useful to have.
This is interesting.
The stable tag hasn't moved for a while - it points at the latest release, 0.2.5, and has done since that release on about 28 August. Nothing has changed in that code. Maybe a dependency has changed, though (e.g. pytorch and torchvision have had releases, so if you fetch them without a version number you'll get different stuff.) However we do envisage a release fairly soon.
I have put the header files of nvidiacub in a linux-64 conda package in my personal channel - conda install -c bottler nvidiacub. There is no official conda package for cub as far as I could see. After all the commits from the last few hours, the idea is that a user should be able to install that package and then be able to build pytorch3d - the setup.py should pick them up. For you, you need a version of that package for your architecture. Maybe you can download the package https://anaconda.org/bottler/nvidiacub/1.10.0/download/linux-64/nvidiacub-1.10.0-0.tar.bz2 and convert it with conda convert -p linux-ppc64le nvidiacub-1.10.0-0.tar.bz2 and then install it. (It is just header files.)
I haven't thought about powerpc architecture, and yes, it would be interesting to know what kinds of changes you had to make.
Ah alright -- I immediately figured it was a problem with PyTorch3d as that's the step that threw the error but I'll look back at the other packages more closely! Thanks!
As far as the ppc64le - let me try to get it working again and if I manage to do so I'll come back with the steps. (From the first time, there were no changes to the PyTorch3d code, just to the environment and installations needed).
@bottler I managed to get it working -- not quite sure what the issue was. Maybe a weird config the first time I tried it.
Anyways, these were the steps I followed:
The IBM WMLCE environment has a lot of software for ppc64le -- I added it to my conda config to be able to install pytorch and other libraries:
conda config --prepend channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/
I created a conda environment and installed PyTorch=1.5 on it, (as it is the latest version supported by WMLCE-Early Access)
export FORCE_CUDA=1
conda install -c conda-forge cxx-compiler (This might only be necessary for me - the server I was running on didn't have a recent version of a compiler).
conda install -c conda-forge -c fvcore fvcore
conda install -c anaconda cudatoolkit-dev=10.2
export CUDA_HOME=$CONDA_PREFIX
Up until here, nothing too special, but the last step before the PIP install is important:
unset CC
I found that PyTorch3d and PyTorch both set up the -ccbin flag in their setup scripts (at least this is what I gathered from reading through the code - but I'm of course less familiar with it than you are) and having a set $CC results in a double definition of ccbin in the nvcc calls, which throws a a nvcc fatal : redefinition of compiler-bindir and a which: no hipcc error. Interestingly enough, hipcc doesn't suddenly appear when CC is unset (in fact, I don't think it could appear on a ppc64le), but the error disappears, PyTorch3d is built, and I'm able to import it and use it as normal.
Thank you for sharing your experience!
@bottler I managed to get it working -- not quite sure what the issue was. Maybe a weird config the first time I tried it.
Anyways, these were the steps I followed:The IBM WMLCE environment has a lot of software for ppc64le -- I added it to my conda config to be able to install pytorch and other libraries:
conda config --prepend channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/I created a conda environment and installed PyTorch=1.5 on it, (as it is the latest version supported by WMLCE-Early Access)
export FORCE_CUDA=1
conda install -c conda-forge cxx-compiler(This might only be necessary for me - the server I was running on didn't have a recent version of a compiler).
conda install -c conda-forge -c fvcore fvcore
conda install -c anaconda cudatoolkit-dev=10.2
export CUDA_HOME=$CONDA_PREFIXUp until here, nothing too special, but the last step before the PIP install is important:
unset CC
I found that PyTorch3d and PyTorch both set up the -ccbin flag in their setup scripts (at least this is what I gathered from reading through the code - but I'm of course less familiar with it than you are) and having a set $CC results in a double definition ofccbinin the nvcc calls, which throws a anvcc fatal : redefinition of compiler-bindirand awhich: no hipccerror. Interestingly enough, hipcc doesn't suddenly appear when CC is unset (in fact, I don't think it could appear on a ppc64le), but the error disappears, PyTorch3d is built, and I'm able to import it and use it as normal.
This procedure works smoothly for me with the combination of CUDA-10.2, PyTorch 1.5, and PyTorch3D 0.3.0. Thanks a lot for this instruction. Just a minor note, CUDA-10.2 does not support GCC > 8. So I need to have
conda install -c conda-forge gcc_linux-ppc64le==8.2.0
Most helpful comment
@bottler I managed to get it working -- not quite sure what the issue was. Maybe a weird config the first time I tried it.
Anyways, these were the steps I followed:
The IBM WMLCE environment has a lot of software for ppc64le -- I added it to my conda config to be able to install pytorch and other libraries:
conda config --prepend channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda-early-access/I created a conda environment and installed PyTorch=1.5 on it, (as it is the latest version supported by WMLCE-Early Access)
export FORCE_CUDA=1conda install -c conda-forge cxx-compiler(This might only be necessary for me - the server I was running on didn't have a recent version of a compiler).conda install -c conda-forge -c fvcore fvcoreconda install -c anaconda cudatoolkit-dev=10.2export CUDA_HOME=$CONDA_PREFIXUp until here, nothing too special, but the last step before the PIP install is important:
unset CCI found that PyTorch3d and PyTorch both set up the -ccbin flag in their setup scripts (at least this is what I gathered from reading through the code - but I'm of course less familiar with it than you are) and having a set $CC results in a double definition of
ccbinin the nvcc calls, which throws a anvcc fatal : redefinition of compiler-bindirand awhich: no hipccerror. Interestingly enough, hipcc doesn't suddenly appear when CC is unset (in fact, I don't think it could appear on a ppc64le), but the error disappears, PyTorch3d is built, and I'm able to import it and use it as normal.