Conda-forge.github.io: Documenting GPU builds

Created on 17 Oct 2019  路  12Comments  路  Source: conda-forge/conda-forge.github.io

Would be good to document how users should enable GPU builds of packages.

Docs

Most helpful comment

My understanding is that the nvcc metapackage is only a wrapper around the system installation because nvcc is only distributed with the Nvidia drivers, provided in the Docker image.

All 12 comments

Navigated this for https://github.com/conda-forge/nvidia-apex-feedstock - overall, large challenge is the more-limited use of the outputs section in current packages - so, general lack of experience in writing that portion, and the lack of understanding on what it is executing/what needs to be included in it versus what can be left out. Documentation is available on the outputs section generally, but the use specifically for these type of recipes can be unclear and could use to be clarified (ie, the order of the build scripts to make sure you're skipping the correct aspects in each part). I remain to be a bit confused on the difference between run, and run_constrained (and especially what run_constrained actually executes).

Also, the previous package I used for guidance was the https://github.com/conda-forge/ucx-split-feedstock - which uses an install.sh script in the outputs section. Understanding when that's necessary and when it is not, would be helpful, as that brought some confusion during the build.

In this package specifically, one challenge was balancing the CUDA limitations for packages that can run on both Linux and Windows. Ultimately, it came to an easy solution for setting the proc type to CPU for all Windows machines, but it took some trouble-shooting (for example, simply skipping windows on the CUDA compiler did not work).

Overall, the experience has been positive! I started the recipe when nvcc was only a PR, so I understand that my experience was a bit bumpier than needed. That said, along the way, I found some things confusing or surprising (it was also my first recipe beyond pure Python packages, so I learnt a lot!). The most relevant ones could be:

  • Why a metapackage is needed to choose the platform (CPU/GPU) and how this works
  • How can a user choose a specific platform or a CUDA version
  • The role of ocl-icd in OpenCL-enabled libraries
  • Explain how CUDA support is offered through Docker images
  • Using {{ compiler('cuda') }} will put cudatoolkit in run through run_constrained (and how this could be overridden if needed)

Also, I'd like to thank you for putting this together! I'll be happy to participate in reviewing the documentation if you need help!

I'm trying to use cuda_compiler_version to set build flags, but this variable is not defined for OSX and Windows. i.e. when cuda_compiler_version is None, then I pass a --disable-cuda flag. Is there some other variable that I can use for this purpose?

https://github.com/carterbox/tomopy-feedstock/blob/007f30d9b503a395ebbbff3ddd4515f3702565af/recipe/meta.yaml#L4-L12

There isn't. Currently selectors are used to handle the other platforms. Please see these lines as an example.

Are the headers for the libraries provided with cudatoolkit available from a package that is not cudatoolkit-dev? That package just downloads and tries to run the cudatoolkit installer.

They are in the Docker images. You may need to re-render after adding {{ compiler("cuda") }} to ensure they are used.

Are these files located at ${CUDA_HOME}/include? Alternatively, where can I find the docker config file that conda-forge uses to provide the build environment? I'd like to poke around and figure out where various resources are located because my CMakeLists aren't doing their job correctly.

Are these files located at ${CUDA_HOME}/include?

Yes.

Alternatively, where can I find the docker config file that conda-forge uses to provide the build environment?

Do you have a PR? Maybe we should move this conversation to there. 馃檪

My questions regarding general GPU support (I don't know the answers):

  • How is cudatoolkit built? Do we use the one from anaconda or nvidia channel?
  • Why isn't nvcc available in cudatoolkit, and how can I get nvcc using conda install?
  • What's the relation between cudatoolkit and cudatoolkit-dev? Why isn't conda-forge providing cudatoolkit?
  • Is CI set up to build and test GPU packages? If not, where is this done?
  • How do the docker images work?
  • More generally, what is the workflow for building GPU packages for conda-forge?
  • What flags or options are exposed for enabling/disabling CUDA?
  • What is the standard practice for building packages that can optionally support CUDA GPUs?

Addressing these questions will provide a basis for putting together the document.

Here are some of the answers:

  • nvcc is available from the nvcc_linux-64 package on conda-forge. This package is automatically selected using the {{ compiler('cuda') }} Jinja template. The package contains only nvcc.
  • Enable CUDA specific things in your recipe using the Jinja selectors ([linux64 and cuda_compiler_version != "None"])

My understanding is that the nvcc metapackage is only a wrapper around the system installation because nvcc is only distributed with the Nvidia drivers, provided in the Docker image.

I might tackle this over the Christmas break, together with other GPU stuff like https://github.com/conda-forge/cudatoolkit-feedstock/issues/38. How should we coordinate @jakirkham ?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

peterjc picture peterjc  路  4Comments

westurner picture westurner  路  3Comments

prachi237 picture prachi237  路  3Comments

scopatz picture scopatz  路  4Comments

h-vetinari picture h-vetinari  路  4Comments