Cupy: Policy for general supports?

Created on 22 Jul 2020  路  3Comments  路  Source: cupy/cupy

I think this is a companion of #3643, #3631, and #3301, and is in line with the spirit of NEP 29, but it covers more aspects. This is not urgent or mission-critical, but would be nice to see how difficult to reach a consensus.

The question is: What would be the lowest supported version for each of the following items in a major release of CuPy?

  • CUDA Toolkit? (rel: #3301)
  • Compute capability?
  • Device architecture? (For example, CUDA's cuda_fp16.h is not really useful until sm53.)
  • Satellite libraries? This includes NCCL, cuDNN, cuTENSOR, CUB (pre-CUDA 11), etc that do not come with CUDA Toolkit but are still part of the general CUDA ecosystem. (For example, when would be the good time to drop NCCL 1.x?)
  • Host compiler? (rel: #3530)

While the CUDA Toolkit Release Note does specify the support matrix, when using specific functionalities outlined above a stringent requirement is often imposed.

(I think I missed at least one item above...Will add it when I can recall it 馃槄)

needs-discussion

All 3 comments

Let me try describing the current status:

CUDA Toolkit

When we discussed https://github.com/cupy/cupy/issues/3301, one strong reason to support this decision (drop 8.0 / 9.1 but keep others) was that cuDNN for 8.0 / 9.1 is not released for a long time, which may indicate that the number of people using them is significantly decreasing.

Compute capability
Device architecture

In general supporting old CC does not have an additional cost, so no strong motivation to aggressively drop old ones.
Exceptions may be:

  • CUDA Toolkit no longer supports that CC (e.g. compute_30 dropped in CUDA 11: https://github.com/cupy/cupy/pull/3578)

    • Note: As we still support CUDA 9.0, we don't drop compute_30 in v8.

  • For technical reasons (e.g., to improve performance, like we did in CuPy v4: https://github.com/cupy/cupy/pull/616)

Satellite libraries

I think there's no policy here.
(I just recalled that NCCL 1.x only supports CUDA 8.0, so v8 will not support NCCL 1.x.)

Host compiler

Ideally, CuPy should be buildable by the default compiler on OS supported by CUDA toolkit.
Exception is CUB.

Thanks, @kmaehashi!

When we discussed #3301, one strong reason to support this decision (drop 8.0 / 9.1 but keep others) was that cuDNN for 8.0 / 9.1 is not released for a long time, which _may_ indicate that the number of people using them is significantly decreasing.

I see, this is an interesting but perhaps very practical indicator indeed. I was thinking an NEP 29-like approach: "After X months from the official release of CUDA Toolkit version Y.Z, we will drop the support". Perhaps this is too aggressive in terms of the product lifecycles of NVIDIA GPUs?

In general supporting old CC does not have an additional cost, so no strong motivation to aggressively drop old ones.
Exceptions may be:

  • CUDA Toolkit no longer supports that CC (e.g. compute_30 dropped in CUDA 11: #3578)

    • Note: As we still support CUDA 9.0, we don't drop compute_30 in v8.
  • For technical reasons (e.g., to improve performance, like we did in CuPy v4: #616)

Another exception off top of my head (as I mentioned earlier) is fp16 support. Before sm53 it is crap, not really usable: https://github.com/cupy/cupy/pull/3617#issuecomment-659111349. The current solution is to detect the device CC at runtime to decide if fp16 can be used or not. (The fp16 kernels are still compiled, but not used for CC<53.)

Satellite libraries

I think there's no policy here.
(I just recalled that NCCL 1.x only supports CUDA 8.0, so v8 will not support NCCL 1.x.)

Example: cuTENSOR only supports CUDA 10.1+.

Host compiler

Ideally, CuPy should be buildable by the default compiler on OS supported by CUDA toolkit.
Exception is CUB.

As I mentioned in https://github.com/cupy/cupy/issues/3530#issuecomment-657163686, CUDA Toolkit's documentation is bad at this. For example, CuPy's complex headers has a stringent C++11 requirement, and the compilation would fail if the host compiler is not new enough (https://github.com/cupy/cupy/issues/2530#issuecomment-539366174). Another exception is cuFFT. Now this issue is "explicitly" acknowledged in CUDA 11 Release Notes and Installation Guide for Linux (both in a footnote...)

Deprecation plans for v9: #4300

I will include a link to this issue in the release note of the next release to gather comments/concerns from the community.

Was this page helpful?
0 / 5 - 0 ratings