Stl: tests/std: Harness CUDA/NVCC

Created on 26 Mar 2020 · 15Comments · Source: microsoft/STL

We need to add nvcc to the set of configurations we test.

We need to determine which version we need to support.
- CUDA 10.1 unblocked VS2019 users, however 10.1 Update 2 added if constexpr support. Which we want.
We need to actually add tests which use nvcc.
- We have some internally but they aren't fit for external consumption.

Microsoft-internal: Tracked by VSO-750842 / AB#750842. After we add CUDA 10.1 Update 2 coverage to GitHub, we should remove our completely-hacked-up CUDA 9.2 coverage from the MSVC-internal vcr test suite.

help wanted test

Source

cbezault

All 15 comments

I have code to put CUDA on the agents when we're ready to do this from vcpkg.

BillyONeal on 31 Mar 2020

We talked about this, and VS 2019 16.7 can require CUDA 10.1 Update 2, as long as we emit an informative error message for CUDA 10.1 RTW and Update 1.

We already use #ifdef __CUDACC__ to detect CUDA. We should be able to use __CUDACC_VER_MAJOR__ == 10 && __CUDACC_VER_MINOR__ == 1 && __CUDACC_VER_BUILD__ < VALUE_FOR_UPDATE_2.

StephanTLavavej on 2 Apr 2020

👍1

Related to #189.

StephanTLavavej on 2 Apr 2020

@BillyONeal If you're ready to put CUDA 10.1 Update 2 on the agents, that would be the first step - then we'd need a modern test (with NVIDIA's help), and finally we can update the STL to use if constexpr everywhere.

StephanTLavavej on 3 Apr 2020

@StephanTLavavej The version I have comes from this URI:

$CudaUrl = 'https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.105_418.96_win10.exe'
$CudaFeatures = 'nvcc_10.1 cuobjdump_10.1 nvprune_10.1 cupti_10.1 gpu_library_advisor_10.1 memcheck_10.1 ' + `
  'nvdisasm_10.1 nvprof_10.1 visual_profiler_10.1 visual_studio_integration_10.1 cublas_10.1 cublas_dev_10.1 ' + `
  'cudart_10.1 cufft_10.1 cufft_dev_10.1 curand_10.1 curand_dev_10.1 cusolver_10.1 cusolver_dev_10.1 cusparse_10.1 ' + `
  'cusparse_dev_10.1 nvgraph_10.1 nvgraph_dev_10.1 npp_10.1 npp_dev_10.1 nvrtc_10.1 nvrtc_dev_10.1 nvml_dev_10.1 ' + `
  'occupancy_calculator_10.1 fortran_examples_10.1'

Function InstallCuda {
  Param(
    [String]$Url,
    [String]$Features
  )

  try {
    Write-Output 'Downloading CUDA...'
    [string]$installerPath = Get-TempFilePath -Extension 'exe'
    curl.exe -L -o $installerPath -s -S $Url
    Write-Output 'Installing CUDA...'
    $proc = Start-Process -FilePath $installerPath -ArgumentList @('-s ' + $Features) -Wait -PassThru
    $exitCode = $proc.ExitCode
    if ($exitCode -eq 0) {
      Write-Output 'Installation successful!'
    }
    else {
      Write-Output "Installation failed! Exited with $exitCode."
      exit $exitCode
    }
  }
  catch {
    Write-Output "Failed to install CUDA!"
    Write-Output $_.Exception.Message
    exit -1
  }
}

@brycelelbach Do you know what installer we should be using?

BillyONeal on 3 Apr 2020

After looking at https://developer.nvidia.com/cuda-toolkit-archive , I believe you've found 10.1 RTW. https://developer.nvidia.com/cuda-10.1-download-archive-update2?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exelocal indicates that you can download 10.1 Update 2 from (adding HTTPS):
https://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.243_426.00_win10.exe

StephanTLavavej on 3 Apr 2020

I'm collecting this and other changes in my branch https://github.com/BillyONeal/STL/tree/cleanup_script -- after I have vcpkg switched over and there stops being churn I'll update here.

BillyONeal on 4 Apr 2020

https://github.com/microsoft/STL/pull/682 adds nvcc since Curtis added my in flight commits.

BillyONeal on 5 Apr 2020

10.1.43 sounds right. But why not use 10.2?

brycelelbach on 10 Apr 2020

I absolutely love increasing minimum required versions, but users sometimes have different preferences. CUDA 10.1 was the first version that worked with VS 2019, and there are some number of users still using that combination (I don't know numbers/percentages but it is certainly nonzero). By waiting almost a year and telling management that 10.1 Update 2 looked like a highly compatible update (admittedly not from personal experience), plus pointing to our support for side-by-side toolsets and the latency of our release schedule, and promising good error messages, I was able to get approval to increase the minimum required version from 10.1 RTW to 10.1 Update 2 (i.e. allowing us to use if constexpr unconditionally).

Requiring 10.2 would be significantly more aggressive, and while it would presumably fix various compiler bugs, I'm unable to point to specific improvements or features that would dramatically accelerate our development. (Note that I have a pending request for unconditional explicit(bool) support, not sure what the status is there.) I am clueless about how rapidly CUDA users upgrade, but it seems like they explicitly have to download the new version - i.e. there is no auto-upgrader like the VS installer that now offers new production versions as soon as they're released.

If we could somehow require 10.2 without making any of our mutual users hiss, I would love to find a way to do so. Or some kind of predictable policy (e.g. supporting CUDA versions one year old, or something). Currently, the system appears to be that a new VS version can require the corresponding CUDA version that accepts it, except in special circumstances like this 10.1 Update 2 thing. Anyways, please send me an email if you'd like to discuss policy changes, as that's outside the scope of the microsoft/STL repo (whereas getting CUDA 10.1 Update 2 coverage is in scope, and now possible because it's installed on the worker VMs - we just need the actual test, and enhancements to the lit Python machinery to invoke a different compiler driver).

StephanTLavavej on 10 Apr 2020

@brycelelbach , IIRC you said your team could contribute a test. Is there anything you need from us in order to submit a PR? I think all we need is a source file (or set of source files) and a command line, where we can compile it with CUDA 10.1 Update 2 and ensure that the STL's headers can be parsed (by including all of them). I believe we can do the test infrastructure work to execute that command line automatically.

StephanTLavavej on 30 Apr 2020

NVIDIA Announces CUDA Toolkit 11 on May 14, 2020:

Support for new host compilers and language standards including C++17

We'll presumably need test coverage for CUDA 11 with C++17, in addition to CUDA 10.1 Update 2 with C++14.

StephanTLavavej on 16 May 2020

Side by side installations of CUDA are a pain :/

cbezault on 16 May 2020

Hey Stephan,

We're a bit underwater so I don't know when I'd be able to commit someone
to do it, unfortunately.

Side by side installations of the CUDA toolkit shouldn't be too painful.
With the driver, of course, there's compatibility issues.

On Fri, May 15, 2020 at 6:37 PM Curtis J Bezault notifications@github.com
wrote:

Side by side installations of CUDA are a pain :/

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/microsoft/STL/issues/639#issuecomment-629569869, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/AADBG4UXFDE4F2J22XD44QTRRXU45ANCNFSM4LT5QYXA
.

--
Bryce Adelstein Lelbach aka wash
US Programming Language Standards (PL22) Chair
ISO C++ Library Evolution Chair
CppCon and C++Now Program Chair

CUDA Core C++ Libraries (Thrust, CUB, libcu++) Lead @ NVIDIA

brycelelbach on 16 May 2020

👍1

I've pinned and marked this as help wanted in the hopes that someone can help us with this relatively small, but unusual and important, task.

Summary of the current status and the desired outcome:

We're installing CUDA 10.1 Update 2 on our CI machines (which are Azure Virtual Machine Scale Sets) and the installer is reporting success. Here's the exact version and executable:
https://github.com/microsoft/STL/blob/3cca47387bc286e94f6bce18f3adcd7d737f5a3c/azure-devops/provision-image.ps1#L102
Nothing in our Python/lit-powered test infrastructure is invoking nvcc yet. (We have both cl and clang-cl test coverage at this time, so the infrastructure has some awareness of how to invoke different compilers.)
In our MS-internal test infrastructure (powered by a mix of incomprehensible Perl and more comprehensible C#), we're running a hacked-up version of CUDA 9.2 (which ordinarily refuses to compile with VS 2019) and a hacked-up test derived from CUDA's examples. The purpose of this test is to verify that all STL headers can be included when compiling with nvcc, so we don't accidentally use intrinsics or other features that interfere with the CUDA compilation process.
The goal is to have similar coverage, but with CUDA 10.1 Update 2, in our Python test harness, and with a test case written from scratch and not derived from CUDA's examples, due to our licensing requirements. Note that the test case can be the CUDA equivalent of Hello World, as long as we can include all STL headers in it (well, all of the C++14 STL headers). Also note that we're interested in compilation only, not actually running the program.

The necessary changes should be fairly small - a new test in tests/std, and possibly a bit of test infrastructure work if our existing machinery to invoke clang-cl isn't totally applicable to invoking nvcc. However, as this requires some familiarity with the CUDA compilation process, and writing a program to be compiled for CUDA, and our new test infrastructure, and our previous techniques are inapplicable (since we can't reuse the old hacked-up test case), this is currently hard for us to address, especially given that the maintainer team is busy with reviewing and implementing C++20 features.

The benefit to completing this test coverage, aside from ensuring that we don't break the important CUDA scenario, is that we will finally be able to unconditionally use if constexpr in the STL. If you can help, please let us know.

StephanTLavavej on 19 Aug 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

STL: Consider renaming <xmeow> internal headers

StephanTLavavej · 11Comments

shared_ptr idle deletion

ohhmm · 16Comments

<algorithm>: Consider using *_Prev_iter(it) instead of *(it - 1)

StephanTLavavej · 10Comments

<fstream>: Breaks with modules?

gmcode · 9Comments

CMake build failure in presence of gcc's c++.exe in the PATH

malkia · 10Comments