Alpaka: CUDA 9.2 fails to compile

Created on 12 Jun 2018  路  6Comments  路  Source: alpaka-group/alpaka

I want to test alpaka with CUDA 9, especially CUDA 9.2. Regarding CUDA 9.0 and 9.1 see issue #426 .
At the moment I use the modules on ZIH/Taurus cluster. Trying to compile examples/bufferCopy from current develop. At the moment on Taurus CUDA 8.0.61 is only working (with GCC 5.3.0 and Boost 1.64.0). CUDA 9.x requires at least Boost 1.65.1. I still need to build & try Boost with gcc 5.4 as it is missing on Taurus.

|CUDA 9.2.88|Boost >=1.65.1|
|---|---|
|gcc 4.9.3|:x:|
|gcc 5.3|:x:|
|gcc 5.4|:question:|
|gcc 5.5|:x:|
|gcc 6.x|:x:|
|gcc 7.x|:x:|

GCC 7.1 + Boost 1.65.1 + CUDA 9.2.88

Currently Loaded Modules:
  1) modenv/classic (S)   2) cmake/3.10.1   3) git/2.15.1   4) gcc/7.1.0   5) bullxmpi/1.2.8.4   6) boost/1.65.1-gnu7.1   7) cuda/9.2.88

```bash

alpaka/example/bufferCopy/build_cuda

cmake .. && make

Fails with:

```bash
[ 50%] Building NVCC (Device) object CMakeFiles/bufferCopy.dir/src/bufferCopy_generated_bufferCopy.cpp.o
/sw/taurus/libraries/cuda/9.2.88/bin/nvcc /alpaka/example/bufferCopy/src//bufferCopy.cpp -x=cu -c -o /alpaka/example/bufferCopy/build_cuda/CMakeFiles/bufferCopy.dir/src/./bufferCopy_generated_bufferCopy.cpp.o -ccbin /sw/global/compilers/gcc/7.1.0/bin/g++ -m64 -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_FIBERS_ENABLED -DALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLED -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_DEBUG=0 -Xcompiler ,\"-fopenmp\",\"-g\" --expt-extended-lambda --expt-relaxed-constexpr --generate-code arch=compute_30,code=sm_30 --generate-code arch=compute_30,code=compute_30 -std=c++11 --use_fast_math --ftz=false -DNVCC -I/sw/taurus/libraries/cuda/9.2.88/include -I/sw/taurus/libraries/boost/1.65.1-gnu7.1/include -I/alpaka/include
/alpaka/include/alpaka/core/Assert.hpp(102): error: this pragma must immediately precede a declaration

Same error occurs for GCC 5.5 + Boost 1.65.1 + CUDA 9.2.88 and GCC 7.1 + Boost 1.66.0 + CUDA 9.2.88.
This is setup working, but it is only for CUDA 8.

1) modenv/classic (S)   2) gcc/5.3.0   3) bullxmpi/1.2.8.4   4) boost/1.64.0-gnu5.3   5) git/2.15.1   6) cmake/3.10.1   7) cuda/8.0.61

Edit: gcc 5.3 and 4.9.3 added to table.

CUDA Bug

Most helpful comment

CUDA 9.2 is now supported with both gcc and clang. I will investigate clang 5.0 in a separate ticket.

All 6 comments

Also, as a first try, can you just switch the order of ALPAKA_NO_HOST_ACC_WARNING and ALPAKA_FN_HOST_ACC in the erroring lines?

Is this a CUDA 9.2 or a NVCC 9.2 issue?

(Can you try if clang -x cuda as a device compiler with CUDA 9.2 is affected as well?)

@BenjaminW3 let us write in the matrix in README.md not CUDA 8.0+ but instead the range we currently test. This us useful when looking at already released versions if they list CUDA 8.0 - 9.1 then one knows at least what was tested.

That does not mean we should assume things break on new releases, but we can document what we know to work.

At least nvcc 9.2 using clang 5.0.1 fails with the following errors:

/home/travis/llvm/lib/clang/5.0.1/include/smmintrin.h(674): error: argument of type "const __v2di *" is incompatible with parameter of type "__attribute((vector_size(16))) long long *"
/home/travis/llvm/lib/clang/5.0.1/include/avx2intrin.h(836): error: argument of type "const __v4di_aligned *" is incompatible with parameter of type "__attribute((vector_size(32))) long long *"
/home/travis/llvm/lib/clang/5.0.1/include/avx512fintrin.h(9041): error: argument of type "const __v8di_aligned *" is incompatible with parameter of type "__attribute((vector_size(64))) long long *"

I am currently doing a CI run testing nvcc 9.2 using clang 4.0.0 but I assume similar results.

  • tested with gcc5.3 and Boost 1.65.1, same error,
  • I do not know if it is a CUDA 9.2 or a NVCC 9.2 issue,
  • switched order of ALPAKA_NO_HOST_ACC_WARNING and ALPAKA_FN_HOST_ACC, same error,
  • clang 4.0 as host compiler with same error

mistakenly tested vectorAdd, but should not be different to bufferCopy

/sw/taurus/libraries/cuda/9.2.88/bin/nvcc /alpaka/example/vectorAdd/src//main.cpp -x=cu -c -o /alpaka/example/vectorAdd/build_cuda/CMakeFiles/vectorAdd.dir/src/./vectorAdd_generated_main.cpp.o -ccbin /sw/global/compilers/llvm/4.0.0/bin/clang++ -m64 -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLED -DALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLED -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_DEBUG=0 -Xcompiler ,\"-fopenmp=libiomp5\",\"-g\" --expt-extended-lambda --expt-relaxed-constexpr --generate-code arch=compute_30,code=sm_30 --generate-code arch=compute_30,code=compute_30 -std=c++11 --use_fast_math --ftz=false -DNVCC -I/sw/taurus/libraries/cuda/9.2.88/include -I/alpaka/include -I/software/boost_1.65.1/taurus/include
/alpaka/include/alpaka/core/Assert.hpp(103): error: this pragma must immediately precede a declaration


.

  • Clang 5.0 as device compiler (set in ccmake for ALPAKA_CUDA_COMPILER), I have this problem:

    • --cuda-path is correctly set, maybe clang needs something?

[ 50%] Building CXX object CMakeFiles/bufferCopy.dir/src/bufferCopy.cpp.o
/sw/taurus/eb/Clang/5.0.0-GCC-6.4.0-2.28/bin/clang++  -DALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLED -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_DEBUG=0 -isystem /sw/taurus/libraries/cuda/9.2.88/include -I/alpaka/include -isystem /software/boost_1.65.1/taurus/include  -fopenmp=libomp   -Werror -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-disabled-macro-expansion -Wno-global-constructors -Wno-padded --system-header-prefix=boost/ -fopenmp=libomp --cuda-path=/sw/taurus/libraries/cuda/9.2.88 --cuda-gpu-arch=sm_30 -Qunused-arguments -Wno-unused-local-typedef -ffast-math -ffp-contract=fast -std=c++11 -ftemplate-depth=512 -x cuda -o CMakeFiles/bufferCopy.dir/src/bufferCopy.cpp.o -c /alpaka/example/bufferCopy/src/bufferCopy.cpp
clang-5.0: error: cannot find libdevice for sm_30. Provide path to different CUDA installation via --cuda-path, or pass -nocudalib to build without linking with libdevice.

CUDA 9.2 is now supported with both gcc and clang. I will investigate clang 5.0 in a separate ticket.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ax3l picture ax3l  路  3Comments

SimeonEhrig picture SimeonEhrig  路  5Comments

jkelling picture jkelling  路  4Comments

BenjaminW3 picture BenjaminW3  路  3Comments

mxmlnkn picture mxmlnkn  路  5Comments