Alpaka: CUDA 9.0-9.1 using gcc 6 fails to compile

Created on 19 Oct 2017  Â·  19Comments  Â·  Source: alpaka-group/alpaka

It is not possible to compile an alpaka example with CUDA9.

e.g. bufferCopy

[ 75%] Building NVCC (Device) object CMakeFiles/bufferCopy.dir/src/bufferCopy_generated_bufferCopy.cpp.o
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple: In instantiation of ‘static constexpr bool std::_TC<<anonymous>, _Elements>::_MoveConstructibleTuple() [with _UElements = {const std::tuple<unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>&}; bool <anonymous> = true; _Elements = {unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int}]’:
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple:626:248:   required by substitution of ‘template<class ... _UElements, typename std::enable_if<(((std::_TC<(sizeof... (_UElements) == 1), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_NotSameTuple<_UElements ...>() && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_MoveConstructibleTuple<_UElements ...>()) && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_ImplicitlyMoveConvertibleTuple<_UElements ...>()) && (3ul >= 1)), bool>::type <anonymous> > constexpr std::tuple< <template-parameter-1-1> >::tuple(_UElements&& ...) [with _UElements = {const std::tuple<unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>&}; typename std::enable_if<(((std::_TC<(sizeof... (_UElements) == 1), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_NotSameTuple<_UElements ...>() && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_MoveConstructibleTuple<_UElements ...>()) && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_ImplicitlyMoveConvertibleTuple<_UElements ...>()) && (3ul >= 1)), bool>::type <anonymous> = <missing>]’
/home/widera/src/alpaka/include/alpaka/exec/ExecCpuSerial.hpp:64:7:   required from ‘class alpaka::exec::ExecCpuSerial<std::integral_constant<long unsigned int, 3ul>, long unsigned int, InitBufferKernel, unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>’
/home/widera/src/alpaka/example/bufferCopy/src//bufferCopy.cpp:246:220:   required from here
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple:483:67: error: mismatched argument pack lengths while expanding ‘std::is_constructible<_Elements, _UElements&&>’
       return __and_<is_constructible<_Elements, _UElements&&>...>::value;
                                                                   ^~~~~
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple:484:1: error: body of constexpr function ‘static constexpr bool std::_TC<<anonymous>, _Elements>::_MoveConstructibleTuple() [with _UElements = {const std::tuple<unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>&}; bool <anonymous> = true; _Elements = {unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int}]’ not a return-statement
     }
 ^
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple: In instantiation of ‘static constexpr bool std::_TC<<anonymous>, _Elements>::_ImplicitlyMoveConvertibleTuple() [with _UElements = {const std::tuple<unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>&}; bool <anonymous> = true; _Elements = {unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int}]’:
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple:626:362:   required by substitution of ‘template<class ... _UElements, typename std::enable_if<(((std::_TC<(sizeof... (_UElements) == 1), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_NotSameTuple<_UElements ...>() && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_MoveConstructibleTuple<_UElements ...>()) && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_ImplicitlyMoveConvertibleTuple<_UElements ...>()) && (3ul >= 1)), bool>::type <anonymous> > constexpr std::tuple< <template-parameter-1-1> >::tuple(_UElements&& ...) [with _UElements = {const std::tuple<unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>&}; typename std::enable_if<(((std::_TC<(sizeof... (_UElements) == 1), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_NotSameTuple<_UElements ...>() && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_MoveConstructibleTuple<_UElements ...>()) && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_ImplicitlyMoveConvertibleTuple<_UElements ...>()) && (3ul >= 1)), bool>::type <anonymous> = <missing>]’
/home/widera/src/alpaka/include/alpaka/exec/ExecCpuSerial.hpp:64:7:   required from ‘class alpaka::exec::ExecCpuSerial<std::integral_constant<long unsigned int, 3ul>, long unsigned int, InitBufferKernel, unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>’
/home/widera/src/alpaka/example/bufferCopy/src//bufferCopy.cpp:246:220:   required from here
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple:489:65: error: mismatched argument pack lengths while expanding ‘std::is_convertible<_UElements&&, _Elements>’
       return __and_<is_convertible<_UElements&&, _Elements>...>::value;
                                                                 ^~~~~
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple:490:1: error: body of constexpr function ‘static constexpr bool std::_TC<<anonymous>, _Elements>::_ImplicitlyMoveConvertibleTuple() [with _UElements = {const std::tuple<unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>&}; bool <anonymous> = true; _Elements = {unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int}]’ not a return-statement
     }
 ^
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple: In instantiation of ‘static constexpr bool std::_TC<<anonymous>, _Elements>::_NonNestedTuple() [with _SrcTuple = const std::tuple<unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>&; bool <anonymous> = true; _Elements = {unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int}]’:
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple:662:419:   required by substitution of ‘template<class ... _UElements, class _Dummy, typename std::enable_if<((std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_ConstructibleTuple<_UElements ...>() && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_ImplicitlyConvertibleTuple<_UElements ...>()) && std::_TC<(std::is_same<_Dummy, void>::value && (1ul == 1)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_NonNestedTuple<const tuple<_Elements ...>&>()), bool>::type <anonymous> > constexpr std::tuple< <template-parameter-1-1> >::tuple(const std::tuple<_Args1 ...>&) [with _UElements = {unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int}; _Dummy = void; typename std::enable_if<((std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_ConstructibleTuple<_UElements ...>() && std::_TC<(1ul == sizeof... (_UElements)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_ImplicitlyConvertibleTuple<_UElements ...>()) && std::_TC<(std::is_same<_Dummy, void>::value && (1ul == 1)), unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>::_NonNestedTuple<const tuple<_Elements ...>&>()), bool>::type <anonymous> = <missing>]’
/home/widera/src/alpaka/include/alpaka/exec/ExecCpuSerial.hpp:64:7:   required from ‘class alpaka::exec::ExecCpuSerial<std::integral_constant<long unsigned int, 3ul>, long unsigned int, InitBufferKernel, unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>’
/home/widera/src/alpaka/example/bufferCopy/src//bufferCopy.cpp:246:220:   required from here
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple:495:244: error: wrong number of template arguments (4, should be 2)
       return  __and_<__not_<is_same<tuple<_Elements...>,
                                                                                                                                                                                                                                                    ^    
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/type_traits:1558:8: note: provided for ‘template<class _From, class _To> struct std::is_convertible’
     struct is_convertible
        ^~~~~~~~~~~~~~
/sw/global/compilers/gcc/6.3.0/include/c++/6.3.0/tuple:502:1: error: body of constexpr function ‘static constexpr bool std::_TC<<anonymous>, _Elements>::_NonNestedTuple() [with _SrcTuple = const std::tuple<unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int>&; bool <anonymous> = true; _Elements = {unsigned int*, alpaka::vec::Vec<std::integral_constant<long unsigned int, 3ul>, long unsigned int>, unsigned int}]’ not a return-statement
     }
 ^

Software used:

  • gcc 6.30
  • boost 1.65.1
  • cmake 3.9.0
  • cuda 9.0.176

@BenjaminW3 It looks like it is something with the meta::apply command. Could you see anything which possible triggered the error?

CUDA Wontfix Bug

All 19 comments

gcc 6.X is compiling C++ 14 by default. It looks like CUDA9 has problems with C++14.

It looks like it helps to use c++11 by setting export CXX_FLAGS="-std=c++11".
I will check if this solves the issue fully.

[update] compile with c++11 is not solving the issue :-(

If I switch to gcc/5.3.0 the issue is gone. It looks like it is something with gcc/6.3.0 and CUDA 9.

This was already part of the original CUDA 9.0 support Pull Request description #384. The combination of CUDA and gcc 6 is not enabled in the supported compiler matrix.

The bug is still triggered when using CUDA 9.1

Not sure if it's silently patched in one of the three CUDA 9.1.85.X patch releases: https://gist.github.com/ax3l/9489132

I can confirm that V9.1.85 + gcc 6.X still has problems with std::tuple.

Thanks for the info Robert! Do you work-around it somehow in your projects?

Just ran into the problem and was able to roll back to gcc-5 to work around the problem. Haven't investigated any local workarounds.

Just as a note: NVCC 9.2 fixed the GCC 6 std::tuple issues: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

2.2. CUDA Tools
2.2.1. CUDA Compilers

    The following compilers are supported as host compilers in nvcc
        Clang 5.0
        GCC 7.x
        Microsoft Visual Studio 2017 (RTW and Update 6)
        PGI pgc++ 18.x
        XLC 13.1.6
    __device__ / __constant__ variables are now allowed to have an rvalue reference type.
    Functions in math_functions.hpp have been changed to use memcpy for type punning.

    Added support for std::tuple.
[...]

Also it could be that one can use the shipped thrust tuples as work-around: https://stackoverflow.com/questions/40742242/does-cuda-c-not-have-tuples-in-device-code

Awesome news, I will have to try out CUDA 9.2 and see if my host side std::tuple woes with GCC 7 are now gone.

Since we don't have a hard dependency on Thrust/CUDA we went with forking taotuple for VTK-m ( https://gitlab.kitware.com/third-party/taotuple / https://github.com/taocpp/tuple).

Cool, thanks for the heads up!

The reason why I though about trust was: it's a nvcc only issue and nvcc is shipped with a matching thrust in the Nvidia SDK anyway (if CUDA is used as a backend; CUDA is optional for us as well).

I hope I can report my current incompatibilities with CUDA9 here as well. At the moment I use the modules on ZIH/Taurus cluster. Trying to compile examples/bufferCopy from current develop. At the moment there is only CUDA 8.0.61 working with GCC 5.3.0 and Boost 1.64.0.
CUDA 9.x requires at least Boost 1.65.1. I still need to build & try Boost with gcc 5.4 as it is missing on Taurus.

|CUDA 9.2.88|Boost >=1.65.1|
|---|---|
|gcc 5.4|:question:|
|gcc 5.5|:x:|
|gcc 6.x|:x:|
|gcc 7.x|:x:|

GCC 7.1 + Boost 1.65.1 + CUDA 9.2.88

Currently Loaded Modules:
  1) modenv/classic (S)   2) cmake/3.10.1   3) git/2.15.1   4) gcc/7.1.0   5) bullxmpi/1.2.8.4   6) boost/1.65.1-gnu7.1   7) cuda/9.2.88

```bash

alpaka/example/bufferCopy/build_cuda

cmake .. && make

Fails with:

```bash
[ 50%] Building NVCC (Device) object CMakeFiles/bufferCopy.dir/src/bufferCopy_generated_bufferCopy.cpp.o
/sw/taurus/libraries/cuda/9.2.88/bin/nvcc /alpaka/example/bufferCopy/src//bufferCopy.cpp -x=cu -c -o /alpaka/example/bufferCopy/build_cuda/CMakeFiles/bufferCopy.dir/src/./bufferCopy_generated_bufferCopy.cpp.o -ccbin /sw/global/compilers/gcc/7.1.0/bin/g++ -m64 -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_FIBERS_ENABLED -DALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLED -DALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLED -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_DEBUG=0 -Xcompiler ,\"-fopenmp\",\"-g\" --expt-extended-lambda --expt-relaxed-constexpr --generate-code arch=compute_30,code=sm_30 --generate-code arch=compute_30,code=compute_30 -std=c++11 --use_fast_math --ftz=false -DNVCC -I/sw/taurus/libraries/cuda/9.2.88/include -I/sw/taurus/libraries/boost/1.65.1-gnu7.1/include -I/alpaka/include
/alpaka/include/alpaka/core/Assert.hpp(102): error: this pragma must immediately precede a declaration

Same error occurs for GCC 5.5 + Boost 1.65.1 + CUDA 9.2.88 and GCC 7.1 + Boost 1.66.0 + CUDA 9.2.88.
This is setup working, but it is only for CUDA 8.

1) modenv/classic (S)   2) gcc/5.3.0   3) bullxmpi/1.2.8.4   4) boost/1.64.0-gnu5.3   5) git/2.15.1   6) cmake/3.10.1   7) cuda/8.0.61

Thank you @tdd11235813 for this report. I have tried out CUDA 9.2 in a branch in CI yesterday as well and triggered the same error.
I had not yet time to investigate this deeper and any help would be appreciated!
This seems to be a newly introduced CUDA 9.2 issue.

Jumper over the last two posts and for reference, we need to include this in Alpaka as well: https://github.com/ComputationalRadiationPhysics/picongpu/pull/2628

(including install docs)

@ax3l I will try to add those supported combination checks today.

@tdd11235813 do you mind opening an new issue for the CUDA 9.2 report?
This here is quite CUDA 9.0-9.1 specific so far.

Also, as a first try, can you just switch the order of ALPAKA_NO_HOST_ACC_WARNING and ALPAKA_FN_HOST_ACC in the erroring lines?

I will close this ticket now because:

  • there is nothing we can do about it
  • there are already newer CUDA versions which support gcc 6 and even gcc 7
  • we added diagnostics in the CMake scripts and print a description why this is unsupported

Agreed, we also documented that CUDA 9.0-9.1 actually just works well with GCC <= 5.5.
Here and in all our install instructions, e.g. here, and build recipes.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ax3l picture ax3l  Â·  5Comments

mxmlnkn picture mxmlnkn  Â·  5Comments

jkelling picture jkelling  Â·  3Comments

tdd11235813 picture tdd11235813  Â·  4Comments

psychocoderHPC picture psychocoderHPC  Â·  5Comments