I get the following error message when trying to invoke a kernel created with WorkDivMembers (to manually set launch parameters). I can't determine what this error means, but I suspect the WorkDivMembers might not be used as much as getValidWorkDiv, so perhaps the former is less advanced?
/inst/include/alpaka/dim/Traits.hpp(27): error #70: incomplete type is not allowed
detected during:
instantiation of type "alpaka::Dim<alpaka::DevUniformCudaHipRt>"
/inst/include/alpaka/kernel/Traits.hpp(275): here
instantiation of "auto alpaka::createTaskKernel<TAcc,TWorkDiv,TKernelFnObj,TArgs...>(const TWorkDiv &, const TKernelFnObj &, TArgs &&...) [with TAcc=alpaka::DevUniformCudaHipRt, TWorkDiv=alpaka::WorkDivMembers<Dim, Idx>, TKernelFnObj=VectorAddKernel, TArgs=<>]"
/proj/src/vecMain.cpp(52): here
instantiation of "int vecAdd(const Acc &, TQueue &, const TVec &) [with Acc=alpaka::DevUniformCudaHipRt, TQueue=alpaka::Queue<alpaka::ExampleDefaultAcc<Dim, Idx>, alpaka::property::Blocking>, TVec=Vec]"
/proj/src/vecMain.cpp(73): here
/inst/include/alpaka/kernel/Traits.hpp(275): error #276: name followed by "::" must be a class or namespace name
detected during:
instantiation of "auto alpaka::createTaskKernel<TAcc,TWorkDiv,TKernelFnObj,TArgs...>(const TWorkDiv &, const TKernelFnObj &, TArgs &&...) [with TAcc=alpaka::DevUniformCudaHipRt, TWorkDiv=alpaka::WorkDivMembers<Dim, Idx>, TKernelFnObj=VectorAddKernel, TArgs=<>]
...
The reproducing code:
#include <alpaka/alpaka.hpp>
#include <alpaka/example/ExampleDefaultAcc.hpp>
#include <iostream>
#include <random>
#include <typeinfo>
using Dim = alpaka::DimInt<1u>;
using Idx = std::uint32_t;
using Vec = alpaka::Vec<Dim, Idx>;
//#############################################################################
//! A vector addition kernel.
class VectorAddKernel
{
public:
ALPAKA_NO_HOST_ACC_WARNING
template<typename TAcc>
ALPAKA_FN_ACC auto operator()(TAcc const& acc) const -> void { }
};
/* Declare and run the VectorAddKernel */
template<typename Acc, typename TQueue, typename TVec>
int vecAdd(const Acc &devAcc, TQueue &queue, const TVec &numElements) {
// Launch with one warp per thread block:
Idx const warpExtent = alpaka::getWarpSize(devAcc);
TVec const gridBlockExtent = TVec::all(Idx(numElements[0] / warpExtent));
TVec blockThreadExtent = TVec::all(Idx(warpExtent));
TVec elemExtent = TVec::all(Idx(1));
alpaka::WorkDivMembers<Dim, Idx> workDiv{gridBlockExtent, blockThreadExtent, elemExtent};
// Create the kernel execution task.
VectorAddKernel K{};
auto const taskKernel = alpaka::createTaskKernel<Acc>(workDiv, K);
// Enqueue the kernel execution task
alpaka::enqueue(queue, taskKernel);
return 0;
}
int main() {
using Acc = alpaka::ExampleDefaultAcc<Dim, Idx>;
auto const devAcc = alpaka::getDevByIdx<Acc>(0u);
std::cout << "Using alpaka accelerator: " << alpaka::getAccName<Acc>() << std::endl;
using QueueAcc = alpaka::Queue<Acc, alpaka::Blocking>;
QueueAcc queue(devAcc);
// Define the work division
Idx const numElements(123456);
const Vec extents(numElements);
vecAdd(devAcc, queue, extents);
return 0;
}
It was compiled with stock gcc 8.4.0-1ubuntu1~18.04 and cuda toolkit V10.2.89 using flags -DALPAKA_ACC_GPU_CUDA_ENABLE=ON -DALPAKA_ACC_GPU_CUDA_ONLY_MODE=ON and:
cmake_minimum_required(VERSION 3.15)
set(_TARGET_NAME pairEn)
project(${_TARGET_NAME})
#-------------------------------------------------------------------------------
# Find alpaka.
find_package(alpaka REQUIRED)
#-------------------------------------------------------------------------------
# Add executable.
alpaka_add_executable(
${_TARGET_NAME}
src/vecMain.cpp)
target_compile_features(${_TARGET_NAME} PUBLIC cxx_std_14)
target_link_libraries(
${_TARGET_NAME}
PUBLIC alpaka::alpaka)
install(TARGETS ${_TARGET_NAME} DESTINATION bin)
I also tried with cxx_std_11 and cxx_std_17 and got the same error.
You are mixing up accelerators and devices.
vecAdd(devAcc, queue, extents);
template<typename Acc, typename TQueue, typename TVec>
int vecAdd(const Acc &devAcc, TQueue &queue, const TVec &numElements) {
...
You pass in devAcc which is a device and not an accelerator. Acc is deduced to the type of a device. When you call alpaka::createTaskKernel<Acc>(workDiv, K);, you instantiate the function with the type of a device and not with an accelerator. And the Dim trait seems to not be specialized for devices.
You can make a simple change:
vecAdd<Acc>(devAcc, queue, extents);
template<typename Acc, typename Dev, typename TQueue, typename TVec>
int vecAdd(const Dev&devAcc, TQueue &queue, const TVec &numElements) {
...
and it should work then (works for me).
Confirmed - this fixes the issue, and solves related troubles I've been having.
I guess I assumed that devAcc would have the type Acc
in auto const devAcc = alpaka::getDevByIdx<Acc>(0u) but I should have checked.
Most helpful comment
Confirmed - this fixes the issue, and solves related troubles I've been having.
I guess I assumed that devAcc would have the type Acc
in
auto const devAcc = alpaka::getDevByIdx<Acc>(0u)but I should have checked.