Alpaka: Simple Summary: Areas of Alpaka Abstraction

Created on 3 Aug 2018  路  3Comments  路  Source: alpaka-group/alpaka

We should summarize in the docs which areas Alpaka abstracts and which not.
This makes it easier for users to understand where to look for certain things when building on Alpaka.

Currently it does abstract (feel free to edit):

  • thread scheduling & indexing
  • math primitives (abs, cos/sin/tan, exp, pow, ...)
  • random number generators
  • memory allocation: device-global, block-shared, thread-local
  • memory atomics
  • timing
  • ...
  • future: BLAS primitives in/ex kernel, e.g. CUTLASS-like?

Batteries not included:

  • memory: layout, pre-fetch/caching strategy, ...
  • containers & algorithms
  • ...
Help Wanted Documentation

All 3 comments

I think 'memory access' is probably too vague. E.g. alpaka abstracts shared memory for blocks, which could be counted as memory access. Did you mean that it does not abstract data layout ?

"Task scheduling" indirectly covers device command submission (command queues, CUDA stream...), but it does not obviously cover device-host synchronization (events, wait for queue/device, etc), which Alpaka must also abstract. You will want to discuss that as well.

I agree with @HadrienG2 that we should add device-host synchronization explicitly, as it seems indeed not covered by other items from the list.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

psychocoderHPC picture psychocoderHPC  路  4Comments

psychocoderHPC picture psychocoderHPC  路  5Comments

BenjaminW3 picture BenjaminW3  路  3Comments

ax3l picture ax3l  路  5Comments

BenjaminW3 picture BenjaminW3  路  6Comments