Currently the generation method for random numbers for a accelerator is fixed defined within alpaka.
To provide different generators per accelerator depending of the users needs we should think about an interface change.
e.g. PIConGPU provides different methods up to my pull request to use the native alpaka generator which removes the possibility for the user to control the quality of the RNG generator.
In alpaka the generators are already seperated from the distribution.
So in theory it would be possible to use different generators. However, there is some work to do:
alpaka::rand::generator::createDefault) This generator should be made its own class.~ There are some other generators provided by the C++ standarad library which could be added to RandStl.hpp.alpaka::rand::generator::createDefault simply uses an unspecified generator.Points 1, 2 and 5 can be solved, but point 4 depends on point 3 which may be very hard.
Edit: point 5 has been solved.
Edit: parts of point 2 have been solved.
Thanks for the summary. Point 3 is one reason why I opened this pull request. I think it is not possible or to hard to maintain that we have all Generators on all platforms.
I will use this issue also for thinking about solution how we can handle that each platform maybe ships different algorithm and never the less give the user the opportunity to write code without #if to support the differences between the platforms.
One idea is to create something like a factory where the user can set properties like quality, performance and memory usage and gets back the type of the best fitting generator. If a platform has only implemented one algorithm than there will be always the same generator returned.
Such a factory might be the only viable option. It might be hard to find the correct properties to describe the generators.
I will work on point 5 and write some unit tests for the existing generators/distributions because I am already adding some stream and event unit tests at the moment.
Admittedly, a typical PIConGPU 0.4.0-dev simulation on Tesla P100 currently uses (wastes) 18-25% of its main memory (3 our of 12/16 GByte) just to the RNG state. Can we do anything to allow backend-specific RNGs like the one we had before https://github.com/ComputationalRadiationPhysics/picongpu/pull/2226 again (~50% mem footprint)? It would be totally fine if that RNG is only usable on a specific backend (e.g. via a less-specific wrapper/factory as above) and an other implementation (and API) is used on other backends.
cross-linking https://github.com/ComputationalRadiationPhysics/picongpu/pull/2410 as @psychocoderHPC work-arounds back the XorMin (6xint32 state per thread; 50% footprint of current RNG) back into PIConGPU in parallel
enum class
Generator {
Default,
MersenneTwister
// , ...
};
// ...
auto genMersenneTwister = alpaka::rand::generator::create<
alpaka::rand::Generator::MersenneTwister
>(
acc,
12345u,
6789u
);
We discussed this in today's meeting. @sliwowitz is currently working on a separate RNG library on top of alpaka that will adress this issue. This is therefore WONTFIX and will be closed once the new RNG library is public.
Most helpful comment
Thanks for the summary. Point 3 is one reason why I opened this pull request. I think it is not possible or to hard to maintain that we have all Generators on all platforms.
I will use this issue also for thinking about solution how we can handle that each platform maybe ships different algorithm and never the less give the user the opportunity to write code without #if to support the differences between the platforms.
One idea is to create something like a factory where the user can set properties like quality, performance and memory usage and gets back the type of the best fitting generator. If a platform has only implemented one algorithm than there will be always the same generator returned.