PR #4536 introduced the best PNRG method Crystal has had to date. As we want to remove anything that isn't essential from the standard library, we will only keep PCG32 and create a new official shard to provide the rest of the PNRG methods.
PRs to remove the methods from the stdlib are welcome!
Also, create a shard with the extracted methods here: https://github.com/crystal-lang/crystal-random
It seems I can't create a PR to shard as it's empty.
PCG32 has not proven itself by being used in the standard library of any serious programming language, or otherwise. Making it the default was bad, removing alternatives is madness.
If it doesn't prove itself, it can certainly be replaced before 1.0. But there is surely no value in having more than one generator in the stdlib.
@oprypin do you know of any test suites to compare PRNGs? Everyone in the #4536 agreed that it was an appropriate default for the stdlib. The only way to settle this discussion is by running tests and comparing metrics: performance, coverage, etc.
comparing metrics: performance, coverage
That gets us nowhere. return 4 would easily win.
Without being deeply involved in the field, the only way to compare PRNGs is by the level of trust. There is no rigorous way to prove randomness.
PCG32 has not proven itself by being used in the standard library of any serious programming language
Yes, and I wonder why. Xorshift varions are used in several languares though (as I mentioned in #4493) - would you prefer it? Is it discussion of Xorshift vs PCG or MT19937+ISAAC vs PCG?
Few more links about PRNGS:
https://sunoru.github.io/RandomNumbers.jl/stable/man/benchmark/
https://cocoawithlove.com/blog/2016/05/19/random-numbers.html
https://www.reddit.com/r/programming/comments/4gtlfz/xoroshiro128_the_fastest_fullperiod_pseudorandom/
It seems that most people agree that xoroshiro is slightly faster than PCG. Same was results on my PC, but @ysbaddaden have results on both x32 and x64 machines where PCG was faster. The only thing everybody agree - Mersenne Twister is outdated. It is slower, uses more memory, easily predictable and fail some statistical tests. So IMHO replacing it with any modern PRNG is a good thing. Of those PRNGS the choice is (again, IMHO) between PCG and Xorshift. Personally i'm slightly inclined towards PCG, as I was told about it on local gamedev forum by some people with serious mathematic background. But yes, several languages prefer xorshift, and PCG is somewhy isn't used in any stdlib.
There is no rigorous way to prove randomness.
There is a TestU01. Both PCG and Xorshift passes it.
I don't care so much about merits of PRNGs and I am not directing the communication that way. All I'm saying is the standard library should have a trusted PRNG, not only one that performs fast or whatever.
@oprypin so is a PRNG that is used by Erlang, JavaScript V8, Nim and Rust (Rust is using ISAAC by default though) trusted enough? Would it better if we switch to it instead of PCG?
Another idea is to let ISAAC stay (as @ysbaddaden proposed IIRC) - it is cryptographically secure, time-proven and requires zero efforts as it is already here.
I'm not knowledgeable enough on the topic of randomness, so I asked @jedisct1 (libsodium author):
PCG鈥檚 good. I would use ChaCha20/12 if speed is not the primary concern, though.
Everything I read says the same: PCG32 and xorshift* are both good. They pass the TestU01 statistical test suite, and they're fast. They aren't cryptographically secure AFAIK, so documentation should state that, and redirect to Random::System, and ChaCha20 (maybe ISAAC), available in the https://github.com/crystal-lang/crystal-random Shard.
True entropy is really the provenance of the OS - has access to boot time entropy pools, mixes inputs, and has direct hwrng access if available. Software-only solutions in userspace are inherently at a disadvantage against attackers, so I'd lean towards a performant prng in stdlib, with instructions for users on getting heavy lifting done by the OS if possible, or using a proven slower one from a separate lib.
I think distributing a proven and trusted secure PRNG along the faster default and the system source is a good idea. We'd have a fast yet good default, a secure but system-specific solution, and a secure solution using the same PRNG, whatever the underlying system.
Maybe we could introduce ChaCha12 (or ChaCha20 if we're conservative) instead of ISAAC. The implementation is simpler, it's well trusted and used in many places. ChaCha20 performed poorly in non release mode (which may or may not a concern) but I suppose ChaCha12 would do better?
this stackexchange answer seems to agree that ChaCha12 is a fine choice. PCG32 seems to be fast and "challenging" to predict compared to MT19937, but still not fully secure. Therefore PCG32 appears to be a very good all-rounder, a good default. ChaCha12 seems to be a good choice for cryptographically secure random. In addition to providing /dev/urandom, I think this is the best solution.
ChaCha12 should take 60% time of ChaCha20 +- overhead. I've updated my benchmark at https://github.com/konovod/prngs with it, and added to table results in non-release mode.
Note that this is an implementation based on ChaCha20 RFC/docs, not an optimized SSE version mentioned at pcg-random or (possibly optimized?) version used in arc4random.
Speaking about performance, I have only basic experience in optimization, didn't tried to check generated assembly code (and don't know SSE anyway).
But there is a huge increase in ChaCha1220 speed in release mode (and it is still passing specs in it), so perhaps LLVM knows it's work.
Also, both PCG32 and ChaCha20 (and also XorShift) use rotl command (rotating shift). According to @funny-falcon comment, crystal fails to convert it to single instruction (i didn't checked it in my case). So perhaps all three algorithms could be made much faster if there would be the way to generate rot instruction (but it seems more like an LLVM problem).
About ISAAC - i don't know about attacks on it, there is even unclaimed bounty on it's page for discovering a seed from sequence, but it is based on RC4 that is known to be vulnerable. So ChaCha is a safer bet.
crystal fails to convert it to single instruction
I didn't said it always fails. But it fails in unpredictable situations, so it should be always checked.
Looks like, combining shift+or into rotl is late in optimisations chain, and previous optimisations sometimes break this conversion.
but it is based on RC4
Same way SHA2 is based on MD5. But SHA2 is still not broken.
There is improved version ISAAC+ from Jean-Phillipe Aumasson "On the pseudo-random generator ISAAC"(https://131002.net/data/papers/Aum06b.pdf)
Does LLVM provide intrinsics for rot or rotl that we could take advantage of?
AFAIK no.
There is no rot intrinsic.
LLVM sources have only macros like this:
#define rot( x, k ) (((x)<<(k)) | ((x)>>(32-(k))))
Usually it is optimized by LLVM into rot instruction as @funny-falcon already said.
There is improved version ISAAC+ from Jean-Phillipe Aumasson "On the pseudo-random generator ISAAC"(https://131002.net/data/papers/Aum06b.pdf)
This require very little changes to ISAAC - replace shifts to rotation, one addition to xor, add xor in one place. I did it in my prngs shard. Makes thing slower by 5-10%. The only thing that stops me from making PR - i haven't found any implementations of it. So no specs or test vectors - i can only generate some values and call it spec, hoping that i implemented it right.
Interesting, but no test vectors nor usages is a show stopper, yes.
Just created an empty shared at this repo: https://github.com/crystal-lang/crystal-random. PRs to move stuff from the stdlib welcome!
This issue can be closed. In stdlib there's only Random::PCG32, Random::ISAAC and Random::Secure. Everything else has been moved to https://github.com/crystal-lang/crystal-random
Most helpful comment
I'm not knowledgeable enough on the topic of randomness, so I asked @jedisct1 (libsodium author):
Everything I read says the same: PCG32 and xorshift* are both good. They pass the TestU01 statistical test suite, and they're fast. They aren't cryptographically secure AFAIK, so documentation should state that, and redirect to
Random::System, and ChaCha20 (maybe ISAAC), available in the https://github.com/crystal-lang/crystal-random Shard.