Glow: Glow should provide its own implementation of a random number generator to ensure deterministic testing and avoid platform-dependent differences

Created on 28 Aug 2018  路  6Comments  路  Source: pytorch/glow

Glow currently uses the Mersenne Twister engine, which is std::mt19937.

It turns out that this random number generator seeded with the same values generates different random numbers depending on the OS, e.g. it produces different results on MacOS and Linux.

Most likely, the implementation of std::mt19937 is somehow platform-dependent.

To avoid such problems in the future, Glow could use its own in-tree implementation of a random-number generation, which has no target dependencies. It can even be an implementation of mt19937.

enhancement good first issue

Most helpful comment

The function of std::mt19937 is completely specified by the C++ standard, so it should be portable. On the other hand, the operation of uniform_int_distribution is apparently not fully specified, so it is possible that different implementations can return different numbers.

The simplest fix is probably to avoid the *_distribution templates and use the MT output directly.

All 6 comments

@opti-mix A long time ago Glow had a target independent stable doom1-style random number generator:

https://github.com/pytorch/glow/blob/f6b610cc092c08ced8fa6c5ab34cb1ac113590c2/src/glow/Support/Random.cpp

The function of std::mt19937 is completely specified by the C++ standard, so it should be portable. On the other hand, the operation of uniform_int_distribution is apparently not fully specified, so it is possible that different implementations can return different numbers.

The simplest fix is probably to avoid the *_distribution templates and use the MT output directly.

Hi,
I am new to this repo, I found on reddit a similar issue https://www.reddit.com/r/cpp/comments/7i21sn/til_uniform_int_distribution_is_not_portable/ and that the boost library provides a portable implementation and further on stackoverflow there is an example implementation https://stackoverflow.com/questions/2254909/boost-random-number-generator, is it alright if I followed that implementation for this issue and make a PR. The other option is to write our own function like here https://stackoverflow.com/questions/34903356/c11-random-number-distributions-are-not-consistent-across-platforms-what-al Thanks!

Hi @weimin -- I believe we do already take Boost as a dependency in order to support folly. So that is a potential option. However, another option would be to follow @stoklund's suggestion, and IMO that would be preferable, since I believe should be pretty simple and limit our dependence which I personally would prefer.

The simplest fix is probably to avoid the *_distribution templates and use the MT output directly.

Hi,
Thanks for your reply, I tested the MT output:

int main(){
  typedef std::chrono::high_resolution_clock myclock;
  myclock::time_point beginning = myclock::now();
  myclock::duration d = myclock::now() - beginning;
  unsigned seed = d.count();
  std::mt19937 generator(seed);
  std::map<int, int> hist;
  int a = 1;
  int b = 6;
  double dScale = ((double)b+1 - (double)a) / ((double)(generator.max() - generator.min())+(double)1);
  std::cout << "scale: " << dScale << "\n";
  for(int n=0;n<10000;++n){
    int output = (generator() - generator.min()) * dScale  + a;
    ++hist[output];
  }
  for (auto p : hist) {
        std::cout << std::fixed << std::setprecision(1) << std::setw(2)
                  << p.first << ' ' << std::string(p.second/200, '*') << '\n';
  }
  return 0;
}
(base) Weis-MBP:prng weimin$ ./test
scale: 1.39698e-09
 1 ********
 2 ********
 3 ********
 4 ********
 5 ********
 6 ********

However I am having some problems building the code to do more tests, I am using os X Catalina (10.15.6) and I followed the instructions to build it using homebrew and I also updated XCode but its giving me the following errors:
CMakeError.log
CMakeOutput.log

I also tried to build on aws (Ubuntu 18.04) and got much further but it still didn't complete at ninja all:
[428/674] Linking CXX executable bin/model-profiler
FAILED: bin/model-profiler
CMakeError.log
CMakeOutput.log
output.txt

Do you have any suggestions on how to build the code, in the error log it looks like the
clang: error: unknown argument: '-wd654'
clang: error: unknown argument: '-wd654'
ninja: build stopped: subcommand failed.
issue from a previous post.

Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dati91 picture dati91  路  3Comments

wayneshawn picture wayneshawn  路  3Comments

tlepley-cadence picture tlepley-cadence  路  4Comments

rdzhabarov picture rdzhabarov  路  4Comments

speryt picture speryt  路  3Comments