Julia: make `rand` unbiased for floats

Created on 11 Sep 2019  Â·  26Comments  Â·  Source: JuliaLang/julia

The current implementation of samplers for eg Float23 and Float64 effectively generates numbers as n*eps(T), where n is an integer between 0 and 1/eps(T)-1 (inclusive). Eg for Float64 this range is 0:(2^52-1).

This makes the random draws (slightly) biased, by -eps(T)/2.

The proposal is to add this term back to the result in the implementation.

Cf this discussion, especially the proposal of @perrutquist

RNG

Most helpful comment

If an issue with a pseudo RNG isn't picked up by the best random number test suite around then I don't think there is an issue.

There's nothing special about the tests in Test01. Testing that the mean of values sampled from [0,1] is 1/2 is perhaps such a simple test that they didn't think to include it, but clearly it is a valid statistical test and one that we would currently fail given enough samples. So why wouldn't we want to fix it? Or put another way, if the current "best random number test suite" in existence doesn't include a bias test, then we could add a bias test (which we'd currently be failing) and then we would fail the best random number test suite (i.e. the new one that we just added the bias test to).

All 26 comments

Just as a data point, to observe this bias, you would need ballpark

julia> Ts = [Float16, Float32, Float64]
julia> [Ts [(std(Uniform())/(eps(T)/2))^2 for T in Ts]]
3×2 Array{Any,2}:
 Float16  3.49525e5 
 Float32  2.34562e13
 Float64  6.7608e30 

number of trials.

Possibly, but since we can do it right with a very simple correction, why shouldn't we?

cc @rfourquet

Some more checks (right column after adding eps(T)/2):

julia> T = Float16;[["P(rand() <= 2^(-$i)" for i in 0:16] [mean([(n)*eps(T) <= 2.0^(-i) for n in 0:(1/eps(T)-1)]) for i in 0:16] [mean([(n)*eps(T) + eps(T)/2  <= 2.0^(-i) for n in 0:(1/eps(T)-1)]) for i in 0:16] ]
17×3 Array{Any,2}:
 "P(rand() <= 2^(-0)"   1.0          1.0        
 "P(rand() <= 2^(-1)"   0.500977     0.5        
 "P(rand() <= 2^(-2)"   0.250977     0.25       
 "P(rand() <= 2^(-3)"   0.125977     0.125      
 "P(rand() <= 2^(-4)"   0.0634766    0.0625     
 "P(rand() <= 2^(-5)"   0.0322266    0.03125    
 "P(rand() <= 2^(-6)"   0.0166016    0.015625   
 "P(rand() <= 2^(-7)"   0.00878906   0.0078125  
 "P(rand() <= 2^(-8)"   0.00488281   0.00390625 
 "P(rand() <= 2^(-9)"   0.00292969   0.00195313 
 "P(rand() <= 2^(-10)"  0.00195313   0.000976563
 "P(rand() <= 2^(-11)"  0.000976563  0.000976563
 "P(rand() <= 2^(-12)"  0.000976563  0.0        
 "P(rand() <= 2^(-13)"  0.000976563  0.0  


julia> T = Float16;[["P(rand() < 2^(-$i)" for i in 0:16] [mean([(n)*eps(T) < 2.0^(-i) for n in 0:(1/eps(T)-1)]) for i in 0:16] [mean([(n)*eps(T) + eps(T)/2  < 2.0^(-i) for n in 0:(1/eps(T)-1)]) for i in 0:16] ]
17×3 Array{Any,2}:
 "P(rand() < 2^(-0)"   1.0          1.0        
 "P(rand() < 2^(-1)"   0.5          0.5        
 "P(rand() < 2^(-2)"   0.25         0.25       
 "P(rand() < 2^(-3)"   0.125        0.125      
 "P(rand() < 2^(-4)"   0.0625       0.0625     
 "P(rand() < 2^(-5)"   0.03125      0.03125    
 "P(rand() < 2^(-6)"   0.015625     0.015625   
 "P(rand() < 2^(-7)"   0.0078125    0.0078125  
 "P(rand() < 2^(-8)"   0.00390625   0.00390625 
 "P(rand() < 2^(-9)"   0.00195313   0.00195313 
 "P(rand() < 2^(-10)"  0.000976563  0.000976563
 "P(rand() < 2^(-11)"  0.000976563  0.0        
 "P(rand() < 2^(-12)"  0.000976563  0.0  

This doesn't look too bad at first sight.

@mschauer: I am sorry, I don't understand the purpose of your execise.

The bias exists and can be described exactly. There is no need to simulate anything.

The bias is correctable, rather cheaply. Whether it "looks bad" in some statistical tests that may or may not have power for detecting it (it is a rather small bias, after all) is not the issue here.

The key question is whether there are any objections to correcting this bias on the table. Otherwise, I would just make a PR. @rfourquet, I would appreciate your opinion on this.

I am sorry, I don't understand the purpose of your exercise.

A good implementation of rand() would for example have the property that the probabilities of rand() <= p and rand() < p should be equal to p. I checked*) this for our Random.rand(T) and for your Random.rand(T) + eps(T)/2 and concluded "it doesn't look to bad" for your idea because e.g. P(rand() <= 0.5) = 0.5 and not 0.500977. I am not sure why you are a bit defensive here. Why not discuss the consequences of this change?

*) Using the information "rand() effectively generates numbers as n*eps(T), where n is an integer between 0 and 1/eps(T)-1 (inclusive)" you gave above.

Sorry if I came across as defensive, that was not the intention.

The consequences of this change are that rand will be unbiased for floats, including the bias you demonstrated, at no measurable performance loss (seems to be negligible compared to noise):

julia> using Random, BenchmarkTools

julia> rand_corrected(T) = rand(T) + eps(T)/2
rand_corrected (generic function with 1 method)

julia> Random.seed!(1);

julia> @btime foreach(_ -> rand(Float64), 1:1024)
  4.382 μs (0 allocations: 0 bytes)

julia> Random.seed!(1);

julia> @btime foreach(_ -> rand_corrected(Float64), 1:1024)
  4.368 μs (0 allocations: 0 bytes)

The performance loss wouldn't just be negligible, it would be exactly zero. The current implementation is something like rand(r, CloseOpen12()) - 1.0 and this could be replaced by rand(r, CloseOpen12()) - prevfloat(1.0)

I.e. the only change is in a compile-time constant.

My understanding is that this rearrangement is valid because eps is 2^-52 in the [1, 2) range and 1.0-prevfloat(1.0) == 2^-53, but I don't know enough about floating point error to prove it formally. But I can of course make the PR do this.

I can verify the bias.

First the Float16 case: With

rnd(::Type{Float16}, i) = Float16(reinterpret(Float32,
                              (UInt32(i) << 13)  | 0x3f800000) - 1)
rndnew(::Type{Float16}, i) = Float16(reinterpret(Float32,
                              (UInt32(i) << 13)  | 0x3f800000) - 0.9995117f0)

being a deterministic model for rand(Float16) generation where i ranges in 0:0x000003ff, I get

julia> mean(rnd(Float16, i) for i in 0:0x000003ff)
Float16(0.4995)

julia> extrema(rnd.(Float16,  0:0x000003ff))
(Float16(0.0), Float16(0.999))

julia> mean(rndnew(Float16, i) for i in 0:0x000003ff)
Float16(0.5)

julia> extrema(rndnew.(Float16,  0:0x000003ff))
(Float16(0.0004883), Float16(0.9995))

Although floating-point addition/subtraction is not associative in general, it's quite easy to show that (r - 1.0) + eps()/2 == r - (1.0 - eps()/2) in this case (on systems that adhere to the IEEE standard). It follows directly from these two facts:

  1. Addition and subtraction produce correctly rounded results.
  2. The intermediate terms (r - 1.0), eps()/2 and (1.0 - eps()/2) can all be expressed exactly, i.e. are not affected by rounding.

We do not need to add eps(Float16)/2, we need to subtract the right number from r where r in [1,2) such that the result is unbiased in [0,1], which is 0.9995117f0 in my code above. This coincides with julia> Float32((1.0 - eps(Float16)/2))
0.9995117f0, but we do not really need to worry about it.

The problem discussed here appears to be of very small significance in practice. See https://github.com/JuliaLang/julia/pull/8958#issuecomment-62491681 and https://github.com/JuliaLang/julia/pull/8958#issuecomment-66564921

Can you please elaborate on this? Which of the Test01 test do you think would pick this up?

My understanding is that to detect an eps()/2 difference empirically would take a lot of samples, too many for practical purposes, and very precise mean calculations at the same time. I don't think that any "random" test would pick this up practically, but that does not mean it is not worth fixing.

Not for Float64, but for Float16 this can be observed in a fraction of a second

julia> mean((Float64(rand(Float16)) for i in 1:10^7))
0.4994191243164062

julia> mean((Float64(rand(Float16) + eps(Float16)/2) for i in 1:10^7))
0.5000916360351563

Which of the Test01 test do you think would pick this up?

If an issue with a pseudo RNG isn't picked up by the best random number test suite around then I don't think there is an issue. To my understanding, it's the only meaningful way of evaluating pseudo RNGs for statistical applications.

I think that this is a pretty narrow view: in this case the bias is too small to be picked up using a black-box test, but can be demonstrated analytically, and fixed very easily.

Test01 is a very nice suite, but I am not sure that even its authors would claim that what it was designed to pick up everything, and conversely what it doesn't show is by construction not important.

Thinking about it, having P(rand() == 0) too large (see my table) can break Metropolis-Hastings:

using Distributions
using Makie
using Colors
using LinearAlgebra

logπ(x) = logpdf(Normal(), x)
Point2

function mh!(corr, xs, N = 10_000_000, samples = 1:100:N, σ = 2.)
    x = xs[end]
    xmax = maximum(x)
    xmin = minimum(x)
    acc = 0
    for i in 1:N
        xáµ’ = x
        while true
            xᵒ = x + σ*randn(Point2)
            norm(xáµ’) > 3 && break
        end
        if rand(Float16) + corr <= exp(logπ(xᵒ[1]) - logπ(x[1]) + logπ(xᵒ[2]) - logπ(x[2]))
            x = xáµ’
            acc += 1
            xmax = max(x[1], x[2], xmax)
            xmin = min(x[1], x[2], xmin)
        end
        if i in samples
            push!(xs, x)
        end
    end
    @show (xmin, xmax)
    @show acc/N
end
xs = [Point2(3.1, 3.1)]
ys = [Point2(3.1, 3.1)]
mh!(0.0, xs)
mh!(eps(Float16)/2, ys) # with correction

N = length(ys)
ii = randperm(2N)
scatter([xs;ys][ii], markersize=0.2, color=[fill(:blue, N);fill(:red, N)][ii])

Output

(xmin, xmax) = (-11.804179996165205, 10.500512563920687)
acc / N = 0.1778087
(xmin, xmax) = (-5.7318743113302615, 5.766862517806166)
acc / N = 0.176106

Screenshot 2019-09-13 at 12 42 48

Short explanation: A bivariate Normal(0,1) conditioned on having distance larger than 3 to the origin, sampled with symmetric random walk Metropolis-Hastings. Should "never" values as large as 11, because

julia> cdf(Normal(), -11)
1.9106595744986566e-28

That's a nice demo for Float16. In practice I think one would use Float64 though, where the issue may not be this prominent. But it should be fixed; I will make a PR.

Or -randexp(x) < ll instead of x < exp(ll). Anyway: yes, this is just meant as illustration.

If an issue with a pseudo RNG isn't picked up by the best random number test suite around then I don't think there is an issue.

There's nothing special about the tests in Test01. Testing that the mean of values sampled from [0,1] is 1/2 is perhaps such a simple test that they didn't think to include it, but clearly it is a valid statistical test and one that we would currently fail given enough samples. So why wouldn't we want to fix it? Or put another way, if the current "best random number test suite" in existence doesn't include a bias test, then we could add a bias test (which we'd currently be failing) and then we would fail the best random number test suite (i.e. the new one that we just added the bias test to).

Would we fail that test suite because of the bias test? It would take a huge number of draws for Float64. It may be that another test already on it would fail us with fewer draws.

However, I am a fan of this issue and associated PR.

No, detecting such a small bias using black-box statistical techniques is prohibitively expensive. From the CLT,

S√n ∼ N(1/2, 1/12)

where S is the running mean and 1/12 is the variance of the uniform on [0,1]. The 99.5% quantile for N(0, 1/12) (assuming we are aiming for a two-tailed test at a 1% significance level) is around 0.75. So you would need around

julia> abs2(2*0.75/eps())
4.563542160821626e31

draws to detect this reliably.

Even detecting something much more crude is expensive. Eg a bias of 1000*eps(), would require around 4e25 draws. For 1e9 Float64s/second (*/ factors of 10, depending on the hardware) which is a reasonable ballpark, this would still take 5e9 years. (Hope I calculated these things right, corrections welcome).

Thats correct, I got numbers of the same order (see my first post, using a slightly simpler approach of just checking after how many samples the bias exceeds the standard error of the mean.)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sbromberger picture sbromberger  Â·  3Comments

helgee picture helgee  Â·  3Comments

dpsanders picture dpsanders  Â·  3Comments

Keno picture Keno  Â·  3Comments

wilburtownsend picture wilburtownsend  Â·  3Comments