Rust: `.mod_euc()` produces invalid ouput for small negative floats due to rounding errors

Created on 23 Apr 2018 · 19Comments · Source: rust-lang/rust

The newly added method std::f64::mod_euc() can produce values equal to the modulus when applied to a small negative float and sufficiently big modulus. AFAIK this should never happen.

I tried this code on nightly rustc 1.27.0-nightly (ac3c2288f 2018-04-18) and the version on playground:

#![feature(euclidean_division)]

fn main() {
    let f = -std::f64::EPSILON;
    let m = 3.0f64;

    assert!(f.mod_euc(m) < m);
}

https://play.rust-lang.org/?gist=01a506080183f14067553856de6763ab&version=nightly

I expected to see this happen:

The operation should produce only values in between 0. <= x < modulos.

Instead, this happened:

But it produced a value equal to the modulus, in this case 3.. This is true for all sufficiently big floats relative to the machine epsilon.

Discussion

It is likely a bug in the implementation ignoring some floating point arithmetic oddities. I haven't looked deeper yet.

C-bug T-libs

Source

fkjogu

❤1

Most helpful comment

@varkor If our highest priority is that a.mod_euc(b) < b.abs(), then I agree with you.

I just think it's okay to weaken that requirement for floating points because they are not precise by their very nature.

To give an analogy: Mathematically, we have -PI / 2 < x.atan() < PI / 2 (note the "strictly less than"). However, we have std::f64::MAX.atan() == std::f64::consts::FRAC_PI_2 (proof), although it is technically not in the codomain of atan. For this function, it was apparently okay to extend the codomain from (-PI/2, PI/2) to [-PI/2,PI/2]. I argue that it is also okay to extend the codomain of mod_euc from [0,b.abs()) to [0,b.abs()].

I agree we also want a.div_euc(b) * b + a.mod_euc(b), but I don't see that being mutually exclusive.

So here we definitely have a problem because in the example above, we have to set (-std::f64::EPSILON).div_euc(3.0) == 0.0. The mathematically exact result is -1.0, which is far away from 0.0. In this case, this discrepancy is not defensible by periodicity because the the codomain is not periodic.

fanzier on 24 Apr 2018

👍2

All 19 comments

It is currently implemented as

pub fn mod_euc(self, rhs: f64) -> f64 {
    let r = self % rhs;
    if r < 0.0 {
        r + rhs.abs()
    } else {
        r
    }
}

https://github.com/rust-lang/rust/blob/master/src/libstd/f64.rs#L236

What happens is that because -EPSILON is negative the modulus is added to it. Which still results in the modulus when it is big enough, because the operation is not representable in the given precision.

A more stable implementation would be, I believe, for example something like

pub fn mod_euc(self, rhs: f64) -> f64 {
    ((self % rhs) + rhs.abs()) % rhs
}

fkjogu on 23 Apr 2018

Thanks @fkjogu for the investigation, and good catch!
I wonder whether:

pub fn mod_euc(self, rhs: f64) -> f64 {
    let mut r = self % rhs;
    if r < 0.0 {
        r += rhs.abs();
        if r == rhs {
            return 0.0;
        }
    }
    r
}

would be a slightly better implementation, as it avoids the extra unnecessary division from the second %.

Correspondingly, I imagine div_euc is probably incorrect. The floating-point implementations were added later in the RFC, so they probably didn't get quite as much examination as the integer implementations...

varkor on 23 Apr 2018

👍1

This is also how it behaves in Python:

>>> -1e-18 % 1
1.0

and in Haskell:

λ> import Data.Fixed
λ> (-1e-18) `mod'` 1.0
1.0

For better or worse, it seems that virtually all possible implementations of floored or Euclidean modulus of floating point numbers naturally have this problem unless you explicitly account for this nasty edge case.

ExpHP on 24 Apr 2018

To play devil's advocate though, a colleague of mine has shared his story of the worst bug he ever had to debug, which was basically caused by having exactly this happen in his own handwritten div_floor function (causing some binning algorithm to try to write some data into the "negative first" bin).

But that was in C++, where UB reigns supreme.

ExpHP on 24 Apr 2018

For better or worse, it seems that virtually all possible implementations of floored or Euclidean modulus of floating point numbers naturally have this problem unless you explicitly account for this nasty edge case.

That's very interesting; I wasn't aware of that.

Regardless, this bug should definitely be fixed. I'd feel more comfortable addressing it along with a fix to div_euc, so we don't end up replacing one bad behaviour with another.

varkor on 24 Apr 2018

👍1

This is also how it behaves in Python and in Haskell

True, one could argue that this is just how floating point arithmetic works. Floats certainly break many axioms, for example the uniqueness of the neutral element in the additive group: EPSILON + 10.0 = 10.0. But in a numerical context the non-uniqueness example feels more like an imprecision whereas a the modulus is just no element of the residue class it creates. To me, this feels 'more' wrong and I would suspect, it would introduce more subtle and fundamental bugs.

I'll admit that I do not give a strong argument. Maybe I just got used to only a subset of floating point oddities :wink:.

_Edit:_ Or to argue from another side: I'm okay for an operator like % to behave like that, but if there is a special method, that 'promises' to deliver a Euclidean modulo operation, it should to the right thing, IMHO.

fkjogu on 24 Apr 2018

👍1

a special method that 'promises' to deliver a Euclidean modulo operation should to the right thing

My thoughts exactly. The whole point of the method is to be the "good" version of %, so it should meet the euclidean definition strictly, even if that's more expensive.

scottmcm on 24 Apr 2018

I think that setting this precedent is something I can get behind; just be aware there is another closely-related edge case at -1e-18 % -1.0:

 pub fn mod_euc(self, rhs: f64) -> f64 {
     let mut r = self % rhs;
     if r < 0.0 {
         r += rhs.abs();
-        if r == rhs {
+        if r == rhs.abs() {
             return 0.0;
         }
     }
     r
 }

P.S. bear in mind the ultimate property of any div/mod function pair:

    a == a.div_euc(b) * b + a.mod_euc(b)

I just tested with Python's (flooring) division and float, and see that it generally satisfies this property for pairs of numbers uniformly drawn at random from (-1, 1), but quickly finds a counterexample when asked to draw numbers of wildly different magnitude.

import sys
import random

def uniform(amplitude):
    return amplitude * (2 * random.random() - 1)

amp_1 = float(sys.argv[1])
amp_2 = float(sys.argv[2])

while True:
    a = uniform(amp_1)
    b = uniform(amp_2)
    if a != (a // b) * b + (a % b):
        print(a, b)
        sys.exit(1)

$ python3 find-counterexample.py 1.0 1.0
(CPU spins up)
(time passes.........)
(okay, I'm bored)
^C
$ python3 find-counterexample.py 1.0 1e15
-0.7833003032084154 281330340541134.03
$ python3 find-counterexample.py 1e15 1.0
556555443188291.3 -0.047383938082182775

I wonder how well the corrected definitions fare at this? (my guess: probably still not well, and it might not even be possible to produce outputs satisfying this property for certain inputs. Not sure.)

ExpHP on 24 Apr 2018

I'd like to contribute a different perspective: I consider the current behavior to be reasonable and correct but think we should document that the return value of mod_euc may equal the divisor due to precision problems.

Let me explain why I think so: Normally, the way floating point operation work (at least the basic operations: addition, subtraction, multiplication and division) is that they produce the floating point number that is closest to the mathematically correct result. For example, 3.0 - std::f64::EPSILON == 3.0 because 3.0 is the closest to the (irrepresentable) exact result.

Applying this to the example (-std::f64::EPSILON).mod_euc(3.0), we see that the mathematically exact result is 3.0 - std::f64::EPSILON which should be rounded – as before – to 3.0. If we used the alternate implementation, the result would be 0.0, which is about as far off from the mathematically correct result as it can be.

In addition, if we wanted to preserve the (desirable) property that a.div_euc(b) * b + a.mod_euc(b) is close to b, we would have to implement div_euc in such a way that (-std::f64::EPSILON).div_euc(3.0) == 0.0. But this is very confusing since the Euclidean quotient should always be negative if the dividend is negative and the divisor is positive.

But my main point is that deviating from the principle "floating point results should be as close as possible to the mathematically exact result" requires a really good reason. I don't see why the requirement a.mod_euc(b) != b should be such a good reason; in my opinion, it is not that important, as long as we have 0.0 <= a.mod_euc(b) <= b.abs(). The way how other languages handle this case seems to support my view.

a special method that 'promises' to deliver a Euclidean modulo operation should to the right thing

My thoughts exactly. The whole point of the method is to be the "good" version of %, so it should meet the euclidean definition strictly, even if that's more expensive.

I agree. I just think that the current version is the "best" version of % given that floating point is imprecise.

fanzier on 24 Apr 2018

👍1

Applying this to the example (-std::f64::EPSILON).mod_euc(3.0), we see that the mathematically exact result is 3.0 - std::f64::EPSILON which should be rounded – as before – to 3.0. If we used the alternate implementation, the result would be 0.0, which is about as far off from the mathematically correct result as it can be.

I was about to argue against this, but after thinking about it, you're right.

Basically, I was going to say that the distance to the correct result should be measured in "periodic space;" in this sense, 0.0 is "just as correct" as 3.0, because in the quotient space Reals // 3.0 these two points are one and the same.

But then I realized the flaw in this argument. We can't measure error in periodic space, because otherwise the best implementations would clearly be the following:

fn div_euc(a: f64, b: f64) -> f64 { 0 }
fn mod_euc(a: f64, b: f64) -> f64 { a }

so our definition of error needs to be one that can differentiate between 0 and b.

I am just sharing in case somebody else were to try to make this argument.

ExpHP on 24 Apr 2018

If we used the alternate implementation, the result would be 0.0, which is about as far off from the mathematically correct result as it can be.

But then I realized the flaw in this argument. We can't measure error in periodic space, because otherwise the best implementations would clearly be the following:

I disagree: we _should_ measure the error in the periodic range [0, 3.0), because that's the codomain of the operation. And it's precisely for this reason that 3.0 is absolutely incorrect — it's the closest floating point number _in the entire domain of floating point numbers_, but in the domain [0, 3.0), which is what mod_euc is defined to return, it's not even a valid number. 3.0 is an invalid output. The closest valid output to 3.0 _is_ 0.0, because they are equivalent under the quotient relation, but only one of them is a valid output (that's literally the entire point of modulo).

as long as we have 0.0 <= a.mod_euc(b) <= b.abs()

We should have 0.0 <= a.mod_euc(b) < b.abs(). We can't just change the definition because floating point is frustrating.

I agree we also want a.div_euc(b) * b + a.mod_euc(b), but I don't see that being mutually exclusive.

varkor on 24 Apr 2018

@varkor If our highest priority is that a.mod_euc(b) < b.abs(), then I agree with you.

I just think it's okay to weaken that requirement for floating points because they are not precise by their very nature.

I agree we also want a.div_euc(b) * b + a.mod_euc(b), but I don't see that being mutually exclusive.

fanzier on 24 Apr 2018

👍2

I find @fanzier rather convincing. It is true, we cannot fix all edge cases floats introduce. But no matter the outfall, in my opinion the discussed property should be at least documented, because I find it surprising.

The reason it surprises me more than in the example -PI / 2 < x.atan() < PI / 2 is that PI / 2 is still asymptotically uniquely correct because atan is continuous. mod is not, so at the border/discontinuity it depends from which side you take the limit. In this sense, (-std::f64::EPSILON).mod_euc(3.0) == 3.0 is as asymptotically correct as == 0.0. I'd argue, that it makes more sense to resolve the ambiguity in favor of a valid output. But it definitely should not break a.div_euc(b) * b + a.mod_euc(b).

So my conclusion is, leave it as it is, but add some friendly documentation:

Due to limitations of the floating point format a.mod_euc(b) can evaluate to the mathematically invalid result b if a is much smaller in magnitude than b and a < 0.

If you agree, I can create a pull request.

fkjogu on 25 Apr 2018

I just realized the property a == a.div_euc(b) * b + a.mod_euc(b) is also broken in the current implementation. Try

#![feature(euclidean_division)]

fn main() {
    let m = 3.0f64;
    let f = m - std::f64::EPSILON;

    let q = f.div_euc(m);
    let r = f.mod_euc(m);

    assert_eq!(f, m * r + q);
}

https://play.rust-lang.org/?gist=2a2664c0c1d80746cc3c00518499cfbd&version=nightly

It's like playing Whac-A-Mole.

fkjogu on 25 Apr 2018

in my opinion the discussed property should be at least documented, because I find it surprising.

Definitely, I also said that in my first comment :)

Due to limitations of the floating point format a.mod_euc(b) can evaluate to the mathematically invalid result b if a is much smaller in magnitude than b and a < 0.

I think we should specify that we mean a.mod_euc(b) == b.abs() by "mathematically invalid". So I would prefer something along the lines of "While the mathematically exact result always satisfies a.mod_euc(b) < b.abs(), the rounded floating point result can be a.mod_euc(b) == b.abs() due to floating point precision problems if a is much smaller in magnitude than b and a < 0."

I just realized the property a == a.div_euc(b) * b + a.mod_euc(b) is also broken in the current implementation. Try (...)

No, it is not. (At least not in your example.) Note that f is set to exactly 3.0 in your example (due to rounding), so f.mod_euc(m) == 0.0 is completely correct.

fanzier on 25 Apr 2018

Perhaps it would help to examine the use cases this method is meant to address. If it is supposed to be "as close as possible" (as one usually wants for mathematical functions), it seems to me that sometimes the correctly rounded result will be rounded up to the divisor in the natural implementation (although one could make an argument that 0 is an equally correct result, it's not more correct). If, on the other hand, this method is intended to bring an input into a half-open range (e.g. because you're computing a function that is not defined everywhere, or has a discontinuity you'd rather avoid) then the result should clearly be strictly less than the divisor.

hanna-kruppe on 25 Apr 2018

No, it is not. (At least not in your example.) Note that f is set to exactly 3.0 in your example (due to rounding), so f.mod_euc(m) == 0.0 is completely correct.

I am very sorry, i had a bug in my code (or rather my brain).

#![feature(euclidean_division)]

fn main() {
    let m = 3.0f64;
    let f = m - std::f64::EPSILON;

    let q = f.div_euc(m);
    let r = f.mod_euc(m);

-   assert_eq!(f, m * r + q);
+   assert_eq!(f, m * q + r);
}

Indeed it yields the correct result.

fkjogu on 25 Apr 2018

I have created a pull-request. Feel free to shout out any improvements.

fkjogu on 30 Apr 2018

As far as I'm aware, mod_euc behaves as expected (and is documented as such) now.

varkor on 3 Feb 2019

Was this page helpful?

0 / 5 - 0 ratings