Extract this issue from https://github.com/dotnet/coreclr/issues/10651#issuecomment-315667429
Generally speaking, double.ToString("R")
tries to:
This introduced a bug and also listed as a known issue on MSDN (See "Notes to Callers"):
https://stackoverflow.com/questions/24299692/why-is-a-round-trip-conversion-via-a-string-not-safe-for-a-double
https://msdn.microsoft.com/en-us/library/kfsatb94(v=vs.110).aspx
To reproduce this issue, simplely try following code:
using System;
namespace lab
{
class Program
{
static void Main(string[] args)
{
double d1 = 0.84551240822557006;
string s = d1.ToString("R");
double d2 = double.Parse(s);
Console.WriteLine(d1 == d2);
}
}
}
The output is False
which we expect True
. The reason is d1.ToString("R") wrongly chose the result in precision of 15, which is 0.84551240822557. If we choose result in precision of 17, which is 0.84551240822557006, we can convert it back to double accurately.
The proposal is just convert the double to string in 17 digits precision directly. We can give it a try but it may be a breaking change with compat issue as @jkotas said here: https://github.com/dotnet/coreclr/issues/10651#issuecomment-315839827
For the bug: 0.84551240822557
is 3FEB0E7009B61CDF
and 0.84551240822557006
is 3FEB0E7009B61CE0
, so the delta for their bit representations is 1.
So, it would seem that there is an off-by-one error somewhere in NumberToDouble
.
I would think we should find this off-by-one error before taking any other course of action.
It looks like we are taking the incorrect path here: https://github.com/dotnet/coreclr/blob/master/src/classlibnative/bcltype/number.cpp#L534.
Rounding to even in this scenario causes us to generate the incorrect value.
In any case, It's late and I'm not going to finish processing exactly what it is doing wrong tonight.
Okay, the rounding logic is correct, so the error must be earlier on...
val
is 0xD873804DB0E6FC00
, this is the mantissa (0xB0E7009B61CDF) + the guard bit.
The logic for 'round to even' is, do nothing if the guard bit is 0 or if the guard bit is 1 and the LSB of the mantissa is 0. In this case, both the guard bit and the LSB of the mantissa are 1, so it rounds up setting val
to 0xD873804DB0E70000
.
I'm going to keep digging to see if the guard bit is being set incorrectly at some point.
Ok. Implementation looks and behaves correctly, as best as I can tell.
After going through the IEEE spec again, it comes down to these two rules:
Conversions from a supported binary format bf to an external character sequence and back again results in a copy of the original number so long as there are at least Pmin (bf ) significant digits specified and the rounding-direction attributes in effect during the two conversions are round to nearest rounding-direction attributes.
Conversions from a supported decimal format df to an external character sequence and back again results in a canonical copy of the original number so long as the conversion to the external character sequence is one that preserves the quantum
For double-precision, the first rule is essentially: 'a double converted to a string with at least 17 digits and back must match'
The second rule is essentially: a string with 15 or less digits converted to a double and back to a string with the same number of digits must match
Here, we are only converting from a double to a string with 15 digits and back, so we are not meeting the requirements put forth in the IEEE spec.
IMO, it is worth doing one of the following:
I believe for net core we can go ahead and take the breaking change there without any quirks. for the desktop, if we need to apply the fix there we have to introduce it with a quirk.
The second option listed by @tannergooding is possible too but I prefer to fix "R" to do the right thing as introducing a new format specifier will make "R" useless anyway.
CC @AlexGhiondea
@tannergooding are you starting fixing this? If not, I can help working for this issue when the option decided.
@mazong1123, not yet. I was going to wait for a couple more people (those tagged already) to chime in first (with regards to back-compat concerns).
As I mentioned we can apply the fix in .Net Core without any quirks. if we want to apply this fix from now, we need to ensure there is no perf regression when using "R' options.
CC @jkotas @AlexGhiondea
@tannergooding are you working on this? If not, I'd like to send a PR to see if it would break anything.
@mazong1123 go ahead and submit the PR. please try to measure the perf with your changes to know always formating with 17 significant digits can change the perf.
Tagging all the people who were involved in https://github.com/dotnet/coreclr/pull/13140, plus a couple others. @eerhardt, @jkotas, @danmosemsft, @tarekgh, @AlexGhiondea, @gafter, @joperezr, @JamesNK, @terrajobst
As indicated by this issue, the current handling of "R" is incorrect and results in non-roundtrippable values for certain scenarios. The current handling is:
This also causes a fairly decent perf hit, as compared to just converting to G17 directly.
The people who are using "R" today, are likely doing so for serialization purposes (they want to be able to round-trip the result) and they would be broken for any of the values that don't currently round-trip successfully. There are also likely some repos (like Newtonsoft.JSON, but possibly also CoreFX, CoreCLR, and Roslyn) which are reliant on the current behavior. Some examples would be in tests where they are directly comparing the "expected" vs "actual" string produced (rather than parsing the strings back to floating-point values and validating that those are equal).
My opinion is that we should take this breaking change, as it results in more correct/compliant behavior; However, I believe this is a large enough breaking change that we should also provide a "quirk" switch and ensure that the change is properly documented/announced to the public (with the reasoning of why the change is being made).
I think we need to think in the scenario exchanging data between the .Net Core and the full framework. Serialization scenario can cross the framework versions. I think having the opt-out switch is reasonable but I heard before we use the switches in .Net Core only in scenarios prevents migration to .Net Core.
I think the general point is that serialization can occur anywhere, and incorrect serialization is bad.
We are already in a world where CoreFX is doing the wrong thing, which means we are broken for converting to/from anything that isn't also doing the same broken behavior. As an example, serializing from C# and then deserializing that value in C++ (or vice versa) can wind up with different results (and may not be round-trippable).
My preference is to ensure we are IEEE compliant here, and actually round-trippable with anything else that is IEEE compliant (when using the "round-trippable" specifier).
My preference is to ensure we are IEEE compliant here
Agree.
we use the switches in .Net Core only in scenarios prevents migration to .Net Core
We add quirk switches to .NET Core re-actively, only once we get feedback that they are really needed for something important. We do not add them proactively like in .NET Framework just because of they might be needed in theory.
For reference, the relevant bits from the IEEE spec are under "5.12 Details of conversion between floating-point data and external character sequences":
Some excerpts that are particularly indicative of the requirements are:
Implementations shall provide conversions between each supported binary format and external decimal character sequences such that, under
roundTiesToEven
, conversion from the supported format to external decimal character sequence and back recovers the original floating-point representation, except that a signaling NaN might be converted to a quiet NaN.
For the purposes of discussing the limits on correctly rounded conversion, define the following quantities:
― for binary16,Pmin (binary16)
= 5
― for binary32,Pmin (binary32)
= 9
― for binary64,Pmin (binary64)
= 17
Conversions to and from supported decimal formats shall be correctly rounded regardless of how many digits are requested or given.
Conversions from a supported binary format
bf
to an external character sequence and back again results in a copy of the original number so long as there are at leastPmin (bf)
significant digits specified and the rounding-direction attributes in effect during the two conversions are round to nearest rounding-direction attributes
There are others covered in the pages as well, but the above should cover the current issue.
@jkotas, @danmosemsft.
What (if anything) else do we need to do here to get such a breaking API change approved?
Additionally, once (and if) this is approved, what needs to be done to communicate the breaking change to things like Newtonsoft.JSON (where they are known to be impacted, but will need to take a fix)? -- CC @JamesNK
I don't think this would break Newtonsoft.Json, it would just change the value of serialized doubles. Of course that could break people who depend on the current serialized value.
Only action in Newtonsoft.Json would likely be fixing some unit tests where the serialized double value has changed.
Now that dotnet/coreclr#22040 is merged, for .NET Core 3.0, ToString(), ToString("G"), ToString("R"), double.ToString("G17"), and float.ToString("G9") should now always produce a string that will roundtrip to the original value..
I am closing this as https://github.com/dotnet/coreclr/issues/3313 is tracking the addition of an explicit testcase for 0.84551240822557006
.
@tannergooding sorry to be thick but isn't this issue effecting both .net framework and .net core? It looks like you closed it based on a commit for coreclr but would not the issue still stand for other CLR implementations?
Most helpful comment
Now that dotnet/coreclr#22040 is merged, for .NET Core 3.0, ToString(), ToString("G"), ToString("R"), double.ToString("G17"), and float.ToString("G9") should now always produce a string that will roundtrip to the original value..
I am closing this as https://github.com/dotnet/coreclr/issues/3313 is tracking the addition of an explicit testcase for
0.84551240822557006
.