Json: Loss of precision when serializing <double>

Created on 17 Nov 2016 · 38Comments · Source: nlohmann/json

It seems that precision is lost when serializing a double. I cannot say why since
std::numeric_limits::digits10 should provide enough digits !?
but if I change that to:
std::numeric_limits::digits10 then I'm not loosing anything.
I'm NOT using and "long double" types. Only "double".

proposed fix

Source

matspetter

Most helpful comment

It doesn't use the ostream formatting for floating point numbers.
https://github.com/nlohmann/json/blob/master/src/json.hpp#L8324
https://github.com/nlohmann/json/blob/develop/src/json.hpp#L6701

If you change these to max_digits10 instead of digits10 it should fix your failures.

gregmarr on 30 Oct 2017

👍2

All 38 comments

Do you have an example to check?

nlohmann on 17 Nov 2016

digits10
number of decimal digits that can be represented without change
that is, any number with this many decimal digits can be converted to a value of type T and back to decimal form, without change due to rounding or overflow.

max_digits10
number of decimal digits necessary to differentiate all values of this type
that is, the number of base-10 digits that are necessary to uniquely represent all distinct values of the type T, such as necessary for serialization/deserialization to text.

gregmarr on 17 Nov 2016

Ok lets see. This code was run on a Mac.
g++ --version

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/c++/4.2.1
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin16.1.0

#include <string>
#include <sstream>
#include <iostream>
#include <limits>

int main(int argc,char** argv)
{
    double v = 100000000000.1236;

    int p1 = std::numeric_limits<double>::digits10;
    int p2 = std::numeric_limits<double>::max_digits10;

    // stream with precision == digits10
    std::stringstream ss;
    ss.precision(p1);
    ss << v;
    std::cout << "digits10     " << p1 << ": " << ss.str() << std::endl;

    // stream with precision == max_digits10
    std::stringstream ss2;
    ss2.precision(p2);
    ss2 << v;
    std::cout  << "max_digits10 " << p2 << ": " << ss2.str() << std::endl;

    // Read back and compare with original
    double v1,v2;
    ss >> v1;
    ss2 >> v2;

    std::cout << "v==v1 : " << ((v==v1)?"true":"false") << std::endl;
    std::cout << "v==v2 : " << ((v==v2)?"true":"false") << std::endl;
}

output:

digits10     15: 100000000000.124
max_digits10 17: 100000000000.1236
v==v1 : false
v==v2 : true

It is not easy with floating points and precision but this tells me that the streaming seems more correct when using "max_digits10"

matspetter on 17 Nov 2016

It also gives the same result/output on a Ubuntu system:
g++ --version
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

uname -a
Linux mbergg12-ubuntu15 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

matspetter on 18 Nov 2016

This is not a bug, but a reality of dealing with floating-point representation.
If one uses more than digits10 digits of precision, then string->value->string is not guaranteed to round-trip.

TurpentineDistillery on 19 Nov 2016

@TurpentineDistillery However using max_digits10 on output guarantees that value->string->value will not change. The difference being that not all strings of length max_digits10 can be generated by value->string with max_digits10 precision. So as long as string was generated by outputting a value, it's safe. The issue comes when the string was just an arbitrary string of digits.

gregmarr on 19 Nov 2016

This issue can be closed, right?

nlohmann on 24 Nov 2016

Yes I guess so.
/mb

Sent from my iPhone

On 24 Nov 2016, at 20:46, Niels Lohmann <[email protected]notifications@github.com> wrote:

This issue can be closed, right?

-
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHubhttps://github.com/nlohmann/json/issues/360#issuecomment-262837047, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAq6oDYqiPZd1bPmNpaTZ-AvNPG0bUaLks5rBemYgaJpZM4K1d9R.

matspetter on 24 Nov 2016

Thanks for the quick response!

nlohmann on 24 Nov 2016

I don't see how this is not a bug. The OP's example exactly illustrates the problem of value -> string -> value serialization/deserialization. We need to store double precision data in a json string form and read it back without loss of precision and have run into exactly the same problem.

timueller on 28 Jul 2017

What would you propose?

nlohmann on 28 Jul 2017

Is there a problem if you just switch to using max_digits10 for both dump() and operator<< ?

timueller on 31 Jul 2017

Then numbers like 2312.42 would be round-tripped to 2312.4200000000001.

nlohmann on 31 Jul 2017

I just ran into the same problem. In my opinion, the string->value->string round-trip isn't really relevant, the value->string->value round-trip is. And the (minimum) number of decimal digits to distinguish all floating-point values is max_digits10, not digits10.

Note that currently "2312.4200000000001" does not "round-trip" to "2312.4200000000001". Actually a string->value->string round-trip cannot be guaranteed unless the first string is generated by the same implementation as the second (and then there is an initial value->string->value round-trip required...).

So I think that value->string->value should be guaranteed (when using the same library for serialization/deserialization).

abolz on 11 Oct 2017

See https://github.com/nlohmann/json/issues/360#issuecomment-261743910.

nlohmann on 11 Oct 2017

@nlohmann My comment actually supports using max_digits10. It was a comment on this statement:

If one uses more than digits10 digits of precision, then string->value->string is not guaranteed to round-trip.

I was pointing out that this is true, but only for strings that were written by a value->string conversion not using the full precisions, or were written by hand. As such, I think it's fine for those values to not be preserved exactly.

gregmarr on 11 Oct 2017

I think @gregmarr is saying the same. My expectation is that if I have a json object, assign any double precision value to it, serialize it to disk and later deserialize it again, the values should be exactly equal. This requires using max_digits10.

abolz on 11 Oct 2017

I think one reason for the status quo were the roundtrip results of https://github.com/miloyip/nativejson-benchmark. I'm not sure whether there exists the one right solution, so we need to make a decision.

nlohmann on 11 Oct 2017

I have an implementation of the Grisu2 algorithm for printing floating-point numbers, based on the reference implementation by Florian Loitsch. It works for IEEE float/double (but not long double) and produces a short representation which is guaranteed to round-trip. I could submit a PR if you like, so we have something to start with.

abolz on 28 Oct 2017

I just hit this issue. I store unit test data in JSON and a new unit test is failing because of this loss of precision. Is there any reason why std::setprecision shouldn't work on an ostream I'm passing a json object into?

ojwoodford on 30 Oct 2017

It doesn't use the ostream formatting for floating point numbers.
https://github.com/nlohmann/json/blob/master/src/json.hpp#L8324
https://github.com/nlohmann/json/blob/develop/src/json.hpp#L6701

If you change these to max_digits10 instead of digits10 it should fix your failures.

gregmarr on 30 Oct 2017

👍2

I just ran into the same issue as @gregmarr described and switching to max_digits10 seems to work

json j1 = 1.0 / 3.0;
json j2 = json::parse( j1.dump() );
bool is_eq = *( j1.get_ptr<json::number_float_t*>() ) == *( j2.get_ptr<json::number_float_t*>() );
// is_eq is false with digits10, true with max_digits10

pvleuven on 17 Nov 2017

Hi all. I shall change digits10 to max_digits10.

I lot of test cases fail. It seems that they focus on the string->number->string case:

      Start 36: test-inspection_all
36/70 Test #36: test-inspection_all .................***Failed    5.77 sec

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test-inspection is a Catch v1.9.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
object inspection
  serialization
  dump and floating-point numbers
-------------------------------------------------------------------------------
../test/src/unit-inspection.cpp:235
...............................................................................

../test/src/unit-inspection.cpp:238: FAILED:
  CHECK( s.find("42.23") != std::string::npos )
with expansion:
  18446744073709551615 (0xffffffffffffffff)
  !=
  18446744073709551615 (0xffffffffffffffff)

-------------------------------------------------------------------------------
object inspection
  serialization
  dump and small floating-point numbers
-------------------------------------------------------------------------------
../test/src/unit-inspection.cpp:241
...............................................................................

../test/src/unit-inspection.cpp:244: FAILED:
  CHECK( s.find("1.23456e-78") != std::string::npos )
with expansion:
  18446744073709551615 (0xffffffffffffffff)
  !=
  18446744073709551615 (0xffffffffffffffff)

===============================================================================
test cases:   1 |   0 passed | 1 failed
assertions: 150 | 148 passed | 2 failed

      Start 62: test-regression_all
62/70 Test #62: test-regression_all .................***Failed    5.59 sec

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test-regression is a Catch v1.9.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
regression tests
  issue #228 - double values are serialized with commas as decimal points
-------------------------------------------------------------------------------
../test/src/unit-regression.cpp:417
...............................................................................

../test/src/unit-regression.cpp:453: FAILED:
  CHECK( j1a.dump() == "2312.42" )
with expansion:
  "2312.4200000000001" == "2312.42"

../test/src/unit-regression.cpp:454: FAILED:
  CHECK( j1b.dump() == "2312.42" )
with expansion:
  "2312.4200000000001" == "2312.42"

../test/src/unit-regression.cpp:462: FAILED:
  CHECK( ss.str() == "4.712,112312.42" )
with expansion:
  "4.712,112312.4200000000001"
  ==
  "4.712,112312.42"

../test/src/unit-regression.cpp:464: FAILED:
  CHECK( ss.str() == "4.712,112312.4247,11" )
with expansion:
  "4.712,112312.420000000000147,11"
  ==
  "4.712,112312.4247,11"

../test/src/unit-regression.cpp:466: FAILED:
  CHECK( j2a.dump() == "23.42" )
with expansion:
  "23.420000000000002" == "23.42"

-------------------------------------------------------------------------------
regression tests
  issue #380 - bug in overflow detection when parsing integers
-------------------------------------------------------------------------------
../test/src/unit-regression.cpp:784
...............................................................................

../test/src/unit-regression.cpp:788: FAILED:
  CHECK( j.dump() == "1.66020696663386e+20" )
with expansion:
  "1.6602069666338596e+20"
  ==
  "1.66020696663386e+20"

===============================================================================
test cases:   1 |   0 passed | 1 failed
assertions: 408 | 402 passed | 6 failed

      Start 66: test-testsuites_all
66/70 Test #66: test-testsuites_all .................***Failed    0.07 sec

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test-testsuites is a Catch v1.9.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
compliance tests from nativejson-benchmark
  roundtrip
-------------------------------------------------------------------------------
../test/src/unit-testsuites.cpp:281
...............................................................................

../test/src/unit-testsuites.cpp:328: FAILED:
  CHECK( j.dump() == json_string )
with expansion:
  "[1.2344999999999999]" == "[1.2345]"
with messages:
  filename := "test/data/json_roundtrip/roundtrip22.json"
  json_string := "[1.2345]"

../test/src/unit-testsuites.cpp:328: FAILED:
  CHECK( j.dump() == json_string )
with expansion:
  "[-1.2344999999999999]" == "[-1.2345]"
with messages:
  filename := "test/data/json_roundtrip/roundtrip23.json"
  json_string := "[-1.2345]"

../test/src/unit-testsuites.cpp:328: FAILED:
  CHECK( j.dump() == json_string )
with expansion:
  "[2.2250738585071999e-308]"
  ==
  "[2.2250738585072e-308]"
with messages:
  filename := "test/data/json_roundtrip/roundtrip29.json"
  json_string := "[2.2250738585072e-308]"

===============================================================================
test cases:   7 |   6 passed | 1 failed
assertions: 974 | 971 passed | 3 failed

It would be great if you could have a look at these tests and tell me why it's OK to change or ignore them.

nlohmann on 28 Nov 2017

Hi, after thinking about this again, I think both use-cases are valid.
If you are writing a json file to disk with a double with value 0.2 you will get:
digits10: 0.2
max_digits10: 0.2000000000001
If the file is intended to be edited by a human, the first one definitely has the preference.
In the case where the round-trip conversion is essential, max_digits10 should be used (or store in binary form).
Would it be possible to let the user decide which type to use and keep the original digits10 as default?

pvleuven on 29 Nov 2017

👍1

A string is higher precision than a double (e.g. the former can represent 1.2345 exactly; the latter cannot), so converting from string -> double -> string can lead to a change in value, whereas double -> string -> double should not. For this reason, it's not clear to me why you would have exact tests on the former; they should have a tolerance.

ojwoodford on 30 Nov 2017

I think several of these came from an external benchmark that valued the "load and resave a JSON file with exact values" benchmark. I agree that those are not necessarily something that we should care about.

gregmarr on 30 Nov 2017

The roundtrip tests (string -> JSON -> string) come from here: https://github.com/miloyip/nativejson-benchmark/tree/master/data/roundtrip

nlohmann on 30 Nov 2017

Another strange behavior happening with serialization. Here I serialized a double with the DBL_MAX value (1.79769e+308). The resulting string value becomes larger than DBL_MAX and cannot be parsed back. (I post this here as it seems to be related.)

#include<iostream>
#include<float.h>
#include<cstdio>
#include"json.hpp"

using nlohmann::json;
using namespace std;


int main() {
    double d = DBL_MAX;
    json js;
    js["max"] = d;

    stringstream ss;
    ss << js["max"].dump();
    json js2 = json::parse(ss.str());
    cout << js2.dump() << endl;
}

This results in:

terminate called after throwing an instance of 'nlohmann::detail::out_of_range'
  what():  [json.exception.out_of_range.406] number overflow parsing '1.79769313486232e+308'

lwinkler on 7 Dec 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] on 6 Jan 2018

Reopened to check whether #915 fixed this issue.

nlohmann on 21 Jan 2018

The example from https://github.com/nlohmann/json/issues/360#issuecomment-349906638 works now and outputs 1.7976931348623157e+308.

nlohmann on 27 Jan 2018

The roundtrips from https://github.com/nlohmann/json/issues/360#issuecomment-347649156 work.

nlohmann on 27 Jan 2018

Roundtripping 2312.42 (https://github.com/nlohmann/json/issues/360#issuecomment-319131301) works now.

nlohmann on 27 Jan 2018

Roundtripping 100000000000.1236 (https://github.com/nlohmann/json/issues/360#issuecomment-261324038) works now.

nlohmann on 27 Jan 2018

Thanks to #915 from @abolz, this issue is now fixed. Thanks everybody for the patience!

nlohmann on 27 Jan 2018

This issue still seems to be there?

Here is some code that reproduces the issue:

#include <nlohmann/json.hpp>
int main()
{   
    using Json = ::nlohmann::json;
    std::string json_text{ "{\"spot\": 21898.99}" };
    Json json = Json::parse(json_text);
    auto j = *json.find("spot");
    double val = j.get<double>();
    // val is 21898.990000000002
        // expecting 21898.99
}

Git SHA da81e7be
Windows 10 version 10.0.14939.0
Visual Studio 2017 15.9.0, C++ 19.15.26732.1
Windows SDK 10.0.17134.0

Is seems as if this calls to std::strtod in lexer.hpp is the problem

 static void strtof(double& f, const char* str, char** endptr) noexcept
    {
        f = std::strtod(str, endptr);
    }

ghost on 14 Nov 2018

The library stores floating point numbers as double by default (you can change this in a template parameter).

The number 21898.99 will be stored as the double 21898.990000000002. This is the name number the parser comes up to after reading the string 21898.99. You can check in your debugger that both numbers are equal down to the bit.

https://www.exploringbinary.com/floating-point-converter/

nlohmann on 14 Nov 2018

@nlohmann Danke Schoen!
I found out this is not a problem in your library but how decimal to floating point conversion takes place, and the limit of how floating point numbers are stored in memory. I learnt something today!

ghost on 15 Nov 2018

😄1

Was this page helpful?

0 / 5 - 0 ratings