Pandas: Support Infinity, -Infinity and NaN in read_json

Created on 2 Feb 2016  路  10Comments  路  Source: pandas-dev/pandas

While these special values are not strictly standard conform, most implementations do allow or use them.

For example Google's GSON Library or python's json module in the standard library.

Enhancement IO JSON

Most helpful comment

The newest master throws a ValueError: Expected object or value exception when trying to parse JSON containing inf of nan values. While I understand that these are not valid JSON, it might be helpful to throw more informative exceptions to help the user debug the issue.

Script to reproduce (using Hypothesis):

import json

from hypothesis import assume, example, given
import hypothesis.strategies as st

import pandas as pd

FAILING_EXAMPLES = (
    [{u'': float('inf')}],
    [{u'': float('nan')}],
)

@given(st.lists(st.dictionaries(st.text(),
                                st.floats() | st.booleans() | st.text() | st.none())))
@example([{u'': float('inf')}])
@example([{u'': float('nan')}])
def test_load_json(test_input):

    df = pd.read_json(json.dumps(test_input))

All 10 comments

nan's are already supported.

In [1]: df = DataFrame({'A' : [np.nan,1,np.inf,-np.inf]})

In [2]: df.to_json(None)
Out[2]: '{"A":{"0":null,"1":1.0,"2":null,"3":null}}'

But not NaN for reading: pd.read_json('{"a": [NaN, Infinity, -Infinity]}')

that's not valid json

I know, that's what I wrote in the first post. But it is a commonly used extension.

See the description in Google's GSON package:

Section 2.4 of JSON specification disallows special double values (NaN, Infinity, -Infinity). However, Javascript specification (see section 4.3.20, 4.3.22, 4.3.23) allows these values as valid Javascript values. Moreover, most JavaScript engines will accept these special values in JSON without problem. So, at a practical level, it makes sense to accept these values as valid JSON even though JSON specification disallows them.

Or python's json module:

It also understands NaN, Infinity, and -Infinity as their corresponding float values, which is outside the JSON spec.

Also interesting: the json specifications explicitly allow to accept extensions of the standard:
https://tools.ietf.org/html/rfc7159.html#section-9

A JSON parser transforms a JSON text into another representation. A
JSON parser MUST accept all texts that conform to the JSON grammar.
A JSON parser MAY accept non-JSON forms or extensions.

:+1: This would indeed be very helpful for interoperability between a pandas backend and a JS frontend.

I took a crack at this and discovered that the json library underlying read_json, ujson, doesn't handle NaN nor infinity. In previous discussions (more) the authors of ujson resisted adding support because it's out of spec, and there's no way to specify a custom decodor nor encoder with ujson.

The newest master throws a ValueError: Expected object or value exception when trying to parse JSON containing inf of nan values. While I understand that these are not valid JSON, it might be helpful to throw more informative exceptions to help the user debug the issue.

Script to reproduce (using Hypothesis):

import json

from hypothesis import assume, example, given
import hypothesis.strategies as st

import pandas as pd

FAILING_EXAMPLES = (
    [{u'': float('inf')}],
    [{u'': float('nan')}],
)

@given(st.lists(st.dictionaries(st.text(),
                                st.floats() | st.booleans() | st.text() | st.none())))
@example([{u'': float('inf')}])
@example([{u'': float('nan')}])
def test_load_json(test_input):

    df = pd.read_json(json.dumps(test_input))

Hi there,

Thanks for the thorough discussions. We would suppose that pandas to be friendly to data scientists and a little surprised to find compatibility issues with IEEE float standards. Infinity/-Infinity is particularly useful because it is flexible enough to be clipped to the max/min value in any valid ranges. This is not possible with NaN/null.

As of today, it is possible to encode Infinity using default_handler, but not to decode it, correct?

Thanks.

Finally! Thanks to anyone involved in making this possible.

Was this page helpful?
0 / 5 - 0 ratings