Elasticsearch: Allow non numeric numbers in JsonParser

Created on 5 Apr 2013  路  8Comments  路  Source: elastic/elasticsearch

I need to index number with NaN values...
Could you allow non numeric number as you allowed comments (https://github.com/elasticsearch/elasticsearch/issues/1394).

jsonFactory.configure(JsonParser.Feature.ALLOW_NON_NUMERIC_NUMBERS, true);

Could these settings be configurable in ElasticSearch?

Most helpful comment

If I understand correctly, ignore_malformed has no impact here. This error isn't thrown by the mapper parser. It's thrown by Jackson when parsing the request json. Let me demonstrate the difference.

curl -XPUT http://localhost:9200/bad_numbiz/_doc/1?pretty=true -H 'Content-Type: application/json' -d '{
>   "age":1.7976931348623157e+308}'

returns

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse field [age] of type [float]"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse field [age] of type [float]",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "[float] supports only finite values, but got [Infinity]"
    }
  },
  "status" : 400
}

I can prevent that.

curl -XPUT http://localhost:9200/bad_numbiz?pretty=true -H 'Content-Type: application/json' -d '{
>   "settings": {"index": {"mapping.ignore_malformed": true}}}'
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "bad_numbiz"
}
curl -XPUT http://localhost:9200/bad_numbiz/_doc/1?pretty=true -H 'Content-Type: application/json' -d '{
  "age":1.7976931348623157e+308}'
{
  "_index" : "bad_numbiz",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

However, if I provide a NaN field value:

curl -XPUT http://localhost:9200/bad_numbiz/_doc/3?pretty=true -H 'Content-Type: application/json' -d '{"other": NaN}'

the response is

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse",
    "caused_by" : {
      "type" : "json_parse_exception",
      "reason" : "Non-standard token 'NaN': enable JsonParser.Feature.ALLOW_NON_NUMERIC_NUMBERS to allow\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@7c5c36af; line: 1, column: 14]"
    }
  },
  "status" : 400
}

That seems to be generated by the Jackson JSON parser.

I can't work out a way to pass that attribute to Jackson by command line or config file.

If possible, I'd like to reopen this issue.

(Possibly worth mentioning that the same behavior results from passing in an actual Infinity value, as opposed to passing in a value that exceeds to max for a field type, so the parser reports it as "Infinity".)

All 8 comments

You can map a numeric field to be lenient, which would work.

Can you provide an example of how you can map a field to be lenient? I only see it in the docs for queries.

I'm getting the error with python infinity,

Caused by: org.elasticsearch.common.jackson.core.JsonParseException: Non-standard token 'Infinity': enable JsonParser.Feature.ALLOW_NON_NUMERIC_NUMBERS to allow at [Source: [B@5f26bc90; line: 1, column: 2880]

@alecklandgraf apologies, the correct parameter is ignore_malformed. As you said, lenient is indeed for queries only.

I'm seeing the same error. I'm having trouble getting ignore_malformed to work.

Searching this repo for ALLOW_NON_NUMERIC_NUMBERS yields 0 results, so I don't think this is actually implemented?

For comparison, searching for ALLOW_COMMENTS yields results.

If I understand correctly, ignore_malformed has no impact here. This error isn't thrown by the mapper parser. It's thrown by Jackson when parsing the request json. Let me demonstrate the difference.

curl -XPUT http://localhost:9200/bad_numbiz/_doc/1?pretty=true -H 'Content-Type: application/json' -d '{
>   "age":1.7976931348623157e+308}'

returns

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse field [age] of type [float]"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse field [age] of type [float]",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "[float] supports only finite values, but got [Infinity]"
    }
  },
  "status" : 400
}

I can prevent that.

curl -XPUT http://localhost:9200/bad_numbiz?pretty=true -H 'Content-Type: application/json' -d '{
>   "settings": {"index": {"mapping.ignore_malformed": true}}}'
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "bad_numbiz"
}
curl -XPUT http://localhost:9200/bad_numbiz/_doc/1?pretty=true -H 'Content-Type: application/json' -d '{
  "age":1.7976931348623157e+308}'
{
  "_index" : "bad_numbiz",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

However, if I provide a NaN field value:

curl -XPUT http://localhost:9200/bad_numbiz/_doc/3?pretty=true -H 'Content-Type: application/json' -d '{"other": NaN}'

the response is

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse",
    "caused_by" : {
      "type" : "json_parse_exception",
      "reason" : "Non-standard token 'NaN': enable JsonParser.Feature.ALLOW_NON_NUMERIC_NUMBERS to allow\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@7c5c36af; line: 1, column: 14]"
    }
  },
  "status" : 400
}

That seems to be generated by the Jackson JSON parser.

I can't work out a way to pass that attribute to Jackson by command line or config file.

If possible, I'd like to reopen this issue.

(Possibly worth mentioning that the same behavior results from passing in an actual Infinity value, as opposed to passing in a value that exceeds to max for a field type, so the parser reports it as "Infinity".)

+1 nan, -inf, and +inf, are part of the standard JSON spec and should be supported (via a user configurable option).

If anyone has figured this one out, I'd appreciate a hint. Thanks!

Me too

Was this page helpful?
0 / 5 - 0 ratings