Elasticsearch-net: ExtendedStats with one document -> UnexpectedElasticsearchClientException

Created on 10 Sep 2020  路  13Comments  路  Source: elastic/elasticsearch-net

NEST/Elasticsearch.Net version:
7.9.0
Elasticsearch version:
7.9.1

Description of the problem including expected versus actual behavior:
Using ExtendedStats on a numeric value on one document generates always an exception:

    "message": "An error has occurred.",
    "exceptionMessage": "expected:'Number Token', actual:'\"NaN\"', at offset:179002",
    "exceptionType": "Elasticsearch.Net.UnexpectedElasticsearchClientException",

ps: I really don't know if the issue was present before, actually we working usually with a lot of documents, so this is the first time we see only one document after filtering our index and the first time in years that we see this exception.

Expected behavior
No exception but an empty result.

bug

Most helpful comment

All 13 comments

Actually now I'm just filtering my results based on document count calculated on a pre query.
My first question is related to the issue...is it confirmed? any news from nest team? @russcam for example.

Secondly, how to avoid a pre query to filter data? is it possible to calculate ExtendedStats based on document count > 1 on the same query?

Hi @meriturva, this _looks_ like a bug.

NaN is a peculiarity in JSON, which might be supported in a number of ways. The server is sending it as a string "NaN", but the internal client serializer will not accept a string, but would accept it as NaN. I think we'll need to address this in the client.

ps: I really don't know if the issue was present before,

I've not seen Elasticsearch return NaN before. This may have come in with the addition of standard deviation and variance sampling to extended stats in https://github.com/elastic/elasticsearch/pull/49782

In meanwhile, any workaround to avoid double query (to filter aggregation without documents)?
I don't found any way to enable aggregation based on document count resulting in parent aggregation, so I have now to execute a more one query just to check document presents.

Hi @russcam I just would like to ask news about that issue or maybe a cool workaround to avoid double queries.
So thanks.

Just to know @russcam ...is it fixed on 7.10 version?

Hi @meriturva

there's no change in NEST for this yet. There's been some discussion about whether Elasticsearch should be returning NaN at all, given the ambiguity of representing it in JSON. No conclusion has been reached yet though.

@Mpdreamz, @stevejgordon perhaps we can check for "NaN" in the offending fields? I _think_ it's the addition of the sampling fields in extended stats, but would need double checking.

Just updated from 7.6 to latest version: 7.10 and encountered the very similar problem
when making following call that worked before:

`
var searchRequest = new SearchRequest(Indices.Parse(ElasticsearchConstants.GetIndexNameFromBatchId(batchId)))
{
Size = 0, // We just need the aggregation data. Returned documents of the top level query are not required.
Query = new BoolQuery
{
Must = new List
{
new TermQuery { Field = ElasticsearchConstants.DataPointFieldBatchId, Value = batchId },
new TermQuery { Field = ElasticsearchConstants.DataPointFieldParameterId, Value = parameterId }
},
Filter = new List
{
new DateRangeQuery
{
Field = ElasticsearchConstants.DataPointFieldTimestamp,
GreaterThanOrEqualTo = startDateTime.ToString("O", CultureInfo.InvariantCulture),
LessThanOrEqualTo = endDateTime.ToString("O", CultureInfo.InvariantCulture)
}
}
},
Aggregations = new DateHistogramAggregation(ElasticsearchConstants.DataPointsHistogramAggregationKeyString)
{
Field = ElasticsearchConstants.DataPointFieldTimestamp,
FixedInterval = histogramTimeInterval,

                                                           Order = HistogramOrder.CountAscending,
                                                           Aggregations = new ExtendedStatsAggregation(ElasticsearchConstants.DataPointsHistogramStatsKeyString, ElasticsearchConstants.DataPointFieldValue),
                                                           MinimumDocumentCount = 1 // Just returns buckets which contains documents.
                                                       }
                                };

`

error message is pretty much the same:

Elasticsearch.Net.UnexpectedElasticsearchClientException: expected:'Number Token', actual:'"NaN"', at offset:493 ---> Elasticsearch.Net.Utf8Json.JsonParsingException: expected:'Number Token', actual:'"NaN"', at offset:493
at Elasticsearch.Net.Utf8Json.JsonReader.ReadDouble()
at Nest.AggregateFormatter.GetExtendedStatsAggregate(JsonReader& reader, IJsonFormatterResolver formatterResolver, StatsAggregate statsMetric, IReadOnlyDictionary2 meta) at Nest.AggregateFormatter.GetStatsAggregate(JsonReader& reader, IJsonFormatterResolver formatterResolver, IReadOnlyDictionary2 meta)
at Nest.AggregateFormatter.ReadAggregate(JsonReader& reader, IJsonFormatterResolver formatterResolver)
at Nest.AggregateFormatter.GetSubAggregates(JsonReader& reader, String name, IJsonFormatterResolver formatterResolver)
at Nest.AggregateFormatter.GetDateHistogramBucket(JsonReader& reader, IJsonFormatterResolver formatterResolver)
at Nest.AggregateFormatter.ReadBucket(JsonReader& reader, IJsonFormatterResolver formatterResolver)
at Nest.AggregateFormatter.GetMultiBucketAggregate(JsonReader& reader, IJsonFormatterResolver formatterResolver, ArraySegment1& propertyName, IReadOnlyDictionary2 meta)
at Nest.AggregateFormatter.ReadAggregate(JsonReader& reader, IJsonFormatterResolver formatterResolver)
at Nest.AggregateDictionaryFormatter.ReadAggregate(JsonReader& reader, IJsonFormatterResolver formatterResolver, String[] tokens, Dictionary`2 dictionary)
at Nest.AggregateDictionaryFormatter.Deserialize(JsonReader& reader, IJsonFormatterResolver formatterResolver)
at Deserialize(Object[] , JsonReader& , IJsonFormatterResolver )

Hi @meriturva

there's no change in NEST for this yet. There's been some discussion about whether Elasticsearch should be returning NaN at all, given the ambiguity of representing it in JSON. No conclusion has been reached yet though.

With ES 7.6 "NaN" has been returned - at aleast when index is set to ignore malformed values.
Not returning already stored "NaN" values is not acceptable.

same issue un 7.11.1
any news @russcam here?

I am experiencing the very same problem (in my case it's on 7.10.0), and can confirm that the NaN comes from the computed "sampling" fields, not from my documents.

    "extended_stats#my_extendedstats_agg" : {
      "count" : 1,
      "min" : -269.0,
      "max" : -269.0,
      "avg" : -269.0,
      "sum" : -269.0,
      "sum_of_squares" : 72361.0,
      "variance" : 0.0,
      "variance_population" : 0.0,
      "variance_sampling" : "NaN",
      "std_deviation" : 0.0,
      "std_deviation_population" : 0.0,
      "std_deviation_sampling" : "NaN",
      "std_deviation_bounds" : {
        "upper" : -269.0,
        "lower" : -269.0,
        "upper_population" : -269.0,
        "lower_population" : -269.0,
        "upper_sampling" : "NaN",
        "lower_sampling" : "NaN"
      }
    }

I really just need to compute the standard deviation, don't even know what the "sampling" fields are used for.

Awesome, thanks!!!

NEST/Elasticsearch.Net version:
7.12.1
Elasticsearch version:
7.10.1

Hello, FYI - same type of issue still exists, but me comes from different line of code in both master and 7.13:

Was this page helpful?
0 / 5 - 0 ratings