Elasticsearch: Range query with date_time field does not parse "lte" correctly

Created on 10 Sep 2020  路  5Comments  路  Source: elastic/elasticsearch

Elasticsearch version (bin/elasticsearch --version): 7.9.1

Plugins installed: []

JVM version (java -version): openjdk 14.0.1 2020-04-14

OS version (uname -a if on a Unix-like system): Linux elasticsearch-0 4.19.76-linuxkit #1 SMP Tue May 26 11:42:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
Executing a range query against an index with a field of type "date" and format "date_time" behaves inconsistently with certain date formats.

The format 2020-09-01T00:00:00.000+0200 works for "gte", but not for "lte". Changing the timezone to Z or +/-02:00 (including colon) works for both.

Steps to reproduce:

  1. Create an index:
PUT myindex
{
  "mappings": {
    "_source": {
      "enabled": true
    },
    "dynamic": "strict",
    "properties": {
      "@type": {
        "type": "keyword",
        "index": "false"
      },
      "@version": {
        "type": "integer",
        "index": "false"
      },
      "creationDate": {
        "type": "date",
        "format": "date_time"
      }
    }
  }
}
  1. Run query
{
  "query": {
    "range": {
      "creationDate": {
        "gte": "2020-09-01T00:00:00.000+0200",
        "lte": "2020-09-30T23:59:59.000+0200"
      }
    }
  }
}

Result:

{
  "error": {
    "root_cause": [
      {
        "type": "parse_exception",
        "reason": "failed to parse date field [2020-09-30T23:59:59.000+0200] with format [date_time]: [failed to parse date field [2020-09-30T23:59:59.000+0200] with format [date_time]]"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "range_problem",
        "node": "45sllSW0SEmR7iErUZGpOg",
        "reason": {
          "type": "parse_exception",
          "reason": "failed to parse date field [2020-09-30T23:59:59.000+0200] with format [date_time]: [failed to parse date field [2020-09-30T23:59:59.000+0200] with format [date_time]]",
          "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "failed to parse date field [2020-09-30T23:59:59.000+0200] with format [date_time]",
            "caused_by": {
              "type": "date_time_parse_exception",
              "reason": "Text '2020-09-30T23:59:59.000+0200' could not be parsed at index 23"
            }
          }
        }
      }
    ]
  },
  "status": 400
}
:CorFeatureFeatures :SearcSearch >bug CorFeatures Search

Most helpful comment

it is a bug. date_time consists of 2 parsers - one for parsing with a timezone with a colon, second one without a colon. DateMathParser was incorrectly only taking into the account the first one.
I will work on a fix.

All 5 comments

Pinging @elastic/es-core-features (:Core/Features/Features)

Pinging @elastic/es-search (:Search/Search)

This doesn't seem to be limited to lte. I did some digging to check if we do something different in the query builders but the error seems to be a special case of parsing woth this particular format ("date_time") combined with some internal rounding setting.
The following query using gt fails in a similar way:

POST /myindex/_search
{
  "query": {
    "range": {
      "creationDate": {
        "gt": "2020-01-01T22:59:59.000+0200"
      }
    }
  }

The exception is only thrown when calling JavaDateMathParser#parse with the roundUpProperty set to true, as this little unit test Experiment:

DateFormatter formatter = DateFormatter.forPattern("date_time");
        Instant parsed = formatter.toDateMathParser().parse("2020-01-01T22:59:59.000+02:00", () -> 0L, false, (ZoneId) null);
        System.out.println(parsed.toEpochMilli());  <-- 1577908799000
        parsed = formatter.toDateMathParser().parse("2020-01-01T22:59:59.000+02:00", () -> 0L, true, (ZoneId) null);
        System.out.println(parsed.toEpochMilli());  <-- 1577908799000
        parsed = formatter.toDateMathParser().parse("2020-01-01T22:59:59.000+0200", () -> 0L, false, (ZoneId) null);
        System.out.println(parsed.toEpochMilli());  <-- 1577908799000
        parsed = formatter.toDateMathParser().parse("2020-01-01T22:00:59.000+0200", () -> 0L, true, (ZoneId) null); <-- throws

@pgomulka maybe you can see a difference between the two different zone identifier variants ("+0200" vs. "+02:00") in terms of how parsing should be supported? Is there something thats already acting like this on the java.time level?

it is a bug. date_time consists of 2 parsers - one for parsing with a timezone with a colon, second one without a colon. DateMathParser was incorrectly only taking into the account the first one.
I will work on a fix.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

abrahamduran picture abrahamduran  路  3Comments

jasontedor picture jasontedor  路  3Comments

brwe picture brwe  路  3Comments

ttaranov picture ttaranov  路  3Comments

rjernst picture rjernst  路  3Comments