Elasticsearch: ingest date processor parsing

Created on 16 Jan 2020  路  5Comments  路  Source: elastic/elasticsearch

Elasticsearch version: 7.4.2

JVM version : openjdk version "1.8.0_232"

OS version : 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
trying to parse a timestamp with iso8601 format. I expect @timestamp 2020-01-15T14:28:09.452+01:00 but it shows 2020-01-15T15:28:09.452+01:00.

Steps to reproduce:

curl -XPOST 'http://localhost:9200/_ingest/pipeline/_simulate' \
-h 'Content-Type: application/json' -d'{
  "pipeline": {
    "processors": [
      {
        "date": {
          "field": "timestamp",
          "timezone": "+0100",
          "formats": [ "ISO8601"]
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "timestamp": "2020-01-15T14:28:09,452"
      }
    }
    ]
}'

Result:

{
    "docs": [
        {
            "doc": {
                "_index": "_index",
                "_type": "_doc",
                "_id": "_id",
                "_source": {
                    "@timestamp": "2020-01-15T15:28:09.452+01:00",
                    "timestamp": "2020-01-15T14:28:09,452"
                },
                "_ingest": {
                    "timestamp": "2020-01-16T15:06:47.477281Z"
                }
            }
        }
    ]
}
:CorFeatureIngest :CorInfrCore >bug

Most helpful comment

From what I see in 6.x it was working as you described @matriv
I will mark this as a bug and work on this

All 5 comments

Pinging @elastic/es-core-features (:Core/Features/Ingest)

Reproduced the behaviour in master.
To double check inserted actual document using the pipeline and the result is

"hits": [
  {
      "_index": "myindex",
      "_id": "1",
      "_score": 1.0,
      "_source": {
          "@timestamp": "2020-01-15T15:28:09.452+01:00",
          "timestamp": "2020-01-15T14:28:09,452"
      }
  }
]

Using stored fields I also get:

"hits": [
  {
      "_index": "myindex",
      "_id": "1",
      "_score": 1.0,
      "fields": {
          "timestamp": [
              "2020-01-15T14:28:09.452Z"
          ]
      }
  }
]

To me it also seems like an incorrect behaviour as defining the timezone shouldn't do a conversion on the _source field, but save it as 2020-01-15T14:28:09.452+01:00 which in UTC should become: 2020-01-15T13:28:09.452Z

Imho, as a user, if I have a date with a timezone e.g.: 2020-01-15T15:28:09.452+01:00 and a timezone -05:00
I would expect a transformation that would give: 2020-01-15T09:28:09.452-05:00
Similarly for a date: 2020-01-15T15:28:09.452Z and a timezone +03:00 I would expect:
2020-01-15T18:28:09.452+03:00.
But if the date is missing the timezone part: 2020-01-15T15:28:09.452 I would expect to just add the tz without transformation: 2020-01-15T15:28:09.452+03:00.

From what I see in 6.x it was working as you described @matriv
I will mark this as a bug and work on this

Pinging @elastic/es-core-infra (:Core/Infra/Core)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

abrahamduran picture abrahamduran  路  3Comments

matthughes picture matthughes  路  3Comments

rjernst picture rjernst  路  3Comments

martijnvg picture martijnvg  路  3Comments

Praveen82 picture Praveen82  路  3Comments