I'm upgrading from groovy to painless and it's not so painless so far. I wanted to post this here as I think this would be a pretty common scenario and I'm having trouble with dates (gotta love dates)
PUT /test-stacks-v1
{
"aliases": {
"test-stacks": {}
},
"mappings": {
"stacks": {
"_all": {
"enabled": false
},
"dynamic": false,
"properties": {
"id": {
"type": "keyword"
},
"first_occurrence": {
"type": "date"
},
"last_occurrence": {
"type": "date"
},
"total_occurrences": {
"type": "float"
}
}
}
}
}
PUT /test-stacks/stacks/1ecd0826e447a44e78877ab1
{
"id": "1ecd0826e447a44e78877ab1",
"total_occurrences": 0,
"first_occurrence": "0001-01-01T00:00:00",
"last_occurrence": "0001-01-01T00:00:00"
}
POST /test-stacks/stacks/1ecd0826e447a44e78877ab1/_update?retry_on_conflict=3
{
"script": {
"params": {
"minOccurrenceDateUtc": "2016-10-25T22:04:19.3272056Z",
"maxOccurrenceDateUtc": "2016-10-25T22:04:19.3272056Z",
"count": 1
},
"inline": "if (ctx._source.total_occurrences == 0 || ctx._source.first_occurrence > params.minOccurrenceDateUtc) { ctx._source.first_occurrence = params.minOccurrenceDateUtc; } if (ctx._source.last_occurrence < params.maxOccurrenceDateUtc) { ctx._source.last_occurrence = params.maxOccurrenceDateUtc; } ctx._source.total_occurrences += params.count;"
}
}
The last script update returns
json
{
"error": {
"root_cause": [
{
"type": "remote_transport_exception",
"reason": "[8mJiEA3][local[1]][indices:data/write/update[s]]"
}
],
"type": "illegal_argument_exception",
"reason": "failed to execute script",
"caused_by": {
"type": "script_exception",
"reason": "runtime error",
"caused_by": {
"type": "class_cast_exception",
"reason": "Cannot apply [<] operation to types [java.lang.String] and [java.lang.String]."
},
"script_stack": [
"if (ctx._source.last_occurrence < params.maxOccurrenceDateUtc) { ",
" ^---- HERE"
],
"script": "if (ctx._source.total_occurrences == 0 || ctx._source.first_occurrence > params.minOccurrenceDateUtc) { ctx._source.first_occurrence = params.minOccurrenceDateUtc; } if (ctx._source.last_occurrence < params.maxOccurrenceDateUtc) { ctx._source.last_occurrence = params.maxOccurrenceDateUtc; } ctx._source.total_occurrences += params.count;",
"lang": "painless"
}
},
"status": 400
}
The original groovy script worked flawlessly in 1.7.5 (including linebreaks)
``` cs
var request = new UpdateRequest<Stack, Stack>(GetIndexById(stackId), ElasticType.Type, stackId) {
RetryOnConflict = 3,
Script = @"if (ctx._source.total_occurrences == 0 || ctx._source.first_occurrence > minOccurrenceDateUtc) {
ctx._source.first_occurrence = minOccurrenceDateUtc;
}
if (ctx._source.last_occurrence < maxOccurrenceDateUtc) {
ctx._source.last_occurrence = maxOccurrenceDateUtc;
}
ctx._source.total_occurrences += count;",
Params = new Dictionary<string, object>(3) {
{ "minOccurrenceDateUtc", minOccurrenceDateUtc },
{ "maxOccurrenceDateUtc", maxOccurrenceDateUtc },
{ "count", count }
}
};
Is there anything special I need to do to cast a field (which is mapped as a date) and stored as a string to a date reliably?
I'll try to update (https://www.elastic.co/guide/en/elasticsearch/reference/5.0/docs-update.html) with a date example once this is figured out :)
Elastic reserves GitHub for bug reports and feature requests; please post general questions on the Elastic Discourse forum.
Considering you don't have a single painless test for dates or docs (https://github.com/elastic/elasticsearch/blob/master/modules/lang-painless/src/test/java/org/elasticsearch/painless/CastTests.java) I think this is a bug/feature report.
I've created a post on https://discuss.elastic.co/t/comparing-dates-in-painless/63943 and renamed this title
Looks like it's almost impossible to parse an ISO-8601 date with painless (http://stackoverflow.com/questions/2201925/converting-iso-8601-compliant-string-to-java-util-date) am I missing something?
To anyone wondering I was able to get this working at least locally, I'm not sure how reliable it is at parsing different date formats as I can't specify Zulu
because min date time doesn't have it (pre epoch)
def sf = new SimpleDateFormat(\""yyyy-MM-dd'T'HH:mm:ss\"");
if (ctx._source.total_occurrences == 0 || sf.parse(ctx._source.first_occurrence).after(sf.parse(params.minOccurrenceDateUtc))) {
ctx._source.first_occurrence = params.minOccurrenceDateUtc;
}
if (sf.parse(ctx._source.last_occurrence).before(sf.parse(params.maxOccurrenceDateUtc))) {
ctx._source.last_occurrence = params.maxOccurrenceDateUtc;
}
ctx._source.total_occurrences += params.count;
I'm reopening this because I think it is interesting. I mean, you have a workaround but it kind of sucks for a few reasons:
first_occurrence
is a date and should be exposed to painless as a date.I don't know how many changes should stem from this issue but my instinct is "a couple".
I do want to be clear on point number 2. Painless's list of priorities are:
The work around you have has trouble with 2 and 3 and 4. I don't know if we'll ever get 4, but we should be able to get 2 and 3.
Yeah, I ended up having to do even more work as stated in here (https://discuss.elastic.co/t/comparing-dates-in-painless/63943). I had to create a function to parse the date into a Instant (in a try block) and return a default value. It definitely wasn't straight forward. I appreciate you reopening this.
@nik9000 Point 1 to me is certainly the most interesting here. On point 3, I thought @rjernst made the change to Date fields to expose these things as dates? Maybe I'm misremembering, though. I think for compatibility with Groovy, it's best effort, but if something doesn't make sense or we agree it's a poor feature we should definitely not be forced to implement it. Groovy has a lot of really insecure/overly lenient features.
The change I made was for boolean
s, and only applicable for doc values (which are not available in update scripts).
Some additional feedback. I seem to be hitting more issues around this (I'm working to get to the bottom of this.).. I have a date "date": "2017-01-03T14:12:36.2770782-06:00"
and I'm doing an explain on it .
_explain
{
"query": {
"script": {
"script": "Debug.explain(doc.date)"
}
}
}
which returns
"type": "script_exception",
"reason": "runtime error",
"to_string": "[1483474356277]",
So I tried to convert it by doing both Instant.parse(doc.date)
returns "org.elasticsearch.index.fielddata.ScriptDocValues$Longs cannot be cast to java.lang.CharSequence"
and new Date(doc.date)
returns "org.elasticsearch.index.fielddata.ScriptDocValues$Longs cannot be cast to java.lang.Long"
Seems at least on the second one that the conversion should be much easier in painless.
and new Date(doc.date) returns "org.elasticsearch.index.fielddata.ScriptDocValues$Longs cannot be cast to java.lang.Long"
ScriptDocValues$Longs
already provides a date accessor. Use doc['date'].date
.
@rjernst good to know, I just also figured I could do this as well (get the utc time) def x = OffsetDateTime.parse('2017-01-01T00:00:00.0000000+12:00'); Debug.explain(x.toInstant())
@rjernst is that (doc['date'.date) ]available on reindex scripts? Getting a null reference when I try that outside of regular explain.
That would be available in any script that has doc
available. I think that is just search scripts. But if you see a message about ScriptDocValues, it should be available.
In reindex ctx._source.date
should return a date. You should be able to have a look at it with Debug.explain(ctx._source.date)
(5.1.1+). That should return an error with the class of the date and you can go from there. This has some instructions that should be useful. I'd love to have a proper API reference to point you at and I've been working on and off on it for a bit. I've uncovered quite a few crabs along the way though....
@nik9000 thanks!, looks like it's a string, I can do OffsetDateTime.parse(ctx._source.date).toInstant();
Hello,
I also encounter a lot of painful problems when trying to work with datetime in painless... For instance, i wanted to apply a TZ to two dates and get the difference but cannot find a way to do it in painless...
In groovy, i could do something like:
(doc['t1'].date.setZone(org.joda.time.DateTimeZone.forID(tz)) - doc['t2'].date.setZone(org.joda.time.DateTimeZone.forID(tz))).days
Do you plan to improve this aspect of painless or do i need to keep using groovy if i want to do complex operations on date object ?
(doc['t1'].date.setZone(org.joda.time.DateTimeZone.forID(tz)) - doc['t2'].date.setZone(org.joda.time.DateTimeZone.forID(tz))).days
I don't think a lot of these are whitelisted. I'm not sure why, but right now most of java's time API is whitelisted and joda isn't.... I expect something like this should work:
ZoneId zone = ZoneId.of(params.tz); ChronoUnit.DAYS. between(Instant.ofEpochMilli(doc.t2.date.millis).atZone(zone), Instant.ofEpochMilli(doc.t1.date.millis).atZone(zone))
Thank you very much @nik9000, it works well !!
Thank you very much @nik9000, it works well !!
And now I'm torn! Painless supports java's time api well because we expected to port Elasticsearch to java's time api and drop joda time at some point. But we haven't yet. Either we should push the migration through in 6.0 or fix up joda's date support in painless or both....
or fix up joda's date support in painless
I don't think we should do that. The java 8 time api in painless works fine.
The java 8 time api in painless works fine.
Yeah. I'm fairly sure the right thing to do is to complete the port. For the time being things like date aggregations can work if you return millis-since-epoch from the script.
I'm also against adding support for Joda. Dates are already challenging to use, but having two libraries would really step up the difficulty level.
Removing Joda is an eventuality.
I agree with not adding support for joda to painless. We should (at some stage) start working on https://github.com/elastic/elasticsearch/issues/12829 to replace joda with java time instead
@rjernst I wouldn't say it works fine, but it works.. It's been a PITA working with dates in elastic painless...
@niemyjski thanks for the help.
Here is my experience with manipulating dates.
Maybe this will help someone, I eventually succeed using the .plusHours()
function.
it seems like this issue can be closed since we have a way forward with java time... closing
Most helpful comment
I agree with not adding support for joda to painless. We should (at some stage) start working on https://github.com/elastic/elasticsearch/issues/12829 to replace joda with java time instead