Elasticsearch: [Transform] scripted group_by fails for continuos transform

Created on 29 May 2020  路  3Comments  路  Source: elastic/elasticsearch

Affected versions: 7.7 - 7.9.1 (fixed >= 7.9.2)

Continuous transform minimize the amount of updates by querying only changed buckets, the logic uses a combination of query features for that. For terms it relies on terms query.

If a script is used in group_by instead of a field, this change detection logic causes the transform to fail (after it switches to continuous mode, not directly after start but after checkpoint 1) as it expects a field:

task encountered irrecoverable failure: field name cannot be null

Because scripts offer freedom to build a bucket key, we can't use them to detect changes on them. We could construct a script query, however this would be very expensive.

Mitigation: don't use scripts in group_by together with continuous transform. Scripts in queries are very expensive, so independent of change detection it's highly recommended to _not_ use scripts in production, but only in the development/data exploration phase.

Possible solutions

Update: We finally decided to go with option B, keeping the other ideas for documentation.

## A Disallow scripted group_by in continuous mode

This would be easiest, however if you have only 1 scripted group_by but n non scripted group_by's this would limit functionality unnecessary.

B Disable change detection for scripted group_by

This would let continuous mode do a full rerun if all group_by's are using a script. On larger scale this leads to performance problems. If there are other non-scripted group_by's or the amount of data is small this might still be an acceptable solution.

B.2: For this solution we could also consider using _update instead of index.

## C Implement change detection based on scripted query

I am not 100% sure this is possible. This solution would use a script query instead of a terms query. This solution might not be better than solution B: disabling change detection.

General solution remarks

Because data might be small A would limit functionality, I therefore tend towards B. It would be possible to disallow certain combinations like only disallow if no group_by implements change detection, but again this sounds like to much of a restriction. Instead we should warn the user about potential problems. Solution C might be good in the long run, but takes significant more time to verify, implement and test, so B seems to be the best short term solution.

The user that reported the issue made a workaround for https://github.com/elastic/elasticsearch/issues/48243, therefore supporting missing_bucket should be prioritized.

Update

If transform can not apply any change detection a warning (job message + log) is raised regarding performance.

:mTransform >bug

All 3 comments

Pinging @elastic/ml-core (:ml/Transform)

FWIW: I tested the possibility to backport this change to 7.9, but unfortunately this turned into a lot of merge problems as the PR depends on other changes.

Without leaking any information about release dates, the next patch release 7.9.2 and the next minor 7.10 are not far away and there is not much time in-between them.

I therefore dropped the idea of a backport.

After revisiting the backport again, I manually backported the important bits to 7.9.2.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ttaranov picture ttaranov  路  3Comments

clintongormley picture clintongormley  路  3Comments

brwe picture brwe  路  3Comments

rjernst picture rjernst  路  3Comments

matthughes picture matthughes  路  3Comments