Elasticsearch: Add (Hard-) Limit to SLM Snapshot Frequency?

Created on 20 Apr 2020  路  5Comments  路  Source: elastic/elasticsearch

Currently, users can get their clusters into some trouble by accidentally setting a second-scale SLM snapshot frequency.
E.g. accidentally setting something like 10s that is short enough to finish a snapshot (because the cluster contents barely changed) could lead to a situation where the repository metadata size grows enormously by accident.

Should we add some hard limit to the snapshot frequency in SLM? (e.g. 5 min)

:CorFeatureILM+SLM CorFeatures

Most helpful comment

what about preventing SLM schedules like once every thousand years?

We shouldn't prevent this, and we shouldn't prevent schedules that never execute (such as * * * 31 FEB ? *), because SLM policies can still be manually executed on an as-needed basis.

All 5 comments

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

cc @nachogiljaldo

We discussed today , and this sounds like a good idea. Since it is a cron expression, it is a bit more challenging to enforce, but still do-able.

On the opposite side, what about preventing SLM schedules like once every thousand years?

edit: Thinking about this a little more, the consequences for a bad policy that executes only once a year is that the snapshots don't get taken, whereas at once a minute or once a second you can really hammer your cluster.

what about preventing SLM schedules like once every thousand years?

We shouldn't prevent this, and we shouldn't prevent schedules that never execute (such as * * * 31 FEB ? *), because SLM policies can still be manually executed on an as-needed basis.

Was this page helpful?
0 / 5 - 0 ratings