It's quite common that different people are responsible for configuring data ingestion and actually analyzing the data. How can analysts tell whether they cannot find events older than X months because this is the retention period of your data or just because no events match the current filter and are older than X months?
Some cases make this even more trappy. For instance think of a user searching for sequences of event of category X then Y. If X and Y have different retention periods then it's easy to be misleaded to think that there is old data when actually there is only old data for one of the categories.
Some questions to get the discussion started:
can_match phase? (with the caveat that the can_match phase may filter indices based on their @timestamp values)Pinging @elastic/es-search (:Search/Search)
@jpountz how are you defining "retention period" here? ILM policy delete phase?
@dakrone Yes indeed.
Hmm.. what about a policy like this:
{
"policy": {
"phases" : {
"hot" : {
"min_age" : "0ms",
"actions" : {
"rollover" : {
"max_docs" : 10000000,
"max_size": "50gb"
}
}
},
"delete" : {
"min_age" : "1d",
"actions" : {
"delete" : { }
}
}
}
}
}
We would have to be careful not to warn the that they shouldn't look for data past one day, because deletion is based off of the rollover time, so the index could be a month old even though their delete retention is one day
@dakrone I think you're bringing a good question, but it's not obvious to me that we should not warn though as the fact that data exists is a bit accidental. I'm thinking of the case of someone who experiments with a query with the goal of turning it into an alerting rule at some point. If there is data just because we're "lucky", wouldn't it better to warn users so that they don't accidentally create rules that might not see all the data that they expect to see?
We have some discussions about this yesterday and the following questions were raised:
@tomcallahan brought up the idea that maybe this shouldn't be about warning users, but instead we should enable Elasticsearch to return information about the retention period for a given index pattern. This would allow Kibana to tailor its UI for this retention period and e.g. give signs that filtering data from the "Last 90 days" isn't right if the data has a retention period of 30 days.
To move this forward we agreed to gather more feedback from Solutions to see whether this is something they already considered.
++ for exposing this and letting Kibana decide how to show it
@jpountz since this has two area labels, which team should take ownership of this, the search team or the core/features team?
Thanks for the ping. Since the idea of adding this information to _field_caps seems to be getting traction, I'm assigning the search team.
Most helpful comment
Thanks for the ping. Since the idea of adding this information to
_field_capsseems to be getting traction, I'm assigning the search team.