The Kibana Upgrade Assistant helps users prepare to the next major version of Elasticsearch. The upgrade assistant works by introspecting various aspects of a cluster and its usage to surface deprecated functionality that is in use, and also to prepare indices that require re-indexing. The deprecations that are surfaced in the migration assistant are those that can be ascertained by introspecting the current state of the cluster. It can not however, catch on-going uses of deprecated functionality (e.g., APIs) that can not be ascertained by introspecting the current state of the cluster. The upgrade assistant would be even more useful for our users if we could assist users in understanding their use of such deprecated functionality.
Today we do surface such deprecated functionality via the deprecation logs. The upgrade assistant does not have an easy way to get its hands on the deprecation logs.
The crux of this issue then is aimed at making it possible for the upgrade assistant to collect the deprecation logs from each running node. One way to do this is to write the deprecation logs to an index that the upgrade assistant could then read, along side the deprecations that it can already obtain.
It is likely that we want to consider the deprecation indices as system indices, and also manage them via ILM if ILM is available (e.g., use of a deprecated API no more recent than N months ago isn't relevant, it's likely that the user already migrated away from that API).
Pinging @elastic/es-core-infra
Pinging @elastic/es-core-features
This is a really cool idea. We were looking into this together with @ycombinator and @nik9000 to possibly start a filebeat that would consume ES logs and upload them back to an index.
The solution was proposed a while back here https://github.com/elastic/dev/issues/731
We discussed this in a couple of team meetings. Two approaches were discussed with the first being that we would rely on a bundled filebeat to ingest the deprecation logs and the other would include building this functionality into the DeprecationLogger itself.
There is ongoing discussion about packaging filebeat and metricbeat with elasticsearch (#49399) but they would be disabled by default. Given the desire to have them be disabled and not add a new process by default, this would mean this functionality would explicitly require the user to do work. Whereas if we added this to the deprecation logger, we could enable it more easily. Given this, the discussion favored the process internal solution.
There was additional discussion regarding:
deprecateAndMaybeLog correctly, which should prevent spamming. This should not be the case if we use
deprecateAndMaybeLogcorrectly, which should prevent spamming.
While I was the one that suggested that we could use deprecateAndMaybeLog here, I do wonder if we we should consider trying to log all of these messages to the index. There's a trade-off here between potential performance issues (which maybe we can address in other ways, such as batching) and being able to surface to a user the actual last time that they used deprecated functionality.
Pinging @elastic/es-ui
I do wonder if we we should consider trying to log all of these messages to the index
Even batching I worry could exhaust resources in any case that a deprecation occurs per document in a query, as is typical when deprecations occur within scripting. I'm supportive of the idea of logging more if we move to an index, but I wanted to point out we still have edge cases we need to consider where logging the details of every warning is not practical.
I don鈥檛 think we should deprecation log anything per document. Per request, that鈥檚 okay though, and I think alleviates a lot of pressure here.
Could we use something like ScriptService#checkCompilationLimit to limit how much we log? If we're batching we're probably already going to synchronize somehow, somewhere, so adding the rate limiting would be pretty cheap.
checkCompilationLimit relies on calling nanoTime, which I don't think we want to do per document. Instead, I think we can find a way to call deprecations through a script specific lock, so that we only call the deprecation on the first use when executing the script.
I've been playing around with the current deprecation logger. Since the logger is called all over the place, it seems unfeasible to introduce anything that would require changes to the call sites.
Instead, I threw together a DeprecationIndexer, initialised it in Node and passed it to the DeprecationLogger. Now, if a new setting is true, deprecation messages are written in something resembling ECS (only if the logger is also writing a message to Log4J).
The main issue I had was security - the NodeClient I got from Node is authenticated as _system, which doesn't have permission to create templates, create indices, or write to indices, so I slapped a dirty hack to see if the rest of it worked. I was running Elasticsearch with ./gradlew run.
I also make the indexer listen to the cluster state so that when the cluster was ready, it could ensure an index template exists, then the indexer writes to a daily index. The indexer stops listening once it knows the template exists.
So the questions I have are:
@pugnascotia that sounds good to me.
Will this Indexer be synchronous and and blocking the execution? Can you link your branch?
I was thinking if maybe we could implement this as a log4j appender that would be used together with asynchronous logger?
I was hoping to reuse deprecation logger logic for compatible API warnings. So there will be even more usages.
With regards to ECS, I am meant to tackle this here https://github.com/elastic/elasticsearch/pull/47105
It is actually almost done, I need to add more testing.
@pgomulka here's the branch: https://github.com/elastic/elasticsearch/compare/4ff5e03c70a...pugnascotia:index-deprecation-logs
We can take this idea even further, and use this deprecation index as common collection point for deprecation logs across the Stack, and then expose in Kibana all the deprecated functionality in any Stack products that a user is using, helping give a full view across the Stack of changes a user might need to make when preparing to upgrade. We will need to hash out the details of this idea, which @pugnascotia will take charge on. 馃檹
I had a good chat with @jakelandis about this, and we realised that although there are parallels with the existing monitoring code, given that we're ripping all that out and relying on stack features, we should do the same here. We can ship an index template that creates the deprecation index as a data stream with some suitable ILM settings.
@pgomulka would you mind take a quick look at a new implementation for writing deprecation logs? See:
https://github.com/elastic/elasticsearch/compare/master...pugnascotia:index-deprecation-logs-v2
Most helpful comment
We can take this idea even further, and use this deprecation index as common collection point for deprecation logs across the Stack, and then expose in Kibana all the deprecated functionality in any Stack products that a user is using, helping give a full view across the Stack of changes a user might need to make when preparing to upgrade. We will need to hash out the details of this idea, which @pugnascotia will take charge on. 馃檹