We use the ES backend, and would like to be able to archive traces
Archiving is only supported by the Cassandra storage plugin
I briefly looked at the implementation of archiving. It looks like most of the logic is built on things that already existed, and that it would be fairly straightforward to add an ES implementation. Two options come to mind:
1) Allow configuring a second ES cluster for archiving (the same way we do for Cassandra)
2) Use a separate index for archive traces. The esCleaner.py script would need to ignore this index.
It seems like we'd need a different way of dividing up the indexes to support this if we go with option (1) above. Currently we create an index for each day, which probably doesn't make sense for archiving. Option (2) above would solve this inherently.
I think this needs both 1 and 2 simultaneously. Probably requires #799 and #628 to be implemented first (per cluster), and then add ability to specify archiving cluster.
Is there a specific motivation for wanting a separate cluster for archiving, rather than just a separate keysapce/index?
it doesn't have to be a separate cluster, but doesn't have NOT to be either. The configuration is such that archive storage inherits most settings from primary storage, and you can override some things.
I am looking into this
@yurishkuro Are the archived traces shown on /search page? it seems that an archived trace can be only accessed directly via /trace/id endpoint.
If the above is true we don't want to create index for service names, therefore e will have to make changes to span writer and reader. I also assume that we will create only one archive index per deployment.
To be able to support multiple tenants the index prefix will be just put in front of the archive index e.g. tenant:jaeger-span-archive or tenant:jaeger-span-1970-01-01 :)
@yurishkuro Are the archived traces shown on /search page? it seems that an archived trace can be only accessed directly via /trace/id endpoint.
Yes, it only works for direct lookups by ID. It's primarily built to support long-lived hyperlinks that people can put to tickets, postmortem docs, etc.
We should also think about retention for the archive index. We could do it per time like proposed in #628 (day, month, year) or just allow to use a different index name e.g. `jaeger-span-archive-2. It might be also doable with prefix.
Note that it's not expected to delete data from an index in ES.
^^ cc @jaegertracing/elasticsearch
Just as a counter-point, we would not use the native Jaeger archive at all.
Instead, we currently rely on our own elasticsearch-curator configuration to route indices from "hot nodes" to "warm nodes". Specifically in our Kube+AWS deployment, this means moving from Elasticsearch nodes sized as r4.4xlarge with gp2 EBS volumes to r4.xlarge nodes with st1 EBS volumes.
This might be a better recommendation for the Jaeger project, even though it is more complexity in the Elasticsearch deployment and surrounding tooling itself.
An older blog post, though still largely relevant in Elasticsearch 6.x, which further explains this architecture: "Hot-Warm" Architecture in Elasticsearch 5.x
I guess, specifically, this feels like possibly the wrong way to fix a performance limitation/regression in the Elasticsearch storage backend. If given time-bounding on the query, it will "automatically" optimize which indexes need to be scanned instead of walking the entire available data-set of spans.
This is all new to me I will have to experiment and do some reading. But it seems we could have one archive index (as perhaps an alias). This alias would point to one write index (archive-3) and several read indices (archive-1, archive-2). The write index would be rolled over (based on conditions - shards, time?) and put to read indices. I am not sure if the rolled over index can be automatically assigned to another alias.
After playing with rollover API here is my proposal how we could go forward and use it for archive index. If it works well we could start experimenting and use it for the main indices.
First a brief explanation of rollover API:
name-%counter or name-%date-%counterGreat news is that ES >= 6.4.4 supports is_write_index (https://www.elastic.co/guide/en/elasticsearch/reference/6.4/indices-aliases.html#aliases-write-index, https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-rollover-index.html#indices-rollover-is-write-index) which allows to use an alias for writes with multiple indices. The write index in the alias has is_write_index:true. With rollover the old index stays in alias as read only. This simplifies readers and external tolling - adding the old index to separate read alias.
My proposal is to allow use two archive indices - one for writes and one for reads. By default the read index would be the same as write. This satifies a simple deployments with no extra configuration and the more complex deployments would have to create the archive aliases (or single if ES6 is used) before deploying and use cronjob with rollover.
cc @jaegertracing/elasticsearch any feedback is welcome.
The last thing we have to figure out is how to call rollover. We could use curator with a cronjob. The culrator would call rollover, parse the response and put the old index (if ES5) to read alias (I am not sure if rollover operation in curator returns an object with index info).
{"old_index":"jaeger-span-archive-000001","new_index":"jaeger-span-archive-000002","rolled_over":true,"dry_run":false,"acknowledged":true,"shards_acknowledged":true,"conditions":{"[max_age: 1s]":true,"[max_docs: 1]":false}}
I am linking an issue regarding automatic rollover API https://github.com/elastic/elasticsearch/issues/26092.
NB: are you only thinking of using this for archives? It seems useful for the main storage as well, since we're currently issuing queries over multiple indices, rather than a single alias, which would simplify the code & configuration.
My plan is to start with the archive index and then add option for the main storage.
that's fair, although in Cassandra there is no difference between main/archive storage implementations, just a configuration, so you would still need to make changes to the ES storage impl - are you thinking a fork or a feature flag?
A good question, I think a feature flag there will be a lot of similarities. Only get index names functions should by different maybe we could resolve that function in the constructor.
My main blocker here is how to get old index name after rollover and put it into the read alias. I will have to play with the curator.
@pavolloffay Would archiving traces only be supported with ES >= 6.4.4?
No, it will be supported for > 5.x. 6.4.4 just would leverage the is_write_index to have one alias for writes and reads.