Describe the feature:
Current Elasticsearch multi data center (or cross cloud/AWS region) deployment strategy is incomplete and/or too complex.
For example one approach mentions using messaging queue (which in my opinion suffers from over complicated nature and dependencies):
https://www.elastic.co/blog/scaling_elasticsearch_across_data_centers_with_kafka
Another approach mentions using limited Tribe node based deployment - which is really designed for federated search across clusters and not so much scalability/availability):
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-tribe.html
https://www.elastic.co/blog/clustering_across_multiple_data_centers
It would be useful to explore other approaches - specifically a kind present with Cassandra quorum replication, with its different levels of consistency (e.g. EACH_QUORUM) and deployment using specified network or dynamically derived datacenter topology:
https://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
https://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureSnitchesAbout_c.html
I understand that this is a complicated topic, however I just wanted to put this out there as an issue for Elacticsearch team to consider.
Hi @ttaranov
We are actively working on a disaster recovery product for X-Pack which will be based on the Changes API (https://github.com/elastic/elasticsearch/issues/1242). This will be about keeping multiple clusters in sync rather than trying to make a single cluster span multiple data centres.
Cool - glad to hear it's in progress. Thanks, Tim
@clintongormley Do you have any progress?
Most helpful comment
Hi @ttaranov
We are actively working on a disaster recovery product for X-Pack which will be based on the Changes API (https://github.com/elastic/elasticsearch/issues/1242). This will be about keeping multiple clusters in sync rather than trying to make a single cluster span multiple data centres.