Today when we register a remote cluster for cross cluster search we try to connect to it straight-away. If the remote cluster is not available and it is configured in the elasticsearch.yml of the CCS node, such node won't start. On the other hand, when a remote cluster is registered through the cluster update settings API we still try to connect to it but a connection failure won't impact the registration of the remote cluster.
I wonder if we should unify the behaviour between the two ways of registering. Do we prefer the remote cluster to be available in both cases at registration time, or would we want to make remote clusters also optional when registering them in the configuration file.
With #26118 we are introducing a setting that allows to skip remote clusters that are disconnected at search time. Maybe we want to behave differently at registration time too depending on that setting?
Opening this one for discussion.
Relates to #26961
We spoke about this during fixit-friday and we agree that given that we now have search.remote.connect we should rather remove the ability to specify remote clusters in the elasticsearch.yaml altogether. The original purpose was to allow certain nodes to act as CCS nodes but now we have a setting to do that.
cc @elastic/es-search-aggs
The solution discussed above would translate to making the remote cluster seeds a setting that can only be set through the cluster settings API, disallowing it to be set in the elasticsearch.yaml configuration file. I was talking to @s1monw and it seems like we do not have yet the infra for this, and I wonder if it's worth adding it. There was also a comment from a user who said they would prefer to set seeds in the configuration file rather than the API, and have it behave the same as setting things from the API. I think that would be ok too, I am not sure that it makes sense to introduce now and here the notion of a node setting that can only be set dynamically through the API? What do people think?
As a user, I would rather be able to set everything required for a working elasticsearch cluster via the yml file (or, in my case, docker container environment variables). My use case is a system that is producing logs in multiple, distant data centers, and I am setting up an elasticsearch cluster in each data center to index the logs. however, I would like a single "pane of glass" to search the logs from both data centers, so I need to configure CCS. It makes the most sense to me to configure each datacenter cluster as a remote cluster of the other, and I do that when I launch the containers.
Just my $0.02
As a user I would also find it undesirable to only configure cross cluster search through the API. For our use case the most reliable way, in terms of distributing configuration to our servers, will be for the remote cluster settings to be configured in elasticsearch.yml via our standard process that configures all the other servers in our networks. Having almost everything in elasticsearch.yml, but then some settings that can only be provided to the API creates additional complexities that don't seem to be strictly required.
thanks for commenting @ebernhardson . we got similar feedback from other users which is why we have not yet made the change described above. I am planning to rather unify the behaviour between the two ways of registering remote clusters, as the behaviour is rather confusing at the moment. We may also consider to honour skip_unavailable at registration time.
Hey @javanna I think this solution would work well for me too. Is it still being considered?
heya @StevTheDev yes this is still the plan, but I was busy with other tasks that have higher priority hence I never got to it. I will try to make time soon-ish but I don't know yet when that will be exactly.
Most helpful comment
thanks for commenting @ebernhardson . we got similar feedback from other users which is why we have not yet made the change described above. I am planning to rather unify the behaviour between the two ways of registering remote clusters, as the behaviour is rather confusing at the moment. We may also consider to honour
skip_unavailableat registration time.