Spring-boot: The Jest Elasticsearch health indicator uses a potentially heavy call to check its status

Created on 1 Jun 2017  路  14Comments  路  Source: spring-projects/spring-boot

The spring-boot-actuator Elasticsearch health check performs a call to /_all/_stats which potentially comes with a big response. In my case, this response is ~8MB at the moment and regular health checks were causing a huge load on the system. We create indices at an hourly basis. IMHO, using /_cat/health is better suited for this task

superseded bug

Most helpful comment

I agree, the output of that command can be quite large.

It seems that the /_cat/health endpoint may not be supported in SaaS environments though; they tend to use REST gateways and limit features in shared environments, see searchly documentation for example. I've just tried with a searchly instance (ES 5.1+) and this doesn't work (I'm getting an HTTP 500).

Do you know of other potential endpoints that we could use?

All 14 comments

I agree, the output of that command can be quite large.

It seems that the /_cat/health endpoint may not be supported in SaaS environments though; they tend to use REST gateways and limit features in shared environments, see searchly documentation for example. I've just tried with a searchly instance (ES 5.1+) and this doesn't work (I'm getting an HTTP 500).

Do you know of other potential endpoints that we could use?

In the case of Searchly, all administrative features of ElasticSearch are restricted from the API, according to their documentation. That doesn't leave us with a lot of room, does it? I'm sure that Searchly have their reasons to do so, but that is one case out of millions of users, and many of them are not developing in shared environments with special restrictions.

I still believe that the right endpoint to call by default should be /_cat/health which could be overridden via configuration properties.

Instead of having management.health.elasticsearch.indices= we could have management.health.elasticsearch.endpoint=/_cat/health

I think we should treat this one as a bug so I've targeted it to 1.5.4. We've only got a few days before that release so we might not get to it in time. It still feels like a solution in 1.5.x would be useful.

Note: it's not a case of Searchly vs. rest of the world. It's more a case of ElasticSearch instances running on premise vs. in the cloud (the official elastic.co cloud offering or the AWS one). Even companies running their own cluster can have X-Pack enabled and disable cluster info access from applications.

The team just discussed this topic and it seems that changing the default for all setups out there might not be the solution we're looking for. Having a strategy that uses one or the other might be more useful; in fact, the HealthIndicator bean could be that strategy.

Right now, defining your own bean named elasticsearchHealthIndicator should replace the existing one and provide something more suited to your needs.

We could consider switching strategies and providing different information automatically, but I'm not aware of any clear signal in the configuration that says if you've got access to those endpoints or not (besides trying and possibly failing). Any idea about that @rodol-fo ?

Let's keep this issue opened until we've figured this out.

@rodol-fo Your input would be useful here. Please see the comment from @bclozel above.

Hello @bclozel,

I think you are right. The only bulletproof way of knowing if a particular ES endpoint is available is by trying a request on it. However, in my experience /_cluster/health was always available when I worked with cloud based ES clusters, especifically AWS and Elastic.

I'm not so sure about /_cat/health in the case Elastic Cloud but it is definitely available for Amazon. See the below:

http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-supported-es-operations.html

I hope this helps

Thanks, @rodol-fo.

I think I've changed my opinion on this. Having spent some more time reading around, and reading the Elasticsearch documentation in particular, I now think that /_cluster/health is a more appropriate default for the health indicator. However, I can't see a way to make that change in 1.5.x without possibly breaking people's applications.

Perhaps we should consider leaving this as-is in 1.5 and making a change in 2.0 instead?

That last option has my vote.

It seems the /_cat/** endpoints can only be leveraged using REST calls directly; this means you can use any HTTP client directly or the official elasticsearch-rest-client (ES 5.0+) / elasticsearch-rest-high-level-client (ES 6.0+) dependencies.

In other words, we can't use the Java native transport API to fetch that information.
The org.elasticsearch.client.Client native transport API or io.searchbox.client.JestClient are what we're auto-configuring, and none of those seem to support that endpoint.

The Java native transport should be deprecated in the future, see here and here.
Ultimately, using elasticsearch-rest-client means using Apache httpclient (4.5.x, so in line with what we've got right now).

With all that in mind, a couple of options:

  1. Leave the current behavior as it is and revisit things in the future
  2. Deprecate both ES health indicators in 1.5.x and replace them with a single indicator that uses elasticsearch-rest-client, for both native and Jest use cases. Note that this choice probably means altering existing starters to include that dependency. Of course, it would also make sense to auto-configure a elasticsearch-rest-client instance for application usage as well.

@snicoll, @wilkinsona, @philwebb, @rodol-fo - let me know what you think and if I've missed something.

I don't think we should do much in 1.5.x except perhaps add a warning to the documentation. For 2.x, I guess option 2 makes the most sense in the long-term, but it's certainly something we can tackle in 2.1 if necessary.

Depends on #12600

12600 is now fixed, and we can try a global approach for that change now.

/_cluster/health is still probably the best way to go, even if some providers don't support it (at least one of them doesn't seem to support /_all/_stats anyway).

The ES REST client will be supported by Spring Data ElasticSearch in the next Moore release train. As of #12600, we're auto-configuring those clients in Spring Boot.

I'm closing this issue in favour of #14914 since Jest is now supporting the cluster endpoint. This aligns the Jest health indicator with what we've got for the transport client already.

We'll figure the rest out in #15008.

Was this page helpful?
0 / 5 - 0 ratings