The following use case may cause all services implemented using micronaut connected to consul to be marked as down even if they are up and running.
Micronaut uses https://github.com/micronaut-projects/micronaut-core/blob/e54f6e6e3077ce24815bdebf9ffe95daf530fc51/discovery-client/src/main/java/io/micronaut/discovery/consul/client/v1/HealthEntry.java to deserialize the responses from consul
InetAddress is used in the jackson deserialization but since the hostname is not an ip and it is not resolvable from the current context the following exception is throw:
io.micronaut.http.codec.CodecException: Error decoding JSON stream for type [interface java.util.List]: simpleapp-6dfdb49cd7-8p9z8: Name or service not known (through reference chain: java.util.ArrayList[0])2019-08-03 09:20:0906:20:09.470 [nioEventLoopGroup-3-2] DEBUG i.m.http.client.DefaultHttpClient - Unable to decode response body using codec JsonMediaTypeCodec:Error decoding JSON stream for type [interface java.util.List]: simpleapp-6dfdb49cd7-8p9z8: Name or service not known (through reference chain: java.util.ArrayList[0])
Thus since the exception is not caught the health endpoint is marked as down.
Let's have 2 services A and B
Since the services do not have any interactions between, B's service status should not be marked as down.
B overall service status is marked as down even though there are no interactions between the service. Furthermore a single dns error is able to bring the whole cluster down.
As a workaround one may disable few out of the box functionalities:
consul.client.discovery.enabled: false
endpoints.health.discovery-client.enabled: false
In scenarios where the host name cannot be resolved you should use the prefer-ip-address setting:
https://docs.micronaut.io/latest/guide/configurationreference.html#io.micronaut.discovery.consul.ConsulConfiguration$ConsulRegistrationConfiguration
Even if prefer-ip-address is used it does not prevent the cascade failure in case of dns resolution.
Should prefer-ip-address property be used only with valid ip values because currently you can set a non/resolvable hostname?
In this scenario you should replace DiscoveryClientHealthIndicator with an implementation that either ignores any exceptions or just returns ok
@Replaces(DiscoveryClientHealthIndicator.class)
class MyIndicator implements HealthIndicator {
public Publisher<HealthResult> getResult() {
return Flowable.just(
HealthResult.builder("discovery-client", HealthStatus.UP).build()
);
}
}
If you wish to supply a PR with configurable exception handling logic then the code that handles this can be found here:
Most helpful comment
In this scenario you should replace
DiscoveryClientHealthIndicatorwith an implementation that either ignores any exceptions or just returns ok