There appears to be some change in behaviour for the Kubernetes-oriented readiness
group endpoint on 2.3.2
compared to 2.3.1
.
For a service that has no external dependencies (and only readinessState
in the health group), the /actuator/health/readiness
endpoint is returning a 404
.
Configuration we are using:
management.server.port=9083
management.health.probes.enabled=true
management.endpoints.enabled-by-default=false
management.endpoint.info.enabled=true
management.endpoint.health.enabled=true
management.endpoint.health.show-details=always
management.endpoint.health.group.liveness.include=livenessState,diskSpace,refreshScope
management.endpoint.health.group.readiness.include=readinessState
management.endpoint.health.group.liveness.show-details=always
management.endpoint.health.group.readiness.show-details=always
management.endpoints.web.exposure.include=health
Expected Behaviour
We expect this to just return 200
with { "status": "UP" }
Actual Behaviour
$ http http://localhost:9083/actuator/health/readiness
HTTP/1.1 404 Not Found
Full health
call:
$ http http://localhost:9083/actuator/health
HTTP/1.1 200 OK
Connection: keep-alive
Content-Type: application/json
Date: Sat, 25 Jul 2020 06:27:55 GMT
Transfer-Encoding: chunked
{
"components": {
"discoveryComposite": {
"components": {
"discoveryClient": {
"description": "Discovery Client not initialized",
"status": "UNKNOWN"
}
},
"description": "Discovery Client not initialized",
"status": "UNKNOWN"
},
"diskSpace": {
"details": {
"exists": true,
"free": 287311962112,
"threshold": 10485760,
"total": 499963174912
},
"status": "UP"
},
"livenessStateProbeIndicator": {
"status": "UP"
},
"ping": {
"status": "UP"
},
"reactiveDiscoveryClients": {
"components": {
"Simple Reactive Discovery Client": {
"description": "Discovery Client not initialized",
"status": "UNKNOWN"
}
},
"description": "Discovery Client not initialized",
"status": "UNKNOWN"
},
"readinessStateProbeIndicator": {
"status": "UP"
},
"refreshScope": {
"status": "UP"
}
},
"groups": [
"liveness",
"readiness"
],
"status": "UP"
}
This may relate to #22107.
After a bit more digging, I'm not really sure why or whether it was intended, however the issue seems to be that readinessState
has become readinessStateProbeIndicator
(and same for livenessState
) so the old configuration was not correctly including the indicator at all, leaving the readiness
group empty.
This seems to work as expected.
management.endpoint.health.group.liveness.include=livenessStateProbeIndicator,diskSpace,refreshScope
management.endpoint.health.group.readiness.include=readinessStateProbeIndicator
Yes this is an unintended side effect of #22107. The workaround you're mentioning is the right one in the meantime.
Thanks for raising this issue!
No problem - feel free to re-title it as appropriate.
Unfortunately this is a transparently breaking change for many people, they probably won't realise the probe status isn't being included in the status in addition to, say, db
, redis
etc because including a non-existent indicator in a group doesn't seem to fail startup :(
I've tagged this issue as a regression.
I'm really sorry for letting in that one.
Does this cover the fact that they are listed under groups
at /health
, but then don't actually exist?
I precisely have the same issue than @OrangeDog . On my container with management.endpoint.health.probes.enabled=true
:
When executing GET /actuator/health
:
{
"status": "UP",
"groups": [
"liveness",
"readiness"
]
}
When executing GET /actuator/health/liveness
:
404 Not Found
* When executing GET `/actuator/health`: `{ "status": "UP", "groups": [ "liveness", "readiness" ] }` * When executing GET `/actuator/health/liveness`: `404 Not Found`
I agree this is potentially confusing, but doesn't seem to be the main problem here?
I wonder whether the /actuator/health
endpoint behaved differently under 2.3.1
if a group has no configured components? i.e it filtered them out from groups: []
?
I guess this is a matter of design - the group exists but has no (valid) components, therefore its status is indeterminate, therefore the implementation returns a 404? It certainly can't return 200
OK....
Would we
include
?Instead of referencing readinessStateProbeIndicator
and livenessStateProbeIndicator
, I think you need to set management.health.livenessstate.enabled
and management.health.readinessstate.enabled
properties introduced by spring-boot 2.3.2. So that, you could use readinessState
and livenessState
reference.
When management.health.[readiness|livenessstate].enabled
properties are set to false
(by default), AvailabilityProbesAutoConfiguration
creates readinessStateProbeIndicator
and livenessStateProbeIndicator
beans which need to be referenced as [readiness|liveness]StateProbeIndicator
(full bean name).
On the other hand, when properties are enabled, AvailabilityHealthContributorAutoConfiguration
creates [readiness|liveness]StateHealthIndicator
beans which can be referenced as [readiness|liveness]State
.
The problem is in AvailabilityProbesHealthEndpointGroups
created by AvailabilityProbesHealthEndpointGroupsPostProcessor
, this creates readiness/liveness
groups with [readiness|liveness]State
.
So, if [readiness|liveness]State
are not available, groups are created but referenced HealthIndicator
beans are not there.
want to be aware the groups exist, so we know we can add components to them with include ?
The API response is supposed to be for consumers of the API, not documenting configuration options for the developer. Like the rest of the actuator system, only endpoints that are currently available should be listed as available.
When
management.health.[readiness|livenessstate].enabled
properties are set tofalse
(by default)
FYI surprisingly enough Spring Boot decided to name the readiness state property management.health.readynessstate.enabled
with a y
in the 2.3.2.RELEASE
version (most recent release at this date).
See the reference: https://docs.spring.io/spring-boot/docs/2.3.2.RELEASE/reference/html/appendix-application-properties.html#actuator-properties
@antoinegrappin no, that's just a documentation error. The property is readiness
.
@OrangeDog indeed, I confirm after tests.
This issue is now fixed in the 2.3.3 and 2.4.0 SNAPSHOTs.
I've carefully read the comments on this issue regarding the following surprising behavior: getting a 404 status on a configured health group, when no indicator is present. In this very case it's arguably wrong, but we're in a case of a regression. But some of you thought that
The first alternative sounds nice, especially for detecting bad configurations. But it's also likely to fail in perfectly valid cases. Your application could configure a group management.endpoint.health.group.custom.include=ping,redis
and fail in a test environment where no redis instance is available. Because Spring Boot reacts to the environment, it's expected to behave differently and adapt to the situation.
The second alternative is debatable. Right now our health groups support is auto-configured with the configuration properties and does not look into the application context to check for the existence of health indicators. We seem to all agree that a 404 response status is right in this case. Removing the group information would, in my opinion, make things less consistent as we wouldn't know that a group has been configured. After all, a health group is just a way to wrap several indicators under the same name and customize its global health status - but health indicators are still dynamic.
After discussing that briefly with the team, we didn't think that this needs to be changed. Note that this behavior exists since the introduction of the health groups feature. If you can make a stronger case for changing this, please create a dedicated issue and explain how this behavior is inconsistent or could lead to issues.
Thanks!
Thanks @bclozel - fix is working fine in 2.3.3
after removing the workaround to the probe names I mentioned above :-)
@chadlwilson can you share your configurations in 2.3.3? I am finding the same issue there..
@salaboy If your application runs on kubernetes, you don't need any specific configuration.
If it doesn't, you need to enable the probes with the following:
management.endpoint.health.probes.enabled=true
Most helpful comment
This issue is now fixed in the 2.3.3 and 2.4.0 SNAPSHOTs.
I've carefully read the comments on this issue regarding the following surprising behavior: getting a 404 status on a configured health group, when no indicator is present. In this very case it's arguably wrong, but we're in a case of a regression. But some of you thought that
The first alternative sounds nice, especially for detecting bad configurations. But it's also likely to fail in perfectly valid cases. Your application could configure a group
management.endpoint.health.group.custom.include=ping,redis
and fail in a test environment where no redis instance is available. Because Spring Boot reacts to the environment, it's expected to behave differently and adapt to the situation.The second alternative is debatable. Right now our health groups support is auto-configured with the configuration properties and does not look into the application context to check for the existence of health indicators. We seem to all agree that a 404 response status is right in this case. Removing the group information would, in my opinion, make things less consistent as we wouldn't know that a group has been configured. After all, a health group is just a way to wrap several indicators under the same name and customize its global health status - but health indicators are still dynamic.
After discussing that briefly with the team, we didn't think that this needs to be changed. Note that this behavior exists since the introduction of the health groups feature. If you can make a stronger case for changing this, please create a dedicated issue and explain how this behavior is inconsistent or could lead to issues.
Thanks!