-
Notifications
You must be signed in to change notification settings - Fork 41.2k
Better controls for Health Indicators #18753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
i think i have suggested this (or something very similar) during discussions of #14022 (comment) I am happy more people are interested in this. In my opinion the ability to specify a "threshold" on the a specific health check. e.g.
is easy to implement. On top of that the actuator endpoints can also support querying by those thresholds e.g.
This way it is very straightforward to configure something like:
Which is actually disabling the database health check, without losing visibility. People can then map liveness, readiness probes to ( All these will work nice, since heath statuses are already ordered. |
as a sample health results when someone configures it could look something like this:
|
Thanks for raising these suggestions again but we're not keen to add any more complexity to the health indicator endpoint at this time. We feel that having health contributors that don't actually affect the overall status might cause quite a bit of confusion. A couple of specific points have guided our thinking on this: We want to keep
|
Hi 👋
This issue was discussed a few years back (#7626 in 2016) but since then things have changed.
With the jump in adoption of systems like Kubernetes/... it seems that the way that Health Indicators operate should be revisited (an example of this is the new health indicator groups feature to support different probes (liveness and readiness) but imo this is still not enough.)
I would like to suggest another change which is the ability to specify if the status of a specific health indicator should affect the overall health check.
This is already being done by some implementations, one of the most high profile ones being the Hystrix health indicator, where when a circuit breaker is open the health endpoint still returns a 200 "UP".
Having the ability to specify this behaviour would allow us to report the status of dependent systems and why they are failing (withDetails, withException) without actually forcing the overall health status to fail.
It would also make the usage less ambiguous and less error prone. For eg: the Hystrix and the Resilience4j health indicators have opposite behaviours when dealing with failures: one results in a 200 UP and the other in a 503 DOWN.
I'm not sure if this could be done with a condition like
management.health.foo.some-name-here
or if it would have to be manually configured for each of the indicators included in the spring-boot-actuator, but I believe this is the right time to discuss if this change has merit.The text was updated successfully, but these errors were encountered: