-
Notifications
You must be signed in to change notification settings - Fork 41.2k
Conditional Execution of Health Checks #3441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Are you aware that there is a |
@snicoll -- Yes, I'm aware of the caching capabilities of the health result but I do not think it's the right solution for this need since it unilaterally caches all the results of the health check instead of an identified subset. Setting a high Thanks! |
Either way, this sounds like a nice feature. If you want to give that a try, go ahead and we can discuss more based on the code (make sure to sign the CLA). |
As I said on Stack Overflow, I'm not convinced that this is a good idea. The description above hasn't changed that opinion – I still think this feels like premature optimisation. @coreyjv You've said that some calls can be "expensive". Can you define that more concretely? Checking that a service is available, even if it's remote, should have a far lower cost than actually using that service for any real work. I also don't really understand the point of having a health check that, based on some condition, might not actually check anything. Beyond a cache with predictable TTL, aren't you in danger of believing that your app is healthy when, in fact, it's not? |
Thanks for taking the time to respond!
Sure thing. One example would be checking connectivity to a directory service such as LDAP or Active Directory. In these cases one would want to validate they could query certain parts of the tree successfully. Another example would be a remote call on a web service. This may not only be expensive from a network perspective since it's a remote call but it presents another challenge. These downstream dependencies also have to be able to support the load that these health check calls are placing on which could slow down actual services, and be significant if the calling application is actually multiple application instances.
I think there's a distinction between the application health, and the health of external dependencies. While an external dependency such as a remote web service may report as I hope this provides a bit of clarity and thanks for taking a critical look at this request. |
I remain unconvinced by the expensive argument. A periodic health check, even if it's happening every five seconds from multiple instances, should be very lightweight in comparison to the load that those instances are producing doing real work. If those health checks are sufficient to cause a noticeable degradation in performance then you have bigger problems.
I find this part of the reasoning more compelling. It sounds like you want the health indicators to have an error threshold such that it takes multiple errors in a given sampling window for a service to be considered down, very much like what's supported by Hystrix. |
For the record, I thought this issue was about tuning the |
FYI: We use a similar approach in our current project. If written a single class which decouples returning the health result from calculating it. All that is needed then is to schedule a trigger for the calculation method. If there is any interest I could upload an example. |
Supporting this would require a lot of additional code on our end. We're not convinced there's enough of a need. You can implement this directly by writing a custom health aggregator. |
@philwebb I'd like to reopen this for consideration to support the following use case. My organization uses Spring Boot services running in a Kubernetes environment. Kubernetes has a concept of two types of health probes: liveness and readiness probes. Liveness probes are used to detect if the service needs to be restarted, while readiness probes detect if the service is ready to be added to load balancers. It would be great if there were a way to perform different checks for readiness and liveness. For example, if an application temporarily loses connectivity to its data source it should be taken out of load balancing, but doesn't need to be restarted. So we would like to include the Note that #8685 already allows querying a single health indicator, but not multiple indicators. Perhaps Spring Boot could provide a way to configure named groups of indicators that could be accessed as |
@ngaya-ll That's an interesting use case, thank you. It feels a bit different to what was proposed in this issue so can you please open a new issue so that we can track it separately? |
@wilkinsona Sure, filed #14022 |
Thanks. |
I've been using the Spring Boot Actuator health check functionality for a while now and really enjoy the easy of writing custom
HealthIndicator
s. However, there is one need that Spring Boot Actuator does not currently support (see this Stack Overflow thread for reference here) that I'd like to suggest as a new feature but first I'd like to explain our challenge.For our applications the
/health
endpoint is queried very frequently (say every 5 seconds) to determine if fail over to another site needs to occur. For frequently executed queries the/heath
endpoint should report the health of the application, and not execute anyHealthIndicator
s that check the health of external dependencies such as external web services because those calls are expensive.Additionally, there are times where health checks against external dependencies are needed during post deployment or troubleshooting scenarios. It's very easy to write custom
HealthIndicators
and having the ability to consolidate it all under one/health
endpoint is advantageous instead of writing additional code to expose this through another endpoint.If this feature request is accepted I'd like to propose that the implementation is opt-in so by default custom
HealthIndicator
s are executed like they are today so users don't find themselves in a situation where aHealthIndicator
is not run when they expect it to be.Finally, if possible, I'd be willing to make the contributions to implement this feature as well.
Thank you!
The text was updated successfully, but these errors were encountered: