Reduce backend load by using passive health indicators #6407

mp911de · 2016-07-18T07:16:09Z

Health-checks are cheap checks to validate the health of a particular system. Checks are used to be called regular and often (5 to 10-second intervals) by monitoring systems, hardware load-balancers, and other clients.

A health check should primarily report the health of the current system by using all known aspects of the system to tell whether it's healthy. Ideally, it should not obtain that details from external sources. While it's convenient to retrieve external data to compose the health check state, external access generates additional load. This can be totally fine for some cases but in other scenarios (e.g. many application instances create load on few database hosts/MongoDB servers/...), health checks can cause load that helps to kill otherwise healthy backends.

Health checks should be able to report also on locally available data (passive health check) without invoking operations on remote hosts. This could be done by using local metrics (JDBC/JMS/generic connection pool utilization, connection state for persistent TCP connections)

wilkinsona · 2016-07-18T07:23:18Z

See also #3441 where I said:

I remain unconvinced by the expensive argument. A periodic health check, even if it's happening every five seconds from multiple instances, should be very lightweight in comparison to the load that those instances are producing doing real work. If those health checks are sufficient to cause a noticeable degradation in performance then you have bigger problems.

I'm still unconvinced

mp911de · 2016-07-18T07:50:21Z

I agree with the point

If those health checks are sufficient to cause a noticeable degradation in performance then you have bigger problems.

Performance degradation of light-weight health checks gets only visible when the application is under high load. Not all applications run at the upper-end of their capacity. If your system experiences a higher load and the backend services are maxed-out, they start queueing work and requests take longer. Health checks using the same application resources add their load to the existing workload.

In simple environments, you just add more CPU/RAM/Disks/more machines and your problems are (naive spoken) deferred. There are other scenarios in which one central database host (or a shared host) is used by many clients and can't grow any larger or can't be easily migrated to own hosts. That's another proof for

then you have bigger problems

In such a case I'd wish my software would support me with a different scope/feature than being required to implement health checks on my own.

Adding further indication metrics (pool usage, connection state) enhances health indicators. Being able to enable/disable health check modes gives users a possibility to react to their needs.

philwebb · 2016-07-18T18:06:42Z

We need to balance any new features that we add against the additional complexity that it brings. My feeling is that the current HealthIndicator code works well for most users and I'm not keen to make it any more complex.

If our out-of-the-box solution isn't suitable then it's possible to add additional Endpoints to do something different. I'm happy to add to make reuse of the HealthEndpoint easier if needed, but I don't think we should change the core design.

spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Jul 18, 2016

philwebb closed this as completed Jul 18, 2016

philwebb added status: declined A suggestion or change that we don't feel we should currently apply and removed status: waiting-for-triage An issue we've not yet triaged labels Jul 18, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce backend load by using passive health indicators #6407

Reduce backend load by using passive health indicators #6407

mp911de commented Jul 18, 2016

wilkinsona commented Jul 18, 2016 •

edited

Loading

mp911de commented Jul 18, 2016

philwebb commented Jul 18, 2016

Reduce backend load by using passive health indicators #6407

Reduce backend load by using passive health indicators #6407

Comments

mp911de commented Jul 18, 2016

wilkinsona commented Jul 18, 2016 • edited Loading

mp911de commented Jul 18, 2016

philwebb commented Jul 18, 2016

wilkinsona commented Jul 18, 2016 •

edited

Loading