Skip to content

Reduce backend load by using passive health indicators #6407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mp911de opened this issue Jul 18, 2016 · 3 comments
Closed

Reduce backend load by using passive health indicators #6407

mp911de opened this issue Jul 18, 2016 · 3 comments
Labels
status: declined A suggestion or change that we don't feel we should currently apply

Comments

@mp911de
Copy link
Member

mp911de commented Jul 18, 2016

Health-checks are cheap checks to validate the health of a particular system. Checks are used to be called regular and often (5 to 10-second intervals) by monitoring systems, hardware load-balancers, and other clients.

A health check should primarily report the health of the current system by using all known aspects of the system to tell whether it's healthy. Ideally, it should not obtain that details from external sources. While it's convenient to retrieve external data to compose the health check state, external access generates additional load. This can be totally fine for some cases but in other scenarios (e.g. many application instances create load on few database hosts/MongoDB servers/...), health checks can cause load that helps to kill otherwise healthy backends.

Health checks should be able to report also on locally available data (passive health check) without invoking operations on remote hosts. This could be done by using local metrics (JDBC/JMS/generic connection pool utilization, connection state for persistent TCP connections)

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Jul 18, 2016
@wilkinsona
Copy link
Member

wilkinsona commented Jul 18, 2016

See also #3441 where I said:

I remain unconvinced by the expensive argument. A periodic health check, even if it's happening every five seconds from multiple instances, should be very lightweight in comparison to the load that those instances are producing doing real work. If those health checks are sufficient to cause a noticeable degradation in performance then you have bigger problems.

I'm still unconvinced

@mp911de
Copy link
Member Author

mp911de commented Jul 18, 2016

I agree with the point

If those health checks are sufficient to cause a noticeable degradation in performance then you have bigger problems.

Performance degradation of light-weight health checks gets only visible when the application is under high load. Not all applications run at the upper-end of their capacity. If your system experiences a higher load and the backend services are maxed-out, they start queueing work and requests take longer. Health checks using the same application resources add their load to the existing workload.

In simple environments, you just add more CPU/RAM/Disks/more machines and your problems are (naive spoken) deferred. There are other scenarios in which one central database host (or a shared host) is used by many clients and can't grow any larger or can't be easily migrated to own hosts. That's another proof for

then you have bigger problems

In such a case I'd wish my software would support me with a different scope/feature than being required to implement health checks on my own.

Adding further indication metrics (pool usage, connection state) enhances health indicators. Being able to enable/disable health check modes gives users a possibility to react to their needs.

@philwebb
Copy link
Member

We need to balance any new features that we add against the additional complexity that it brings. My feeling is that the current HealthIndicator code works well for most users and I'm not keen to make it any more complex.

If our out-of-the-box solution isn't suitable then it's possible to add additional Endpoints to do something different. I'm happy to add to make reuse of the HealthEndpoint easier if needed, but I don't think we should change the core design.

@philwebb philwebb added status: declined A suggestion or change that we don't feel we should currently apply and removed status: waiting-for-triage An issue we've not yet triaged labels Jul 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: declined A suggestion or change that we don't feel we should currently apply
Projects
None yet
Development

No branches or pull requests

4 participants