-
Notifications
You must be signed in to change notification settings - Fork 41.2k
JmsHealthIndicator can hang in case failover with infinite max reconnects is used with ActiveMQ #10809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@filiphr I am not sure we can do anything about it really. If returning a valid |
@snicoll sorry I was not clearer. The protected void doHealthCheck(Health.Builder builder) throws Exception {
try (Connection connection = this.connectionFactory.createConnection()) {
ExceptionListener original = connection.getExceptionListener();
ExceptionListener bootListener = new ExceptionListener() {
@Override
public void onException(JMSException exception) {
if (original != null) {
original.onException(exception);
}
// spring boot can do something here
// (some counts, or have some timer how long it should try, or something similar)
// and close the connection
}
}
connection.start();
builder.up().withDetail("provider",
connection.getMetaData().getJMSProviderName());
}
} How exactly Spring Boot will decide to close the connection I am not sure yet. Maybe this is out of scope for Spring Boot. I just wanted to check if there might be something that you can do. I was really confused why the |
@filiphr I am not keen to use a feature that has this kind of warning:
Also the Javadoc of
I'll try to look at other options. If you have a working patch, please submit a PR. |
@snicoll I didn't notice that, and you are right it is not good to use something like that. I had another idea and that is to run it wrapped into some separate Thread that we would block on. If it doesn't return in some time we can write "down" or "unstable" (I don't know what is there in the Actuators) and just write that. I can prepare an patch and PR for this in the next few days. |
Starting a JMS connection may block infinitely if the factory is not correctly setup. For example if ActiveMQ with failover transport is used then the default connection retries infinitely, which leads to ActiveMQ trying to establish a connection and never succedding in case none of the configured brokers is up. Fixes spring-projectsgh-10809
I've created a PR as promise. It is WIP and I am open to discussions about it |
See discussion on #10853 |
Inspired somewhat by the suggestion from the ActiveMQ team, I'd like to know what happens if we create the connection and then pass it to another thread that monitors it. The original thread can then call |
The idea above appears to work. With the following changes: diff --git a/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/jms/JmsHealthIndicator.java b/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/jms/JmsHealthIndicator.java
index 320d23b900..7fdae6c2fd 100644
--- a/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/jms/JmsHealthIndicator.java
+++ b/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/jms/JmsHealthIndicator.java
@@ -16,9 +16,15 @@
package org.springframework.boot.actuate.jms;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+
import javax.jms.Connection;
import javax.jms.ConnectionFactory;
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+
import org.springframework.boot.actuate.health.AbstractHealthIndicator;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;
@@ -31,6 +37,8 @@ import org.springframework.boot.actuate.health.HealthIndicator;
*/
public class JmsHealthIndicator extends AbstractHealthIndicator {
+ private final Log logger = LogFactory.getLog(JmsHealthIndicator.class);
+
private final ConnectionFactory connectionFactory;
public JmsHealthIndicator(ConnectionFactory connectionFactory) {
@@ -39,8 +47,25 @@ public class JmsHealthIndicator extends AbstractHealthIndicator {
@Override
protected void doHealthCheck(Health.Builder builder) throws Exception {
+ CountDownLatch started = new CountDownLatch(1);
try (Connection connection = this.connectionFactory.createConnection()) {
+ new Thread(() -> {
+ try {
+ if (!started.await(5, TimeUnit.SECONDS)) {
+ this.logger.warn("Connection failed to start within 5 seconds "
+ + "and will be closed.");
+ connection.close();
+ }
+ }
+ catch (InterruptedException ex) {
+ Thread.currentThread().interrupt();
+ }
+ catch (Exception ex) {
+ // Continue
+ }
+ }).start();
connection.start();
+ started.countDown();
builder.up().withDetail("provider",
connection.getMetaData().getJMSProviderName());
} The request for
The following is logged by the application:
We should perhaps consider making the 5 second timeout configurable. |
@wilkinsona Maybe you can make the connection and CountDownLatch a field (volatile) and only create a new connection if there is no connection running. |
I considered something like that but I'm not sure that it's necessary. The endpoint infrastructure provides support for caching the response for a configurable time-to-live. With appropriate configuration for the health endpoint, multiple calls to it within the time-to-live will only result in each health indicator being called once. |
First of all, I have to say that it would be wrong to use infinite max reconnects with ActiveMQ. However, that is the default starting from ActiveMQ 5.6 (see failover transport-reference).
In case the default is used then the actuator endpoint
/health
will hang forever. Do you think that something like this should be addressed in theJmsHealthIndicator
itself? I think that it would be good if the endpoint doesn't hang forever.The text was updated successfully, but these errors were encountered: