Skip to content

JmsHealthIndicator can hang in case failover with infinite max reconnects is used with ActiveMQ #10809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
filiphr opened this issue Oct 29, 2017 · 10 comments
Assignees
Labels
type: bug A general bug
Milestone

Comments

@filiphr
Copy link
Contributor

filiphr commented Oct 29, 2017

First of all, I have to say that it would be wrong to use infinite max reconnects with ActiveMQ. However, that is the default starting from ActiveMQ 5.6 (see failover transport-reference).

In case the default is used then the actuator endpoint /health will hang forever. Do you think that something like this should be addressed in the JmsHealthIndicator itself? I think that it would be good if the endpoint doesn't hang forever.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Oct 29, 2017
@snicoll
Copy link
Member

snicoll commented Oct 29, 2017

@filiphr I am not sure we can do anything about it really. If returning a valid javax.jms.Connection never returns when the remote broker is down, this is beyond what the JMS API allows us to invoke.

@filiphr
Copy link
Contributor Author

filiphr commented Oct 29, 2017

@snicoll sorry I was not clearer. The connection.start() is the place where it blocks. The connection is correctly created. javax.jms.Connection has setExceptionListener that catches exceptions (and I am 90% sure that it catches those as well). I think that Spring Boot can do something with them (not sure whether it is OK or not though). My idea was something like:

protected void doHealthCheck(Health.Builder builder) throws Exception {
    try (Connection connection = this.connectionFactory.createConnection()) {
        ExceptionListener original = connection.getExceptionListener();
        ExceptionListener bootListener = new ExceptionListener() {
            @Override
            public void onException(JMSException exception) {
                if (original != null) {
                    original.onException(exception);
                }
                // spring boot can do something here 
                // (some counts, or have some timer how long it should try, or something similar)
                // and close the connection
            }
        }
        connection.start();
	builder.up().withDetail("provider",
		     connection.getMetaData().getJMSProviderName());
	}
}

How exactly Spring Boot will decide to close the connection I am not sure yet. Maybe this is out of scope for Spring Boot. I just wanted to check if there might be something that you can do. I was really confused why the /health endpoint was being blocked.

@snicoll
Copy link
Member

snicoll commented Oct 30, 2017

@filiphr I am not keen to use a feature that has this kind of warning:

This method must not be used in a Java EE web or EJB application. Doing so may cause a {@code JMSException} to be thrown though this is not guaranteed.

Also the Javadoc of ExceptionListener states that

A JMS provider should attempt to resolve connection problems itself before it notifies the client of them.

I'll try to look at other options. If you have a working patch, please submit a PR.

@snicoll snicoll added priority: normal type: enhancement A general enhancement and removed status: waiting-for-triage An issue we've not yet triaged labels Oct 30, 2017
@filiphr
Copy link
Contributor Author

filiphr commented Oct 30, 2017

@snicoll I didn't notice that, and you are right it is not good to use something like that.
The thing is that the JMS provider is trying to resolve the connection problem (infinitely trying to reconnect 😄).

I had another idea and that is to run it wrapped into some separate Thread that we would block on. If it doesn't return in some time we can write "down" or "unstable" (I don't know what is there in the Actuators) and just write that.

I can prepare an patch and PR for this in the next few days.

@philwebb philwebb added the status: waiting-for-feedback We need additional information before we can continue label Oct 31, 2017
filiphr added a commit to filiphr/spring-boot that referenced this issue Oct 31, 2017
Starting a JMS connection may block infinitely if the factory is not correctly setup.
For example if ActiveMQ with failover transport is used then the default connection
retries infinitely, which leads to ActiveMQ trying to establish a connection and
never succedding in case none of the configured brokers is up.

Fixes spring-projectsgh-10809
@filiphr
Copy link
Contributor Author

filiphr commented Oct 31, 2017

I've created a PR as promise. It is WIP and I am open to discussions about it

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Oct 31, 2017
@philwebb philwebb added type: bug A general bug and removed status: feedback-provided Feedback has been provided type: enhancement A general enhancement labels Nov 15, 2017
@philwebb
Copy link
Member

See discussion on #10853

@wilkinsona
Copy link
Member

wilkinsona commented Nov 15, 2017

Inspired somewhat by the suggestion from the ActiveMQ team, I'd like to know what happens if we create the connection and then pass it to another thread that monitors it. The original thread can then call start() which may block indefinitely. My hope is that, if the monitoring thread detects that the connection hasn't started within 5 seconds (say) and calls close(), the call to start() will then throw an exception.

@wilkinsona
Copy link
Member

The idea above appears to work. With the following changes:

diff --git a/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/jms/JmsHealthIndicator.java b/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/jms/JmsHealthIndicator.java
index 320d23b900..7fdae6c2fd 100644
--- a/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/jms/JmsHealthIndicator.java
+++ b/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/jms/JmsHealthIndicator.java
@@ -16,9 +16,15 @@

 package org.springframework.boot.actuate.jms;

+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+
 import javax.jms.Connection;
 import javax.jms.ConnectionFactory;

+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+
 import org.springframework.boot.actuate.health.AbstractHealthIndicator;
 import org.springframework.boot.actuate.health.Health;
 import org.springframework.boot.actuate.health.HealthIndicator;
@@ -31,6 +37,8 @@ import org.springframework.boot.actuate.health.HealthIndicator;
  */
 public class JmsHealthIndicator extends AbstractHealthIndicator {

+       private final Log logger = LogFactory.getLog(JmsHealthIndicator.class);
+
        private final ConnectionFactory connectionFactory;

        public JmsHealthIndicator(ConnectionFactory connectionFactory) {
@@ -39,8 +47,25 @@ public class JmsHealthIndicator extends AbstractHealthIndicator {

        @Override
        protected void doHealthCheck(Health.Builder builder) throws Exception {
+               CountDownLatch started = new CountDownLatch(1);
                try (Connection connection = this.connectionFactory.createConnection()) {
+                       new Thread(() -> {
+                               try {
+                                       if (!started.await(5, TimeUnit.SECONDS)) {
+                                               this.logger.warn("Connection failed to start within 5 seconds "
+                                                               + "and will be closed.");
+                                               connection.close();
+                                       }
+                               }
+                               catch (InterruptedException ex) {
+                                       Thread.currentThread().interrupt();
+                               }
+                               catch (Exception ex) {
+                                       // Continue
+                               }
+                       }).start();
                        connection.start();
+                       started.countDown();
                        builder.up().withDetail("provider",
                                        connection.getMetaData().getJMSProviderName());
                }

The request for /application/health blocks for 5 seconds and then responds with a 503:

$ http :8080/application/status
HTTP/1.1 503
Connection: close
Content-Type: application/vnd.spring-boot.actuator.v2+json;charset=UTF-8
Date: Mon, 20 Nov 2017 11:14:58 GMT
Transfer-Encoding: chunked

{
    "status": "DOWN"
}

The following is logged by the application:

2017-11-20 11:20:24.308  WARN 38450 --- [       Thread-4] o.s.boot.actuate.jms.JmsHealthIndicator  : Connection failed to start within 5 seconds and will be closed.
2017-11-20 11:20:24.317  WARN 38450 --- [nio-8080-exec-1] o.s.boot.actuate.jms.JmsHealthIndicator  : Health check failed

javax.jms.JMSException: Stopped.
	at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:54) ~[activemq-client-5.15.2.jar:5.15.2]
	at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1399) ~[activemq-client-5.15.2.jar:5.15.2]
	at org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1478) ~[activemq-client-5.15.2.jar:5.15.2]
	at org.apache.activemq.ActiveMQConnection.start(ActiveMQConnection.java:527) ~[activemq-client-5.15.2.jar:5.15.2]
	at org.springframework.boot.actuate.jms.JmsHealthIndicator.doHealthCheck(JmsHealthIndicator.java:67) ~[classes/:na]
	at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:43) ~[classes/:na]
	at org.springframework.boot.actuate.health.CompositeHealthIndicator.health(CompositeHealthIndicator.java:68) [classes/:na]
	at org.springframework.boot.actuate.health.StatusEndpoint.health(StatusEndpoint.java:43) [classes/:na]
	at org.springframework.boot.actuate.health.StatusEndpointWebExtension.getHealth(StatusEndpointWebExtension.java:44) [classes/:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_121]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_121]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_121]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
	at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:223) [spring-core-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.boot.actuate.endpoint.reflect.ReflectiveOperationInvoker.invoke(ReflectiveOperationInvoker.java:68) [classes/:na]
	at org.springframework.boot.actuate.endpoint.web.servlet.WebMvcEndpointHandlerMapping$OperationHandler.handle(WebMvcEndpointHandlerMapping.java:133) [classes/:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_121]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_121]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_121]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:209) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:102) [spring-webmvc-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:871) [spring-webmvc-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:777) [spring-webmvc-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) [spring-webmvc-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:991) [spring-webmvc-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:925) [spring-webmvc-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:978) [spring-webmvc-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:870) [spring-webmvc-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:635) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:855) [spring-webmvc-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:742) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) [tomcat-embed-websocket-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.springframework.boot.actuate.trace.WebRequestTraceFilter.doFilterInternal(WebRequestTraceFilter.java:119) [classes/:na]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:108) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:81) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.filterWithMetrics(WebMvcMetricsFilter.java:95) [classes/:na]
	at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:72) [classes/:na]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) [spring-web-5.0.2.BUILD-SNAPSHOT.jar:5.0.2.BUILD-SNAPSHOT]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:478) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1459) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_121]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121]
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-8.5.23.jar:8.5.23]
	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.io.IOException: Stopped.
	at org.apache.activemq.transport.ResponseCorrelator.stop(ResponseCorrelator.java:131) ~[activemq-client-5.15.2.jar:5.15.2]
	at org.apache.activemq.util.ServiceSupport.dispose(ServiceSupport.java:43) ~[activemq-client-5.15.2.jar:5.15.2]
	at org.apache.activemq.ActiveMQConnection.close(ActiveMQConnection.java:722) ~[activemq-client-5.15.2.jar:5.15.2]
	at org.springframework.boot.actuate.jms.JmsHealthIndicator.lambda$0(JmsHealthIndicator.java:57) ~[classes/:na]
	... 1 common frames omitted

We should perhaps consider making the 5 second timeout configurable.

@ghost
Copy link

ghost commented Nov 24, 2017

@wilkinsona
What happens if doHealthCheck() is called multiple times within the 5 seconds?

Maybe you can make the connection and CountDownLatch a field (volatile) and only create a new connection if there is no connection running.

@wilkinsona
Copy link
Member

I considered something like that but I'm not sure that it's necessary. The endpoint infrastructure provides support for caching the response for a configurable time-to-live. With appropriate configuration for the health endpoint, multiple calls to it within the time-to-live will only result in each health indicator being called once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A general bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants