In what version(s) of Spring Integration are you seeing this issue?
6.5.7
Describe the bug
We have a Spring Boot application (an HTTP API), which sends commands to some specialized hardware we connect through raw TCP/IP connections. We use spring-integration-ip to send the commands to the specialized hardware, using persistent shared connections.
The TCP/IP connections have been always working perfectly, but when we updated to spring-integration-ip 6.5.7, from 6.3.9, we started seeing sporadic read errors. Through some log debugging, we were able to track down the issue to this exception in the org.springframework.integration.ip.tcp.connection.TcpNioConnection class:
java.lang.NullPointerException: Cannot invoke "java.util.concurrent.CountDownLatch.await(long, java.util.concurrent.TimeUnit)" because "this.writingLatch" is null
at org.springframework.integration.ip.tcp.connection.TcpNioConnection.convert(TcpNioConnection.java:380)
at org.springframework.integration.ip.tcp.connection.TcpNioConnection.run(TcpNioConnection.java:259)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
To Reproduce
I don't fully understand how all that code works, I was not able to discover the exact conditions for the problem. I can only guess it's some race condition over the TcpNioConnection.writingLatch object and that seems to be related to the following change: d07dc70
Only I can do at the moment is provide the following information about our application and the environment where it runs, maybe it's useful to somebody else:
- An application instance (we have multiple nodes, but that shouldn't matter) uses just 2 persistent, shared TCP connections
- It only happens about just 2 out of 200000 requests a day
- The TCP connection responds usually in less than 4 milliseconds
- The TCP connection configuration in our application code can be resumed like this:
- 2
TcpOutboundGateway instances
- remoteTimeout = 2 seconds
- requestTimeout = 2 seconds
- Each
TcpOutboundGateway instance is set a FailoverClientConnectionFactory instance (each one their own instance)
- refreshSharedInterval = 5 minutes
- closeOnRefresh = true
- Each
FailoverClientConnectionFactory instance is set an array of 1 instance of TcpNioClientConnectionFactory
- usingDirectBuffers = true
- singleUse = false
- soKeepAlive = true
- tcpNioConnectionSupport = DefaultTcpNioConnectionSupport
- serializer = ByteArrayLengthHeaderSerializer
- deserializer = ByteArrayLengthHeaderSerializer
- spring-integration-ip 6.5.7
- Spring Boot 3.5.11
- Java 21 (Amazon corretto)
- Spring boot's virtual threads support (
spring.threads.virtual.enabled) is not enabled
- Environment and resources where the application instance runs:
- docker image on kubernetes
- resource limits:
- docker image based on amazoncorretto:21 alpine image
Expected behavior
TcpNioConnection shouldn't fail to read and process the response data from the network (assuming the connection is healthy).
Sample
https://github.com/emuller84/spring-integration-ip-bug-demo-20260312
It's just a sample API application with the same TCP client code configuration we have on the real application where I'm seeing this error. Also, a simple python-based TCP server that application will connect to, and a postman collection that can be used to run a load test against the app to try to trigger the error. It does NOT provide an unit test or something similar that triggers the issue, I was not able to figure out the exact error conditions for that. (more details on that repository's README file)
In what version(s) of Spring Integration are you seeing this issue?
6.5.7
Describe the bug
We have a Spring Boot application (an HTTP API), which sends commands to some specialized hardware we connect through raw TCP/IP connections. We use spring-integration-ip to send the commands to the specialized hardware, using persistent shared connections.
The TCP/IP connections have been always working perfectly, but when we updated to spring-integration-ip 6.5.7, from 6.3.9, we started seeing sporadic read errors. Through some log debugging, we were able to track down the issue to this exception in the
org.springframework.integration.ip.tcp.connection.TcpNioConnectionclass:To Reproduce
I don't fully understand how all that code works, I was not able to discover the exact conditions for the problem. I can only guess it's some race condition over the
TcpNioConnection.writingLatchobject and that seems to be related to the following change: d07dc70Only I can do at the moment is provide the following information about our application and the environment where it runs, maybe it's useful to somebody else:
TcpOutboundGatewayinstancesTcpOutboundGatewayinstance is set aFailoverClientConnectionFactoryinstance (each one their own instance)FailoverClientConnectionFactoryinstance is set an array of 1 instance ofTcpNioClientConnectionFactoryspring.threads.virtual.enabled) is not enabledExpected behavior
TcpNioConnectionshouldn't fail to read and process the response data from the network (assuming the connection is healthy).Sample
https://github.com/emuller84/spring-integration-ip-bug-demo-20260312
It's just a sample API application with the same TCP client code configuration we have on the real application where I'm seeing this error. Also, a simple python-based TCP server that application will connect to, and a postman collection that can be used to run a load test against the app to try to trigger the error. It does NOT provide an unit test or something similar that triggers the issue, I was not able to figure out the exact error conditions for that. (more details on that repository's README file)