Skip to content

TcpNioConnection.convert() error: NullPointerException when calling this.writingLatch.await() #10876

@emuller84

Description

@emuller84

In what version(s) of Spring Integration are you seeing this issue?

6.5.7

Describe the bug

We have a Spring Boot application (an HTTP API), which sends commands to some specialized hardware we connect through raw TCP/IP connections. We use spring-integration-ip to send the commands to the specialized hardware, using persistent shared connections.

The TCP/IP connections have been always working perfectly, but when we updated to spring-integration-ip 6.5.7, from 6.3.9, we started seeing sporadic read errors. Through some log debugging, we were able to track down the issue to this exception in the org.springframework.integration.ip.tcp.connection.TcpNioConnection class:

java.lang.NullPointerException: Cannot invoke "java.util.concurrent.CountDownLatch.await(long, java.util.concurrent.TimeUnit)" because "this.writingLatch" is null
	at org.springframework.integration.ip.tcp.connection.TcpNioConnection.convert(TcpNioConnection.java:380)
	at org.springframework.integration.ip.tcp.connection.TcpNioConnection.run(TcpNioConnection.java:259)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

To Reproduce

I don't fully understand how all that code works, I was not able to discover the exact conditions for the problem. I can only guess it's some race condition over the TcpNioConnection.writingLatch object and that seems to be related to the following change: d07dc70

Only I can do at the moment is provide the following information about our application and the environment where it runs, maybe it's useful to somebody else:

  • An application instance (we have multiple nodes, but that shouldn't matter) uses just 2 persistent, shared TCP connections
  • It only happens about just 2 out of 200000 requests a day
  • The TCP connection responds usually in less than 4 milliseconds
  • The TCP connection configuration in our application code can be resumed like this:
    • 2 TcpOutboundGateway instances
      • remoteTimeout = 2 seconds
      • requestTimeout = 2 seconds
    • Each TcpOutboundGateway instance is set a FailoverClientConnectionFactory instance (each one their own instance)
      • refreshSharedInterval = 5 minutes
      • closeOnRefresh = true
    • Each FailoverClientConnectionFactory instance is set an array of 1 instance of TcpNioClientConnectionFactory
      • usingDirectBuffers = true
      • singleUse = false
      • soKeepAlive = true
      • tcpNioConnectionSupport = DefaultTcpNioConnectionSupport
      • serializer = ByteArrayLengthHeaderSerializer
      • deserializer = ByteArrayLengthHeaderSerializer
  • spring-integration-ip 6.5.7
  • Spring Boot 3.5.11
  • Java 21 (Amazon corretto)
  • Spring boot's virtual threads support ( spring.threads.virtual.enabled) is not enabled
  • Environment and resources where the application instance runs:
    • docker image on kubernetes
    • resource limits:
      • cpu: 3.0
      • memory: 1000M
    • docker image based on amazoncorretto:21 alpine image

Expected behavior

TcpNioConnection shouldn't fail to read and process the response data from the network (assuming the connection is healthy).

Sample

https://github.com/emuller84/spring-integration-ip-bug-demo-20260312

It's just a sample API application with the same TCP client code configuration we have on the real application where I'm seeing this error. Also, a simple python-based TCP server that application will connect to, and a postman collection that can be used to run a load test against the app to try to trigger the error. It does NOT provide an unit test or something similar that triggers the issue, I was not able to figure out the exact error conditions for that. (more details on that repository's README file)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions