Description
Other methods in ClusterAdmin appear to use the retryOnError mechanism, allowing them to retry upon failure (e.g., CreateTopic, DeleteTopic, CreatePartitions, etc.).
However, ListTopics does not implement any retry logic.
As a result, if the method fails due to reasons like the broker closing the connection (e.g., caused by connections.max.idle.ms), it does not retry. Instead, the client/broker seems to deregister and register again. Once a new broker is registered, the issue no longer occurs.
Is there a specific reason why ListTopics does not include retry logic? Was this behavior intentional?
Versions
| Sarama |
Kafka |
Go |
| 1.45.1 |
3.7 |
1.23.0 |
Configuration
Logs
logs: CLICK ME
time="2025-03-17T17:14:43+09:00" level=debug msg="client/metadata fetching metadata for all topics from broker kafka-broker-test-0:19000\n"
time="2025-03-17T17:14:43+09:00" level=debug msg="client/metadata got error from broker 0 while fetching metadata: EOF\n"
time="2025-03-17T17:14:43+09:00" level=debug msg="Closed connection to broker kafka-broker-test-0:19000\n"
time="2025-03-17T17:14:43+09:00" level=debug msg="client/brokers deregistered broker #0 at kafka-broker-test-0:19000"
time="2025-03-17T17:14:43+09:00" level=debug msg="client/metadata fetching metadata for all topics from broker kafka-broker-test-1:19001\n"
time="2025-03-17T17:14:43+09:00" level=debug msg="client/metadata got error from broker 1 while fetching metadata: write tcp 172.24.1.6:53158->10.161.82.44:19001: write: broken pipe\n"
time="2025-03-17T17:14:43+09:00" level=debug msg="Error while closing connection to broker kafka-broker-test-1:19001: tls: failed to send closeNotify alert (but connection was closed anyway): write tcp 172.24.1.6:53158->10.161.82.44:19001: write: broken pipe\n"
time="2025-03-17T17:14:43+09:00" level=debug msg="client/brokers deregistered broker #1 at kafka-broker-test-1:19001"
time="2025-03-17T17:14:43+09:00" level=debug msg="client/metadata fetching metadata for all topics from broker kafka-broker-test-2:19002\n"
time="2025-03-17T17:14:43+09:00" level=debug msg="client/brokers registered new broker #0 at kafka-broker-test-0:19000"
time="2025-03-17T17:14:43+09:00" level=debug msg="client/brokers registered new broker #1 at kafka-broker-test-1:19001"
time="2025-03-17T17:18:05+09:00" level=debug msg="client/metadata fetching metadata for all topics from broker kafka-broker-test-0:19000\n"
time="2025-03-17T17:18:05+09:00" level=debug msg="Connected to broker at kafka-broker-test-0:19000 (registered as #0)\n"
time="2025-03-17T17:23:16+09:00" level=debug msg="client/metadata fetching metadata for all topics from broker kafka-broker-test-0:19000\n"
Additional Context
Description
Other methods in ClusterAdmin appear to use the retryOnError mechanism, allowing them to retry upon failure (e.g., CreateTopic, DeleteTopic, CreatePartitions, etc.).
However, ListTopics does not implement any retry logic.
As a result, if the method fails due to reasons like the broker closing the connection (e.g., caused by connections.max.idle.ms), it does not retry. Instead, the client/broker seems to deregister and register again. Once a new broker is registered, the issue no longer occurs.
Is there a specific reason why ListTopics does not include retry logic? Was this behavior intentional?
Versions
Configuration
Logs
logs: CLICK ME
Additional Context