rptest: fix max_connections swarm sizing for librdkafka >= 2.10.0#30848
Conversation
The test sized the swarm assuming each producer holds num_brokers + 1 connections, where the +1 was the persistent bootstrap-broker connection. librdkafka removed that connection in v2.10.0 (confluentinc/librdkafka#4557) by keying brokers on id rather than host:port. client-swarm now bundles librdkafka >= 2.10.0, so each producer holds exactly num_brokers connections, leaving the swarm under-provisioned and unable to reach the advertised connection target. CORE-16659
d95c950 to
a24c3ba
Compare
There was a problem hiding this comment.
Pull request overview
Updates the OMBValidationTest.test_max_connections swarm sizing heuristic to reflect librdkafka behavior changes (>= 2.10.0) where producers no longer maintain an extra persistent bootstrap-broker connection, preventing under-provisioned swarms and connection-target shortfalls in the max-connections cloud validation test.
Changes:
- Adjusts the assumed per-producer connection count from
num_brokers + 1tonum_brokers. - Expands the inline comment to document the historical bootstrap connection and the librdkafka v2.10.0 change that removed it.
ballard26
left a comment
There was a problem hiding this comment.
LGTM, I assume we're only going to be using this with librdkafka versions >= 2.10 right?
Basically yes, since the swarm version is pinned here in this same repo in ducktape-deps, it was upgraded in #30671 earlier this month. It got through CI since these don't run in PRs. |
CI test resultstest results on build#86005
|
|
BIG. This also means one less thread? |
OMBValidationTest.test_max_connectionssizes the swarm of producers from anassumption about how many connections each swarm producer holds. It assumed
num_brokers + 1, where the+ 1was a persistent connection to the bootstrapbroker.
librdkafka removed that separate bootstrap-broker connection in v2.10.0
(confluentinc/librdkafka#4557): brokers are now keyed by id rather than
host:port, so the bootstrap entry is merged into the learned broker list and a
producer ends up holding exactly one connection per broker. client-swarm now
bundles librdkafka >= 2.10.0, so the old
+ 1over-estimated the per-producerconnection count and the swarm was provisioned with too few producers to ever
reach the advertised connection target, causing the test to fail with e.g.
Failed to reach target connections, actual: ~18560, target: 24723.This drops the stale
+ 1soconn_per_swarm_producer == num_brokers, whichre-sizes
producer_per_swarm_nodeupward and lets the swarm actually reach thetarget connection count.
Backports Required
Release Notes