Warn when Redis pending timeout is shorter than worker request timeout (+buffer)#2181
Warn when Redis pending timeout is shorter than worker request timeout (+buffer)#2181
Conversation
svix-jbrown
left a comment
There was a problem hiding this comment.
good catch; one style nit (I believe our style is to prefer matches!() over batch blocks for booleans)
I wonder if we could detect the equivalent problem for SQS and RabbitMQ?
Use matches!() for queue type guard Co-authored-by: James Brown <jbrown@svix.com>
There was a problem hiding this comment.
Rustfmt is keeping me humble today 😅.
Re: SQS/RabbitMQ — I took a look at the OSS backends. In this repo we only have Redis and RabbitMQ (no SQS here).
RabbitMQ doesn’t rely on visibility/pending timeouts — messages are only requeued on NACK or if the consumer/channel dies — so the “pending timeout < request timeout” mismatch doesn’t really apply there. That’s why I kept the guardrail Redis-only.
If SQS exists in Cloud/EE, then doing a similar check against the SQS visibility timeout vs worker_request_timeout would probably make sense on that side.
Context
In OSS/self-hosted setups, operators sometimes increase
worker_request_timeoutto accommodate slow endpoints.The Redis visibility timeout (
redis_pending_duration_secs) is configuredseparately and remains global.
If
worker_request_timeoutis increased for slow deliveries butredis_pending_duration_secsis not adjusted accordingly, Redis mayre-queue a task while the original worker is still processing it.
That can result in overlapping deliveries of the same message.
Change
This PR adds a startup guardrail:
If using a Redis-based queue and:
a
tracing::warn!is emitted during startup.Why a Warning (Not a Hard Error)
The configuration is technically valid and may be intentional.
This change does not modify runtime behavior.
It simply highlights a potentially unsafe configuration
that could lead to duplicate in-flight deliveries when
handling slow endpoints.
Scope