-
Notifications
You must be signed in to change notification settings - Fork 235
Description
hey svix team been digging through the repo and found something interesting I want to fix—just want to make sure I'm on the right track before I open a PR.
Right now when a destination server sends back a 429 or 503 with a Retry-After header, Svix ignores it and retries on its own schedule. The server is saying give me 120 seconds but we might hit it again in 30 — which wastes retries and doesn't help it recover.
My proposal is to use whichever is larger — the server's requested delay or our internal backoff. This would only ever increase the delay, never cause earlier retries. We'd also need a cap since a server could send Retry-After: 86400 and we don't want to blindly respect that — tom mentioned something like retry_schedule_delay * 2 + 2hrs which sounds reasonable to me, but curious what the team thinks.
Before I touch anything — should the cap be relative to our backoff stage or a fixed ceiling? Should we support both seconds and HTTP-date formats or just seconds? And should this apply to 429/503 only or other 5xx too?