Skip to content

t3022: Cap concurrent opus-4-6 dispatches at N per Anthropic account to prevent rate-limit cascade #21579

@marcusquinn

Description

@marcusquinn

Problem

Single Anthropic account sustains many concurrent sonnet workers but only ~3-4 concurrent opus before triggering 429 rate limits. Currently no per-model concurrency cap exists in dispatch logic. When fill_floor + cascade-elevation both push opus-tier dispatches, all simultaneous opus workers hit 429 within seconds and become 20-min zombies (see GH#21570 follow-up).

Evidence: 3 opus-4-6 workers killed at the same minute with rate_limit (headless-runtime-metrics.jsonl ts=1777397345-1777397359). Sonnet workers in the same pool succeeded.

How

EDIT: .agents/scripts/pulse-dispatch-engine.sh — function _dff_compute_max_parallel (~line 712-727). Add a per-model concurrency check:

# Count in-flight workers by model from active worker process list
local opus_inflight
opus_inflight=$(pgrep -f 'opencode.*-m anthropic/claude-opus' | wc -l | tr -d ' ')
local opus_cap=${AIDEVOPS_OPUS_CONCURRENCY_CAP:-4}
if (( opus_inflight >= opus_cap )); then
    # Skip opus-tier candidates this cycle; sonnet candidates still dispatch
    return $opus_cap
fi

EDIT: .agents/scripts/pulse-dispatch-core.sh near line 1119 — when a candidate's resolved model is opus AND opus_inflight >= cap, defer with reason="opus_concurrency_cap" instead of dispatching.

NEW: per-model cap config in .agents/configs/dispatch-model-caps.conf:

opus-4-6=4
opus-4-7=4
sonnet-4-6=24

Reference pattern

shared-gh-wrappers.sh rate-limit threshold logic is conceptually similar: probe budget before action, defer when below threshold.

Verification

# Spawn 6 simulated opus workers (sleep loops with matching cmdline)
for i in 1 2 3 4 5 6; do (exec -a 'opencode -m anthropic/claude-opus-4-6' sleep 600 &); done
# Run pulse cycle
pulse-wrapper.sh --canary
# Log should show 'opus_concurrency_cap' deferrals for issues that would have used opus

Acceptance

  • Concurrent opus workers never exceed the cap (default 4)
  • Sonnet dispatch unaffected
  • Cap is overridable via AIDEVOPS_OPUS_CONCURRENCY_CAP env
  • Deferred candidates retry next cycle (not NMR'd)

Filed from interactive root-cause session (2026-04-28). Composes with the early-429-detection fix.


aidevops.sh v3.13.7 plugin for OpenCode v1.14.29 with claude-opus-4-7 spent 1h 12m and 19,339 tokens on this with the user in an interactive session.

Metadata

Metadata

Assignees

Labels

auto-dispatchAuto-created from TODO.md tagenhancementAuto-created from TODO.md tagneeds-simplificationIssue targets large file(s) needing simplification firstorigin:workerAuto-created by pulse labelless backfill (t2112)priority:highHigh severity — significant quality issuestatus:doneTask is completetier:standardAuto-created by pulse labelless backfill (t2112)

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions