fix(providers): add circuit breaker for Responses API fallback#3205
Conversation
When the Responses API fails repeatedly (3 consecutive compatibility errors), skip it and fall back directly to Chat Completions. Unlike a permanent disable, the circuit re-probes after 5 minutes so recovery is automatic when the API comes back. Success resets the counter. Keyed per (model, reasoning_effort) so a failure with one model does not affect others.
chengyongru
left a comment
There was a problem hiding this comment.
Review: fix(providers): add circuit breaker for Responses API fallback
结论: Approve
这是一个设计良好的断路器实现。
优点:
- 按
(model, reasoning_effort)键隔离,一个模型的故障不影响其他模型 - 阈值 (3次) 和探测间隔 (5分钟) 作为模块级常量清晰定义
- 半开 (half-open) 状态设计正确:超过冷却期后允许一次探测
- 成功时立即重置所有计数器,恢复迅速
- 在非流式和流式两条代码路径都正确集成了
_record_responses_success/failure - 测试覆盖充分 (7个测试),覆盖了所有关键场景
小建议(不阻塞合并):
-
_record_responses_failure里有内联from loguru import logger,建议移到文件顶部与其他 import 保持一致。这不是热路径,每次调用都走 import 缓存不会有性能问题,但放在顶部更符合惯例。 -
断路器的
_responses_failures和_responses_tripped_at字典会随着不同 key 不断增长(虽然实际上 key 空间很小)。如果在意极端情况,可以在_record_responses_success中确认两个 dict 的清理。当前实现已经在做pop了,所以这已经处理了。
Good to merge.
Review: fix(providers): add circuit breaker for Responses API fallbackVerdict: Approve Well-designed circuit breaker implementation. Pros:
Minor suggestions (non-blocking):
Good to merge. |
Addresses reviewer suggestion to keep imports conventional.
e9d7a85 to
6c4ba2d
Compare
Summary
OpenAICompatProvider(model, reasoning_effort)key, skip the Responses API and go straight to Chat CompletionsTest plan
ruff check --select F401,F841clean🤖 Generated with Claude Code