fix(providers): add circuit breaker for Responses API fallback by mohamed-elkholy95 · Pull Request #3205 · HKUDS/nanobot

mohamed-elkholy95 · 2026-04-16T04:11:56Z

Summary

Add a proper circuit breaker for Responses API fallback in OpenAICompatProvider
After 3 consecutive compatibility errors for a (model, reasoning_effort) key, skip the Responses API and go straight to Chat Completions
Circuit probes again after 5 minutes (half-open state) — recovery is automatic
Success immediately resets the failure counter
Addresses the issue where a transient Responses API outage could permanently degrade output for the lifetime of the process

Test plan

7 new tests covering: default availability, threshold opening, model isolation, success reset, time-based probe, below-threshold pass, reasoning_effort key separation
All 209 existing provider tests pass (no regressions)
ruff check --select F401,F841 clean

🤖 Generated with Claude Code

When the Responses API fails repeatedly (3 consecutive compatibility errors), skip it and fall back directly to Chat Completions. Unlike a permanent disable, the circuit re-probes after 5 minutes so recovery is automatic when the API comes back. Success resets the counter. Keyed per (model, reasoning_effort) so a failure with one model does not affect others.

chengyongru

Review: fix(providers): add circuit breaker for Responses API fallback

结论: Approve

这是一个设计良好的断路器实现。

优点：

按 (model, reasoning_effort) 键隔离，一个模型的故障不影响其他模型
阈值 (3次) 和探测间隔 (5分钟) 作为模块级常量清晰定义
半开 (half-open) 状态设计正确：超过冷却期后允许一次探测
成功时立即重置所有计数器，恢复迅速
在非流式和流式两条代码路径都正确集成了 _record_responses_success/failure
测试覆盖充分 (7个测试)，覆盖了所有关键场景

小建议（不阻塞合并）：

_record_responses_failure 里有内联 from loguru import logger，建议移到文件顶部与其他 import 保持一致。这不是热路径，每次调用都走 import 缓存不会有性能问题，但放在顶部更符合惯例。
断路器的 _responses_failures 和 _responses_tripped_at 字典会随着不同 key 不断增长（虽然实际上 key 空间很小）。如果在意极端情况，可以在 _record_responses_success 中确认两个 dict 的清理。当前实现已经在做 pop 了，所以这已经处理了。

Good to merge.

chengyongru · 2026-04-16T07:45:44Z

Review: fix(providers): add circuit breaker for Responses API fallback

Verdict: Approve

Well-designed circuit breaker implementation.

Pros:

Isolated per (model, reasoning_effort) key — one model's failures don't affect others
Threshold (3) and probe interval (5 min) are clearly defined as module-level constants
Correct half-open state: allows one probe attempt after the cooldown period
Success immediately resets all counters for quick recovery
Properly integrated in both non-streaming and streaming code paths
Adequate test coverage (7 tests) covering all key scenarios

Minor suggestions (non-blocking):

_record_responses_failure has an inline from loguru import logger. Consider moving it to the top of the file with the other imports for consistency. This isn't a hot path so the import cache handles it fine, but top-level is more conventional.
The _responses_failures and _responses_tripped_at dicts grow with distinct keys (though the key space is tiny in practice). _record_responses_success already does pop to clean up, so this is handled.

Good to merge.

Addresses reviewer suggestion to keep imports conventional.

chengyongru approved these changes Apr 16, 2026

View reviewed changes

mohamed-elkholy95 added 2 commits April 16, 2026 05:52

style: move loguru import to module top level

6ab0748

Addresses reviewer suggestion to keep imports conventional.

style: fix import sorting (ruff I001)

6c4ba2d

mohamed-elkholy95 force-pushed the fix/responses-api-circuit-breaker branch from e9d7a85 to 6c4ba2d Compare April 18, 2026 23:50

github-actions bot mentioned this pull request Apr 19, 2026

🦞 OpenClaw 生态日报 2026-04-19 gsscsd/big_model_radar#210

Open

chengyongru merged commit adcd3fe into HKUDS:nightly Apr 19, 2026
3 checks passed

chengyongru mentioned this pull request Apr 19, 2026

fix(providers): add circuit breaker for Responses API fallback #3302

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(providers): add circuit breaker for Responses API fallback#3205

fix(providers): add circuit breaker for Responses API fallback#3205
chengyongru merged 3 commits intoHKUDS:nightlyfrom
mohamed-elkholy95:fix/responses-api-circuit-breaker

mohamed-elkholy95 commented Apr 16, 2026

Uh oh!

chengyongru left a comment

Uh oh!

chengyongru commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mohamed-elkholy95 commented Apr 16, 2026

Summary

Test plan

Uh oh!

chengyongru left a comment

Choose a reason for hiding this comment

Review: fix(providers): add circuit breaker for Responses API fallback

Uh oh!

chengyongru commented Apr 16, 2026

Review: fix(providers): add circuit breaker for Responses API fallback

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants