Skip to content

fix: honor null reasoning effort disable#3587

Open
boogieLing wants to merge 2 commits intoHKUDS:mainfrom
boogieLing:R0/issue-3585-reasoning-null
Open

fix: honor null reasoning effort disable#3587
boogieLing wants to merge 2 commits intoHKUDS:mainfrom
boogieLing:R0/issue-3585-reasoning-null

Conversation

@boogieLing
Copy link
Copy Markdown
Contributor

Summary

Fixes #3585.

This PR preserves the documented distinction between an omitted reasoningEffort and an explicit reasoningEffort: null:

  • omitted reasoningEffort continues to preserve the provider/model default
  • explicit reasoningEffort: null is normalized to the existing internal "none" disable semantic
  • Xiaomi MiMo, including OpenRouter/custom MiMo routes, now receives an explicit JSON reasoning_effort: null disable signal through extra_body

Root Cause

reasoningEffort: null parsed to Python None, but None was also used as the internal meaning for "not configured". The agent/provider path therefore skipped the parameter entirely, so MiMo fell back to its default reasoning behavior.

Design Notes

The MiMo-specific wire behavior is modeled as provider metadata via ProviderSpec.reasoning_disable_style, rather than hard-coding provider names in the request builder. Gateway/custom routes use conservative model-id matching by path part or known MiMo prefixes, avoiding broad substring matches such as mimosa-pro.

Validation

  • uv run --extra dev pytest tests/providers/test_provider_factory.py tests/providers/test_litellm_kwargs.py -q
  • uv run ruff check nanobot/config/schema.py nanobot/providers/factory.py nanobot/providers/openai_compat_provider.py nanobot/providers/registry.py tests/providers/test_provider_factory.py

@chengyongru chengyongru added the bug Something isn't working label May 2, 2026
Copy link
Copy Markdown
Collaborator

@chengyongru chengyongru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug itself is valid — reasoningEffort: null and "field omitted" both resolve to Python None via Pydantic, so there's no way to explicitly disable thinking for models like MiMo that default to thinking-on. The effective_reasoning_effort() method and ProviderSpec.reasoning_disable_style field are a solid foundation. But the PR adds a redundant second mechanism that significantly increases surface area for little gain.


1. Dual mechanism doing the same job

openai_compat_provider.py:96-116

The PR adds reasoning_disable_style on ProviderSpec (the declarative field at registry.py:78) — that's the right abstraction. But then it also builds _model_looks_like_null_reasoning_route with hardcoded markers/prefixes (openai_compat_provider.py:63-64) that duplicate the same decision. This means a custom provider using MiMo gets the null payload from two independent code paths, and they can disagree if the model naming doesn't match the hardcoded patterns. The spec field is sufficient; the fuzzy matching layer is redundant.

Side effect: any user routing through a gateway that renames the model (e.g., strips the "mimo" prefix) silently loses the disable signal via the hardcoded path. Conversely, the hardcoded check means the spec field alone is not the single source of truth — maintainers must keep both in sync.

2. Hardcoded model name matching is fragile

openai_compat_provider.py:63-64

_NULL_REASONING_DISABLE_MARKERS = frozenset({"mimo", "xiaomi", "xiaomimimo"})
_NULL_REASONING_DISABLE_PREFIXES = ("mimo-", "mimo_", "mimo.")

This approach doesn't scale — every new model/vendor that needs reasoning_effort: null requires a source code change. The existing _KIMI_THINKING_MODELS set has the same problem, but that's not a reason to repeat the pattern. With reasoning_disable_style on ProviderSpec, custom route users can self-serve via config without touching source.

The mimosa-pro defense test (test_litellm_kwargs.py) is a sign this is already brittle — you're writing tests to prove the matching *doesn't fire on lookalikes.

3. Test relies on opaque tuple index

tests/providers/test_provider_factory.py:30

assert provider_signature(config)[11] == "none"

provider_signature returns an unnamed tuple. Index [11] breaks silently if anyone inserts or reorders fields. This is testing internal structure rather than behavior — consider testing the actual provider generation settings or the kwargs output instead.


What to keep vs cut:

  • Keep: effective_reasoning_effort(), reasoning_disable_style field on ProviderSpec, the _build_kwargs branch at line 613 (but gated only on spec.reasoning_disable_style)
  • Cut: _NULL_REASONING_DISABLE_MARKERS, _NULL_REASONING_DISABLE_PREFIXES, _model_looks_like_null_reasoning_route(), _needs_null_reasoning_disable() — the spec field alone is the right abstraction, and custom provider users should set it in config rather than relying on source-code heuristic matching

@boogieLing boogieLing force-pushed the R0/issue-3585-reasoning-null branch from 3cecc89 to a69493a Compare May 2, 2026 18:11
@boogieLing
Copy link
Copy Markdown
Contributor Author

The bug itself is valid — reasoningEffort: null and "field omitted" both resolve to Python None via Pydantic, so there's no way to explicitly disable thinking for models like MiMo that default to thinking-on. The effective_reasoning_effort() method and ProviderSpec.reasoning_disable_style field are a solid foundation. But the PR adds a redundant second mechanism that significantly increases surface area for little gain.

1. Dual mechanism doing the same job

openai_compat_provider.py:96-116

The PR adds reasoning_disable_style on ProviderSpec (the declarative field at registry.py:78) — that's the right abstraction. But then it also builds _model_looks_like_null_reasoning_route with hardcoded markers/prefixes (openai_compat_provider.py:63-64) that duplicate the same decision. This means a custom provider using MiMo gets the null payload from two independent code paths, and they can disagree if the model naming doesn't match the hardcoded patterns. The spec field is sufficient; the fuzzy matching layer is redundant.

Side effect: any user routing through a gateway that renames the model (e.g., strips the "mimo" prefix) silently loses the disable signal via the hardcoded path. Conversely, the hardcoded check means the spec field alone is not the single source of truth — maintainers must keep both in sync.

2. Hardcoded model name matching is fragile

openai_compat_provider.py:63-64

_NULL_REASONING_DISABLE_MARKERS = frozenset({"mimo", "xiaomi", "xiaomimimo"})
_NULL_REASONING_DISABLE_PREFIXES = ("mimo-", "mimo_", "mimo.")

This approach doesn't scale — every new model/vendor that needs reasoning_effort: null requires a source code change. The existing _KIMI_THINKING_MODELS set has the same problem, but that's not a reason to repeat the pattern. With reasoning_disable_style on ProviderSpec, custom route users can self-serve via config without touching source.

The mimosa-pro defense test (test_litellm_kwargs.py) is a sign this is already brittle — you're writing tests to prove the matching *doesn't fire on lookalikes.

3. Test relies on opaque tuple index

tests/providers/test_provider_factory.py:30

assert provider_signature(config)[11] == "none"

provider_signature returns an unnamed tuple. Index [11] breaks silently if anyone inserts or reorders fields. This is testing internal structure rather than behavior — consider testing the actual provider generation settings or the kwargs output instead.

What to keep vs cut:

  • Keep: effective_reasoning_effort(), reasoning_disable_style field on ProviderSpec, the _build_kwargs branch at line 613 (but gated only on spec.reasoning_disable_style)
  • Cut: _NULL_REASONING_DISABLE_MARKERS, _NULL_REASONING_DISABLE_PREFIXES, _model_looks_like_null_reasoning_route(), _needs_null_reasoning_disable() — the spec field alone is the right abstraction, and custom provider users should set it in config rather than relying on source-code heuristic matching

Thanks for the detailed review. I agreed with the direction and pushed an update in a69493a.

Changes made:

  • removed the model-name heuristic path (_NULL_REASONING_DISABLE_MARKERS, _NULL_REASONING_DISABLE_PREFIXES, _model_looks_like_null_reasoning_route())
  • now gates the explicit reasoning_effort: null payload only on spec.reasoning_disable_style == "reasoning_effort_null"
  • removed the OpenRouter/custom heuristic tests
  • removed the brittle provider_signature(config)[11] assertions and now assert the actual provider generation setting instead

Validation:

  • uv run --extra dev pytest tests/providers/test_provider_factory.py tests/providers/test_litellm_kwargs.py -q -> 69 passed
  • uv run ruff check nanobot/config/schema.py nanobot/providers/factory.py nanobot/providers/openai_compat_provider.py nanobot/providers/registry.py tests/providers/test_provider_factory.py -> passed

Good point on keeping ProviderSpec.reasoning_disable_style as the single source of truth. The PR body still has stale wording about OpenRouter/custom routing from the previous revision; I’ll update that separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reasoning_effort: null does not disable thinking on Xiaomi MiMo — no way to explicitly disable reasoning

2 participants