Skip to content

fix: strip <think> tags from modal processor fallback responses (closes #159)#236

Merged
LarFii merged 1 commit intoHKUDS:mainfrom
jwchmodx:fix/strip-think-tags-in-modal-processor-fallback
Apr 7, 2026
Merged

fix: strip <think> tags from modal processor fallback responses (closes #159)#236
LarFii merged 1 commit intoHKUDS:mainfrom
jwchmodx:fix/strip-think-tags-in-modal-processor-fallback

Conversation

@jwchmodx
Copy link
Copy Markdown
Contributor

@jwchmodx jwchmodx commented Apr 3, 2026

Problem

Reasoning models such as DeepSeek-R1 and Qwen2.5-think prepend their chain-of-thought inside <think>…</think> blocks before emitting the final answer.

When _robust_json_parse cannot extract a valid JSON object (e.g. because the model emitted only thinking content without a trailing structured answer), the four modal-processor parse methods fell back to returning the raw LLM response string as the enhanced_caption and summary. This caused internal model reasoning to be stored in the knowledge graph instead of the actual content description — exactly the symptom reported in #159.

Affected methods (all shared the same pattern):

  • ImageModalProcessor._parse_response
  • TableModalProcessor._parse_table_response
  • EquationModalProcessor._parse_equation_response
  • GenericModalProcessor._parse_generic_response

Fix

Added a static helper BaseModalProcessor._strip_thinking_tags(text) that removes <think>…</think> and <thinking>…</thinking> blocks (case-insensitive, handles multiline content) and applies it in the except fallback branch of all four methods before the cleaned text is stored or returned.

# Before
return response, fallback_entity          # raw think-tag content stored

# After
cleaned = self._strip_thinking_tags(response)
...
return cleaned, fallback_entity           # only final-answer text stored

The helper is intentionally a @staticmethod on the base class so subclasses and future processors can reuse it without extra imports.

Tests

tests/test_strip_thinking_tags.py — 13 new unit tests:

  • Tag variants (<think>, <thinking>, uppercase)
  • Multiline and multiple blocks
  • No-op on plain text / empty string
  • Full fallback path exercised for all four processor classes

All 13 pass; ruff check + ruff format --check clean.

Checklist

  • Bug reproduced and root cause identified
  • Fix applied to all four affected parse methods
  • Unit tests added (13 tests, all passing)
  • ruff lint + format checks pass
  • No breaking changes to public API

HKUDS#159)

Reasoning models (DeepSeek-R1, Qwen2.5-think, etc.) wrap their
chain-of-thought in <think>…</think> blocks before emitting the
final answer.  When _robust_json_parse fails to extract a valid JSON
object from the response, the four modal-processor parse methods
(_parse_response, _parse_table_response, _parse_equation_response,
_parse_generic_response) were returning the **raw** LLM response as
the fallback caption and summary.  This caused internal model
reasoning to be stored in the knowledge graph instead of the actual
content description.

Fix: add a static helper `BaseModalProcessor._strip_thinking_tags`
that removes <think>/<thinking> blocks (case-insensitive, multiline)
and apply it in every fallback branch so only the final-answer text
is stored or returned.

The helper is tested in tests/test_strip_thinking_tags.py with 13
unit tests covering: tag variants, multiline blocks, multiple blocks,
case-insensitivity, and the full fallback path for all four
processor classes.
@LarFii LarFii merged commit 5fbff29 into HKUDS:main Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants