fix(channels): strip <think> tags from streaming draft updates#5505
Conversation
Qwen and similar models emit <think>...</think> reasoning blocks that were stripped from the final response but leaked to users during partial streaming via DraftEvent::Content and DraftEvent::Progress. Add strip_think_tags_inline() to sanitize accumulated draft text before sending to the channel, preventing reasoning tokens from appearing in streaming updates. Includes unit tests for single blocks, multiple blocks, unclosed blocks, empty strings, and whitespace trimming.
theonlyhennygod
left a comment
There was a problem hiding this comment.
Comprehension Summary
What: Adds strip_think_tags_inline() to the streaming draft handler in src/channels/mod.rs, ensuring <think>...</think> reasoning blocks from models like Qwen are stripped before being sent to users during partial streaming updates.
Why: The existing strip_think_tags functions in compatible.rs, loop_.rs, and ollama.rs only operate on the final complete response. During streaming, raw <think> content leaked into DraftEvent::Content and DraftEvent::Progress updates sent to channels.
Blast radius: Streaming draft updates in process_channel_message(). No changes to final response handling, provider logic, or config schema.
Review
Risk: risk: low — channels subsystem, additive sanitization function.
Verified:
strip_think_tags_inline()correctly handles: single blocks, multiple blocks, unclosed blocks (drops tail), empty strings, and whitespace trimming.- Applied to both
DraftEvent::ProgressandDraftEvent::Contentpaths. - 6 unit tests cover all edge cases with clear assertions.
- The function uses simple string scanning (no regex) — efficient for streaming hot path.
- CI Required Gate is green across all platforms.
- PR template sections are filled out (summary format is compact but all key info is present).
- Privacy/data hygiene: pass.
Security/Performance Assessment:
- No security impact — sanitization only removes reasoning tokens from user-visible output.
- No performance impact —
strip_think_tags_inlineis O(n) with no allocations beyond the result string. Called per streaming chunk, which is acceptable overhead.
This PR is ready for maintainer merge.
Thank you for the clean fix and thorough test coverage.
Port upstream fix from zeroclaw-labs/zeroclaw#5505. Models like Qwen emit <think>...</think> reasoning blocks that were stripped from the final response but leaked to users during partial streaming via DraftEvent::Content and DraftEvent::Progress. Add strip_think_tags_inline() that sanitizes draft text before sending to the channel, handling single/multiple blocks, unclosed blocks, and whitespace trimming. Co-Authored-By: DaBlitzStein <DaBlitzStein@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port upstream fix from zeroclaw-labs/zeroclaw#5505. Models like Qwen emit <think>...</think> reasoning blocks that were stripped from the final response but leaked to users during partial streaming via DraftEvent::Content and DraftEvent::Progress. Add strip_think_tags_inline() that sanitizes draft text before sending to the channel, handling single/multiple blocks, unclosed blocks, and whitespace trimming. Co-authored-by: DaBlitzStein <DaBlitzStein@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…law-labs#5505) * fix(channels): strip <think> tags from streaming draft updates Qwen and similar models emit <think>...</think> reasoning blocks that were stripped from the final response but leaked to users during partial streaming via DraftEvent::Content and DraftEvent::Progress. Add strip_think_tags_inline() to sanitize accumulated draft text before sending to the channel, preventing reasoning tokens from appearing in streaming updates. Includes unit tests for single blocks, multiple blocks, unclosed blocks, empty strings, and whitespace trimming. * style: apply rustfmt to strip_think_tags_inline tests
…law-labs#5505) Manual port of upstream c70e86c. Qwen and similar models emit <think>...</think> reasoning blocks that leaked to users during partial streaming via DraftEvent::Content and DraftEvent::Progress. Add strip_think_tags_inline() to sanitize accumulated draft text before sending to the channel. Directly benefits wecom_ws draft flow.
Upstream c70e86c (zeroclaw-labs#5505) appended `.trim()` to strip_think_tags_inline, which eats the trailing `\n` that Progress events carry (e.g. `"⏳ tool\n"`). wecom_ws note_progress_update then push_str'es straight into the work log with no separator, producing stacked progress lines without line breaks. Fix: after trim_start / trim_end, re-append `\n` if the original post-strip text ended with one. Behaviour of 0d2b57e (zeroclaw-labs#4394) is restored without losing the think-tag stripping guarantee. Added regression test strip_think_tags_inline_preserves_trailing_newline.
Summary
<think>...</think>reasoning blocks that were stripped from the final response viastrip_think_tags()but leaked to users during partial streamingDraftEvent::ContentandDraftEvent::Progresssent raw accumulated text toupdate_draft()without sanitizationstrip_think_tags_inline()tochannels/mod.rsthat strips think blocks from streaming draft text before sending to the channelRoot cause
The existing
strip_think_tagsfunctions incompatible.rs,loop_.rs, andollama.rsonly operate on the final complete response. The streaming draft handler accumulates text chunks and sends them directly to the channel without any think-tag sanitization.Test plan
strip_think_tags_inline: single block, multiple blocks, unclosed block, no tags, empty string, whitespace trimmingcargo clippy --all-targets -- -D warningspassescargo fmt -- --checkpassesstream_mode = "partial"no longer shows<think>content during streaming