fix(agent): align history trim to user boundary for provider compatibility by WanZheng · Pull Request #5257 · zeroclaw-labs/zeroclaw

WanZheng · 2026-04-03T09:05:55Z

Summary

Base branch target (master for all contributions): master
Problem: After any history trim operation, the first non-system message could be assistant or tool, causing providers like Zhipu GLM to reject the request with a 400 error (they require system* -> user ordering).
Why it matters: Long tool-using conversations trigger trimming frequently; invalid ordering silently breaks provider calls with no automatic recovery.
What changed: Added boundary alignment to every trim path — after removing messages, continue dropping leading non-system messages until the first one has role user. Extracted shared align_to_user_boundary helper to avoid duplication.
What did not change (scope boundary): Tool-pair integrity (addressed by fix(agent): treat tool_use/tool_result as atomic groups in history pruning #4825), compaction logic, channel history management, provider retry/fallback behavior.

Label Snapshot (required)

Risk label (risk: low|medium|high): risk: medium
Size label (size: XS|S|M|L|XL, auto-managed/read-only): size: S
Scope labels: agent
Module labels: agent: history
Contributor tier label: N/A
If any auto-label is incorrect, note requested correction: N/A

Change Metadata

Change type (bug|feature|refactor|docs|security|chore): bug
Primary scope (runtime|provider|channel|memory|security|ci|docs|multi): agent

Linked Issue

Related [Bug]: Telegram channel fails with MiniMax due to invalid role ordering in channel runtime #937 (MiniMax invalid role ordering in channel runtime)
Related [Bug]:DeepSeek 400 after auto-compaction: orphan role=tool message without preceding assistant.tool_calls #2232 (DeepSeek 400 after auto-compaction: orphan role=tool)
Related [Bug]: History pruner severs tool_use/tool_result pairs, causing Anthropic 400 error #4810 (History pruner severs tool_use/tool_result pairs)

Supersede Attribution (required when `Supersedes #` is used)

N/A

Validation Evidence (required)

Commands and result summary:

cargo fmt --all -- --check   # pass
cargo clippy --all-targets -- -D warnings   # pass, zero warnings
cargo test   # 159 passed, 0 failed, 10 ignored

Evidence provided: all three commands pass locally on macOS (darwin 25.3.0)
If any command is intentionally skipped, explain why: N/A

Security Impact (required)

New permissions/capabilities? No
New external network calls? No
Secrets/tokens handling changed? No
File system access scope changed? No

Privacy and Data Hygiene (required)

Data-hygiene status: pass
Redaction/anonymization notes: No personal/sensitive data introduced
Neutral wording confirmation: All test fixtures use neutral labels (sys, u1, a1, t1, etc.)

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No

i18n Follow-Through (required when docs or user-facing wording changes)

i18n follow-through triggered? No (no user-facing wording changes)

Human Verification (required)

What was personally validated beyond CI:

Verified scenarios: constructed message sequences with [system, user, assistant, tool, user, assistant], confirmed trim leaves user as first non-system message across all four trim paths
Edge cases checked: empty history, system-only, multiple system messages, all-assistant (no user), keep_recent protection preventing over-removal
What was not verified: live end-to-end with Zhipu GLM provider (no API key available)

Side Effects / Blast Radius (required)

Affected subsystems/workflows: history::trim_history, history::emergency_history_trim, history_pruner::prune_history, Agent::trim_history
Potential unintended effects: Slightly fewer messages retained after trim when orphaned assistant/tool messages are dropped to reach the next user boundary. This is strictly better than sending an invalid sequence to the provider.
Guardrails/monitoring for early detection: Provider error logs for role-ordering rejections should decrease; if message loss becomes excessive, keep_recent config protects the most recent messages.

Agent Collaboration Notes (recommended)

Agent tools used: Cursor (code review, cherry-pick, polish)
Workflow/plan summary: Reviewed commit, verified fix correctness across all four trim paths, corrected inaccurate commit message (removed false ensure_valid_sequence removal claim), deduplicated Phase 3 logic in history_pruner by reusing shared align_to_user_boundary, validated with fmt/clippy/test
Verification focus: boundary alignment correctness, edge case coverage, no regression in existing tests
Confirmation: naming + architecture boundaries followed (AGENTS.md + CONTRIBUTING.md)

Rollback Plan (required)

Fast rollback command/path: git revert <merge-commit> — restores previous trim behavior
Feature flags or config toggles (if any): None
Observable failure symptoms: Providers that enforce system* -> user ordering (Zhipu GLM, MiniMax) would return 400 errors after history trimming in long conversations

Risks and Mitigations

Risk: If no user message exists after the trim point, trim_history may drain all non-system messages.
- Mitigation: This only occurs in malformed conversations (no user messages). In practice, every valid conversation has user messages. The keep_recent parameter protects the most recent messages from removal.

JordanTheJet · 2026-04-05T03:00:04Z

PR Review: fix(agent): align history trim to user boundary for provider compatibility

Comprehension Summary

What: After any history trim operation (across four code paths: trim_history, emergency_history_trim, prune_history, and Agent::trim_history), the first non-system message could end up being assistant or tool. This PR ensures the first non-system message always has role user by extending the trim past any leading non-user messages until a user boundary is reached.

Why: Providers like Zhipu GLM and MiniMax require system* -> user message ordering. When history trimming left an assistant or tool message at the front, these providers returned 400 errors with no automatic recovery — effectively bricking long tool-using sessions.

Blast radius: Four trim paths in src/agent/: history.rs (trim_history, emergency_history_trim, align_to_user_boundary), history_pruner.rs (prune_history), and agent.rs (Agent::trim_history). No changes to provider logic, channel handling, or compaction. Risk classification: medium (behavior change in agent history management, but strictly corrective).

Security & Performance

Security: No new permissions, network calls, or secrets handling. No impact.
Performance: align_to_user_boundary uses history.remove(idx) in a loop, which is O(n*k) where k is the number of orphaned messages. For typical history sizes and trim scenarios (1-3 orphaned messages), this is negligible. The inline drain-extension in trim_history is O(1) extra work. No concern.

Overlap with PR #5264

PR #5264 (fix(agent): preserve tool_use/tool_result pairs during history trimming by @okhsunrog) addresses a complementary but distinct invariant: it prevents orphaned tool messages that lack a preceding assistant (which causes Anthropic API 400 errors). Both PRs modify trim_history and emergency_history_trim in history.rs and will have textual merge conflicts.

The two fixes are not redundant — they enforce different constraints:

fix(agent): align history trim to user boundary for provider compatibility #5257 (this PR): first non-system message must be user (Zhipu/MiniMax requirement)
fix(agent): preserve tool_use/tool_result pairs during history trimming #5264: every tool message must be preceded by an assistant (Anthropic requirement)

Recommendation: Whoever merges second must reconcile the merge conflict. Ideally both invariants should be enforced in a single post-trim pass. Consider coordinating merge order or combining.

Issues

[nit] Inconsistent deduplication claim. The PR body states "Extracted shared align_to_user_boundary helper to avoid duplication," but trim_history in history.rs uses an inline while loop to extend the drain rather than calling the shared helper. This is defensible for efficiency (drain-extension vs. remove-in-loop), but the PR description is slightly misleading. Consider either calling the shared helper for consistency, or noting the intentional divergence in the PR body.
[low] Edge case: no user message exists. In trim_history, if no user message exists after the drain point, drain_end advances to history.len(), draining all non-system messages. In Agent::trim_history, the same pattern yields an empty other_messages vec. The PR body correctly identifies this as a malformed-conversation edge case and notes keep_recent as a mitigation — but trim_history (the ChatMessage variant) does not have keep_recent protection. This means a conversation with only system+assistant messages could be fully drained. Acceptable given the scenario is degenerate, but worth a doc comment.
[low] Agent::trim_history does not use align_to_user_boundary. The Agent struct uses ConversationMessage (an enum wrapping ChatMessage), so it cannot directly call the ChatMessage-typed helper. The inline implementation is correct but represents a second place this logic must be maintained. A future refactor could unify the types or add a trait-based helper.
[info] Test for Agent::trim_history is missing. The PR adds thorough unit tests for trim_history, emergency_history_trim, align_to_user_boundary, and prune_history, but the Agent::trim_history method in agent.rs is only tested indirectly (if at all). The inline logic there is simple enough that this is acceptable, but a dedicated test would strengthen coverage.
[info] Existing test modified in history_pruner.rs. The prune_collapses_tool_pairs test was updated to insert a leading msg("user", "do something") to satisfy the new alignment invariant. This is correct — the original test had an implicit assumption that system -> assistant was valid, which was always wrong for strict providers. The updated assertions (messages.len() == 5, index shifts) are consistent.

Verdict

APPROVE -- The fix is correct, well-scoped, and addresses a real provider-compatibility bug. Test coverage is strong (12 new test cases across the helper, trim_history, emergency_history_trim, and prune_history). The issues identified are all low-severity or informational. CI is green.

Merge coordination note: this PR will conflict with #5264. The maintainer merging second should ensure both invariants (user-first ordering AND tool-pair integrity) are preserved in the reconciled code.

Reviewed by ZeroClaw PR Review Agent

…ility After any trim operation (trim_history, emergency_history_trim, prune_history, Agent::trim_history), the first non-system message could be `assistant` or `tool`. Providers like Zhipu GLM reject such sequences, requiring `system* -> user` ordering. Add boundary alignment to every trim path: after removing messages, continue dropping leading non-system messages until the first one has role `user`. Extract shared `align_to_user_boundary` helper in history.rs and reuse it in history_pruner Phase 3. Changes: - history::trim_history: extend drain end to next user message - history::emergency_history_trim: call align_to_user_boundary after drop - history::align_to_user_boundary: new pub(crate) helper - history_pruner::prune_history: add Phase 3 via align_to_user_boundary - Agent::trim_history: extend drain_end to next user ConversationMessage Made-with: Cursor

…e APIs After proactive trim and length-based truncation, drop leading non-user turns so a prepended system message still yields system→user ordering (e.g. Zhipu GLM). Add unit tests for the alignment behavior. In OpenAI-compatible provider: centralize tool_stream via tool_stream_for_tools, omit tool_stream when tools are absent, and emit debug logs with message role sequences on HTTP 400 for easier diagnosis. Made-with: Cursor

…and test - Refactor `trim_history` to call shared `align_to_user_boundary` instead of inline while-loop, making the deduplication claim accurate - Add doc comment on no-user edge case in `trim_history` (no `keep_recent` guard; acceptable for degenerate conversations) - Add doc comment on `Agent::trim_history` explaining why it cannot reuse the shared helper (ConversationMessage vs ChatMessage type mismatch) - Add dedicated test for `Agent::trim_history` boundary alignment - Fix pre-existing rustfmt issues in agent.rs (assert! format, import order) Made-with: Cursor

Made-with: Cursor

WanZheng · 2026-04-07T11:13:07Z

Post-review update summary

Reviewer feedback addressed (all items from @JordanTheJet's review)

#	Issue	Severity	Resolution
1	`trim_history` used inline loop instead of shared helper	nit	Fixed in `5eb34d9` — `trim_history` now calls `align_to_user_boundary`
2	Edge case: no `user` message after drain	low	Doc comment added to `trim_history` explaining the degenerate-conversation behavior
3	`Agent::trim_history` can't reuse the `ChatMessage`-typed helper	low	Doc comment added explaining the `ConversationMessage` type boundary
4	Missing test for `Agent::trim_history`	info	Test `trim_history_aligns_to_user_boundary_for_conversation_messages` added
5	Modified `prune_collapses_tool_pairs` test	info	Acknowledged — test correctly updated for new invariant

Additional fix in this push

cargo fmt compliance: Applied rustfmt to agent.rs and channels/mod.rs (macro formatting and import ordering) to pass cargo fmt --all -- --check.

Validation evidence (local, post-update)

cargo fmt --all -- --check   # pass
cargo clippy --all-targets -- -D warnings   # pass, zero warnings
cargo test   # 5926 passed, 2 failed (pre-existing sandbox-only: git_operations), 3 ignored

Label request for maintainers

The PR body specifies risk: medium and size: S, but the fork account cannot add labels. Please apply:

risk: medium
size: S

Note on PR #5264 overlap

PR #5264 (tool_use/tool_result pair preservation) is still open and addresses a complementary invariant. Both PRs modify trim_history / emergency_history_trim in history.rs. Whoever merges second will need to resolve textual conflicts. The two invariants are independent and both needed.

okhsunrog · 2026-04-07T12:13:30Z

I closed #5264

Combine orphan ToolResults guard with user-boundary alignment on ConversationMessage, and keep both unit tests. Made-with: Cursor

Made-with: Cursor

…rim-user-boundary

Audacity88 · 2026-05-01T17:54:25Z

I think this needs a rebase before meaningful review. GitHub reports DIRTY and local merge checking conflicts across provider compatibility, runtime history, the old src/agent/history_pruner.rs path, and channel code. Current master now has substantial history-pruner/tool_use/tool_result preservation work in crates/zeroclaw-runtime/src/agent/history.rs and history_pruner.rs, so the original fix may overlap with later changes.

Could you rebase against current master and confirm the remaining provider-compatibility failure this PR still fixes?

JordanTheJet · 2026-05-03T19:15:52Z

Hi @WanZheng — thanks for the careful work on this. The fix shape is sound, but this PR has been overtaken by parallel work and isn't reviewable as-is:

File paths moved. All four targeted files (src/agent/agent.rs, src/agent/history.rs, src/agent/history_pruner.rs, src/providers/compatible.rs) live under crates/zeroclaw-runtime/ and crates/zeroclaw-providers/ after the workspace split. The branch is DIRTY against master and can't cleanly rebase.
Adjacent fixes have already landed. Master now has tool_use/tool_result pair preservation in Agent::trim_history (agent.rs:738-780) and persistent-session orphan repair in load_interactive_session_history (history.rs:198-204) — both via [Bug]: History pruner severs tool_use/tool_result pairs, causing Anthropic 400 error #4810 and [Bug]: Compaction still orphans tool_result blocks, causing Anthropic API 400 errors (v0.6.9) (Signal Channel) #5813. Your patch overlaps these meaningfully.
The user-boundary alignment piece is being handled in fix(providers): drop leading non-user turns before provider call #6303. That open PR ("fix(providers): drop leading non-user turns before provider call") closes [Bug]: Gemini 400 — assistant tool_call emitted as first non-system turn (history serializer invariant violation) #6302 and targets the current crate layout — it does the same system* → user alignment you proposed, but at the provider-call boundary in crates/zeroclaw-providers/src/history_sanitizer.rs, which catches every code path (CLI, gateway, agent-loop) rather than only the trim sites.

Closing as superseded. If #6303 lands and there's still a missed trim path that produces a leading-assistant or leading-tool message, please open a fresh issue with the exact reproduction — happy to revisit. Thanks again for the patch and for the rigorous validation evidence in the original description.

WanZheng requested review from JordanTheJet and theonlyhennygod as code owners April 3, 2026 09:05

github-actions Bot added the agent Auto scope: src/agent/** changed. label Apr 3, 2026

JordanTheJet mentioned this pull request Apr 3, 2026

fix(session): session integrity, streaming refactor, history pruning #5167

Merged

JordanTheJet added this to ZeroClaw Project Board Apr 4, 2026

github-project-automation Bot moved this to Backlog in ZeroClaw Project Board Apr 4, 2026

JordanTheJet added the agent-approved PR approved by automated review agent label Apr 5, 2026

WanZheng added 4 commits April 7, 2026 18:57

chore: apply rustfmt to align with merge gate formatting rules

76756a6

Made-with: Cursor

WanZheng force-pushed the fix/align-history-trim-user-boundary branch from 5bacecc to 76756a6 Compare April 7, 2026 11:12

github-actions Bot added channel Auto scope: src/channels/** changed. provider Auto scope: src/providers/** changed. provider:compatible Auto module: provider/compatible changed. labels Apr 7, 2026

WanZheng added 3 commits April 8, 2026 18:08

merge(origin/master): resolve agent trim_history conflict

55ef54b

Combine orphan ToolResults guard with user-boundary alignment on ConversationMessage, and keep both unit tests. Made-with: Cursor

docs(agent): clarify trim_history ToolResults vs user-alignment overlap

09bb10f

Made-with: Cursor

Merge remote-tracking branch 'origin/master' into fix/align-history-t…

14a7ef3

…rim-user-boundary

singlerider requested a review from Audacity88 April 29, 2026 08:18

singlerider added bug Something isn't working risk: medium Auto risk: src/** or dependency/config changes. size: M Auto size: 251-500 non-doc changed lines. needs-author-action Author action required before merge labels Apr 29, 2026

bkutasi mentioned this pull request May 1, 2026

🦞 OpenClaw Ecosystem Digest 2026-05-02 bkutasi/big_model_radar#2

Open

JordanTheJet closed this May 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): align history trim to user boundary for provider compatibility#5257

fix(agent): align history trim to user boundary for provider compatibility#5257
WanZheng wants to merge 7 commits intozeroclaw-labs:masterfrom
WanZheng:fix/align-history-trim-user-boundary

WanZheng commented Apr 3, 2026

Uh oh!

JordanTheJet commented Apr 5, 2026

Uh oh!

WanZheng commented Apr 7, 2026

Uh oh!

okhsunrog commented Apr 7, 2026

Uh oh!

Audacity88 commented May 1, 2026

Uh oh!

JordanTheJet commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

WanZheng commented Apr 3, 2026

Summary

Label Snapshot (required)

Change Metadata

Linked Issue

Supersede Attribution (required when Supersedes # is used)

Validation Evidence (required)

Security Impact (required)

Privacy and Data Hygiene (required)

Compatibility / Migration

i18n Follow-Through (required when docs or user-facing wording changes)

Human Verification (required)

Side Effects / Blast Radius (required)

Agent Collaboration Notes (recommended)

Rollback Plan (required)

Risks and Mitigations

Uh oh!

JordanTheJet commented Apr 5, 2026

PR Review: fix(agent): align history trim to user boundary for provider compatibility

Comprehension Summary

Security & Performance

Overlap with PR #5264

Issues

Verdict

Uh oh!

WanZheng commented Apr 7, 2026

Post-review update summary

Reviewer feedback addressed (all items from @JordanTheJet's review)

Additional fix in this push

Validation evidence (local, post-update)

Label request for maintainers

Note on PR #5264 overlap

Uh oh!

okhsunrog commented Apr 7, 2026

Uh oh!

Audacity88 commented May 1, 2026

Uh oh!

JordanTheJet commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Supersede Attribution (required when `Supersedes #` is used)