Skip to content

fix: keep system messages at the start of chat history#6552

Open
drbparadise wants to merge 3 commits into
zeroclaw-labs:masterfrom
drbparadise:fix/system-message-normalization
Open

fix: keep system messages at the start of chat history#6552
drbparadise wants to merge 3 commits into
zeroclaw-labs:masterfrom
drbparadise:fix/system-message-normalization

Conversation

@drbparadise
Copy link
Copy Markdown
Contributor

@drbparadise drbparadise commented May 9, 2026

Summary

  • Base branch: master
  • What changed and why:
    • Normalize runtime chat history so all system messages are merged into the first message before provider dispatch.
    • Make loop-detection feedback merge into the leading system message instead of appending a non-leading system message.
    • Self-heal persisted interactive session history with non-leading system messages on load.
    • Canonicalize OpenAI-compatible provider requests so merge_system_into_user = false still sends at most one leading system message.
  • Scope boundary: No new config, no provider-specific workaround requirement, no changes to custom fork patch tracks or ops metadata.
  • Blast radius: Runtime history/session normalization, the agent tool-loop request path, and OpenAI-compatible provider message serialization.
  • Linked issue(s): Closes [Bug]: Non-leading system messages can be sent to OpenAI-compatible providers #6551

Validation Evidence (required)

Local validation is the signal CI cannot replace. Run the full battery and paste literal output (tails, failures, warnings — not "all passed").

cargo fmt --all -- --check
cargo clippy --all-targets -- -D warnings
cargo test

Docs-only changes: replace with markdown lint + link-integrity (scripts/ci/docs_quality_gate.sh). Bootstrap scripts: add bash -n install.sh.

  • Commands run and tail output:
$ cargo test -p zeroclaw-runtime system_messages -- --nocapture
running 2 tests
test agent::loop_::tests::load_interactive_session_merges_non_leading_system_messages ... ok
test agent::loop_::tests::tool_loop_normalizes_non_leading_system_messages_before_provider_request ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 1625 filtered out
$ cargo test -p zeroclaw-providers flatten_system_messages -- --nocapture
running 5 tests
test compatible::tests::flatten_system_messages_inserts_synthetic_user_when_no_user_exists ... ok
test compatible::tests::flatten_system_messages_inserts_user_when_missing ... ok
test compatible::tests::flatten_system_messages_keeps_system_only_at_start_without_user_merge ... ok
test compatible::tests::flatten_system_messages_merges_into_first_user_and_removes_system_roles ... ok
test compatible::tests::flatten_system_messages_merges_into_first_user ... ok
test result: ok. 5 passed; 0 failed; 0 ignored; 0 measured; 817 filtered out
$ git diff --check HEAD^ HEAD
exit 0
$ cargo fmt --all -- --check
exit 0
$ cargo clippy --all-targets -- -D warnings
Finished `dev` profile [unoptimized + debuginfo] target(s) in 56.30s
$ cargo test
Root unit/integration/system tests completed before doctests:
- 237 lib tests passed
- 236 main tests passed
- 159 integration tests passed
- 7 live tests ignored
- 5 system tests passed

Doctest failure:
Doc-tests zeroclaw
error: Option 'default-theme' given more than once
error: doctest failed, to rerun pass `--doc`

The failing rustdoc invocation included duplicate flags:
--default-theme=ayu --default-theme=ayu
  • GitHub CI status: Required gate passed for this PR head. Passing checks include CI Required Gate, Test, Lint, Security, Check (all features), Check (no default features), Check (32-bit), Linux/macOS/Windows builds, and Benchmarks Compile.
  • Beyond CI — what did you manually verify? The new regression tests were written first and failed on the pre-fix code with histories shaped system,user,assistant,system,user. After the fix, the runtime provider request and session load paths contain only one leading system message and preserve both system contents. A self-review of commit ff7da168 found no blocking issues; focused runtime/provider regression tests and git diff --check were rerun after review.
  • If any command was intentionally skipped, why: None. cargo test was run, but the local doctest phase failed because rustdoc received duplicate --default-theme=ayu flags from the current local configuration.

Security & Privacy Impact (required)

Yes/No for each. Answer any Yes with a 1–2 sentence explanation.

  • New permissions, capabilities, or file system access scope? (No)
  • New external network calls? (No)
  • Secrets / tokens / credentials handling changed? (No)
  • PII, real identities, or personal data in diff, tests, fixtures, or docs? (No)
  • If any Yes, describe the risk and mitigation: N/A

Compatibility (required)

  • Backward compatible? (Yes)
  • Config / env / CLI surface changed? (No)
  • If No or Yes to either: exact upgrade steps for existing users: N/A

Rollback (required for risk: medium and risk: high)

Low-risk PRs: git revert <sha> is the plan unless otherwise noted.

Medium/high-risk PRs must fill:

  • Fast rollback command/path: git revert ff7da168
  • Feature flags or config toggles: None
  • Observable failure symptoms: Strict OpenAI-compatible endpoints may again reject requests with System message must be at the beginning. if non-leading system messages reappear.

Supersede Attribution (required only when Supersedes # is used)

  • Superseded PRs + authors (#<pr> by @<author>, one per line): N/A
  • Scope materially carried forward: N/A
  • Co-authored-by trailers added in commit messages for incorporated contributors? (No)
  • If No, why (inspiration-only, no direct code/design carry-over): No superseded PR or external contributor code was incorporated.

Labels live in the GitHub label UI, not in the body. Set risk:*, size:*, and scope labels via the sidebar. Auto-label corrections: add risk: manual and the intended label.

Commit trailers capture AI-assisted collaboration (Co-Authored-By: Claude ...) — no separate section needed.

Privacy contract (docs/book/src/contributing/privacy.md) is a merge gate. Never commit real identities, secrets, personal emails, or PII in diff, tests, fixtures, or docs.

@Audacity88 Audacity88 added bug Something isn't working provider Auto scope: src/providers/** changed. provider:compatible Auto module: provider/compatible changed. risk: high Auto risk: security/runtime/gateway/tools/workflows. runtime Auto scope: src/runtime/** changed. size: M Auto size: 251-500 non-doc changed lines. labels May 10, 2026
@github-actions github-actions Bot removed provider Auto scope: src/providers/** changed. runtime Auto scope: src/runtime/** changed. provider:compatible Auto module: provider/compatible changed. labels May 10, 2026
Copy link
Copy Markdown

@relda88 relda88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good improvement overall. One edge case: empty string vs null aren't treated the same downstream — might need a normalization step.

@drbparadise
Copy link
Copy Markdown
Contributor Author

drbparadise commented May 11, 2026

Follow-up applied: empty system content is now treated as absent during normalization, so we no longer emit placeholder system messages or lose the fallback prompt on session load.

Verified with:

cargo test -p zeroclaw-providers flatten_system_messages -- --nocapture
cargo test -p zeroclaw-runtime load_interactive_session -- --nocapture
cargo test -p zeroclaw-runtime tool_loop_normalizes_non_leading_system_messages_before_provider_request -- --nocapture
cargo fmt --all -- --check
cargo clippy -p zeroclaw-providers -p zeroclaw-runtime --all-targets -- -D warnings

Copy link
Copy Markdown
Collaborator

@Audacity88 Audacity88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking on #6551. I checked the current head 8f63ec6 against the linked issue, the prior review comment, the current diff in crates/zeroclaw-runtime/src/agent/history.rs, crates/zeroclaw-runtime/src/agent/loop_.rs, and crates/zeroclaw-providers/src/compatible.rs, plus the green CI state. I do not see anything blocking this.

✅ Resolved — Empty system content is treated as absent

The follow-up for the empty-string edge case addresses the earlier review concern. normalize_system_messages() now drops empty system fragments instead of manufacturing an empty leading system message, and load_interactive_session_history() restores the fallback prompt when normalization would otherwise leave the session without a leading system message. The new runtime/provider tests cover both the persisted-session fallback case and compatible-provider flattening.

🟢 What looks good — The provider-facing invariant is enforced at the right boundaries

The runtime now normalizes history immediately before provider dispatch, self-heals persisted interactive sessions on load, and merges loop-detection feedback into the leading system message instead of appending a later one. The OpenAI-compatible provider normalization now collapses non-leading system messages in flatten_system_messages(), and the added provider tests cover the strict-endpoint shape where merge_system_into_user = false, so strict endpoints should see at most one leading system role even when older history or runtime paths produced the rejected system,user,assistant,system,user shape.

The regression tests are pointed at the right failure mode. I also noted the local doctest failure described in the PR body; with GitHub's Test job green and the failure attributed to duplicate local rustdoc --default-theme=ayu flags, I do not see it as PR-blocking. The compatibility, security/privacy, and rollback notes match the behavior I see in the diff. Approving.

@Audacity88 Audacity88 added provider Auto scope: src/providers/** changed. runtime Auto scope: src/runtime/** changed. provider:compatible Auto module: provider/compatible changed. labels May 11, 2026
@Audacity88 Audacity88 requested a review from relda88 May 11, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working provider:compatible Auto module: provider/compatible changed. provider Auto scope: src/providers/** changed. risk: high Auto risk: security/runtime/gateway/tools/workflows. runtime Auto scope: src/runtime/** changed. size: M Auto size: 251-500 non-doc changed lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Non-leading system messages can be sent to OpenAI-compatible providers

3 participants