Skip to content

fix: auto-compact and retry on ContextLengthExceeded#315

Merged
ilblackdragon merged 2 commits intomainfrom
fix/auto-compact-on-context-exceeded
Feb 23, 2026
Merged

fix: auto-compact and retry on ContextLengthExceeded#315
ilblackdragon merged 2 commits intomainfrom
fix/auto-compact-on-context-exceeded

Conversation

@bowenwang1996
Copy link
Copy Markdown
Contributor

Summary

  • When the LLM returns a ContextLengthExceeded error mid-turn in the agentic loop, the dispatcher now automatically compacts context messages and retries once instead of propagating the raw error to the user
  • Compaction keeps all system messages (system prompt, skill context), the last user message, and all subsequent messages (current turn's tool calls and results), dropping older conversation history
  • A note is inserted so the LLM knows earlier context was dropped
  • If the retry also fails, the original error is returned

Test plan

  • test_compact_keeps_system_and_last_user_exchange — verifies old history is dropped while system prompt, last user message, and current turn tool work are preserved
  • test_compact_preserves_multiple_system_messages — verifies all system messages (prompt, skill context, nudges) are kept
  • test_compact_single_user_message_keeps_everything — edge case with no old history to drop
  • test_compact_no_user_messages_keeps_non_system — edge case fallback when no user messages exist
  • test_compact_drops_old_history_but_keeps_current_turn_tools — multi-turn conversation with multiple tool calls in current turn
  • All existing tests pass (cargo test — 19/19 dispatcher tests pass)
  • Zero clippy warnings (cargo clippy --all --all-features)

Fixes #260

When the LLM returns a context-length-exceeded error mid-turn, the
dispatcher now automatically compacts the conversation history and
retries once instead of propagating the raw error to the user.

The compaction keeps all system messages (system prompt, skill context),
the last user message, and all subsequent messages (current turn's tool
calls and results), dropping older conversation history. A note is
inserted to inform the LLM that earlier context was dropped.

If the retry also fails, the original error is returned.

Fixes #260

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 22, 2026 21:29
@github-actions github-actions Bot added scope: agent Agent core (agent loop, router, scheduler) size: L 200-499 changed lines risk: medium Business logic, config, or moderate-risk modules labels Feb 22, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @bowenwang1996, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the agent's robustness by introducing an automatic context management mechanism. When an LLM encounters a context length limit, the system now intelligently prunes older conversation history to fit within the model's constraints and retries the request. This prevents premature termination of agentic turns due to context overflow, improving the overall reliability and user experience of long-running conversations.

Highlights

  • Automatic Context Compaction: Implemented automatic compaction of context messages when an LLM returns a ContextLengthExceeded error during an agentic loop, preventing immediate failure.
  • Retry Mechanism: After compaction, the dispatcher automatically retries the LLM call once with the reduced context. If this retry also fails, the original ContextLengthExceeded error is returned.
  • Compaction Logic: The compaction strategy preserves all system messages (system prompt, skill context), the last user message, and all subsequent messages related to the current turn (tool calls and results), while dropping older conversation history.
  • LLM Notification: A system message is inserted into the compacted context to inform the LLM that earlier conversation history was dropped.
Changelog
  • src/agent/dispatcher.rs
    • Modified the respond_with_tools method to catch ContextLengthExceeded errors and trigger a retry mechanism.
    • Added a new private function compact_messages_for_retry to intelligently reduce the message history.
    • Implemented logic to rebuild the ReasoningContext with compacted messages for the retry attempt.
    • Ensured that if the retry fails, the original ContextLengthExceeded error is propagated.
    • Added comprehensive unit tests for the compact_messages_for_retry function, covering various message history scenarios.
Activity
  • The author has provided a detailed summary of the changes and the motivation behind them.
  • A comprehensive test plan has been outlined, including specific test cases for the new compaction logic and verification of existing tests and clippy warnings.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a robust mechanism to handle ContextLengthExceeded errors by automatically compacting the conversation history and retrying the LLM call once. This significantly improves the agent's resilience in long-running or tool-heavy turns. The compaction logic correctly preserves system instructions and the current turn's context. I have identified a potential issue where system messages appearing after the last user message (such as the iteration limit nudge) could be duplicated in the compacted message list, and I've provided a suggestion to fix this.

Comment thread src/agent/dispatcher.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds automatic context compaction and retry when the LLM returns a ContextLengthExceeded error during the agentic loop. When context length is exceeded mid-turn, the dispatcher now:

  1. Automatically compacts the context messages (keeping system messages, last user message, and current turn's work)
  2. Retries the LLM call once with the compacted context
  3. Returns the original error if the retry also fails

The implementation addresses the problem described in issue #260, where users would lose their current turn's work and have to manually run /compact and resend their message.

Changes:

  • Added automatic catch-compact-retry logic in the agentic loop when ContextLengthExceeded occurs
  • Implemented compact_messages_for_retry() function to reduce context while preserving essential information
  • Added comprehensive tests for various compaction scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/agent/dispatcher.rs Outdated
Comment thread src/agent/dispatcher.rs Outdated
Comment thread src/agent/dispatcher.rs Outdated
- Fix system message duplication: only collect system messages before the
  last User message to avoid duplicating nudges in the tail slice (Gemini + Copilot)
- Only add compaction note when earlier history is actually dropped (Copilot)
- Propagate actual retry error instead of masking with original (Copilot)
- Fix else branch to preserve system messages when no User messages exist
- Add test for nudge-after-user deduplication

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ilblackdragon ilblackdragon merged commit 6f21cfa into main Feb 23, 2026
4 checks passed
@ilblackdragon ilblackdragon deleted the fix/auto-compact-on-context-exceeded branch February 23, 2026 00:29
@github-actions github-actions Bot mentioned this pull request Feb 23, 2026
jaswinder6991 pushed a commit to jaswinder6991/ironclaw that referenced this pull request Feb 26, 2026
* fix: auto-compact and retry on ContextLengthExceeded in agentic loop

When the LLM returns a context-length-exceeded error mid-turn, the
dispatcher now automatically compacts the conversation history and
retries once instead of propagating the raw error to the user.

The compaction keeps all system messages (system prompt, skill context),
the last user message, and all subsequent messages (current turn's tool
calls and results), dropping older conversation history. A note is
inserted to inform the LLM that earlier context was dropped.

If the retry also fails, the original error is returned.

Fixes nearai#260

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address Gemini/Copilot review feedback

- Fix system message duplication: only collect system messages before the
  last User message to avoid duplicating nudges in the tail slice (Gemini + Copilot)
- Only add compaction note when earlier history is actually dropped (Copilot)
- Propagate actual retry error instead of masking with original (Copilot)
- Fix else branch to preserve system messages when no User messages exist
- Add test for nudge-after-user deduplication

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026
* fix: auto-compact and retry on ContextLengthExceeded in agentic loop

When the LLM returns a context-length-exceeded error mid-turn, the
dispatcher now automatically compacts the conversation history and
retries once instead of propagating the raw error to the user.

The compaction keeps all system messages (system prompt, skill context),
the last user message, and all subsequent messages (current turn's tool
calls and results), dropping older conversation history. A note is
inserted to inform the LLM that earlier context was dropped.

If the retry also fails, the original error is returned.

Fixes nearai#260

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address Gemini/Copilot review feedback

- Fix system message duplication: only collect system messages before the
  last User message to avoid duplicating nudges in the tail slice (Gemini + Copilot)
- Only add compaction note when earlier history is actually dropped (Copilot)
- Propagate actual retry error instead of masking with original (Copilot)
- Fix else branch to preserve system messages when no User messages exist
- Add test for nudge-after-user deduplication

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: new First-time contributor risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) size: L 200-499 changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: auto-compact and retry on ContextLengthExceeded during agentic loop

3 participants