fix: auto-compact and retry on ContextLengthExceeded#315
Conversation
When the LLM returns a context-length-exceeded error mid-turn, the dispatcher now automatically compacts the conversation history and retries once instead of propagating the raw error to the user. The compaction keeps all system messages (system prompt, skill context), the last user message, and all subsequent messages (current turn's tool calls and results), dropping older conversation history. A note is inserted to inform the LLM that earlier context was dropped. If the retry also fails, the original error is returned. Fixes #260 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello @bowenwang1996, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the agent's robustness by introducing an automatic context management mechanism. When an LLM encounters a context length limit, the system now intelligently prunes older conversation history to fit within the model's constraints and retries the request. This prevents premature termination of agentic turns due to context overflow, improving the overall reliability and user experience of long-running conversations. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces a robust mechanism to handle ContextLengthExceeded errors by automatically compacting the conversation history and retrying the LLM call once. This significantly improves the agent's resilience in long-running or tool-heavy turns. The compaction logic correctly preserves system instructions and the current turn's context. I have identified a potential issue where system messages appearing after the last user message (such as the iteration limit nudge) could be duplicated in the compacted message list, and I've provided a suggestion to fix this.
There was a problem hiding this comment.
Pull request overview
This pull request adds automatic context compaction and retry when the LLM returns a ContextLengthExceeded error during the agentic loop. When context length is exceeded mid-turn, the dispatcher now:
- Automatically compacts the context messages (keeping system messages, last user message, and current turn's work)
- Retries the LLM call once with the compacted context
- Returns the original error if the retry also fails
The implementation addresses the problem described in issue #260, where users would lose their current turn's work and have to manually run /compact and resend their message.
Changes:
- Added automatic catch-compact-retry logic in the agentic loop when
ContextLengthExceededoccurs - Implemented
compact_messages_for_retry()function to reduce context while preserving essential information - Added comprehensive tests for various compaction scenarios
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Fix system message duplication: only collect system messages before the last User message to avoid duplicating nudges in the tail slice (Gemini + Copilot) - Only add compaction note when earlier history is actually dropped (Copilot) - Propagate actual retry error instead of masking with original (Copilot) - Fix else branch to preserve system messages when no User messages exist - Add test for nudge-after-user deduplication Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: auto-compact and retry on ContextLengthExceeded in agentic loop When the LLM returns a context-length-exceeded error mid-turn, the dispatcher now automatically compacts the conversation history and retries once instead of propagating the raw error to the user. The compaction keeps all system messages (system prompt, skill context), the last user message, and all subsequent messages (current turn's tool calls and results), dropping older conversation history. A note is inserted to inform the LLM that earlier context was dropped. If the retry also fails, the original error is returned. Fixes nearai#260 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address Gemini/Copilot review feedback - Fix system message duplication: only collect system messages before the last User message to avoid duplicating nudges in the tail slice (Gemini + Copilot) - Only add compaction note when earlier history is actually dropped (Copilot) - Propagate actual retry error instead of masking with original (Copilot) - Fix else branch to preserve system messages when no User messages exist - Add test for nudge-after-user deduplication Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: auto-compact and retry on ContextLengthExceeded in agentic loop When the LLM returns a context-length-exceeded error mid-turn, the dispatcher now automatically compacts the conversation history and retries once instead of propagating the raw error to the user. The compaction keeps all system messages (system prompt, skill context), the last user message, and all subsequent messages (current turn's tool calls and results), dropping older conversation history. A note is inserted to inform the LLM that earlier context was dropped. If the retry also fails, the original error is returned. Fixes nearai#260 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address Gemini/Copilot review feedback - Fix system message duplication: only collect system messages before the last User message to avoid duplicating nudges in the tail slice (Gemini + Copilot) - Only add compaction note when earlier history is actually dropped (Copilot) - Propagate actual retry error instead of masking with original (Copilot) - Fix else branch to preserve system messages when no User messages exist - Add test for nudge-after-user deduplication Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
ContextLengthExceedederror mid-turn in the agentic loop, the dispatcher now automatically compacts context messages and retries once instead of propagating the raw error to the userTest plan
test_compact_keeps_system_and_last_user_exchange— verifies old history is dropped while system prompt, last user message, and current turn tool work are preservedtest_compact_preserves_multiple_system_messages— verifies all system messages (prompt, skill context, nudges) are kepttest_compact_single_user_message_keeps_everything— edge case with no old history to droptest_compact_no_user_messages_keeps_non_system— edge case fallback when no user messages existtest_compact_drops_old_history_but_keeps_current_turn_tools— multi-turn conversation with multiple tool calls in current turncargo test— 19/19 dispatcher tests pass)cargo clippy --all --all-features)Fixes #260