fix(claude-provider): use [1m] model tag for reliable 1M context#2280
Closed
slambert wants to merge 1 commit into
Closed
fix(claude-provider): use [1m] model tag for reliable 1M context#2280slambert wants to merge 1 commit into
slambert wants to merge 1 commit into
Conversation
…eadless containers The Claude Code CLI determines context window size internally via three paths (checked in order): 1. `[1m]` tag in model name — unconditional, no dependencies 2. `--betas context-1m-2025-08-07` — requires CLI global state 3. Remote config (coral_reef_sonnet) — unavailable in headless containers Path 2 has a race condition: the CLI stores betas via sdkBetas during initialization, but the compaction window can be computed before that store completes. When this happens, CLAUDE_CODE_AUTO_COMPACT_WINDOW gets capped by Math.min(200000, 800000) = 200000, causing compaction at ~160K tokens instead of ~800K. This commit: - Adds `model` field to ContainerConfig, RunnerConfig, and ProviderOptions - Passes model through from container.json → runner → Claude provider - Introduces `modelForSdk()` which appends `[1m]` to known 1M models - Replaces the hardcoded compact window with `compactWindowForModel()` that derives the threshold from the model's context window (80%) - Preserves the CLAUDE_CODE_AUTO_COMPACT_WINDOW env var as an operator override (checked first, before model-based computation) The CLI's internal normalizer (Fj) strips the [1m] tag before API calls, so the Anthropic API still receives the canonical model name. To use: set `"model": "claude-sonnet-4-6"` in the group's container.json (or base-container.json). The provider auto-appends [1m] for 1M-capable models. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7 tasks
Collaborator
|
Thanks for the thorough analysis of the With #2233 merged, per-group model overrides are now wired end-to-end. That said, we're going to pass on auto-expanding the compaction window based on context size. Shorter sessions work better for claw-type agents. We'd rather compact earlier and keep turns focused than let context grow to 800K. The current 165K default is intentional. Closing this one out, but appreciate the investigation. |
This was referenced May 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
modelfield fromcontainer.jsonthrough the agent-runner config, provider options, and into the Claude SDKquery()call[1m]model tag (e.g.claude-sonnet-4-6[1m]) instead of--betasto signal 1M context — the CLI strips the tag before API calls via its internal normalizerCLAUDE_CODE_AUTO_COMPACT_WINDOWfrom the model's actual context window (80% of 1M = 800K for Sonnet/Opus) instead of hardcoding 165KCLAUDE_CODE_AUTO_COMPACT_WINDOWwhen explicitly setWhy
[1m]instead of--betas?The CLI's context window function checks three paths in order:
[1m]tag in model name — unconditional, no dependencies--betasflag — requiressdkBetasto be stored in global state firstIn headless containers, path 3 is unavailable and path 2 has a race condition: the betas store may not complete before the compaction window is computed. This causes 1M models to compact at ~168K tokens (80% of the 200K default) instead of ~800K.
The
[1m]tag (path 1) is checked first via a simple regex (/\[1m\]/i.test(modelName)) with no async dependencies, making it reliable in all environments.Related
CLAUDE_CODE_AUTO_COMPACT_WINDOWenv var, but doesn't help becauseMath.min(perceivedContextWindow, envVar)caps it at 200K when the CLI doesn't know the model is 1M--betasapproach, which has the same race condition this PR avoidsTest plan
"model": "claude-sonnet-4-6"in a group'scontainer.jsonCLAUDE_CODE_AUTO_COMPACT_WINDOWis800000CLAUDE_CODE_AUTO_COMPACT_WINDOWenv var override still works when explicitly set[1m]tag🤖 Generated with Claude Code