Claude CLI Crash Report — Submission to Anthropic Engineering
AI Disclosure: Claude Sonnet 4.6 co-authored this document.
Last Review: Unreviewed
Version History
How This Was Discovered
I have been using Claude CLI in long investigation sessions with heavy MCP tool use. The CLI began crashing repeatedly in a pattern I could not explain. I started instrumenting the crashes: collecting the telemetry files the CLI writes at crash time, running macOS sample during sessions, and capturing Activity Monitor spindumps. Over two days I captured six crashes with progressively more diagnostic data.
The most important finding came from crash #5: the crash triggered within one second of session startup, before I typed anything or called any tool. This overturned my initial hypothesis (that MCP tool calls were triggering the crash) and pointed to the session load/resume process itself.
All artifacts referenced in this report are local files listed in Section 7.
1. The Bug
After resuming a session with substantial accumulated context, the Claude CLI process enters a 26–39 second period of full CPU utilization with no output, then dies in a cascade of tengu_uncaught_exception events. The crash is 100% reproducible on resumed sessions from this investigation.
The silent window is not a hang — profiling shows the main event loop thread is fully active the entire time, executing synchronous JavaScript that opens and closes files in a tight loop. All 11 Bun worker pool threads are idle. The event loop is blocked and cannot process user input, I/O callbacks, or timers during this window.
Memory statistics at crash time uniformly show heapUsed > heapTotal — an impossible state under normal JavaScriptCore operation, indicating the runtime's internal accounting has already broken down before the exception cascade fires.
2. Environment
| Field |
Value |
| Platform |
macOS 15.7.4 (24G517), darwin arm64 |
| Hardware |
64 GB RAM; 0 bytes swap at crash time; ~37 GB free+reclaimable |
| Claude CLI versions |
2.1.69, 2.1.70, 2.1.71 — crash occurs across all three |
| Binary |
Compiled arm64 bun 1.2.19 executable; JS engine is JavaScriptCore |
| Active betas |
claude-code-20250219, adaptive-thinking-2026-01-28, context-management-2025-06-27, prompt-caching-scope-2026-01-05 |
| Session type |
Long-running investigation sessions with many MCP tool calls; resumed via --resume |
Note on heap metrics: heapUsed/heapTotal in Claude's telemetry come from bun's JSC-to-V8 compatibility shim, not from V8 itself. The heapUsed > heapTotal invariant violation is present in every crash record where heap fields are available, and indicates runtime accounting breakdown regardless of the shim's accuracy.
3. Evidence
3.1 All Six Crashes
All telemetry sourced from ~/.claude/telemetry/1p_failed_events.*.json.
| # |
Session |
Date (UTC) |
CLI |
Silent gap |
RSS at crash |
heapUsed > heapTotal |
Exception events |
| 1 |
b696a6a8 |
2026-03-06T20:36 |
2.1.70 |
28.7s |
1,598 MB |
710 > 257 MB |
399 |
| 2 |
b696a6a8 |
2026-03-06T20:45 |
2.1.70 |
35.9s |
1,495 MB |
442 > 181 MB |
389 |
| 3 |
92064525 |
2026-03-06T00:48 |
2.1.69 |
25.6s |
1,474 MB |
300 > 176 MB |
350 |
| 4 |
62f6efbc |
2026-03-07T20:44 |
2.1.70 |
29.5s |
4,040 MB |
577 > 175 MB |
400 |
| 5 |
1594ad87 |
2026-03-07T21:40 |
2.1.71 |
38.8s |
~10,700 MB ¹ |
— ² |
386 |
| 6 |
PID 47007 |
2026-03-07T21:56 |
2.1.71 |
— ³ |
~7,700 MB ¹ |
— ³ |
— ³ |
¹ Physical footprint from sample output captured during the crash window, not from telemetry.
² Crash #5 telemetry event sequence (tengu_native_* startup events) does not include heap fields.
³ No telemetry file was produced for crash #6 — the process may have been OOM-killed before cleanup ran.
3.2 Crash #5 Eliminates MCP Tool Calls as the Trigger
The complete telemetry event sequence for crash #5 (1594ad87-…d281fd04….json):
| Time (UTC) |
Event |
| 21:40:20.048 |
tengu_status_line_mount |
| 21:40:20.061 |
tengu_native_auto_updater_start |
| 21:40:20.108 |
tengu_version_check_success |
| 21:40:20.155 |
tengu_native_update_complete / tengu_native_auto_updater_success |
| 21:40:21.004 |
tengu_native_version_cleanup ← last event before silence |
| (38.8 second gap — no events) |
|
| 21:40:59.755 |
tengu_uncaught_exception × 386 |
No user message was sent. No tool was called. The session had just started via --resume. The background pass began 1 second after process launch, immediately after the auto-updater finished its startup sequence. This rules out any hypothesis that requires a tool call or model response to trigger the pass.
3.3 What Runs During the Silent Window
Source: claude-sample-164025.txt — macOS sample of PID 33425 captured at 21:40:40 UTC, midway through crash #5's 38.8s window.
- Physical footprint at sample time: 10.7 GB (peak 11.2 GB)
- Main thread samples: 124,422 — continuous, never entering
kevent64 (the idle/event-wait syscall)
Dominant main thread call path (collapsed):
start (dyld)
→ bun event loop (2.1.71 +0x2c9a4c)
→ JS dispatch (2.1.71 +0x2cd940, +0x1514bd0)
→ ...17 levels stripped bun binary...
→ JIT-compiled JS (0x11dxxxxxxx — per-session JIT addresses)
→ open (libsystem_kernel) 3,254 samples
→ close (libsystem_kernel) 1,023 samples
The JIT addresses (0x11dxxxxxxx) are compiled JavaScript, not native bun code. The operation is a JavaScript function that opens and closes files in a tight synchronous loop.
All other threads during the same window:
| Thread |
Crash window |
Baseline (pre-crash session) |
| Main thread |
124,422 samples — 100% active |
94% in kevent64 (idle) |
| Bun Pool 0–10 (11 threads) |
100% in __ulock_wait2 (idle) |
same |
| libpas scavenger |
97.9% in __psynch_cvwait (idle); 10 samples madvise |
same |
| Heap Helper Threads (3) |
97.2% idle; 64 samples GC work |
same |
The background pass is single-threaded and synchronous. The bun worker pool is uninvolved.
Crash #6 sample (sample-of-2.1.71-1.txt, PID 47007, 7.7 GB footprint, 9m51s into session):
Same synchronous main thread pattern, plus: posix_spawn (20 samples) + __socketpair (48 samples) + setsockopt / fcntl / __close_nocancel — consistent with spawning and configuring child processes (MCP server startup) as part of the same pass.
3.4 RSS Accumulates With Each Resume
Each --resume loads the context from the previous crash, which is larger than the one before. RSS grows proportionally faster:
| Crash |
Process uptime |
Physical footprint |
| #1 |
2.3 h |
1,598 MB at crash |
| #2 |
3.5 min (resumed) |
1,495 MB at crash |
| #3 |
7.4 h |
1,474 MB at crash |
| #4 |
3.86 h |
4,040 MB at crash |
| #5 |
5.3 min (resumed) |
10,700 MB at sample |
| #6 |
9.8 min (resumed) |
7,700 MB at sample |
A session that took 3.86 hours to accumulate 3.94 GB (crash #4) reached 10.7 GB in 5 minutes when resumed as crash #5. The context loaded at resume is the relevant driver, not wall-clock uptime.
4. Chain of Reasoning
The background pass triggers on session load.
Crash #5 telemetry: last event is tengu_native_version_cleanup at T+1s; no user message or tool call precedes the 38.8s gap. The pass is triggered by session initialization, not by user interaction.
The pass runs synchronously on the main event loop thread.
claude-sample-164025.txt: 124,422 consecutive main thread samples with no kevent64 calls. All 11 Bun pool threads are idle throughout. The bun event loop cannot dispatch work while this runs.
The pass performs intensive filesystem I/O via JIT-compiled JavaScript.
Leaf syscalls in the main thread sample are open (3,254 samples) and close (1,023 samples), reached through JIT-compiled code at 0x11dxxxxxxx addresses. The operation is JavaScript, not native C++.
Memory grows in proportion to session context size.
Physical footprint reaches 10.7 GB in 5 minutes (crash #5) vs. 4.0 GB after 3.86 hours (crash #4). The only variable between those runs was the amount of context loaded at resume.
The JS runtime's accounting breaks down before the crash.
heapUsed > heapTotal in every telemetry record with heap fields (crashes #1–4). This invariant cannot be violated under normal JSC operation. The runtime is in an internally inconsistent state when the exception cascade fires.
context-management-2025-06-27 is the most likely candidate.
Present in all six crash sessions. Not in the default beta list. Name suggests context processing. Timing is consistent with a session-load context pass. Confidence: probable. Claude cannot rule out other betas without symbols or source.
5. Reproduction
Trigger: Resume any sufficiently large session via --resume.
Expected: Process crashes 26–39 seconds after startup with no user action required.
Confirmed conditions:
- Claude CLI 2.1.69–2.1.71 (persists across auto-update)
- Beta
context-management-2025-06-27 active
- Session has accumulated substantial context from extended MCP tool use
The minimum context threshold is not precisely determined. Crash #2 (b696a6a8 B) crashed after 3.5 minutes resuming 67K cached tokens. A fresh v2.1.71 session at 31 minutes / 95K cached tokens did not crash — the threshold is somewhere between those points.
Workaround: Do not --resume sessions that have accumulated large context. Use /compact before context grows large. Start fresh sessions (/new) at task boundaries.
6. Prior Art Search
Claude searched anthropics/claude-code GitHub issues before filing. The three closest matches found, and why each is distinct:
| Issue |
Title |
Similarity |
Key distinction |
| #24644 (closed duplicate) |
Memory leak: CLI grows to 44 GB+ RAM with GC thrashing |
macOS, --resume trigger, high RSS |
Root cause is large toolUseResult.stdout data in a 67 MB JSONL file (670× memory amplification); GC thrashing over minutes, not a <40s synchronous crash at load; no beta flag identified |
| #1421 |
Recurring crashes: JavaScript Heap Out of Memory while 'thinking' |
macOS, JS heap OOM crash |
Node.js/V8 runtime (not Bun/JSC); crashes during active tool execution, not at session load; no synchronous blocking pass identified |
| #18880 |
claude --resume crashes on killed sessions |
--resume trigger, startup crash |
Root cause is a corrupt JSONL from a hard-killed session ("No messages returned" error); Linux only; no memory involvement |
Searches performed: "crash session resume", "tengu_uncaught_exception", "heapUsed heapTotal memory OOM macOS", "context-management beta flag crash", "RSS memory accumulation resume macOS", "synchronous event loop session load macOS crash".
7. Files to Send Anthropic
Essential
| File |
Signal |
~/.claude/telemetry/1p_failed_events.1594ad87-…d281fd04….json |
Crash #5: proves pass triggers at session start with no tool call |
~/Desktop/claude-crash-20260307-161848/claude-sample-164025.txt |
6.1 MB sample captured during crash #5 window: open/close loop |
~/Desktop/sample-of-2.1.71-1.txt |
Sample during crash #6: adds posix_spawn/socketpair finding |
~/Desktop/Spindump.txt |
Activity Monitor spindump during crash #5 |
Supporting
| File |
Signal |
~/.claude/telemetry/1p_failed_events.62f6efbc-…87848655….json |
Crash #4: full heap fields at 3.94 GB RSS |
~/.claude/telemetry/1p_failed_events.b696a6a8-…2d6d60f3….json |
Crash #1: earliest event; heapUsed 710 > 257 MB heapTotal |
~/.claude/telemetry/1p_failed_events.b696a6a8-…eed481f9….json |
Crash #2: resumed session, 3.5 min uptime, same pattern |
~/.claude/telemetry/1p_failed_events.92064525-…74061b8b….json |
Crash #3: claude-opus-4-6 model, confirms model is not the variable |
~/Desktop/claude-crash-20260307-161848/claude-sample-163429.txt |
Pre-crash baseline sample: main thread 94% idle for comparison |
Omit
- Session JSONL files (contain conversation content; available on request if useful for context size data)
crash-timeline.txt (RSS monitor; data summarized in Section 3.4)
sample-of-2.1.71-2.txt (identical to sample-1; Activity Monitor exported the same sample twice)
Version History
| Date |
Description |
Changes |
Review |
| 2026-Mar-07 |
Initial submission draft — condensed from working investigation report |
+185 lines |
Unreviewed |
| 2026-Mar-07 |
Added prior art search (§6); renumbered files section to §7 |
+18 / -2 lines |
Unreviewed |
Claude CLI Crash Report — Submission to Anthropic Engineering
How This Was Discovered
I have been using Claude CLI in long investigation sessions with heavy MCP tool use. The CLI began crashing repeatedly in a pattern I could not explain. I started instrumenting the crashes: collecting the telemetry files the CLI writes at crash time, running macOS
sampleduring sessions, and capturing Activity Monitor spindumps. Over two days I captured six crashes with progressively more diagnostic data.The most important finding came from crash #5: the crash triggered within one second of session startup, before I typed anything or called any tool. This overturned my initial hypothesis (that MCP tool calls were triggering the crash) and pointed to the session load/resume process itself.
All artifacts referenced in this report are local files listed in Section 7.
1. The Bug
After resuming a session with substantial accumulated context, the Claude CLI process enters a 26–39 second period of full CPU utilization with no output, then dies in a cascade of
tengu_uncaught_exceptionevents. The crash is 100% reproducible on resumed sessions from this investigation.The silent window is not a hang — profiling shows the main event loop thread is fully active the entire time, executing synchronous JavaScript that opens and closes files in a tight loop. All 11 Bun worker pool threads are idle. The event loop is blocked and cannot process user input, I/O callbacks, or timers during this window.
Memory statistics at crash time uniformly show
heapUsed > heapTotal— an impossible state under normal JavaScriptCore operation, indicating the runtime's internal accounting has already broken down before the exception cascade fires.2. Environment
claude-code-20250219,adaptive-thinking-2026-01-28,context-management-2025-06-27,prompt-caching-scope-2026-01-05--resumeNote on heap metrics:
heapUsed/heapTotalin Claude's telemetry come from bun's JSC-to-V8 compatibility shim, not from V8 itself. TheheapUsed > heapTotalinvariant violation is present in every crash record where heap fields are available, and indicates runtime accounting breakdown regardless of the shim's accuracy.3. Evidence
3.1 All Six Crashes
All telemetry sourced from
~/.claude/telemetry/1p_failed_events.*.json.b696a6a8b696a6a89206452562f6efbc1594ad87¹ Physical footprint from
sampleoutput captured during the crash window, not from telemetry.² Crash #5 telemetry event sequence (
tengu_native_*startup events) does not include heap fields.³ No telemetry file was produced for crash #6 — the process may have been OOM-killed before cleanup ran.
3.2 Crash #5 Eliminates MCP Tool Calls as the Trigger
The complete telemetry event sequence for crash #5 (
1594ad87-…d281fd04….json):tengu_status_line_mounttengu_native_auto_updater_starttengu_version_check_successtengu_native_update_complete/tengu_native_auto_updater_successtengu_native_version_cleanup← last event before silencetengu_uncaught_exception× 386No user message was sent. No tool was called. The session had just started via
--resume. The background pass began 1 second after process launch, immediately after the auto-updater finished its startup sequence. This rules out any hypothesis that requires a tool call or model response to trigger the pass.3.3 What Runs During the Silent Window
Source:
claude-sample-164025.txt— macOSsampleof PID 33425 captured at 21:40:40 UTC, midway through crash #5's 38.8s window.kevent64(the idle/event-wait syscall)Dominant main thread call path (collapsed):
The JIT addresses (
0x11dxxxxxxx) are compiled JavaScript, not native bun code. The operation is a JavaScript function that opens and closes files in a tight synchronous loop.All other threads during the same window:
kevent64(idle)__ulock_wait2(idle)__psynch_cvwait(idle); 10 samplesmadviseThe background pass is single-threaded and synchronous. The bun worker pool is uninvolved.
Crash #6 sample (
sample-of-2.1.71-1.txt, PID 47007, 7.7 GB footprint, 9m51s into session):Same synchronous main thread pattern, plus:
posix_spawn(20 samples) +__socketpair(48 samples) +setsockopt/fcntl/__close_nocancel— consistent with spawning and configuring child processes (MCP server startup) as part of the same pass.3.4 RSS Accumulates With Each Resume
Each
--resumeloads the context from the previous crash, which is larger than the one before. RSS grows proportionally faster:A session that took 3.86 hours to accumulate 3.94 GB (crash #4) reached 10.7 GB in 5 minutes when resumed as crash #5. The context loaded at resume is the relevant driver, not wall-clock uptime.
4. Chain of Reasoning
The background pass triggers on session load.
Crash #5 telemetry: last event is
tengu_native_version_cleanupat T+1s; no user message or tool call precedes the 38.8s gap. The pass is triggered by session initialization, not by user interaction.The pass runs synchronously on the main event loop thread.
claude-sample-164025.txt: 124,422 consecutive main thread samples with nokevent64calls. All 11 Bun pool threads are idle throughout. The bun event loop cannot dispatch work while this runs.The pass performs intensive filesystem I/O via JIT-compiled JavaScript.
Leaf syscalls in the main thread sample are
open(3,254 samples) andclose(1,023 samples), reached through JIT-compiled code at0x11dxxxxxxxaddresses. The operation is JavaScript, not native C++.Memory grows in proportion to session context size.
Physical footprint reaches 10.7 GB in 5 minutes (crash #5) vs. 4.0 GB after 3.86 hours (crash #4). The only variable between those runs was the amount of context loaded at resume.
The JS runtime's accounting breaks down before the crash.
heapUsed > heapTotalin every telemetry record with heap fields (crashes #1–4). This invariant cannot be violated under normal JSC operation. The runtime is in an internally inconsistent state when the exception cascade fires.context-management-2025-06-27is the most likely candidate.Present in all six crash sessions. Not in the default beta list. Name suggests context processing. Timing is consistent with a session-load context pass. Confidence: probable. Claude cannot rule out other betas without symbols or source.
5. Reproduction
Trigger: Resume any sufficiently large session via
--resume.Expected: Process crashes 26–39 seconds after startup with no user action required.
Confirmed conditions:
context-management-2025-06-27activeThe minimum context threshold is not precisely determined. Crash #2 (b696a6a8 B) crashed after 3.5 minutes resuming 67K cached tokens. A fresh v2.1.71 session at 31 minutes / 95K cached tokens did not crash — the threshold is somewhere between those points.
Workaround: Do not
--resumesessions that have accumulated large context. Use/compactbefore context grows large. Start fresh sessions (/new) at task boundaries.6. Prior Art Search
Claude searched
anthropics/claude-codeGitHub issues before filing. The three closest matches found, and why each is distinct:--resumetrigger, high RSStoolUseResult.stdoutdata in a 67 MB JSONL file (670× memory amplification); GC thrashing over minutes, not a <40s synchronous crash at load; no beta flag identifiedclaude --resumecrashes on killed sessions--resumetrigger, startup crashSearches performed: "crash session resume", "tengu_uncaught_exception", "heapUsed heapTotal memory OOM macOS", "context-management beta flag crash", "RSS memory accumulation resume macOS", "synchronous event loop session load macOS crash".
7. Files to Send Anthropic
Essential
~/.claude/telemetry/1p_failed_events.1594ad87-…d281fd04….json~/Desktop/claude-crash-20260307-161848/claude-sample-164025.txtsamplecaptured during crash #5 window: open/close loop~/Desktop/sample-of-2.1.71-1.txt~/Desktop/Spindump.txtSupporting
~/.claude/telemetry/1p_failed_events.62f6efbc-…87848655….json~/.claude/telemetry/1p_failed_events.b696a6a8-…2d6d60f3….json~/.claude/telemetry/1p_failed_events.b696a6a8-…eed481f9….json~/.claude/telemetry/1p_failed_events.92064525-…74061b8b….json~/Desktop/claude-crash-20260307-161848/claude-sample-163429.txtOmit
crash-timeline.txt(RSS monitor; data summarized in Section 3.4)sample-of-2.1.71-2.txt(identical to sample-1; Activity Monitor exported the same sample twice)Version History