Skip to content

fix(cron): disable auto_save for cron agent jobs to prevent recursive memory bloat#5664

Merged
theonlyhennygod merged 2 commits intozeroclaw-labs:masterfrom
guitaripod:fix/cron-autosave-memory-bloat
Apr 12, 2026
Merged

fix(cron): disable auto_save for cron agent jobs to prevent recursive memory bloat#5664
theonlyhennygod merged 2 commits intozeroclaw-labs:masterfrom
guitaripod:fix/cron-autosave-memory-bloat

Conversation

@guitaripod
Copy link
Copy Markdown
Contributor

Summary

Cron agent jobs inherit the global auto_save = true memory config. When the cron scheduler prepends [Memory context] to the prompt (from recalled memories), the agent loop's auto-save persists the entire enriched prompt — including the [Memory context] wrapper — back into brain.db as a Conversation memory.

On the next cron run, the agent loop's build_context() recalls that saved entry (it has no Conversation category filter, unlike the cron scheduler's own recall at L286), wraps it in another [Memory context], and auto-saves an even larger entry. This creates exponential growth.

Root cause

The existing should_skip_autosave_content() guard catches messages starting with [cron:, but the cron scheduler prepends memory context before the cron prefix:

// scheduler.rs L306
let prefixed_prompt = format!("{memory_context}[cron:{} {name}] {prompt}", job.id);
//                             ^^^^^^^^^^^^^^^ starts with [Memory context], not [cron:

So the skip check fails and auto-save proceeds.

Evidence from production

brain.db after ~2 weeks of daily cron jobs:

Entry Size
user_msg_558a38ed… 2,061,705 bytes
user_msg_76f44c38… 1,048,326 bytes
user_msg_3257a4b4… 533,103 bytes
user_msg_cf57c25f… 271,209 bytes
… (15 entries total) 4.4 MB total

Each entry is roughly 2x the previous — classic exponential doubling from the recursive save-recall-save cycle.

This caused the agent loop to log on every cron run:

Preemptive context trim: estimated tokens exceed budget estimated=1028403 budget=32000

After aggressive trimming, the actual cron prompt was destroyed and the provider call failed:

Claude Code exited with non-zero status 1

Fix

  1. scheduler.rs: Set auto_save = false on the cloned config before calling agent::run() for cron jobs. Cron prompts are synthetic — they should never be persisted as user conversation memories.

  2. lib.rs (defense-in-depth): Add [Memory context] to should_skip_autosave_content() so that even if another code path passes an enriched prompt to auto-save, the synthetic wrapper is caught.

Test plan

  • Existing autosave_content_filter_drops_cron_and_distilled_noise test updated with [Memory context] case
  • All 141 cron tests pass
  • Full workspace test suite passes (6,353 tests, 0 failures)
  • cargo clippy --workspace --features ci-all -- -D warnings clean
  • Verified fix in production: purged bloated entries, restarted daemon, triggered cron catch-up — job completed successfully and delivered to Telegram with no context trim warnings

… memory bloat

Cron agent jobs inherit the global `auto_save = true` config, which
saves the enriched prompt (including the [Memory context] wrapper) back
into brain.db as a Conversation memory. On the next cron run, the agent
loop's build_context() recalls that entry (no Conversation category
filter), wraps it in another [Memory context], and saves an even larger
entry. This creates exponential growth — observed entries of 2MB, 1MB,
533K in production, totalling 4.4MB across 15 context dumps.

The bloated context causes `estimated tokens exceed budget` on every
cron run, aggressive trimming destroys the actual prompt, and the
provider call fails with exit code 1.

Fix: set `auto_save = false` on the cloned config before calling
agent::run() for cron jobs. Also add `[Memory context]` to the
should_skip_autosave_content filter as defense-in-depth.
@theonlyhennygod
Copy link
Copy Markdown
Collaborator

Agent Review — PR #5664

Triage Result: Skipped — High-Risk Path

Comprehension Summary: This PR disables auto_save on the cloned config before passing it to agent::run() for cron jobs (in crates/zeroclaw-runtime/src/cron/scheduler.rs) and adds [Memory context] to the should_skip_autosave_content() filter in crates/zeroclaw-memory/src/lib.rs as defense-in-depth. The fix addresses a recursive memory bloat cycle where cron job prompts, enriched with memory context, were being auto-saved and then recalled on subsequent runs — causing exponential growth in brain.db.

Why skipped: This PR modifies crates/zeroclaw-runtime/src/cron/scheduler.rs, which falls under a high-risk path (crates/zeroclaw-runtime/src/**) per AGENTS.md risk tiers. The PR is not primarily a docs change. Per the review protocol, high-risk non-docs PRs require human maintainer review. The agent does not process these.

Initial observations (for the maintainer who picks this up):

  • The fix is a 2-file, 8-line change. Conceptually clean — cron prompts are synthetic and should not be auto-saved.
  • CI is fully green (all checks pass including CI Required Gate).
  • The PR template is thorough with clear root cause analysis, production evidence (exponential doubling in brain.db), and a well-structured test plan (all 5 checkboxes complete).
  • Test coverage includes a new [Memory context] assertion in the existing autosave_content_filter_drops_cron_and_distilled_noise test.
  • Missing labels: no risk:*, no size:* labels. Suggested: risk: high, size: XS.
  • Potential overlap: PRs feat(memory): add is_user_autosave_key detector for per-turn user message keys #5631 and fix(memory): skip user autosave keys in all memory context paths #5632 are open and touch memory context handling in different code paths (src/memory/mod.rs, src/agent/loop_.rs, src/agent/memory_loader.rs). They address different facets of the same domain — not duplicates, but the maintainer should verify they compose correctly.
Field Content
PR #5664 — fix(cron): disable auto_save for cron agent jobs to prevent recursive memory bloat
Author @guitaripod
Summary Disable auto_save for cron agent jobs; add [Memory context] to autosave content filter. Fixes exponential memory bloat in brain.db.
Action Skipped — high-risk path (runtime crate)
Reason crates/zeroclaw-runtime/src/cron/scheduler.rs is a high-risk path per AGENTS.md
Security/performance Performance improvement — eliminates exponential memory growth in cron scenarios
Notes Needs human maintainer review. Related open PRs #5631/#5632 touch adjacent memory code paths.

@theonlyhennygod theonlyhennygod self-assigned this Apr 12, 2026
@theonlyhennygod
Copy link
Copy Markdown
Collaborator

Agent Review — Ready to Merge

Comprehension summary: This PR fixes recursive memory bloat in cron agent jobs. Cron jobs inherit auto_save = true from global config, causing the agent loop to persist the entire enriched prompt (including [Memory context] wrapper) back to brain.db. On the next cron run, this saved entry is recalled and wrapped again, creating exponential doubling (~2x per run, reaching 2MB+ entries within 2 weeks). The fix is two-layered: (1) scheduler.rs sets auto_save = false on the cloned config before calling agent::run() for cron jobs, and (2) lib.rs adds [Memory context] to should_skip_autosave_content() as defense-in-depth. Blast radius: cron job memory behavior only; no impact on interactive sessions.

Thank you, @guitaripod. This is an excellent bug fix with a thorough root cause analysis, production evidence, and a clean implementation.

What was reviewed and verified:

  • Code correctness: The primary fix (cron_config.memory.auto_save = false) is the right approach — cron prompts are synthetic and should never be persisted. The defense-in-depth addition to should_skip_autosave_content() catches edge cases where enriched prompts might reach auto-save through other code paths.
  • Regression analysis: No existing behavior changes for interactive sessions. auto_save is only disabled on the cloned config for cron jobs. The [Memory context] skip check only affects auto-save filtering, not memory recall itself.
  • Test coverage: Existing test autosave_content_filter_drops_cron_and_distilled_noise updated with [Memory context] case. Full workspace suite passes (6,353 tests).
  • Privacy/data hygiene: No PII, no identity leakage. Test data uses system-scoped placeholders.
  • Architecture alignment: Clean separation — config clone prevents mutation of shared state. Follows existing patterns in the codebase.

Security/performance assessment:

  • Security: No security impact. No changes to access control, input validation, or attack surface.
  • Performance: Positive impact — prevents exponential memory growth that was causing context trim failures and provider call errors.

CI Status: All checks pass (both CI and Quality Gate workflows, including CI Required Gate).

Missing labels: No risk:* or size:* labels were applied by automation. This should be risk: medium (touches cron scheduler + memory), size: XS (8+1 lines changed across 2 files).

PR template: Not filled out in standard template format, but the PR body contains equivalent information for all required sections (summary, validation, security, rollback, risks/mitigations, blast radius).

This PR is ready for maintainer merge.


Field Content
PR #5664 — fix(cron): disable auto_save for cron agent jobs to prevent recursive memory bloat
Author @guitaripod
Summary Prevents exponential memory bloat from cron jobs by disabling auto_save and adding defense-in-depth content filtering
Action Ready to merge
Reason Clean fix, thorough analysis, all CI green, no regressions, no outstanding findings
Security/performance No security impact; positive performance impact (prevents memory explosion)
Changes requested None
Architectural notes Config clone pattern is correct for per-job overrides. Defense-in-depth on content filter is good practice.
Tests 6,353 tests pass; existing test updated with new case
Notes Production evidence of the bug (4.4MB of bloated entries) adds confidence in the diagnosis

@theonlyhennygod theonlyhennygod added the agent-approved PR approved by automated review agent label Apr 12, 2026
@theonlyhennygod theonlyhennygod merged commit c7bc6ef into zeroclaw-labs:master Apr 12, 2026
20 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Shipped in ZeroClaw Project Board Apr 12, 2026
@guitaripod guitaripod deleted the fix/cron-autosave-memory-bloat branch April 12, 2026 20:01
vernonstinebaker added a commit to vernonstinebaker/zeroclaw that referenced this pull request Apr 19, 2026
Raw per-turn user messages are stored under 'user_msg' / 'user_msg_*' keys
by the auto-save path. Without this fix, all three context-building callers
were recalling and injecting these entries back into the LLM context window,
causing exponential bloat: each new turn recalled the previous turn's full
message (which itself contained all prior turns), growing unboundedly.

Wire is_user_autosave_key() (introduced in the preceding commit) into:
- build_context() in zeroclaw-runtime/src/agent/loop_.rs
- DefaultMemoryLoader::load_context() in memory_loader.rs
- should_skip_memory_context_entry() in zeroclaw-channels orchestrator

Placement is consistent across all three callers: after is_assistant_autosave_key
and before should_skip_autosave_content, maintaining filter ordering convention.

Also renames the pass-through key in the existing build_context test from
'user_msg_real' (which would now be filtered) to 'user_preference', and adds
three new tests — one per caller — verifying user_msg_* keys are excluded
while non-prefixed semantic keys (user_preference, user_fact) pass through.

Complementary to zeroclaw-labs#5664 (disabled auto_save on the cron write path); this
PR addresses the read path across all context-assembly callers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-approved PR approved by automated review agent

Projects

Status: Shipped

Development

Successfully merging this pull request may close these issues.

2 participants