-
Notifications
You must be signed in to change notification settings - Fork 4.6k
[Feature]: Local-First Mode for Small Models — Compact No-Tools Prompting, Strict Parser Option, and No Prompt-Leakage #5287
Copy link
Copy link
Open
Labels
agentAuto scope: src/agent/** changed.Auto scope: src/agent/** changed.enhancementNew feature or requestNew feature or requestpriority:p2Medium priorityMedium priorityproviderAuto scope: src/providers/** changed.Auto scope: src/providers/** changed.provider: ollamaAuto module: provider/ollama changed.Auto module: provider/ollama changed.risk: highAuto risk: security/runtime/gateway/tools/workflows.Auto risk: security/runtime/gateway/tools/workflows.runtimeAuto scope: src/runtime/** changed.Auto scope: src/runtime/** changed.securityAuto scope: src/security/** changed.Auto scope: src/security/** changed.status:acceptedRFC or work item accepted and ratified by the team.RFC or work item accepted and ratified by the team.status:no-staleExempt from the 60-day stale auto-close policy.Exempt from the 60-day stale auto-close policy.toolAuto scope: src/tools/** changed.Auto scope: src/tools/** changed.
Metadata
Metadata
Assignees
Labels
agentAuto scope: src/agent/** changed.Auto scope: src/agent/** changed.enhancementNew feature or requestNew feature or requestpriority:p2Medium priorityMedium priorityproviderAuto scope: src/providers/** changed.Auto scope: src/providers/** changed.provider: ollamaAuto module: provider/ollama changed.Auto module: provider/ollama changed.risk: highAuto risk: security/runtime/gateway/tools/workflows.Auto risk: security/runtime/gateway/tools/workflows.runtimeAuto scope: src/runtime/** changed.Auto scope: src/runtime/** changed.securityAuto scope: src/security/** changed.Auto scope: src/security/** changed.status:acceptedRFC or work item accepted and ratified by the team.RFC or work item accepted and ratified by the team.status:no-staleExempt from the 60-day stale auto-close policy.Exempt from the 60-day stale auto-close policy.toolAuto scope: src/tools/** changed.Auto scope: src/tools/** changed.
Type
Projects
Status
Backlog
Summary
ZeroClaw would benefit from a compact local-model mode that reduces prompt bloat, disables permissive fallback parsing, and prevents internal tool/system instructions from leaking into user-visible output.
Problem statement
It solves a real local-first user pain: when someone uses ZeroClaw with a smaller Ollama-hosted model, they want the agent to respond cleanly and cheaply on simple supervised tasks, but the current runtime adds too much prompt overhead and too many permissive fallback behaviors, which can lead to slower responses, hangs, bogus tool behavior, and even internal tool/system text leaking into the final answer.
Current behavior is insufficient because:
• trivial no-tools prompts still carry a large tool/system preamble
• that wastes context and inference budget on local models
• fallback parsing is loose enough to create instability
• internal runtime text can appear in user-visible output
• this makes local models feel less reliable than they should in the exact local-first workflows ZeroClaw could otherwise serve well
Proposed solution
Preferred behavior:
• when ZeroClaw is running against a local/smaller model, it should offer a compact local-model mode that minimizes prompt overhead and keeps no-tools turns simple, fast, and deterministic
• plain-response turns should not include large tool-policy blocks unless tool use is actually enabled for that turn
• internal truncation markers and tool/system scaffolding should never be visible to the model in a way that can leak back to the user
• fallback parsing should be optionally strict, so only native/explicit tool calls are honored and permissive text-to-tool inference is disabled
Preferred interfaces:
• a config flag such as local_compact_mode = true
• a parser setting such as strict_tool_parsing = true
• an option like suppress_tool_instructions_when_no_tools = true
• a guarantee that truncation markers are handled internally, not inserted into model-visible prompt text
• optionally, a built-in preset like runtime_profile = "ollama_local" for smaller local models
Non-goals / out of scope
No response
Alternatives considered
No response
Acceptance criteria
No response
Architecture impact
No response
Risk and rollback
No response
Breaking change?
No
Data hygiene checks