[Feature]: Local-First Mode for Small Models — Compact No-Tools Prompting, Strict Parser Option, and No Prompt-Leakage

### Summary

ZeroClaw would benefit from a compact local-model mode that reduces prompt bloat, disables permissive fallback parsing, and prevents internal tool/system instructions from leaking into user-visible output.

### Problem statement

It solves a real local-first user pain: when someone uses ZeroClaw with a smaller Ollama-hosted model, they want the agent to respond cleanly and cheaply on simple supervised tasks, but the current runtime adds too much prompt overhead and too many permissive fallback behaviors, which can lead to slower responses, hangs, bogus tool behavior, and even internal tool/system text leaking into the final answer.

Current behavior is insufficient because:
	•	trivial no-tools prompts still carry a large tool/system preamble
	•	that wastes context and inference budget on local models
	•	fallback parsing is loose enough to create instability
	•	internal runtime text can appear in user-visible output
	•	this makes local models feel less reliable than they should in the exact local-first workflows ZeroClaw could otherwise serve well

### Proposed solution

Preferred behavior:
	•	when ZeroClaw is running against a local/smaller model, it should offer a compact local-model mode that minimizes prompt overhead and keeps no-tools turns simple, fast, and deterministic
	•	plain-response turns should not include large tool-policy blocks unless tool use is actually enabled for that turn
	•	internal truncation markers and tool/system scaffolding should never be visible to the model in a way that can leak back to the user
	•	fallback parsing should be optionally strict, so only native/explicit tool calls are honored and permissive text-to-tool inference is disabled

Preferred interfaces:
	•	a config flag such as local_compact_mode = true
	•	a parser setting such as strict_tool_parsing = true
	•	an option like suppress_tool_instructions_when_no_tools = true
	•	a guarantee that truncation markers are handled internally, not inserted into model-visible prompt text
	•	optionally, a built-in preset like runtime_profile = "ollama_local" for smaller local models

### Non-goals / out of scope

_No response_

### Alternatives considered

_No response_

### Acceptance criteria

_No response_

### Architecture impact

_No response_

### Risk and rollback

_No response_

### Breaking change?

No

### Data hygiene checks

- [x] I removed personal/sensitive data from examples, payloads, and logs.
- [x] I used neutral, project-focused wording and placeholders.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Local-First Mode for Small Models — Compact No-Tools Prompting, Strict Parser Option, and No Prompt-Leakage #5287

Summary

Problem statement

Proposed solution

Non-goals / out of scope

Alternatives considered

Acceptance criteria

Architecture impact

Risk and rollback

Breaking change?

Data hygiene checks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: Local-First Mode for Small Models — Compact No-Tools Prompting, Strict Parser Option, and No Prompt-Leakage #5287

Description

Summary

Problem statement

Proposed solution

Non-goals / out of scope

Alternatives considered

Acceptance criteria

Architecture impact

Risk and rollback

Breaking change?

Data hygiene checks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions