Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
a237a35
fix(tui): honor nested agent max turns
Apr 25, 2026
aa32bdc
wip: delegation system v2 partial - blocked on delegate_task
Apr 25, 2026
e2cd514
fix: prefer runtime oauth delegation over direct base_url
Apr 24, 2026
add8c74
feat: hardcode kael delegation routing
Apr 25, 2026
5aab5fb
wip: skill selector improvements - mid-task before tmux kill
Apr 25, 2026
86e34e0
fix(delegation): replace raw codex node spawns with native hermes pat…
Apr 25, 2026
b748463
wip: raise delegation concurrency caps and reconcile native codex config
Apr 27, 2026
eb6de73
wip: tighten kael delegation router with validation and edge-case cov…
Apr 27, 2026
dbf7151
wip: add realistic skill selector validation corpus and regression suite
Apr 27, 2026
13b7345
wip: add tracked enzyme daemon systemd units
Apr 27, 2026
acf9344
wip: add live skill selector evaluation cases
Apr 27, 2026
77ba310
feat(rotation): cap heap dumps to last 3
Apr 27, 2026
6b2df2d
feat(rotation): enforce checkpoint cap at 10 with size limit
Apr 27, 2026
55d5185
fix: inherit main context override for compression feasibility
May 1, 2026
27725f8
fix: align codex child routing with live 272k ceiling
May 1, 2026
b6a64e6
feat: add safe runtime exporter for audit pipeline
May 2, 2026
98b8d37
fix: preserve runtime export root contract
May 4, 2026
2d59ec2
test: align delegation tests after upstream rebase
May 5, 2026
b015c66
fix(tui): type heap dump dirents for node 25
May 5, 2026
d901c86
fix: make agent-created skills curator-visible
May 5, 2026
9962256
feat: make background agents full-access orchestrators
May 5, 2026
12ee538
[verified] fix: make curator deletes recoverable
May 6, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,18 @@ source .venv/bin/activate # or: source venv/bin/activate
`$HOME/.hermes/hermes-agent/venv` (for worktrees that share a venv with the
main checkout).

## Local fork doctrine note

This checkout also carries Kael's hardcoded delegation router.

Primary references:
- `scripts/kael_delegation_router.py`
- `KAEL-DELEGATION-ROUTING-V2.md`
- `~/.hermes/skills/delegation-routing-v2/SKILL.md`
- `/home/ubuntu/business/reports/delegation-system-hardcoded-2026-04-25.md`

When work in this repo involves deciding whether to keep execution in the parent, spawn a `gpt-5.5` specialist, use Codex CLI for long-context work, or launch a `gpt-5.4` orchestrator child, follow that router rather than ad-hoc judgment.

## Project Structure

File counts shift constantly — don't treat the tree below as exhaustive.
Expand Down
132 changes: 132 additions & 0 deletions KAEL-DELEGATION-ROUTING-V2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# Kael Delegation Routing v2

This fork carries a hardcoded Kael routing helper at `scripts/kael_delegation_router.py`.

It is not generic theory. It encodes the live lane split Kael is supposed to use in this environment:
- parent on `gpt-5.4`
- bounded specialist leaves on `gpt-5.5`
- long-context clean-room probes via Codex CLI
- multi-domain local synthesis via `gpt-5.4` orchestrator CLI children

## Live doctrine pointers
- Live skill: `~/.hermes/skills/delegation-routing-v2/SKILL.md`
- Live config: `~/.hermes/config.yaml` under `delegation` and `kael_delegation`
- Live spec: `/home/ubuntu/business/reports/delegation-system-hardcoded-2026-04-25.md`
- Prior architecture grounding: `/home/ubuntu/business/reports/subagent-kael-delegation-architecture-2026-04-24.md`
- Fork workflow: `/home/ubuntu/business/reports/hermes-fork-workflow-2026-04-25.md`

## Lane summary

### 1. Parent
- lane: `parent`
- model: `gpt-5.4`
- provider: `openai-codex`
- keeps: `STATE.md` writes, final synthesis, ship/no-ship judgments, irreversible actions
- rule: never delegate work that fits in 1-2 tool calls

### 2. gpt-5.5 specialist leaf
- lane: `gpt55_specialist`
- model: `gpt-5.5`
- provider: `openai-codex`
- role: `leaf`
- best for: bounded reasoning, code review, structured JSON/markdown output, classification, summarization, inspection
- default toolsets:
- code → `['file', 'terminal']`
- research → `['file', 'web']`
- inspection → `['file']`
- timeout policy: retry once, then mark partial

### 3. Codex CLI long-context lane
- lane: `codex_cli_long_context`
- role: `standalone_process`
- best for: 5+ files, >300k tokens, clean-room probes, large corpus analysis, write-allowed generation in isolation
- commands:
- read-only → `codex exec --skip-git-repo-check --sandbox read-only --ephemeral '<prompt>'`
- workspace-write → `codex exec --skip-git-repo-check --sandbox workspace-write --ephemeral '<prompt>'`
- json → `codex exec --skip-git-repo-check --sandbox read-only --ephemeral --json '<prompt>'`
- failure policy: retry with alternate sandbox mode if appropriate

### 4. gpt-5.4 orchestrator CLI child
- lane: `gpt54_orchestrator_cli`
- model: `gpt-5.4`
- provider: `openai-codex`
- role: `orchestrator`
- best for: 2+ broad domains that each need local synthesis before parent judgment
- command:
- `hermes chat --provider openai-codex --model gpt-5.4 -s delegation-routing-v2 -Q -q '<self-contained orchestrator prompt>'`

## Hardcoded routing rules
Apply in this order:

1. **A — shared state / irreversible action** → `parent`
2. **B — single tool call or pure reasoning under 50k** → `parent`
3. **F — 2+ broad domains needing local synthesis** → `gpt54_orchestrator_cli`
4. **E — N independent subtasks** → `parallel_fanout`, with child lane chosen by subtask size
5. **D — 5+ files, >300k tokens, or clean-room** → `codex_cli_long_context`
6. **C — bounded structured work under 200k** → `gpt55_specialist`
7. Fallback → `parent`

Why `F` and `E` are checked before `D`: once decomposition is explicit, Kael should choose the correct child topology rather than collapsing everything into one oversized lane.

## Concurrency and spawn limits
- `max_concurrent_children=10`
- `max_spawn_depth=3`
- research bursts: spawn all immediately and supervise in parallel
- code review across many independent files: 1 `gpt-5.5` leaf per file when each file is bounded
- architecture decisions spanning multiple domains: 1 `gpt-5.4` orchestrator CLI child per domain, with leaves beneath it

## Failure recovery
- `gpt-5.5` timeout → retry once, then mark partial and continue
- Codex CLI failure → inspect sandbox mode and retry with alternate sandbox when appropriate
- native `delegate_task` failure → fall back to `terminal + hermes chat`
- bad child output → do not trust silently; add a follow-up lane and annotate the gap

## Smoke-test commands

### Router examples
```bash
cd ~/.hermes/hermes-agent
./scripts/kael_delegation_router.py --examples
```

### One-shot gpt-5.5 leaf
```bash
# from a live Kael session
# delegate_task(goal="Reply exactly CHILD_OK", model={provider:"openai-codex", model:"gpt-5.5"}, toolsets=[])
```

### Codex CLI long-context lane
```bash
cd ~/.hermes/hermes-agent
codex exec --skip-git-repo-check --sandbox read-only --ephemeral "Reply exactly CODEX_OK"
```

### gpt-5.4 orchestrator CLI lane
```bash
cd ~/.hermes/hermes-agent
hermes chat --provider openai-codex --model gpt-5.4 -s delegation-routing-v2 -Q -q "Reply exactly ORCHESTRATOR_OK"
```

## Worked examples
1. **Review one module diff and return JSON findings**
- route: `gpt55_specialist`
- toolsets: `['file', 'terminal']`
2. **Read 6 docs and summarize system drift**
- route: `codex_cli_long_context`
3. **Audit 6 small files independently for style issues**
- route: `parallel_fanout` → `gpt55_specialist` children
4. **Update `STATE.md` after reading child reports**
- route: `parent`
5. **Compare config, git workflow, and doctrine docs, then decide final policy**
- route: `gpt54_orchestrator_cli`

## Testable contract
The router emits machine-readable JSON so shell checks and future wrappers can assert:
- lane choice
- model/provider
- default toolsets
- command template
- timeout/failure recovery
- fanout metadata

That makes the routing policy harder to skip than a narrative note in a report.
10 changes: 10 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,16 @@ hermes # start chatting!

---

## Local fork operator note

This fork also carries a Kael-specific delegation doctrine for high-context orchestration:
- router script: `scripts/kael_delegation_router.py`
- repo doc: `KAEL-DELEGATION-ROUTING-V2.md`
- live skill: `~/.hermes/skills/delegation-routing-v2/SKILL.md`
- live spec: `/home/ubuntu/business/reports/delegation-system-hardcoded-2026-04-25.md`

Use it when deciding between the Kael parent (`gpt-5.4`), `gpt-5.5` specialist leaves, Codex CLI long-context probes, and `gpt-5.4` orchestrator CLI children. Shared-state writes and irreversible actions stay in the parent.

## Getting Started

```bash
Expand Down
21 changes: 11 additions & 10 deletions agent/curator.py
Original file line number Diff line number Diff line change
Expand Up @@ -316,9 +316,8 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
"Your output IS the deliverable. Produce the exact same "
"human-readable summary and structured YAML block you would "
"produce on a live run — but describe the actions you WOULD take, "
"not actions you took. A downstream reviewer will read the report "
"and decide whether to approve a live run with "
"`hermes curator run` (no flag).\n"
"not actions you took. The dry-run report is an audit/preview artifact; "
"scheduled live Curator runs do not require operator approval.\n"
"\n"
"If you accidentally take a mutating action, say so explicitly in "
"the summary so the reviewer can revert it.\n"
Expand Down Expand Up @@ -387,9 +386,10 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
"copied and modified\n"
" • `scripts/<name>.<ext>` for statically re-runnable actions "
"(verification scripts, fixture generators, probes)\n"
" Then archive the old sibling. Use `terminal` with `mkdir -p "
"~/.hermes/skills/<umbrella>/references/ && mv ... <umbrella>/"
"references/<topic>.md` (or templates/ / scripts/).\n"
" Then archive the old sibling. Use `skill_manage(action=write_file)` "
"to place the support file under the umbrella, then "
"`skill_manage(action=delete, absorbed_into=<umbrella>)` to archive "
"the old sibling.\n"
"4. Also flag skills whose NAME is too narrow (contains a PR number, "
"a feature codename, a specific error string, an 'audit' / "
"'diagnosis' / 'salvage' session artifact). These almost always "
Expand All @@ -407,10 +407,11 @@ def apply_automatic_transitions(now: Optional[datetime] = None) -> Dict[str, int
" - skill_manage action=delete — archive a skill. MUST pass "
"`absorbed_into=<umbrella>` when you've merged its content into another "
"skill, or `absorbed_into=\"\"` when you're truly pruning with no "
"forwarding target. This drives cron-job skill-reference migration — "
"guessing from your YAML summary after the fact is fragile.\n"
" - terminal — mv a sibling into the archive "
"OR move its content into a support subfile\n\n"
"forwarding target. This moves the full skill directory into `.archive/` "
"for restore and drives cron-job skill-reference migration — guessing "
"from your YAML summary after the fact is fragile.\n"
" - terminal — inspect files or prepare support "
"subfile content only; do NOT mv skill directories into `.archive/` manually.\n\n"
"'keep' is a legitimate decision ONLY when the skill is already a "
"class-level umbrella and none of the proposed merges would improve "
"discoverability. 'This is narrow but distinct from its siblings' "
Expand Down
11 changes: 5 additions & 6 deletions agent/prompt_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -912,16 +912,15 @@ def build_skills_system_prompt(

result = (
"## Skills (mandatory)\n"
"Before replying, scan the skills below. If a skill matches or is even partially relevant "
"to your task, you MUST load it with skill_view(name) and follow its instructions. "
"Err on the side of loading — it is always better to have context you don't need "
"than to miss critical steps, pitfalls, or established workflows. "
"Before replying, scan the skills below. If a skill strongly matches the task domain, "
"tool, or requested workflow, you MUST load it with skill_view(name) and follow its instructions. "
"Prefer the most specific skill and normally load only the minimum useful set (often 0-2 skills). "
"Do not load broad workflow skills when the request only needs ordinary execution or 1-2 tool calls. "
"Skills contain specialized knowledge — API endpoints, tool-specific commands, "
"and proven workflows that outperform general-purpose approaches. Load the skill "
"even if you think you could handle the task with basic tools like web_search or terminal. "
"Skills also encode the user's preferred approach, conventions, and quality standards "
"for tasks like code review, planning, and testing — load them even for tasks you "
"already know how to do, because the skill defines how it should be done here.\n"
"for tasks like code review, planning, and testing — load them when the requested phase actually matches.\n"
"Whenever the user asks you to configure, set up, install, enable, disable, modify, "
"or troubleshoot Hermes Agent itself — its CLI, config, models, providers, tools, "
"skills, voice, gateway, plugins, or any feature — load the `hermes-agent` skill "
Expand Down
46 changes: 41 additions & 5 deletions agent/skill_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
"windows": "win32",
}

EXCLUDED_SKILL_DIRS = frozenset((".git", ".github", ".hub", ".archive"))
EXCLUDED_SKILL_DIRS = frozenset((".git", ".github", ".hub", ".archive", "_archived", ".archived"))

# ── Lazy YAML loader ─────────────────────────────────────────────────────

Expand Down Expand Up @@ -245,6 +245,38 @@ def get_all_skills_dirs() -> List[Path]:

# ── Condition extraction ──────────────────────────────────────────────────

_TOOLSET_ALIASES = {
"files": "file",
"skills_tools": "skills",
"terminal_tools": "terminal",
}


def _normalize_condition_list(value: Any, *, aliases: dict[str, str] | None = None) -> List[str]:
"""Normalize frontmatter condition values to a clean list of strings.

Accepts a scalar string, list, tuple, or set. Strings are wrapped into a
single-item list, surrounding whitespace is stripped, and simple aliases
like ``files`` -> ``file`` are normalized.
"""
if value is None:
return []
if isinstance(value, str):
raw_items = [value]
elif isinstance(value, (list, tuple, set)):
raw_items = list(value)
else:
return []

normalized: List[str] = []
alias_map = aliases or {}
for item in raw_items:
text = str(item).strip()
if not text:
continue
normalized.append(alias_map.get(text, text))
return normalized


def extract_skill_conditions(frontmatter: Dict[str, Any]) -> Dict[str, List]:
"""Extract conditional activation fields from parsed frontmatter."""
Expand All @@ -256,10 +288,14 @@ def extract_skill_conditions(frontmatter: Dict[str, Any]) -> Dict[str, List]:
if not isinstance(hermes, dict):
hermes = {}
return {
"fallback_for_toolsets": hermes.get("fallback_for_toolsets", []),
"requires_toolsets": hermes.get("requires_toolsets", []),
"fallback_for_tools": hermes.get("fallback_for_tools", []),
"requires_tools": hermes.get("requires_tools", []),
"fallback_for_toolsets": _normalize_condition_list(
hermes.get("fallback_for_toolsets", []), aliases=_TOOLSET_ALIASES
),
"requires_toolsets": _normalize_condition_list(
hermes.get("requires_toolsets", []), aliases=_TOOLSET_ALIASES
),
"fallback_for_tools": _normalize_condition_list(hermes.get("fallback_for_tools", [])),
"requires_tools": _normalize_condition_list(hermes.get("requires_tools", [])),
}


Expand Down
12 changes: 10 additions & 2 deletions cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -2182,7 +2182,8 @@ def __init__(
if isinstance(cp_cfg, bool):
cp_cfg = {"enabled": cp_cfg}
self.checkpoints_enabled = checkpoints or cp_cfg.get("enabled", False)
self.checkpoint_max_snapshots = cp_cfg.get("max_snapshots", 50)
self.checkpoint_max_snapshots = cp_cfg.get("max_snapshots", 10)
self.checkpoint_max_total_bytes = cp_cfg.get("max_total_bytes", 1_000_000_000)
self.pass_session_id = pass_session_id
# --ignore-rules: honor either the constructor flag or the env var set
# by `hermes chat --ignore-rules` in hermes_cli/main.py. When true we
Expand Down Expand Up @@ -3685,6 +3686,7 @@ def _init_agent(self, *, model_override: str = None, runtime_override: dict = No
thinking_callback=self._on_thinking,
checkpoints_enabled=self.checkpoints_enabled,
checkpoint_max_snapshots=self.checkpoint_max_snapshots,
checkpoint_max_total_bytes=self.checkpoint_max_total_bytes,
pass_session_id=self.pass_session_id,
skip_context_files=self.ignore_rules,
skip_memory=self.ignore_rules,
Expand Down Expand Up @@ -6829,7 +6831,13 @@ def run_background():
acp_command=turn_route["runtime"].get("command"),
acp_args=turn_route["runtime"].get("args"),
max_iterations=self.max_turns,
enabled_toolsets=self.enabled_toolsets,
# /background is an independent operator lane, not a
# restricted child of the current foreground session. Give
# it the full available tool surface so it can orchestrate,
# delegate, use shell/file/web/MCP tools, and write its own
# artifacts unless the prompt itself narrows scope.
enabled_toolsets=None,
disabled_toolsets=[],
quiet_mode=True,
verbose_logging=False,
session_id=task_id,
Expand Down
3 changes: 2 additions & 1 deletion gateway/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -8845,7 +8845,8 @@ async def _handle_rollback_command(self, event: MessageEvent) -> str:

mgr = CheckpointManager(
enabled=True,
max_snapshots=cp_cfg.get("max_snapshots", 50),
max_snapshots=cp_cfg.get("max_snapshots", 10),
max_total_bytes=cp_cfg.get("max_total_bytes", 1_000_000_000),
)

cwd = os.getenv("TERMINAL_CWD", str(Path.home()))
Expand Down
3 changes: 2 additions & 1 deletion hermes_cli/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -571,7 +571,8 @@ def _ensure_hermes_home_managed(home: Path):
# conversation turn (on first write_file/patch call). Use /rollback to restore.
"checkpoints": {
"enabled": True,
"max_snapshots": 50, # Max checkpoints to keep per directory
"max_snapshots": 10, # Max checkpoints to keep per directory
"max_total_bytes": 1_000_000_000, # Cap each shadow repo at ~1 GB, preserving the newest checkpoint
# Auto-maintenance: shadow repos accumulate forever under
# ~/.hermes/checkpoints/ (one per cd'd working directory). Field
# reports put the typical offender at 1000+ repos / ~12 GB. When
Expand Down
Loading