[Bug]: `hermes curator run` can lose LLM reports because CLI exits while background daemon thread is still running

## Describe the bug

Manual `hermes curator run` / `hermes curator run --dry-run` can report that the LLM pass is running in the background, then return control to the shell before the report is written. In a short-lived CLI process, the background review is run in a daemon thread; when the CLI process exits, that daemon thread can be terminated before it writes `REPORT.md` / `run.json` or updates `last_report_path` correctly.

The visible symptom is that `hermes curator status` shows either an old/stale `last report` path or a path that does not exist, even though `hermes curator run` appeared to start successfully.

Example observed status:

```text
curator: ENABLED
  runs:           1
  last summary:   auto: no changes
  last report:    /tmp/pytest-of-steve/pytest-215/popen-gw2/test_state_atomic_write_no_tmp0/.hermes/logs/curator/20260506-011414
...
```

The `/tmp/pytest-of-...` path was stale test state, and no fresh report was produced by the manual CLI invocation.

## Expected behavior

A manual CLI run should either:

1. complete synchronously and write its report before returning, or
2. explicitly opt into a reliable long-lived background execution mode.

This is especially important because the RFC/documented debug path is `hermes curator run --sync` — manual runs are usually used to verify curator behavior and inspect reports.

## Actual behavior

The CLI default can start the LLM/report phase in a background daemon thread and then exit. Because the process exits, the daemon thread may be killed before report generation and state update complete.

## Root cause

The core `agent.curator.run_curator_review(...)` supports a `synchronous` flag correctly. The problem is the CLI wrapper defaulting manual invocations into the background/daemon-thread path.

That background path is appropriate for the gateway/idle hook, where the parent process is long-lived. It is not reliable for a short-lived `hermes curator run` CLI command.

## Local fix validated

I patched my local checkout so that `hermes_cli/curator.py` makes manual CLI `run` synchronous by default, while retaining an explicit `--background` flag for the old non-blocking behavior.

Local commit:

```text
c6c74385d fix(curator): make manual runs synchronous by default
```

Main change:

- `hermes curator run` and `hermes curator run --dry-run` now pass `synchronous=True` by default.
- `--background` opts into the legacy non-blocking path.
- `--sync` remains accepted and wins over `--background` if both are supplied.
- `hermes curator status` marks a saved report path as missing/stale when the path no longer exists.

## Verification performed

Focused tests:

```text
python -m pytest \
  tests/hermes_cli/test_curator_run.py \
  tests/hermes_cli/test_curator_status.py \
  tests/hermes_cli/test_curator_archive_prune.py \
  tests/agent/test_curator_reports.py \
  tests/agent/test_curator.py \
  tests/agent/test_curator_activity.py \
  tests/agent/test_curator_classification.py \
  tests/agent/test_curator_backup.py \
  -q
```

Result:

```text
145 passed
```

Static/syntax checks:

```text
ruff check hermes_cli/curator.py tests/hermes_cli/test_curator_run.py tests/hermes_cli/test_curator_status.py
python -m py_compile hermes_cli/curator.py tests/hermes_cli/test_curator_run.py tests/hermes_cli/test_curator_status.py
```

Result:

```text
All checks passed
```

I also ran an isolated temp `HERMES_HOME` E2E smoke test with a stub LLM response. The synchronous dry-run path created both:

```text
REPORT.md
run.json
```

and updated state to the fresh report directory before returning.

## Suggested fix

Change the CLI wrapper so manual `hermes curator run` defaults to synchronous execution, and make background mode explicit:

```text
hermes curator run              # synchronous, reliable report write
hermes curator run --dry-run    # synchronous, reliable preview report
hermes curator run --background # legacy non-blocking behavior
```

Also helpful: when `curator status` shows `last_report_path`, check whether the path exists and annotate missing paths as stale/missing rather than presenting them as valid latest reports.

## Related context

This appears distinct from the existing first-run dry-run/approval safety fix (#18389) and the per-run reports feature (#17307). It is specifically a CLI lifecycle issue: daemon-thread background work is not reliable after a short-lived CLI command exits.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: `hermes curator run` can lose LLM reports because CLI exits while background daemon thread is still running #20555

Describe the bug

Expected behavior

Actual behavior

Root cause

Local fix validated

Verification performed

Suggested fix

Related context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: hermes curator run can lose LLM reports because CLI exits while background daemon thread is still running #20555

Description

Describe the bug

Expected behavior

Actual behavior

Root cause

Local fix validated

Verification performed

Suggested fix

Related context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Bug]: `hermes curator run` can lose LLM reports because CLI exits while background daemon thread is still running #20555