kanban dispatcher: macOS zombie detection is a no-op — _pid_alive returns True for defunct workers

## Summary

`_pid_alive()` in `hermes_cli/kanban_db.py` only implements zombie detection on Linux (parsing `/proc/<pid>/status` for `State: Z`). On macOS, `os.kill(pid, 0)` returns success for defunct/zombie processes, so a worker that crashes immediately stays "alive" to the dispatcher until `claim_expires` times out (~15 min default).

## Where

`hermes_cli/kanban_db.py:2158-2173`

The docstring at line 2136-2144 even admits this:
```
On Linux we additionally peek at /proc/<pid>/status and treat State: Z
as dead. On other POSIX or on Windows the zombie check is a no-op.
```

## Reproduction

1. Run the kanban dispatcher on macOS with a ~5 min cadence.
2. Assign a worker a task that causes an immediate crash (e.g., require a skill it doesn't have, or a missing credential that triggers an unhandled exception at startup).
3. `os.kill(pid, 0)` succeeds against the defunct process because the process table entry still exists.
4. The dispatcher sees the worker as alive and does NOT re-queue the task until `claim_expires` (~15 min later).
5. This creates a zombie-respawn loop where the dispatcher tries again every N minutes, gets the same crash, and the task stays stuck until manual SQL intervention.

## Impact

Tasks stuck in `running` for up to 15 minutes on macOS, requiring manual `sqlite3` surgery to break the loop. With a 5-minute dispatcher cadence and default claim_expires of 15 minutes, users see 3+ wasted spawn attempts per stuck task.

## Suggested Fix

On Darwin, use `proc_pidinfo(PROC_PIDTASKINFO)` or `kqueue` with `EVFILT_PROC` to detect zombie state. A simpler fallback: check if the process group leader is still alive, or verify that `proc_pidinfo`'s `pti_status` field is not 0.

## Environment

- macOS (any version)
- Hermes Agent v0.11.0 (a7fb79efb)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kanban dispatcher: macOS zombie detection is a no-op — _pid_alive returns True for defunct workers #20015

Summary

Where

Reproduction

Impact

Suggested Fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kanban dispatcher: macOS zombie detection is a no-op — _pid_alive returns True for defunct workers #20015

Description

Summary

Where

Reproduction

Impact

Suggested Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions