Bug
On gateway startup, the kanban dispatcher crashes with:
sqlite3.OperationalError: duplicate column name: consecutive_failures
Root Cause
Two async tasks are created concurrently in the gateway (gateway/run.py):
- Line 3335:
asyncio.create_task(self._kanban_notifier_watcher())
- Line 3341:
asyncio.create_task(self._kanban_dispatcher_watcher())
Both watchers call _kb.connect(board=slug) → _migrate_add_optional_columns(conn) via asyncio.to_thread().
The _INITIALIZED_PATHS set (module-level, kanban_db.py:~917) is used as a cache to skip re-initialization, but it is not thread-safe. When both threads race on the first tick:
- Thread A checks
needs_init = resolved not in _INITIALIZED_PATHS → True
- Thread B checks
needs_init = resolved not in _INITIALIZED_PATHS → True (set not yet updated by A)
- Both threads run
_migrate_add_optional_columns()
- Both read
cols via PRAGMA table_info(tasks) — neither sees consecutive_failures yet
- Thread A succeeds with
ALTER TABLE tasks ADD COLUMN consecutive_failures ...
- Thread B crashes with
duplicate column name: consecutive_failures
The error is caught at the outer exception handler (gateway/run.py:3889) so the gateway keeps running, but the kanban dispatcher tick is lost.
Reproduction
Start the gateway with a fresh or existing kanban.db that already has consecutive_failures in the schema (i.e., after a previous successful migration). The race window is tight but triggers reliably on startup when both watchers hit their first tick close together.
Environment
- Hermes v0.6+ (323 commits behind → updated to latest
main as of bbff2f6)
- Python 3.14, SQLite 3.x
- Linux (NixOS)
Suggested Fix
Either:
- Quick fix: Wrap each
ALTER TABLE in _migrate_add_optional_columns with try/except sqlite3.OperationalError catching only duplicate column errors. Other errors still propagate.
- Proper fix: Use a
threading.Lock around the needs_init check + migration block in connect(), or use CREATE TABLE IF NOT EXISTS style guards.
I can submit a PR for option 1 or 2 if desired. Thanks!
Bug
On gateway startup, the kanban dispatcher crashes with:
Root Cause
Two async tasks are created concurrently in the gateway (
gateway/run.py):asyncio.create_task(self._kanban_notifier_watcher())asyncio.create_task(self._kanban_dispatcher_watcher())Both watchers call
_kb.connect(board=slug)→_migrate_add_optional_columns(conn)viaasyncio.to_thread().The
_INITIALIZED_PATHSset (module-level,kanban_db.py:~917) is used as a cache to skip re-initialization, but it is not thread-safe. When both threads race on the first tick:needs_init = resolved not in _INITIALIZED_PATHS→Trueneeds_init = resolved not in _INITIALIZED_PATHS→True(set not yet updated by A)_migrate_add_optional_columns()colsviaPRAGMA table_info(tasks)— neither seesconsecutive_failuresyetALTER TABLE tasks ADD COLUMN consecutive_failures ...duplicate column name: consecutive_failuresThe error is caught at the outer exception handler (
gateway/run.py:3889) so the gateway keeps running, but the kanban dispatcher tick is lost.Reproduction
Start the gateway with a fresh or existing
kanban.dbthat already hasconsecutive_failuresin the schema (i.e., after a previous successful migration). The race window is tight but triggers reliably on startup when both watchers hit their first tick close together.Environment
mainas of bbff2f6)Suggested Fix
Either:
ALTER TABLEin_migrate_add_optional_columnswithtry/except sqlite3.OperationalErrorcatching onlyduplicate columnerrors. Other errors still propagate.threading.Lockaround theneeds_initcheck + migration block inconnect(), or useCREATE TABLE IF NOT EXISTSstyle guards.I can submit a PR for option 1 or 2 if desired. Thanks!