Problem
NanoClaw can lose visibility into messages that have been selected for agent dispatch but never receive a durable bot response because the run was interrupted, crashed, or timed out. Today there is no explicit persisted marker for “this message batch left the normal polling path,” which makes diagnosis unreliable.
B-01 adds detection state and observability only. It must not change polling eligibility or perform recovery.
Scope
Add processing_started_at tracking to persist when source messages are dispatched to GroupQueue.
Use this field to detect stale interrupted processing rows at startup and log them for operators. Do not requeue, replay, unlock, or recover messages in B-01.
Acceptance Criteria
-
messages table gains processing_started_at TEXT via safe additive migration.
-
processing_started_at is set immediately before dispatching a message batch to GroupQueue, in one transaction for the batch.
-
processing_started_at is cleared when a bot response is stored for that chat after the source message timestamp.
-
When clearing processing_started_at after a bot response, clear only rows matching:
chat_jid = bot_response.chat_jid
AND processing_started_at IS NOT NULL
AND timestamp <= bot_response.timestamp
-
Startup stale scan logs rows where processing_started_at is older than STALE_THRESHOLD_MS.
-
B-01 does not change getNewMessages().
-
B-01 does not requeue, replay, unlock, or recover messages.
Files Likely Touched
src/db.ts
src/index.ts
src/db.test.ts
- relevant index/router tests
Test Plan
- Migration test confirms
processing_started_at TEXT is added safely to existing messages tables.
- Batch dispatch test confirms all messages in a dispatched batch receive the same
processing_started_at in one transaction.
- Bot-response test confirms
processing_started_at is cleared only for rows in the same chat with processing_started_at IS NOT NULL and timestamp <= bot_response.timestamp.
- Regression test confirms future messages in the same chat are not cleared by an older bot response.
- Startup stale scan test confirms rows older than
STALE_THRESHOLD_MS are logged.
- Regression test confirms
getNewMessages() query behavior is unchanged in B-01.
- Regression test confirms no requeue/replay/unlock/recovery path is invoked.
- Run full test suite.
Risks
- Incorrect clearing logic could hide still-interrupted messages.
- Logging stale rows without recovery may surface stuck state but leave it unresolved until B-02.
- Batch marking must be transactional to avoid partial state if dispatch setup fails.
- Timestamp comparison must be consistent with stored message timestamps.
Dependencies
- Existing
messages table migration path.
- Existing
GroupQueue dispatch flow.
- Existing bot-response storage flow.
- B-02 recovery will build on this persisted detection marker.
Problem
NanoClaw can lose visibility into messages that have been selected for agent dispatch but never receive a durable bot response because the run was interrupted, crashed, or timed out. Today there is no explicit persisted marker for “this message batch left the normal polling path,” which makes diagnosis unreliable.
B-01 adds detection state and observability only. It must not change polling eligibility or perform recovery.
Scope
Add
processing_started_attracking to persist when source messages are dispatched toGroupQueue.Use this field to detect stale interrupted processing rows at startup and log them for operators. Do not requeue, replay, unlock, or recover messages in B-01.
Acceptance Criteria
messagestable gainsprocessing_started_at TEXTvia safe additive migration.processing_started_atis set immediately before dispatching a message batch toGroupQueue, in one transaction for the batch.processing_started_atis cleared when a bot response is stored for that chat after the source message timestamp.When clearing
processing_started_atafter a bot response, clear only rows matching:Startup stale scan logs rows where
processing_started_atis older thanSTALE_THRESHOLD_MS.B-01 does not change
getNewMessages().B-01 does not requeue, replay, unlock, or recover messages.
Files Likely Touched
src/db.tssrc/index.tssrc/db.test.tsTest Plan
processing_started_at TEXTis added safely to existingmessagestables.processing_started_atin one transaction.processing_started_atis cleared only for rows in the same chat withprocessing_started_at IS NOT NULLandtimestamp <= bot_response.timestamp.STALE_THRESHOLD_MSare logged.getNewMessages()query behavior is unchanged in B-01.Risks
Dependencies
messagestable migration path.GroupQueuedispatch flow.