Skip to content

chore(tooling): switch pre-commit hook to lint-staged#2171

Closed
topcoder1 wants to merge 986 commits into
nanocoai:mainfrom
topcoder1:chore/lint-staged-precommit
Closed

chore(tooling): switch pre-commit hook to lint-staged#2171
topcoder1 wants to merge 986 commits into
nanocoai:mainfrom
topcoder1:chore/lint-staged-precommit

Conversation

@topcoder1
Copy link
Copy Markdown

Summary

Pre-commit was running `npm run format:fix` which calls `prettier --write "src/**/*.ts"` on the entire `src` tree on every commit — slow on a large repo and writes to files the user didn't touch.

`lint-staged` was already in `devDependencies` but had no config and wasn't invoked. This wires it up.

What changed

```diff

  • // .husky/pre-commit
  • npm run format:fix
  • // .husky/pre-commit

  • npx lint-staged

    // package.json

  • "lint-staged": {

  • "src/**/*.ts": ["prettier --write", "eslint --fix"]

  • }
    ```

Self-validation: when committing this PR, the new hook fired and correctly reported `→ lint-staged could not find any staged files matching configured tasks` (this commit only changes config + lockfile, no `.ts`).

Net effect

Before After
What gets formatted All files matching `src/**/*.ts` Only staged `.ts` files
ESLint --fix in hook
Commit speed (typical 1-3 file change) ~3-5s ~200-400ms

Test plan

  • After merge, stage a deliberately mis-formatted `.ts` file and commit — pre-commit should reformat the staged file in place
  • Confirm `format:check` passes after the commit completes

🤖 Generated with Claude Code

topcoder1 and others added 30 commits April 19, 2026 18:17
Exhaustive QA pass on src/mini-app/. 13 findings across security,
correctness, and consistency. 9 fixed in atomic commits this branch,
1 blocked on user architecture decision (public tunnel auth), 3
deferred as follow-ups.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(qa): auto-expire + test fix (v1.2.55)
…(v1.2.56)

src/index.test.ts was failing to load because its
vi.mock('./channels/registry.js') omitted `registerChannel`.
`telegram.ts` calls `registerChannel('telegram', ...)` at module
top-level and `triage/dashboards.ts` imports telegram.ts directly
(bypassing the empty `./channels/index.js` mock), so vitest couldn't
resolve the call and aborted before any of the 19 tests in the file
ever ran.

Adding the missing mock export makes the full suite 1630/1630 green
(up from 1611 — all 19 previously-dark index tests now execute).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
test(index): fix missing registerChannel mock (v1.2.56)
Pulls in 7 commits hardening the mini-app module:
- ISSUE-003 archive trust (42806aa)
- ISSUE-004+009 CORS/bulk cap (988062c)
- ISSUE-002+008 task-detail JS escape (c751e3e)
- ISSUE-006 any-cast drop (fe12763)
- ISSUE-007 escape centralization (955db64)
- ISSUE-005+012 lint + age (2de0e9a)
- QA report (9cbf832)
ISSUE-010 (MED, follow-up): older mini-app endpoints (archive, bulk
archive, revert) returned one of three bespoke shapes while the new
reply-send routes used {ok, error?, code?} per the approved spec.
Clients branched differently per endpoint.

Standardize:
- All error responses now carry {ok: false, error, code}.
- `success: true/false` preserved alongside `ok` on routes where
  existing callers may still key on it (archive single, draft revert).
- Codes added: INVALID_BODY, BATCH_TOO_LARGE, GMAIL_UNAVAILABLE,
  ITEM_NOT_FOUND, GMAIL_API_ERROR, WATCHER_UNAVAILABLE, INTERNAL.

Backward compatible — every field that existed before still exists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dates

ISSUE-011 (LOW, follow-up): the task-detail template called
location.reload() every time updated_at changed. Long-running tasks
reloaded the page on every step tick — lost scroll position, flashed
the tab, re-requested all assets.

Render steps + logs into id="steps-slot" / id="logs-slot" server-side,
then patch them client-side from the SSE payload. Status, title, and
log autoscroll also update in place. Client mirrors the server's
renderSteps/renderLogs shape so initial paint and updates are visually
identical.

Falls back to keeping the prior DOM when JSON.parse fails, so a single
bad SSE frame doesn't blank the page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ISSUE-013 (LOW, follow-up): five catch blocks in mini-app/server.ts
silently swallowed errors. Most were JSON.parse over tracked_items
metadata — expected to fail cleanly on bad rows, but with zero signal
a DB corruption looked identical to "table missing in test".

Add logger.debug with { err, emailId/id, component } on:
- extractAccount JSON.parse (home listing)
- bulk archive metadata parse
- /email/:emailId tracked_items lookup + metadata parse
- /api/email/:emailId/archive tracked_items lookup + metadata parse

Debug level keeps default logs quiet in prod; visible at LOG_LEVEL=debug
when actually triaging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The old message read "not found (already resolved?)" — misleading for
proposals that got auto-expired by the cron or manually cleaned up,
since nobody actually *resolved* them via Merge/Close. Now reads
"(resolved, expired, or manually cleaned up)" so the reviewer doesn't
wonder whether they clicked something they didn't.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(qa): clarify 'not found' copy for missing proposals (v1.2.57)
…raft-with-AI

Approved design for the next layer of mini-app capability:
- Classification-driven button rows (push/human, digest, transactional,
  fallback), with a universal ⋯ More escape hatch
- Canned reply chips (Thanks / Got it / Will do) for human senders,
  reusing the existing reply-send + undo machinery
- Snooze (1h / Tomorrow 8am / Next Mon / Next week / Custom) with a
  DB-backed 60s wake-tick scheduler
- Unsubscribe via List-Unsubscribe header (RFC 8058 one-click, mailto,
  legacy GET), with fallback to Gmail when headers absent
- Mute thread — SSE intake filter skips muted thread_ids entirely
- Draft with AI: Quick (no prompt) and Prompt (intent seed), both
  spawning an agent container that creates a Gmail draft the existing
  reply mode picks up

Scoped out (v2): agent body-scraping for unsubscribe links, dedicated
Muted/Snoozed list views, auto-unmute on user reply.

See: docs/superpowers/specs/2026-04-19-miniapp-ux-expansion-design.md
18 tasks across 10 phases:
  1-2. Migrations + sender/subtype classifier
  3.   Mute thread (filter, routes, invariant)
  4.   Snooze (scheduler, routes, Telegram wake)
  5.   Unsubscribe (method picker, executor, route)
  6.   Context-aware action row template
  7.   Canned reply chips (reuse PendingSendRegistry)
  8.   Snooze/Unsubscribe/Mute UI wiring
  9.   Draft with AI (Quick + Prompt + polling)
  10.  Integration tests + prod smoke + merge

~60 new test cases. Every task is TDD (fail → implement → pass →
commit). Atomic commits land each slice of value independently.

Spec: docs/superpowers/specs/2026-04-19-miniapp-ux-expansion-design.md
Add muted_threads, snoozed_items, unsubscribe_log tables. Add
sender_kind + subtype columns to tracked_items. FK on snoozed_items
cascades on tracked_items delete.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 1
Review flagged that the snoozed_items FK cascade advertised by the
Phase 1 migration would not actually cascade in production because
SQLite requires PRAGMA foreign_keys = ON per connection. createSchema
now enables the pragma before any CREATE TABLE runs, so initDatabase
+ _initTestDatabase + runMigrations all benefit.

Added a regression test that verifies cascade without manually
enabling the pragma — previously failed, now passes.

Review: Task 1 code-quality I2
Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 1
Pure functions for classifying email sender (human/bot) and subtype
(transactional). Used by the SSE intake path to populate the new
tracked_items columns and by the mini-app template to select the
right button row.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 2
…Thread

Pure helpers over muted_threads. muteThread cascade-resolves existing
tracked_items. isThreadMuted fails open on DB error to prevent a blip
from silently dropping inbound email.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3
Before writing a new tracked_items row, check muted_threads. If
matched, skip the insert, archive the thread in Gmail, log, return.
Also populate sender_kind + subtype columns on insert for
classification-aware rendering downstream.

Added in two places:
- classifyFromSSE (sse-classifier.ts) — production hot path. Mute
  check fires before dedup; sender_kind/subtype computed best-effort
  from the SSE-available fields (sender, subject, snippet).
- processIncomingEmail (email-sse.ts) — new testable seam that
  accepts injected db + gmailOps, so the mute hook + classification
  wiring can be unit-tested end-to-end without the global DB
  singleton or the full classifier pipeline. Carries richer fields
  (headers, body, gmailCategory) that the SSE payload lacks.

insertTrackedItem (tracked-items.ts) extended to persist the two
new columns.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3 (Task 4)
…yFromSSE path

Task 4 review flagged that processIncomingEmail was never called
by production and duplicated insertTrackedItem's INSERT SQL. The
test exercised the dead seam, leaving the actual hot path
(classifyFromSSE) untested for the new mute hook and sender_kind/
subtype columns.

Delete the dead seam + its types. Rewrite email-sse-mute-hook.test.ts
against classifyFromSSE so the test validates what actually runs in
production.

Review: Task 4 critical issues 1+2
Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3 (Task 4)
POST /api/email/:id/mute inserts muted_threads row, cascade-resolves
all open tracked_items in the thread, archives on Gmail. DELETE
removes the mute.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3 (Task 5)
Asserts that no tracked_items row is both unresolved and in a muted
thread. Runs in the QA invariants suite alongside the existing
predicates. Catches mute-filter bugs or races where a muted thread
still has a visible row.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3 (Task 6)
9 commits from the UX expansion plan Phases 1-3:
- DB migrations for mute/snooze/unsubscribe + sender_kind/subtype
  (Task 1: 6c5de50, 78b98ec, e8f323d)
- classifySender / classifySubtype helpers (Task 2: 2e512bc)
- isThreadMuted / muteThread / unmuteThread helpers (Task 3: 9e024a4)
- Wire mute check + sender/subtype into SSE intake via classifyFromSSE
  (Task 4: d884dcf, c136a88)
- /api/email/:id/mute POST + DELETE routes in mini-app (Task 5: 458f638)
- muted-threads-never-visible QA invariant (Task 6: 7e02f95)

Miniapp now supports muting a Gmail thread: any future messages on
the same thread_id are skipped at intake and auto-archived on Gmail.
Existing tracked_items on the thread cascade to resolved/mute:retroactive.

Phases 4-10 (snooze, unsubscribe, context-aware UI, canned replies,
Draft-with-AI) remain pending.
60s tick wakes snoozed items whose wake_at has passed: restores
tracked_items.state/queue, deletes snooze row, emits
email.snooze.waked. A push subscriber posts a Telegram reminder
and the scheduler is cleaned up during graceful shutdown.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 4
POST /api/email/:id/snooze accepts '1h' | 'tomorrow-8am' |
'next-monday-8am' | 'next-week' | 'custom' (with ISO wake_at).
Caps at 90 days. Wraps in a transaction so state/queue backup and
tracked_items state change are atomic. DELETE is idempotent.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 4
…ilMeta

Constructs a minimal RFC 2822 message (To/Subject/MIME + optional
In-Reply-To / References), base64url-encodes, sends via
gmail.users.messages.send. getMessageMeta also now returns a small
headers map (List-Unsubscribe, List-Unsubscribe-Post, List-Id,
Precedence, Message-ID, References, In-Reply-To) used by the
unsubscribe executor.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 5
pickUnsubscribeMethod inspects List-Unsubscribe / List-Unsubscribe-Post
headers and returns the best available method, rejecting javascript:
and data: schemes. executeUnsubscribe does the HTTP POST/GET or
delegates mailto sends to gmailOps.sendEmail. 5s AbortController timeout
on network calls with status=0 on failure.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 5
…T + archive

Fetches message headers via GmailOps.getMessageMeta, picks the best
method (one-click POST, mailto, legacy GET), executes it, logs to
unsubscribe_log, always archives the thread regardless of remote
outcome. Returns 422 / NO_UNSUBSCRIBE_HEADER when absent. Remote
4xx/5xx maps to 502 / UNSUBSCRIBE_REMOTE_FAILED. Adds fetchImpl DI to
MiniAppServerOpts + ActionDeps for tests.

Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 5
topcoder1 and others added 20 commits April 28, 2026 12:51
…ts alive (#43)

Chronic 'Email intelligence trigger failed' alerts (113 timeouts in
the error log over the past few days) traced to a stdout-only liveness
check. The Claude SDK writes tool-call debug logs to stderr while the
agent is doing real work — deep research, multiple /recall calls,
file reads, MCP probes. Previous code only reset the 30-min idle
timer on stdout OUTPUT_MARKER chunks (user-facing emissions), so an
agent doing 30+ min of internal tool calls before its first reply got
killed despite being alive throughout.

Changes in src/container-runner.ts:

  - Track lastStdoutAt + lastStderrAt timestamps. Stderr handler now
    updates lastStderrAt on every chunk (was: ignored entirely).
  - Replace the single setTimeout with a setInterval-driven liveness
    check (cadence min(timeoutMs/10, 60_000)). Kills only when:
      • stdout idle > timeoutMs AND stderr idle > 5min, OR
      • total runtime > HARD_CAP_MS (max(timeoutMs * 2, 60min))
  - Hard cap bounds runaway agents whose stderr never goes quiet —
    can't keep a noisy-but-stuck container alive forever.
  - clearTimeout → clearInterval at the close + error sites.

3 new tests in container-runner.test.ts:
  • Container alive past IDLE_TIMEOUT when stderr is active (regression)
  • HARD_CAP_MS fires even with continuous stderr churn
  • IDLE_TIMEOUT fires when both streams quiet (real-hang case)
Plus loggerMock hoisted via vi.hoisted so tests can introspect the
'Container timeout, stopping gracefully' error log as the canonical
kill signal.

Existing 'timeout with no output' test updated to advance one extra
interval-tick past IDLE_TIMEOUT + 30s grace (the interval-based check
fires up to 60s late vs the old single-shot timer).

Full test suite green: 2447/2447.
…s (PR 3) (#45)

* feat(brain): identity-merge engine — pivot ku_entities + aliases atomically

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(brain): identity-merge — self/missing/type/double-merge rejections

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(events): EntityMergeRequestedEvent type

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(channels): claw merge text trigger on Signal + Discord with shared parser

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(brain): identity-merge handler — resolve handles + ack reply

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(brain): start/stop identity-merge handler with chat-ingest

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(brain): attachment-summary helper with vision tier + fallback

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(brain): include attachment summaries in window transcript

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(env): document BRAIN_MERGE_AUTO_LOW_CONF_REJECT (reserved)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…st (#44)

Two follow-ups to PR #43:

1. Add CONTAINER_STDERR_GRACE_MS env override in src/config.ts.
   Default 300000 (5min). container-runner reads from config instead
   of the local hardcoded constant. Lets ops bump the grace window
   if a workload regularly produces longer stderr-quiet stretches
   (e.g. an MCP tool that blocks for 6+ min on a slow upstream)
   without recompiling.

2. New test in container-runner.test.ts: 'recovers when stderr goes
   briefly silent then resumes (gap shorter than stdout idle)'. The
   reviewer-flagged edge case — stderr active, then 6min silence
   (past 5min grace), then stderr resumes for another 30min before
   stdout emits. Confirms the kill condition correctly requires BOTH
   stdout idle > timeoutMs AND stderr idle > grace; an isolated
   stderr gap within the stdout-idle window does NOT kill.

Full suite: 2448/2448 green (was 2447, +1 new test).
Adds a read-only audit command surfacing four classes of issue that
compound silently as the brain accumulates KUs:

  1. Near-duplicate KUs (cosine >= 0.95 in same (entity, topic_key))
  2. Temporal contradictions (overlapping intervals, conflicting text)
  3. Orphan entities (<2 linked KUs, >30 days old)
  4. Stale wiki pages (last_synthesis_at older than newest KU valid_from)

v1 does NO autonomous CRUD. Each finding includes a "merge /
mark-superseded / ignore" suggestion in the report; the user runs the
action manually. Top anti-pattern from both deep-research passes.

Wired into src/index.ts as /wikilint slash command (intercept order
matters — /wikilint check is placed BEFORE /wiki since they share the
prefix). Cron piggybacks the existing digest scheduler with a 7-day
debounce via system_state.last_wikilint, so no new setInterval.

Class 1 uses a new fetchKuVectors helper in qdrant.ts (Qdrant retrieve
with with_vector: true). Pair budget capped at 500 per run; groups
larger than 32 KUs are themselves anomalous and logged + skipped.
Group key uses a NUL separator since topic_key is space-joined by
extract.ts:normalizeTopic ("current employer") — regression-tested.

Tests: 21 new (15 in wikilint.test.ts, 6 in wikilint-command.test.ts).
Brain suite 431/431 green; full repo 2468/2468; typecheck + build clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a channel-aware reply hook so the identity-merge handler can post
ack/error messages back to the chat where the operator typed
`claw merge`. Loose coupling via a module-level setter
(setIdentityMergeReply) avoids threading channel routing through
chat-ingest's options.

index.ts wires it after channels connect: maps (chat_id, platform) to
the proper JID (sig:+phone, sig:group:base64, dc:channel_id) and calls
channel.sendMessage. Failures log warn, never propagate.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e diagnostic (#48)

Two related fixes:

1. WhatsApp reconnect storm (reason 405 firing every 1-2s) was hammering
   the daemon and burning resources. Replaced the bare 5s-on-error retry
   with proper exponential backoff (1s → 2s → 4s → 8s → 16s → 30s cap)
   plus ±25% jitter. Counter resets on successful connection. Reconnect
   timer is unref()'d so it doesn't keep the event loop alive.

2. Signal: when an envelope has no resolvable dataMsg (Note-to-Self
   syncMessage edits/deletes hit this path), capture the full envelope
   JSON instead of just syncMessage. This lets us identify the exact
   field signal-cli uses for sync deletes when the user retries the
   PR 4 e2e.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
git status was showing 4 untracked entries on every check:
  .claude/worktrees/  — concurrent agent worktrees (this repo runs
                        many parallel claude/* + qa/* branches)
  .claire/            — assistant local state
  .env.bak            — pre-existing env backup (now matched by *.bak)
  scripts/brain-p0-smoke.ts — pre-existing untracked (left alone)

*.bak under Secrets so future env backups don't leak.
…gnals (#50)

The reflection prompt was being fed `caller='agent-auto'` queries, whose
`query_text` is the chat envelope (`<context>...<messages>...</messages>`),
not a user-typed question. Manual smoke runs against live brain.db
emitted hollow rules that just templated the inputs back ("retrieve from
these KU IDs") — exactly the anti-pattern the prompt warned against.

Two fixes in `collectSignals`:

1. **Caller deny-list** — `agent-auto` excluded from zero-result queries,
   recurring-retrieval grouping, and the per-KU sample-query subquery.
   Open-ended deny-list (vs. allow-list) is more forgiving when new
   user-initiated callers land. Add new noisy callers to NOISY_CALLERS.

2. **`stripChatEnvelope`** — extracts the most recent `<message>` content
   from `query_text`, falls through unchanged if no `<message>` tag is
   present. Defensive — covers any non-auto-recall caller that still
   wraps in a chat envelope. Lookahead `<message(?=\s|>)` prevents
   accidentally matching the `<messages>` wrapper tag.

Live impact (manual run against real brain.db):
- BEFORE: 11 recurring-retrieval signals, all polluted; prompt was 4138
  chars of XML; emitted 2 hollow rules with cosine ~0.9 confidence.
- AFTER: 1 recurring signal (a real "ping" healthcheck pattern); prompt
  is 979 chars; emitted 0 rules — model honestly reports "no usable
  signal" rather than fabricating.

Quality > quantity for the D8 gate (≥20 reviewed brain-reflection rules
before Phase 5 prompt injection ships). Better to wait for real signals
than pollute the rule pool.

3 new tests in `procedural-reflect.test.ts`:
- excludes agent-auto from zero-result queries
- excludes agent-auto from recurring retrievals
- strips chat-window XML envelope from surfaced query_text

22/22 procedural-reflect tests green; pre-existing flake in
`startDigestSchedule(daily)` unrelated to this change (reproduces on
clean origin/main).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… top-level timestamp (#51)

Sync envelopes from the bbernhard signal-cli-rest-api wrapper place
editMessage/remoteDelete inside syncMessage.sentMessage with no top-level
`timestamp` on the sentMessage object — the inner timestamp lives at
editMessage.dataMessage.timestamp / remoteDelete.timestamp. The handler
previously called `new Date(dataMsg.timestamp).toISOString()` before the
editMessage/remoteDelete branches, which threw RangeError. The throw was
swallowed by the poll-loop catch at debug level, so PR 4's chat-edit-sync
handlers — which subscribe to chat.message.edited / chat.message.deleted
events emitted in those branches — never fired in production.

Fall back dataMsg.timestamp to envelope.timestamp so the early-return
branches are reached. The fallback doesn't affect their semantics; they
extract their own targets from editMessage.targetSentTimestamp and
remoteDelete.timestamp respectively.

Two regression tests cover the previously-uncovered sync envelope shapes
(Note-to-Self-style sentMessage with no top-level timestamp), confirmed
to fail without the fix and pass with it.

Note: a separate wrapper-level limitation remains for Note-to-Self
"Delete for Everyone" actions, which arrive as `syncMessage: {}` (empty)
because bbernhard does not serialize syncMessage.delete payloads. That
gap is upstream of nanoclaw and tracked separately; non-NTS deletes via
dataMessage.remoteDelete are unaffected.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…action (#52)

When a user edits a message originally sent via `claw save <text>`, the
edit envelope carries the FULL edited body (`claw save <new text>`),
not just the tail. The original ingest path stripped `claw save ` before
emitting `chat.message.saved`, but PR 4's edit-sync handler re-ran
extractPipeline on the raw new_text — producing KUs whose text included
the literal `claw save ` prefix, inconsistent with the original.

Live repro on 2026-04-28:
- Original: `claw save Pay $5,000 to Acme by Friday`
  → KU text: `Pay $5,000 to Acme by Friday` ✓
- Edit:    `claw save Pay $7,500 to Acme by Monday`
  → New KU text: `claw save Pay $7,500 to Acme by Monday` ✗

Add a small `stripClawTriggerPrefix(text)` helper used both for the
single-message path and inside `rebuildWindowTranscript`. Same regex
shape the channel-side text trigger uses, so KU text stays consistent
across original ingest + edit re-extraction.

Two regression tests cover save and merge prefixes; LLM caller used as
a spy to verify the prompt doesn't carry the prefix.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(brain): identity-unmerge engine — round-trip mergeEntities with rich snapshot

Enriches pre_merge_snapshot to capture pre-merge state of ku_entities,
aliases, and relationships for both kept + merged entities (schema_v2).
Adds unmergeEntities(mergeId) that atomically restores the snapshot and
deletes the merge_log row. Guardrail refuses if either entity has new
rows added after the merge (force:true to override).

v1-era snapshots without the rich data are rejected with a clear error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(brain): tighten unmergeEntities guardrail JSDoc to match impl

Reviewer noted the prior comment claimed the guardrail covered
ku_entities, aliases, and relationships, but only ku_entities are
checked today. Reword to match reality and flag the gap for follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(brain): claw unmerge — event type, channel triggers, handler, lifecycle

Adds the operator-facing `claw unmerge <merge_id_or_prefix> [--force]`
command across both Signal and Discord channels:

- `EntityUnmergeRequestedEvent` type + EventMap entry
- Signal/Discord text triggers parse the prefix and optional `--force`
  trailing flag
- `handleEntityUnmergeRequested` resolves exact-match-then-prefix-match
  on entity_merge_log.merge_id, refuses ambiguous prefixes, calls
  unmergeEntities, formats ack reply
- startIdentityMergeHandler now subscribes to both merge + unmerge
  events; reuses the channelReply wiring from PR #47

The engine work + rich snapshot landed earlier in this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ns (#54)

* docs(brain): auto-merge for duplicate entities — design spec

Brainstorm output for the next step in the identity-merge series. v1 is a
nightly batch sweep with three confidence tiers: silent auto-merge for
hard-identifier matches, chat suggestions for name-only matches, drops for
fuzzy matches. Reuses mergeEntities/unmergeEntities/setIdentityMergeReply
from PRs 45/47/53.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* plan(brain): auto-merge for duplicate entities — implementation plan

22 TDD-staged tasks executing the spec from 2026-04-28. Each task is
2-5 minutes: failing test, expected fail output, minimal code, expected
pass output, commit. Covers schema, classifier (high + medium tiers),
sweep with dry-run / env-gate / idempotency, mergeEntities lifecycle
hook, entity-id prefix resolution, chat suggestion formatter, claw
merge-reject parsing in Signal + Discord (with ordering safeguard),
auto-suppression on operator unmerge of auto:high, and nightly schedule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(brain): schema for auto-merge suggestions and suppressions

* feat(events): add entity.merge.suggested and entity.merge.reject.requested

* feat(brain): auto-merge lexOrdered helper

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(brain): normalizePhone helper for hard-identifier matching

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(brain): high-confidence duplicate detector (hard-identifier match)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(brain): derive reason_code post-hoc in findHighConfidenceCandidates

Address Task-5 reviewer feedback: deriving reason_code from the matched
fields (in HARD_IDENTIFIER_FIELDS declaration order) instead of recording
it in a parallel Map removes the dependency on SQLite row order and Map
insertion order. Also document that pairKey.split('|') is safe because
ULIDs (Crockford base32) never contain '|'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(brain): medium-confidence duplicate detector with conflict short-circuit

Add findMediumConfidenceCandidates that groups person entities by
lower(trim(canonical.name)) and emits medium-confidence pairs, skipping
any pair where both entities have conflicting (non-overlapping) normalized
values for a hard-identifier field. Also exports MediumConfidencePair
interface. Includes 7 new tests covering name-exact match, case/trim
normalization, empty-name guard, entity_type mismatch, conflict
short-circuit, one-sided alias, and a production-fixture regression.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(brain): isSuppressed check for auto-merge candidate filter

* feat(brain): mark matching merge-suggestion accepted on manual merge

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(brain): runAutoMergeSweep — high-confidence path

* fix(brain): include discord_snowflake and whatsapp_jid in MergeEvidenceField

Task-9 reviewer flagged that HARD_IDENTIFIER_FIELDS includes
discord_snowflake and whatsapp_jid, but MergeEvidenceField (and the
matching MergeEvidence.matched_field union in identity-merge) only
listed the first three. The 'as MergeEvidenceField' cast in the sweep's
high-confidence path was therefore silently passing values outside the
declared union into entity_merge_log.evidence — a soft data-integrity
issue for downstream consumers (Task 10 notifications, Task 14 unmerge
display) that pattern-match on matched_field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(brain): runAutoMergeSweep — medium-confidence path with event emission

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(brain): auto-merge sweep idempotency on re-run

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(brain): auto-merge sweep is a no-op when disabled

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(brain): auto-merge sweep dry-run writes nothing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(brain): correct dry-run medium-conf count to 2

The plan's expected count of 1 assumed high-conf and medium-conf classifiers
would not overlap, but in dry-run the high-conf merge is skipped, so the
medium-conf classifier still observes the Alice/Alice pair (their emails
overlap, so hasConflictingIdentifier returns false) on top of the
Jonathan/Jonathan pair. Both legitimately surface — dry-run reports what
the medium classifier sees on the current entity table, not what would
remain after a hypothetical high-conf merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(brain): auto-suppress entity pair when operator unmerges an auto:high merge

* feat(brain): resolveHandle accepts entity_id prefixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(brain): handler + chat formatter for entity.merge.suggested

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(index): route entity.merge.suggested via main-group resolution

Task-15 reviewer flagged that the handler emits 'main'/'signal' as a
sentinel asking the channel layer to default to the main group, but the
channel layer didn't actually handle the sentinel — it would construct
sig:group:main (an invalid Signal group ID) and silently fail in the
channel.sendMessage catch. Suggestions would never reach chat in
production.

Detect the 'main' sentinel before the platform-specific JID construction
and route via the same isMain + ownsJid pattern that deliverBrainMessage
uses (index.ts:1365). Works correctly when the main group is on
Telegram, WhatsApp, Signal, or Discord — independent of the literal
'signal' platform token in the sentinel pair.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(signal): claw merge-reject trigger emits entity.merge.reject.requested

Insert the merge-reject matcher before the claw merge matcher so the
existing \b word boundary does not swallow the hyphenated command.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(discord): claw merge-reject trigger emits entity.merge.reject.requested

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(brain): handler for claw merge-reject — writes suppression and updates suggestion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(brain): startAutoMergeSchedule — daily sweep with stop function

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(index): wire auto-merge nightly schedule

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: env vars for auto-merge feature

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(brain): merge-reject same-entity guard + env-var v1-tuning notes

Holistic Task-21 reviewer flagged a coverage gap (the same-entity branch
in handleEntityMergeRejectRequested at line 347-349 had no test) and a
documentation gap (BRAIN_MERGE_AUTO_HIGH_CONF_THRESHOLD and
BRAIN_MERGE_AUTO_SUGGEST_THRESHOLD are reserved-for-future env vars but
were silently undocumented as such — operators changing them would see
no effect since v1 hardcodes confidence at 1.0 and 0.6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… vars

The launchd-managed wrapper exec'd node directly, leaving the running
service unable to read .env. Vars only made it through if explicitly
listed in the plist's EnvironmentVariables, which means each new
feature-gate env var (BRAIN_MERGE_AUTO_ENABLED, etc.) silently no-ops
in production unless the user remembers to update the plist by hand.

Sourcing .env at wrapper entry (with set -a / set +a) gives launchd
runs the same env surface as 'npm run dev'. Caught while end-to-end
testing the auto-merge schedule from PR #54 — the schedule was wired
correctly but never fired because process.env.BRAIN_MERGE_AUTO_ENABLED
was undefined at runtime.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: ci-bot <ci@local>
…#58)

* chore: prettier --write on src/ to clear pre-existing format drift

Required to unblock CI's format:check gate. Whitespace-only.

* fix(index): suppress 'Email intelligence trigger failed' alert on transient upstream errors

Anthropic API socket drops (UND_ERR_SOCKET), 529 overloaded_error,
ECONNRESET / ENETUNREACH, and 502/503 gateway errors were surfacing as
chat-facing alerts even though the next debounced email batch reliably
recovered within ~1 minute. Empirically every observed alert was followed
by a successful retry, so the alert was pure noise.

Adds isTransientAgentError() classifier; runAgent now returns
{ status, error? } so the error string survives to the email-trigger
callsite, which suppresses the chat alert (logs warn instead) for
transient errors. Real failures (timeouts, code-1 exits, parse errors,
budget) still alert as before.
… taps (#59)

* fix(triage): live-refresh archive dashboard + visible toast on button taps

Pinned "Archive queue — N pending" drifted because the gmail-reconciler,
junk-reaper, and per-card archive paths resolved items without re-rendering
the dashboard. After enough drift, "Archive all 53" matched 0 rows in DB and
silently no-op'd; per-card Archive/Dismiss/Snooze gave no UI feedback either,
so the whole feature read as broken.

Refresh the pinned dashboard from every resolution path that touches the
archive queue, return a `{toast}` from the callback router so Telegram can
surface visible feedback via answerCallbackQuery, and detect the empty-queue
archive_all case explicitly with an explanatory toast.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(triage): self-heal pinned dashboard when its id drifts from Telegram

A second failure mode of the same bug: the cached pinned_msg_id in
triage_dashboards can drift from Telegram's actual current pin (DB
migration, manual unpin, out-of-band repin from an older build). Edits
to a no-longer-pinned message succeed silently — Telegram allows
arbitrary edits on past messages, but the chat header keeps showing
whichever message is the active pin, so the user sees stale content
even though the bot reports success.

Verify the cached id matches Telegram's getChat.pinned_message before
editing. On drift, drop the tracking row and post a fresh dashboard
that gets pinned and tracked. One extra getChat call per upsert (~100ms)
is acceptable on the dashboard refresh path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(triage): collapse card text on archive/dismiss/snooze success

Match the existing confirm_archive "✅ Archived" pattern for the per-card
triage callbacks: replace the card text with a one-liner status and clear
buttons in the same edit. Keeps an audit trail (timestamp + status) but
visually marks the card as done so the chat doesn't fill with handled-but-
still-actionable-looking cards.

Failure paths (gmail_failed) keep the original card text intact so the
user has retry context — only the buttons are cleared.

Status text mirrors the toast strings:
  archive    → 🗃 Archived
  dismiss    → ✓ Dismissed
  snooze 1h  → ⏰ Snoozed 1h
  snooze tom → ⏰ Snoozed until tomorrow 8am
  override→archive    → 🗃 Moved to archive queue
  override→attention  → 📥 Moved to attention

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: prettier formatting on callback-router fixes

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… items (#61)

* fix(triage): suppress reminders + dashboard noise for already-handled items

Two related noise sources:

1. The attention reminder sweep trusted the local `tracked_items.state`
   column. If the user archived (or replied to) a thread directly in
   Gmail, the row stayed `pushed` until the gmail-reconciler converged
   (2-4 min minimum, longer if the reconciler was hung), and a "Still
   waiting on you" reminder could fire on an already-handled email.

   Now the sweep does a synchronous `getThreadInboxStatus` check per
   gmail-sourced candidate before sending. If the thread is out of
   INBOX (or the user replied in-thread) the row is resolved in place
   with `gmail:external` / `gmail:user-replied` and no reminder fires.
   Gmail timeouts/errors fall through to send the reminder — suppressing
   a real reminder due to a transient outage would be the worse failure.

2. `renderArchiveDashboard` posted + pinned a fresh "Archive queue — 0
   pending" message whenever no pinned dashboard existed (clean install,
   after a state reset, recovery from a stale pin). Both `sendMessage`
   and `pinChatMessage` fire Telegram notifications, so the user got
   pinged for "nothing to archive" — pure noise. Now mirrors the
   attention-dashboard guard: skips the create path when total=0 and
   no pinned dashboard exists yet. Existing dashboards still get edited
   silently so the count visibly drops to 0.

Tests cover: precheck suppression on `out` status, fallthrough on
timeout, no-op create at total=0, silent edit-down-to-0.

* style: prettier --write triage-reminder.test.ts

* fix(ci): unblock CI — install Playwright Chromium + fix dated digest test

Two pre-existing CI failures, blocking every PR:

1. `src/brain/__tests__/weekly-digest.test.ts:415` reset the daily-digest
   debounce using `Date.now() - 23h` (real wall clock) but compared it
   against a fixed `tueMorning` of 2026-04-28 10:00 inside `nowFn`. Once
   real time drifted past 2026-04-28, the "23h ago" anchor landed AFTER
   the simulated `tueMorning`, so `now.getTime() - lastMs` went negative,
   the debounce check tripped, and the third delivery was suppressed.
   Time-bomb test. Fixed by anchoring the reset to `tueMorning` itself
   so the assertion is wall-clock-independent.

2. `src/__tests__/signer-integration.test.ts` and
   `src/signer/__tests__/docusign-executor.test.ts` call
   `chromium.launch()` from `playwright-core`, which ships without
   browser binaries. CI had no install step, so Chromium was missing on
   every run ("Executable doesn't exist at .../chromium_headless_shell").
   Added an `npx playwright-core install --with-deps chromium` step
   before tests. Uses the `playwright-core` bin since the repo doesn't
   depend on the full `playwright` meta-package.

Local: 267/267 test files, 2580/2580 tests passing.
Pre-commit was running `npm run format:fix` which calls
`prettier --write "src/**/*.ts"` on the entire src tree on every
commit — slow on a large repo and writes to files the user didn't
touch.

`lint-staged` was already in devDependencies but had no config and
wasn't invoked. This adds:

  "lint-staged": {
    "src/**/*.ts": ["prettier --write", "eslint --fix"]
  }

…and switches `.husky/pre-commit` from `npm run format:fix` to
`npx lint-staged` so we only format/lint the staged subset and pick
up the eslint --fix step automatically.

Net effect: faster commits, narrower writes, no behavior loss.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gavrielc
Copy link
Copy Markdown
Collaborator

gavrielc commented May 1, 2026

@topcoder1 good contribution. please reopen a clean pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants