chore(tooling): switch pre-commit hook to lint-staged#2171
Closed
topcoder1 wants to merge 986 commits into
Closed
Conversation
Exhaustive QA pass on src/mini-app/. 13 findings across security, correctness, and consistency. 9 fixed in atomic commits this branch, 1 blocked on user architecture decision (public tunnel auth), 3 deferred as follow-ups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(qa): auto-expire + test fix (v1.2.55)
…(v1.2.56)
src/index.test.ts was failing to load because its
vi.mock('./channels/registry.js') omitted `registerChannel`.
`telegram.ts` calls `registerChannel('telegram', ...)` at module
top-level and `triage/dashboards.ts` imports telegram.ts directly
(bypassing the empty `./channels/index.js` mock), so vitest couldn't
resolve the call and aborted before any of the 19 tests in the file
ever ran.
Adding the missing mock export makes the full suite 1630/1630 green
(up from 1611 — all 19 previously-dark index tests now execute).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
test(index): fix missing registerChannel mock (v1.2.56)
Pulls in 7 commits hardening the mini-app module: - ISSUE-003 archive trust (42806aa) - ISSUE-004+009 CORS/bulk cap (988062c) - ISSUE-002+008 task-detail JS escape (c751e3e) - ISSUE-006 any-cast drop (fe12763) - ISSUE-007 escape centralization (955db64) - ISSUE-005+012 lint + age (2de0e9a) - QA report (9cbf832)
ISSUE-010 (MED, follow-up): older mini-app endpoints (archive, bulk
archive, revert) returned one of three bespoke shapes while the new
reply-send routes used {ok, error?, code?} per the approved spec.
Clients branched differently per endpoint.
Standardize:
- All error responses now carry {ok: false, error, code}.
- `success: true/false` preserved alongside `ok` on routes where
existing callers may still key on it (archive single, draft revert).
- Codes added: INVALID_BODY, BATCH_TOO_LARGE, GMAIL_UNAVAILABLE,
ITEM_NOT_FOUND, GMAIL_API_ERROR, WATCHER_UNAVAILABLE, INTERNAL.
Backward compatible — every field that existed before still exists.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dates ISSUE-011 (LOW, follow-up): the task-detail template called location.reload() every time updated_at changed. Long-running tasks reloaded the page on every step tick — lost scroll position, flashed the tab, re-requested all assets. Render steps + logs into id="steps-slot" / id="logs-slot" server-side, then patch them client-side from the SSE payload. Status, title, and log autoscroll also update in place. Client mirrors the server's renderSteps/renderLogs shape so initial paint and updates are visually identical. Falls back to keeping the prior DOM when JSON.parse fails, so a single bad SSE frame doesn't blank the page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ISSUE-013 (LOW, follow-up): five catch blocks in mini-app/server.ts
silently swallowed errors. Most were JSON.parse over tracked_items
metadata — expected to fail cleanly on bad rows, but with zero signal
a DB corruption looked identical to "table missing in test".
Add logger.debug with { err, emailId/id, component } on:
- extractAccount JSON.parse (home listing)
- bulk archive metadata parse
- /email/:emailId tracked_items lookup + metadata parse
- /api/email/:emailId/archive tracked_items lookup + metadata parse
Debug level keeps default logs quiet in prod; visible at LOG_LEVEL=debug
when actually triaging.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The old message read "not found (already resolved?)" — misleading for proposals that got auto-expired by the cron or manually cleaned up, since nobody actually *resolved* them via Merge/Close. Now reads "(resolved, expired, or manually cleaned up)" so the reviewer doesn't wonder whether they clicked something they didn't. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(qa): clarify 'not found' copy for missing proposals (v1.2.57)
…raft-with-AI Approved design for the next layer of mini-app capability: - Classification-driven button rows (push/human, digest, transactional, fallback), with a universal ⋯ More escape hatch - Canned reply chips (Thanks / Got it / Will do) for human senders, reusing the existing reply-send + undo machinery - Snooze (1h / Tomorrow 8am / Next Mon / Next week / Custom) with a DB-backed 60s wake-tick scheduler - Unsubscribe via List-Unsubscribe header (RFC 8058 one-click, mailto, legacy GET), with fallback to Gmail when headers absent - Mute thread — SSE intake filter skips muted thread_ids entirely - Draft with AI: Quick (no prompt) and Prompt (intent seed), both spawning an agent container that creates a Gmail draft the existing reply mode picks up Scoped out (v2): agent body-scraping for unsubscribe links, dedicated Muted/Snoozed list views, auto-unmute on user reply. See: docs/superpowers/specs/2026-04-19-miniapp-ux-expansion-design.md
18 tasks across 10 phases: 1-2. Migrations + sender/subtype classifier 3. Mute thread (filter, routes, invariant) 4. Snooze (scheduler, routes, Telegram wake) 5. Unsubscribe (method picker, executor, route) 6. Context-aware action row template 7. Canned reply chips (reuse PendingSendRegistry) 8. Snooze/Unsubscribe/Mute UI wiring 9. Draft with AI (Quick + Prompt + polling) 10. Integration tests + prod smoke + merge ~60 new test cases. Every task is TDD (fail → implement → pass → commit). Atomic commits land each slice of value independently. Spec: docs/superpowers/specs/2026-04-19-miniapp-ux-expansion-design.md
Add muted_threads, snoozed_items, unsubscribe_log tables. Add sender_kind + subtype columns to tracked_items. FK on snoozed_items cascades on tracked_items delete. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 1
Review flagged that the snoozed_items FK cascade advertised by the Phase 1 migration would not actually cascade in production because SQLite requires PRAGMA foreign_keys = ON per connection. createSchema now enables the pragma before any CREATE TABLE runs, so initDatabase + _initTestDatabase + runMigrations all benefit. Added a regression test that verifies cascade without manually enabling the pragma — previously failed, now passes. Review: Task 1 code-quality I2 Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 1
Pure functions for classifying email sender (human/bot) and subtype (transactional). Used by the SSE intake path to populate the new tracked_items columns and by the mini-app template to select the right button row. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 2
…Thread Pure helpers over muted_threads. muteThread cascade-resolves existing tracked_items. isThreadMuted fails open on DB error to prevent a blip from silently dropping inbound email. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3
Before writing a new tracked_items row, check muted_threads. If matched, skip the insert, archive the thread in Gmail, log, return. Also populate sender_kind + subtype columns on insert for classification-aware rendering downstream. Added in two places: - classifyFromSSE (sse-classifier.ts) — production hot path. Mute check fires before dedup; sender_kind/subtype computed best-effort from the SSE-available fields (sender, subject, snippet). - processIncomingEmail (email-sse.ts) — new testable seam that accepts injected db + gmailOps, so the mute hook + classification wiring can be unit-tested end-to-end without the global DB singleton or the full classifier pipeline. Carries richer fields (headers, body, gmailCategory) that the SSE payload lacks. insertTrackedItem (tracked-items.ts) extended to persist the two new columns. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3 (Task 4)
…yFromSSE path Task 4 review flagged that processIncomingEmail was never called by production and duplicated insertTrackedItem's INSERT SQL. The test exercised the dead seam, leaving the actual hot path (classifyFromSSE) untested for the new mute hook and sender_kind/ subtype columns. Delete the dead seam + its types. Rewrite email-sse-mute-hook.test.ts against classifyFromSSE so the test validates what actually runs in production. Review: Task 4 critical issues 1+2 Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3 (Task 4)
POST /api/email/:id/mute inserts muted_threads row, cascade-resolves all open tracked_items in the thread, archives on Gmail. DELETE removes the mute. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3 (Task 5)
Asserts that no tracked_items row is both unresolved and in a muted thread. Runs in the QA invariants suite alongside the existing predicates. Catches mute-filter bugs or races where a muted thread still has a visible row. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 3 (Task 6)
9 commits from the UX expansion plan Phases 1-3: - DB migrations for mute/snooze/unsubscribe + sender_kind/subtype (Task 1: 6c5de50, 78b98ec, e8f323d) - classifySender / classifySubtype helpers (Task 2: 2e512bc) - isThreadMuted / muteThread / unmuteThread helpers (Task 3: 9e024a4) - Wire mute check + sender/subtype into SSE intake via classifyFromSSE (Task 4: d884dcf, c136a88) - /api/email/:id/mute POST + DELETE routes in mini-app (Task 5: 458f638) - muted-threads-never-visible QA invariant (Task 6: 7e02f95) Miniapp now supports muting a Gmail thread: any future messages on the same thread_id are skipped at intake and auto-archived on Gmail. Existing tracked_items on the thread cascade to resolved/mute:retroactive. Phases 4-10 (snooze, unsubscribe, context-aware UI, canned replies, Draft-with-AI) remain pending.
60s tick wakes snoozed items whose wake_at has passed: restores tracked_items.state/queue, deletes snooze row, emits email.snooze.waked. A push subscriber posts a Telegram reminder and the scheduler is cleaned up during graceful shutdown. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 4
POST /api/email/:id/snooze accepts '1h' | 'tomorrow-8am' | 'next-monday-8am' | 'next-week' | 'custom' (with ISO wake_at). Caps at 90 days. Wraps in a transaction so state/queue backup and tracked_items state change are atomic. DELETE is idempotent. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 4
…ilMeta Constructs a minimal RFC 2822 message (To/Subject/MIME + optional In-Reply-To / References), base64url-encodes, sends via gmail.users.messages.send. getMessageMeta also now returns a small headers map (List-Unsubscribe, List-Unsubscribe-Post, List-Id, Precedence, Message-ID, References, In-Reply-To) used by the unsubscribe executor. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 5
pickUnsubscribeMethod inspects List-Unsubscribe / List-Unsubscribe-Post headers and returns the best available method, rejecting javascript: and data: schemes. executeUnsubscribe does the HTTP POST/GET or delegates mailto sends to gmailOps.sendEmail. 5s AbortController timeout on network calls with status=0 on failure. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 5
…T + archive Fetches message headers via GmailOps.getMessageMeta, picks the best method (one-click POST, mailto, legacy GET), executes it, logs to unsubscribe_log, always archives the thread regardless of remote outcome. Returns 422 / NO_UNSUBSCRIBE_HEADER when absent. Remote 4xx/5xx maps to 502 / UNSUBSCRIBE_REMOTE_FAILED. Adds fetchImpl DI to MiniAppServerOpts + ActionDeps for tests. Plan: docs/superpowers/plans/2026-04-19-miniapp-ux-expansion.md — Phase 5
…ts alive (#43) Chronic 'Email intelligence trigger failed' alerts (113 timeouts in the error log over the past few days) traced to a stdout-only liveness check. The Claude SDK writes tool-call debug logs to stderr while the agent is doing real work — deep research, multiple /recall calls, file reads, MCP probes. Previous code only reset the 30-min idle timer on stdout OUTPUT_MARKER chunks (user-facing emissions), so an agent doing 30+ min of internal tool calls before its first reply got killed despite being alive throughout. Changes in src/container-runner.ts: - Track lastStdoutAt + lastStderrAt timestamps. Stderr handler now updates lastStderrAt on every chunk (was: ignored entirely). - Replace the single setTimeout with a setInterval-driven liveness check (cadence min(timeoutMs/10, 60_000)). Kills only when: • stdout idle > timeoutMs AND stderr idle > 5min, OR • total runtime > HARD_CAP_MS (max(timeoutMs * 2, 60min)) - Hard cap bounds runaway agents whose stderr never goes quiet — can't keep a noisy-but-stuck container alive forever. - clearTimeout → clearInterval at the close + error sites. 3 new tests in container-runner.test.ts: • Container alive past IDLE_TIMEOUT when stderr is active (regression) • HARD_CAP_MS fires even with continuous stderr churn • IDLE_TIMEOUT fires when both streams quiet (real-hang case) Plus loggerMock hoisted via vi.hoisted so tests can introspect the 'Container timeout, stopping gracefully' error log as the canonical kill signal. Existing 'timeout with no output' test updated to advance one extra interval-tick past IDLE_TIMEOUT + 30s grace (the interval-based check fires up to 60s late vs the old single-shot timer). Full test suite green: 2447/2447.
…s (PR 3) (#45) * feat(brain): identity-merge engine — pivot ku_entities + aliases atomically Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(brain): identity-merge — self/missing/type/double-merge rejections Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(events): EntityMergeRequestedEvent type Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(channels): claw merge text trigger on Signal + Discord with shared parser Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(brain): identity-merge handler — resolve handles + ack reply Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(brain): start/stop identity-merge handler with chat-ingest Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(brain): attachment-summary helper with vision tier + fallback Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(brain): include attachment summaries in window transcript Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(env): document BRAIN_MERGE_AUTO_LOW_CONF_REJECT (reserved) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…st (#44) Two follow-ups to PR #43: 1. Add CONTAINER_STDERR_GRACE_MS env override in src/config.ts. Default 300000 (5min). container-runner reads from config instead of the local hardcoded constant. Lets ops bump the grace window if a workload regularly produces longer stderr-quiet stretches (e.g. an MCP tool that blocks for 6+ min on a slow upstream) without recompiling. 2. New test in container-runner.test.ts: 'recovers when stderr goes briefly silent then resumes (gap shorter than stdout idle)'. The reviewer-flagged edge case — stderr active, then 6min silence (past 5min grace), then stderr resumes for another 30min before stdout emits. Confirms the kill condition correctly requires BOTH stdout idle > timeoutMs AND stderr idle > grace; an isolated stderr gap within the stdout-idle window does NOT kill. Full suite: 2448/2448 green (was 2447, +1 new test).
Adds a read-only audit command surfacing four classes of issue that
compound silently as the brain accumulates KUs:
1. Near-duplicate KUs (cosine >= 0.95 in same (entity, topic_key))
2. Temporal contradictions (overlapping intervals, conflicting text)
3. Orphan entities (<2 linked KUs, >30 days old)
4. Stale wiki pages (last_synthesis_at older than newest KU valid_from)
v1 does NO autonomous CRUD. Each finding includes a "merge /
mark-superseded / ignore" suggestion in the report; the user runs the
action manually. Top anti-pattern from both deep-research passes.
Wired into src/index.ts as /wikilint slash command (intercept order
matters — /wikilint check is placed BEFORE /wiki since they share the
prefix). Cron piggybacks the existing digest scheduler with a 7-day
debounce via system_state.last_wikilint, so no new setInterval.
Class 1 uses a new fetchKuVectors helper in qdrant.ts (Qdrant retrieve
with with_vector: true). Pair budget capped at 500 per run; groups
larger than 32 KUs are themselves anomalous and logged + skipped.
Group key uses a NUL separator since topic_key is space-joined by
extract.ts:normalizeTopic ("current employer") — regression-tested.
Tests: 21 new (15 in wikilint.test.ts, 6 in wikilint-command.test.ts).
Brain suite 431/431 green; full repo 2468/2468; typecheck + build clean.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a channel-aware reply hook so the identity-merge handler can post ack/error messages back to the chat where the operator typed `claw merge`. Loose coupling via a module-level setter (setIdentityMergeReply) avoids threading channel routing through chat-ingest's options. index.ts wires it after channels connect: maps (chat_id, platform) to the proper JID (sig:+phone, sig:group:base64, dc:channel_id) and calls channel.sendMessage. Failures log warn, never propagate. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e diagnostic (#48) Two related fixes: 1. WhatsApp reconnect storm (reason 405 firing every 1-2s) was hammering the daemon and burning resources. Replaced the bare 5s-on-error retry with proper exponential backoff (1s → 2s → 4s → 8s → 16s → 30s cap) plus ±25% jitter. Counter resets on successful connection. Reconnect timer is unref()'d so it doesn't keep the event loop alive. 2. Signal: when an envelope has no resolvable dataMsg (Note-to-Self syncMessage edits/deletes hit this path), capture the full envelope JSON instead of just syncMessage. This lets us identify the exact field signal-cli uses for sync deletes when the user retries the PR 4 e2e. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
git status was showing 4 untracked entries on every check:
.claude/worktrees/ — concurrent agent worktrees (this repo runs
many parallel claude/* + qa/* branches)
.claire/ — assistant local state
.env.bak — pre-existing env backup (now matched by *.bak)
scripts/brain-p0-smoke.ts — pre-existing untracked (left alone)
*.bak under Secrets so future env backups don't leak.
…gnals (#50) The reflection prompt was being fed `caller='agent-auto'` queries, whose `query_text` is the chat envelope (`<context>...<messages>...</messages>`), not a user-typed question. Manual smoke runs against live brain.db emitted hollow rules that just templated the inputs back ("retrieve from these KU IDs") — exactly the anti-pattern the prompt warned against. Two fixes in `collectSignals`: 1. **Caller deny-list** — `agent-auto` excluded from zero-result queries, recurring-retrieval grouping, and the per-KU sample-query subquery. Open-ended deny-list (vs. allow-list) is more forgiving when new user-initiated callers land. Add new noisy callers to NOISY_CALLERS. 2. **`stripChatEnvelope`** — extracts the most recent `<message>` content from `query_text`, falls through unchanged if no `<message>` tag is present. Defensive — covers any non-auto-recall caller that still wraps in a chat envelope. Lookahead `<message(?=\s|>)` prevents accidentally matching the `<messages>` wrapper tag. Live impact (manual run against real brain.db): - BEFORE: 11 recurring-retrieval signals, all polluted; prompt was 4138 chars of XML; emitted 2 hollow rules with cosine ~0.9 confidence. - AFTER: 1 recurring signal (a real "ping" healthcheck pattern); prompt is 979 chars; emitted 0 rules — model honestly reports "no usable signal" rather than fabricating. Quality > quantity for the D8 gate (≥20 reviewed brain-reflection rules before Phase 5 prompt injection ships). Better to wait for real signals than pollute the rule pool. 3 new tests in `procedural-reflect.test.ts`: - excludes agent-auto from zero-result queries - excludes agent-auto from recurring retrievals - strips chat-window XML envelope from surfaced query_text 22/22 procedural-reflect tests green; pre-existing flake in `startDigestSchedule(daily)` unrelated to this change (reproduces on clean origin/main). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… top-level timestamp (#51) Sync envelopes from the bbernhard signal-cli-rest-api wrapper place editMessage/remoteDelete inside syncMessage.sentMessage with no top-level `timestamp` on the sentMessage object — the inner timestamp lives at editMessage.dataMessage.timestamp / remoteDelete.timestamp. The handler previously called `new Date(dataMsg.timestamp).toISOString()` before the editMessage/remoteDelete branches, which threw RangeError. The throw was swallowed by the poll-loop catch at debug level, so PR 4's chat-edit-sync handlers — which subscribe to chat.message.edited / chat.message.deleted events emitted in those branches — never fired in production. Fall back dataMsg.timestamp to envelope.timestamp so the early-return branches are reached. The fallback doesn't affect their semantics; they extract their own targets from editMessage.targetSentTimestamp and remoteDelete.timestamp respectively. Two regression tests cover the previously-uncovered sync envelope shapes (Note-to-Self-style sentMessage with no top-level timestamp), confirmed to fail without the fix and pass with it. Note: a separate wrapper-level limitation remains for Note-to-Self "Delete for Everyone" actions, which arrive as `syncMessage: {}` (empty) because bbernhard does not serialize syncMessage.delete payloads. That gap is upstream of nanoclaw and tracked separately; non-NTS deletes via dataMessage.remoteDelete are unaffected. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…action (#52) When a user edits a message originally sent via `claw save <text>`, the edit envelope carries the FULL edited body (`claw save <new text>`), not just the tail. The original ingest path stripped `claw save ` before emitting `chat.message.saved`, but PR 4's edit-sync handler re-ran extractPipeline on the raw new_text — producing KUs whose text included the literal `claw save ` prefix, inconsistent with the original. Live repro on 2026-04-28: - Original: `claw save Pay $5,000 to Acme by Friday` → KU text: `Pay $5,000 to Acme by Friday` ✓ - Edit: `claw save Pay $7,500 to Acme by Monday` → New KU text: `claw save Pay $7,500 to Acme by Monday` ✗ Add a small `stripClawTriggerPrefix(text)` helper used both for the single-message path and inside `rebuildWindowTranscript`. Same regex shape the channel-side text trigger uses, so KU text stays consistent across original ingest + edit re-extraction. Two regression tests cover save and merge prefixes; LLM caller used as a spy to verify the prompt doesn't carry the prefix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(brain): identity-unmerge engine — round-trip mergeEntities with rich snapshot Enriches pre_merge_snapshot to capture pre-merge state of ku_entities, aliases, and relationships for both kept + merged entities (schema_v2). Adds unmergeEntities(mergeId) that atomically restores the snapshot and deletes the merge_log row. Guardrail refuses if either entity has new rows added after the merge (force:true to override). v1-era snapshots without the rich data are rejected with a clear error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(brain): tighten unmergeEntities guardrail JSDoc to match impl Reviewer noted the prior comment claimed the guardrail covered ku_entities, aliases, and relationships, but only ku_entities are checked today. Reword to match reality and flag the gap for follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(brain): claw unmerge — event type, channel triggers, handler, lifecycle Adds the operator-facing `claw unmerge <merge_id_or_prefix> [--force]` command across both Signal and Discord channels: - `EntityUnmergeRequestedEvent` type + EventMap entry - Signal/Discord text triggers parse the prefix and optional `--force` trailing flag - `handleEntityUnmergeRequested` resolves exact-match-then-prefix-match on entity_merge_log.merge_id, refuses ambiguous prefixes, calls unmergeEntities, formats ack reply - startIdentityMergeHandler now subscribes to both merge + unmerge events; reuses the channelReply wiring from PR #47 The engine work + rich snapshot landed earlier in this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ns (#54) * docs(brain): auto-merge for duplicate entities — design spec Brainstorm output for the next step in the identity-merge series. v1 is a nightly batch sweep with three confidence tiers: silent auto-merge for hard-identifier matches, chat suggestions for name-only matches, drops for fuzzy matches. Reuses mergeEntities/unmergeEntities/setIdentityMergeReply from PRs 45/47/53. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * plan(brain): auto-merge for duplicate entities — implementation plan 22 TDD-staged tasks executing the spec from 2026-04-28. Each task is 2-5 minutes: failing test, expected fail output, minimal code, expected pass output, commit. Covers schema, classifier (high + medium tiers), sweep with dry-run / env-gate / idempotency, mergeEntities lifecycle hook, entity-id prefix resolution, chat suggestion formatter, claw merge-reject parsing in Signal + Discord (with ordering safeguard), auto-suppression on operator unmerge of auto:high, and nightly schedule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(brain): schema for auto-merge suggestions and suppressions * feat(events): add entity.merge.suggested and entity.merge.reject.requested * feat(brain): auto-merge lexOrdered helper Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(brain): normalizePhone helper for hard-identifier matching Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(brain): high-confidence duplicate detector (hard-identifier match) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(brain): derive reason_code post-hoc in findHighConfidenceCandidates Address Task-5 reviewer feedback: deriving reason_code from the matched fields (in HARD_IDENTIFIER_FIELDS declaration order) instead of recording it in a parallel Map removes the dependency on SQLite row order and Map insertion order. Also document that pairKey.split('|') is safe because ULIDs (Crockford base32) never contain '|'. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(brain): medium-confidence duplicate detector with conflict short-circuit Add findMediumConfidenceCandidates that groups person entities by lower(trim(canonical.name)) and emits medium-confidence pairs, skipping any pair where both entities have conflicting (non-overlapping) normalized values for a hard-identifier field. Also exports MediumConfidencePair interface. Includes 7 new tests covering name-exact match, case/trim normalization, empty-name guard, entity_type mismatch, conflict short-circuit, one-sided alias, and a production-fixture regression. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(brain): isSuppressed check for auto-merge candidate filter * feat(brain): mark matching merge-suggestion accepted on manual merge Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(brain): runAutoMergeSweep — high-confidence path * fix(brain): include discord_snowflake and whatsapp_jid in MergeEvidenceField Task-9 reviewer flagged that HARD_IDENTIFIER_FIELDS includes discord_snowflake and whatsapp_jid, but MergeEvidenceField (and the matching MergeEvidence.matched_field union in identity-merge) only listed the first three. The 'as MergeEvidenceField' cast in the sweep's high-confidence path was therefore silently passing values outside the declared union into entity_merge_log.evidence — a soft data-integrity issue for downstream consumers (Task 10 notifications, Task 14 unmerge display) that pattern-match on matched_field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(brain): runAutoMergeSweep — medium-confidence path with event emission Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(brain): auto-merge sweep idempotency on re-run Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(brain): auto-merge sweep is a no-op when disabled Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(brain): auto-merge sweep dry-run writes nothing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(brain): correct dry-run medium-conf count to 2 The plan's expected count of 1 assumed high-conf and medium-conf classifiers would not overlap, but in dry-run the high-conf merge is skipped, so the medium-conf classifier still observes the Alice/Alice pair (their emails overlap, so hasConflictingIdentifier returns false) on top of the Jonathan/Jonathan pair. Both legitimately surface — dry-run reports what the medium classifier sees on the current entity table, not what would remain after a hypothetical high-conf merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(brain): auto-suppress entity pair when operator unmerges an auto:high merge * feat(brain): resolveHandle accepts entity_id prefixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(brain): handler + chat formatter for entity.merge.suggested Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(index): route entity.merge.suggested via main-group resolution Task-15 reviewer flagged that the handler emits 'main'/'signal' as a sentinel asking the channel layer to default to the main group, but the channel layer didn't actually handle the sentinel — it would construct sig:group:main (an invalid Signal group ID) and silently fail in the channel.sendMessage catch. Suggestions would never reach chat in production. Detect the 'main' sentinel before the platform-specific JID construction and route via the same isMain + ownsJid pattern that deliverBrainMessage uses (index.ts:1365). Works correctly when the main group is on Telegram, WhatsApp, Signal, or Discord — independent of the literal 'signal' platform token in the sentinel pair. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(signal): claw merge-reject trigger emits entity.merge.reject.requested Insert the merge-reject matcher before the claw merge matcher so the existing \b word boundary does not swallow the hyphenated command. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(discord): claw merge-reject trigger emits entity.merge.reject.requested Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(brain): handler for claw merge-reject — writes suppression and updates suggestion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(brain): startAutoMergeSchedule — daily sweep with stop function Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(index): wire auto-merge nightly schedule Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: env vars for auto-merge feature Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(brain): merge-reject same-entity guard + env-var v1-tuning notes Holistic Task-21 reviewer flagged a coverage gap (the same-entity branch in handleEntityMergeRejectRequested at line 347-349 had no test) and a documentation gap (BRAIN_MERGE_AUTO_HIGH_CONF_THRESHOLD and BRAIN_MERGE_AUTO_SUGGEST_THRESHOLD are reserved-for-future env vars but were silently undocumented as such — operators changing them would see no effect since v1 hardcodes confidence at 1.0 and 0.6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… vars The launchd-managed wrapper exec'd node directly, leaving the running service unable to read .env. Vars only made it through if explicitly listed in the plist's EnvironmentVariables, which means each new feature-gate env var (BRAIN_MERGE_AUTO_ENABLED, etc.) silently no-ops in production unless the user remembers to update the plist by hand. Sourcing .env at wrapper entry (with set -a / set +a) gives launchd runs the same env surface as 'npm run dev'. Caught while end-to-end testing the auto-merge schedule from PR #54 — the schedule was wired correctly but never fired because process.env.BRAIN_MERGE_AUTO_ENABLED was undefined at runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: ci-bot <ci@local>
…#58) * chore: prettier --write on src/ to clear pre-existing format drift Required to unblock CI's format:check gate. Whitespace-only. * fix(index): suppress 'Email intelligence trigger failed' alert on transient upstream errors Anthropic API socket drops (UND_ERR_SOCKET), 529 overloaded_error, ECONNRESET / ENETUNREACH, and 502/503 gateway errors were surfacing as chat-facing alerts even though the next debounced email batch reliably recovered within ~1 minute. Empirically every observed alert was followed by a successful retry, so the alert was pure noise. Adds isTransientAgentError() classifier; runAgent now returns { status, error? } so the error string survives to the email-trigger callsite, which suppresses the chat alert (logs warn instead) for transient errors. Real failures (timeouts, code-1 exits, parse errors, budget) still alert as before.
… taps (#59) * fix(triage): live-refresh archive dashboard + visible toast on button taps Pinned "Archive queue — N pending" drifted because the gmail-reconciler, junk-reaper, and per-card archive paths resolved items without re-rendering the dashboard. After enough drift, "Archive all 53" matched 0 rows in DB and silently no-op'd; per-card Archive/Dismiss/Snooze gave no UI feedback either, so the whole feature read as broken. Refresh the pinned dashboard from every resolution path that touches the archive queue, return a `{toast}` from the callback router so Telegram can surface visible feedback via answerCallbackQuery, and detect the empty-queue archive_all case explicitly with an explanatory toast. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(triage): self-heal pinned dashboard when its id drifts from Telegram A second failure mode of the same bug: the cached pinned_msg_id in triage_dashboards can drift from Telegram's actual current pin (DB migration, manual unpin, out-of-band repin from an older build). Edits to a no-longer-pinned message succeed silently — Telegram allows arbitrary edits on past messages, but the chat header keeps showing whichever message is the active pin, so the user sees stale content even though the bot reports success. Verify the cached id matches Telegram's getChat.pinned_message before editing. On drift, drop the tracking row and post a fresh dashboard that gets pinned and tracked. One extra getChat call per upsert (~100ms) is acceptable on the dashboard refresh path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(triage): collapse card text on archive/dismiss/snooze success Match the existing confirm_archive "✅ Archived" pattern for the per-card triage callbacks: replace the card text with a one-liner status and clear buttons in the same edit. Keeps an audit trail (timestamp + status) but visually marks the card as done so the chat doesn't fill with handled-but- still-actionable-looking cards. Failure paths (gmail_failed) keep the original card text intact so the user has retry context — only the buttons are cleared. Status text mirrors the toast strings: archive → 🗃 Archived dismiss → ✓ Dismissed snooze 1h → ⏰ Snoozed 1h snooze tom → ⏰ Snoozed until tomorrow 8am override→archive → 🗃 Moved to archive queue override→attention → 📥 Moved to attention Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: prettier formatting on callback-router fixes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… items (#61) * fix(triage): suppress reminders + dashboard noise for already-handled items Two related noise sources: 1. The attention reminder sweep trusted the local `tracked_items.state` column. If the user archived (or replied to) a thread directly in Gmail, the row stayed `pushed` until the gmail-reconciler converged (2-4 min minimum, longer if the reconciler was hung), and a "Still waiting on you" reminder could fire on an already-handled email. Now the sweep does a synchronous `getThreadInboxStatus` check per gmail-sourced candidate before sending. If the thread is out of INBOX (or the user replied in-thread) the row is resolved in place with `gmail:external` / `gmail:user-replied` and no reminder fires. Gmail timeouts/errors fall through to send the reminder — suppressing a real reminder due to a transient outage would be the worse failure. 2. `renderArchiveDashboard` posted + pinned a fresh "Archive queue — 0 pending" message whenever no pinned dashboard existed (clean install, after a state reset, recovery from a stale pin). Both `sendMessage` and `pinChatMessage` fire Telegram notifications, so the user got pinged for "nothing to archive" — pure noise. Now mirrors the attention-dashboard guard: skips the create path when total=0 and no pinned dashboard exists yet. Existing dashboards still get edited silently so the count visibly drops to 0. Tests cover: precheck suppression on `out` status, fallthrough on timeout, no-op create at total=0, silent edit-down-to-0. * style: prettier --write triage-reminder.test.ts * fix(ci): unblock CI — install Playwright Chromium + fix dated digest test Two pre-existing CI failures, blocking every PR: 1. `src/brain/__tests__/weekly-digest.test.ts:415` reset the daily-digest debounce using `Date.now() - 23h` (real wall clock) but compared it against a fixed `tueMorning` of 2026-04-28 10:00 inside `nowFn`. Once real time drifted past 2026-04-28, the "23h ago" anchor landed AFTER the simulated `tueMorning`, so `now.getTime() - lastMs` went negative, the debounce check tripped, and the third delivery was suppressed. Time-bomb test. Fixed by anchoring the reset to `tueMorning` itself so the assertion is wall-clock-independent. 2. `src/__tests__/signer-integration.test.ts` and `src/signer/__tests__/docusign-executor.test.ts` call `chromium.launch()` from `playwright-core`, which ships without browser binaries. CI had no install step, so Chromium was missing on every run ("Executable doesn't exist at .../chromium_headless_shell"). Added an `npx playwright-core install --with-deps chromium` step before tests. Uses the `playwright-core` bin since the repo doesn't depend on the full `playwright` meta-package. Local: 267/267 test files, 2580/2580 tests passing.
Pre-commit was running `npm run format:fix` which calls
`prettier --write "src/**/*.ts"` on the entire src tree on every
commit — slow on a large repo and writes to files the user didn't
touch.
`lint-staged` was already in devDependencies but had no config and
wasn't invoked. This adds:
"lint-staged": {
"src/**/*.ts": ["prettier --write", "eslint --fix"]
}
…and switches `.husky/pre-commit` from `npm run format:fix` to
`npx lint-staged` so we only format/lint the staged subset and pick
up the eslint --fix step automatically.
Net effect: faster commits, narrower writes, no behavior loss.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
|
@topcoder1 good contribution. please reopen a clean pr |
2 tasks
This was referenced May 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Pre-commit was running `npm run format:fix` which calls `prettier --write "src/**/*.ts"` on the entire `src` tree on every commit — slow on a large repo and writes to files the user didn't touch.
`lint-staged` was already in `devDependencies` but had no config and wasn't invoked. This wires it up.
What changed
```diff
// .husky/pre-commit
npx lint-staged
// package.json
"lint-staged": {
"src/**/*.ts": ["prettier --write", "eslint --fix"]
}
```
Self-validation: when committing this PR, the new hook fired and correctly reported `→ lint-staged could not find any staged files matching configured tasks` (this commit only changes config + lockfile, no `.ts`).
Net effect
Test plan
🤖 Generated with Claude Code