Skip to content

incident response: interactive sync + notification suppression#1

Open
mickenordin wants to merge 17 commits intomainfrom
incident-response/interactive-sync
Open

incident response: interactive sync + notification suppression#1
mickenordin wants to merge 17 commits intomainfrom
incident-response/interactive-sync

Conversation

@mickenordin
Copy link
Copy Markdown
Member

Summary

Response to the 2026-04-16 calendar mass-delete incident, where a silent JMAP adapter bug cascaded into hundreds of M365 event deletes with attendee cancellation emails.

  • Interactive default: both groupware-sync-calendar sync and groupware-sync-contacts sync now print the plan and prompt [y/N] before writing. --non-interactive preserves auto-execute for cron; non-TTY without that flag fails fast with exit 2. --dry-run unchanged and wins silently over --non-interactive.
  • Notification suppression: every adapter write attempts to suppress attendee notifications — JMAP adds sendSchedulingMessages: false, Graph adds Prefer: outlook.send-notifications=false to POST/PATCH/DELETE. Each adapter declares a notification_policy (SUPPRESSED / BEST_EFFORT / UNSUPPORTED per op type); the plan output annotates risky ops with [!] notifications best-effort/unsupported.
  • JMAP notCreated fix: the incident's root cause — adapter now raises ValueError instead of silently KeyError-ing when Stalwart rejects a create.

Spec: ~/.claude/projects/-home-micke-sources-groupware-sync/specs/2026-04-17-interactive-sync-mode-design.md
Plan: ~/.claude/projects/-home-micke-sources-groupware-sync/plans/2026-04-17-interactive-sync-mode-plan.md

Structure

15 commits, landed in 11 plan tasks (TDD throughout):

  • c4a1ab3 fix(jmap): check notCreated in CalendarEvent/set response
  • 4a90e3c feat(provider): declare NotificationCapability and NotificationPolicy
  • 51ed3cf feat(models): add aborted flag to SyncSummary
  • 185e5f0 feat(jmap-calendar): suppress scheduling messages, declare policy
  • 2b0a833 feat(graph-calendar): suppress attendee notifications on writes
  • 2ab8cfa feat(caldav): declare UNSUPPORTED notification policy
  • 9bec403 feat(contacts): declare SUPPRESSED notification policy
  • 4803b13 refactor(engine): extract _format_plan with suppression annotations
  • 019073f feat(engine): add confirm callback for interactive sync
  • ae2c340 feat(calendar-cli): interactive default, --non-interactive opt-in
  • 36e422d feat(contacts-cli): interactive default, --non-interactive opt-in
  • Plus four small cleanup/review-fix commits (6f8620c, 68c131b, 11af0a6, 6045b67).

Test plan

  • python3 -m pytest tests --ignore=tests/e2e — expect ~97 passed (or ~90 if vobject is missing; two files require it for collection)
  • Run groupware-sync-calendar sync --dry-run against staging — plan output uses PLAN ... format with [!] annotations where BEST_EFFORT
  • Run echo "" | groupware-sync-calendar sync — exit 2, stderr says "interactive mode requires a TTY", no network calls
  • Run groupware-sync-calendar sync in a TTY with a test attendee on M365, answer n — verify no writes and no emails
  • Repeat, answer y — verify writes apply, still no attendee emails (regression test for 2026-04-16)
  • If suppression confirmed silent on a given op type, upgrade the adapter's notification_policy field from BEST_EFFORT to SUPPRESSED (manual verification step in the plan, Task 11)

Out of scope

  • Delete thresholds / create-error thresholds (interactive prompt is the safety gate)
  • JSON/structured plan output (text only for now)
  • CalDAV Schedule-Reply: F iTIP suppression (CalDAV declared UNSUPPORTED; adapter isn't wired to any CLI yet)
  • Recovery tooling for the 2026-04-16 incident

Stalwart returned notCreated for failed creates, but the adapter only
looked at 'created' and raised KeyError when the response was empty.
The engine caught the exception, logged an error, and continued into
phase 4 deletes — the root cause of the 2026-04-16 incident.

Now the adapter raises ValueError with the server's error type and
description, so _execute_creates records a real error instead of a
surprise KeyError.
Adapters will declare per-op-type suppression capability via a class
attribute. The sync engine reads it when formatting the plan for
interactive confirmation; the attribute has no default so missing
declarations fail loudly.
Interactive mode uses this to signal 'user declined to execute the
plan' — distinct from 'no changes to sync' (empty plan) and from
'execution errors' (errors > 0).
Consolidate imports into the module header. No behavior change;
silences ruff E402 and keeps the file tidy as more tests get
appended in later tasks.
Every CalendarEvent/set request now includes sendSchedulingMessages:
false per the JMAP Calendars spec, and the adapter declares a
BEST_EFFORT notification_policy. The capability is conservative until
verified against a live Stalwart — if the server honors the flag
silently, individual op types can be upgraded to SUPPRESSED.
POST/PATCH/DELETE now go through _write_request which attaches
Prefer: outlook.send-notifications=false. Read paths still use
_request so GETs stay header-free. Policy declared as BEST_EFFORT
until a live M365 tenant confirms silence end-to-end.
The CalDAV adapter has no iTIP suppression yet and is not wired into
any CLI. The explicit declaration makes the gap visible in plan
output if the adapter ever becomes active. Actual Schedule-Reply: F
wiring is a follow-up spec.
Contacts writes don't emit scheduling traffic, but declaring the
policy explicitly keeps the SyncProvider interface uniform and
prevents future silent regressions.
Plan output now uses the neutral 'PLAN ...' prefix and appends
'[!] notifications best-effort/unsupported' when the target adapter's
notification_policy can't guarantee silence. _log_dry_run becomes a
thin wrapper. Enables identical plan text in dry-run and interactive
modes.
Dead-weight local import removed; the module-level import already
references provider types. Also tightens cap_for's return
annotation (the dataclass field is non-Optional).
sync_trees now accepts an optional confirm(ops, plan_summary) -> bool
callback. When provided and the plan is non-empty, the engine prints
the plan + summary + NOTE line to stdout and calls confirm. On False
it records aborted=True and skips phase 4-5. dry_run still wins when
both are set. Empty plans short-circuit before the prompt.
test_confirm_true_runs_writes previously hit the empty-plan short-
circuit; now it builds a non-empty plan (matching containers, leaf
only on b) and asserts phase 4 write actually runs.

Drop conflicts=0 from the interactive SUMMARY line: conflicts are
detected during merge execution, not at plan time, so showing 0
in the pre-execute summary could mislead the user.
Default mode now prints the plan via sync_trees and prompts via
typer.confirm before executing. --non-interactive preserves the old
auto-execute behavior for cron. A non-TTY stdin without either flag
exits 2 with a clear message before any remote access.
Adds test_aborted_summary_exits_zero_with_message — verifies the
primary incident-mitigation path (user declines the prompt, engine
returns aborted=True, CLI prints 'aborted by user' and exits 0).

Also moves the make_session_factory monkeypatch to the cli module
namespace where the symbol is actually bound, and drops an unused
dataclass import.
Mirrors the calendar CLI changes: default mode prompts via
typer.confirm, --non-interactive executes immediately, --dry-run
wins silently over --non-interactive, non-TTY default exits 2.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens sync execution after the 2026-04-16 incident by making sync runs interactive-by-default (with explicit non-interactive opt-in) and by attempting to suppress attendee/notification side effects on all adapter writes, while also fixing a JMAP create failure mode.

Changes:

  • Adds interactive confirmation support to the sync engine and updates the calendar/contacts CLIs to require a TTY unless --non-interactive (with --dry-run taking precedence).
  • Introduces NotificationCapability/NotificationPolicy on providers and implements notification suppression in JMAP + Graph calendar writes (and declares policies for other adapters).
  • Fixes JMAP calendar create handling to surface notCreated failures as a clear exception instead of failing indirectly.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/groupware_sync/provider.py Adds NotificationCapability + NotificationPolicy and makes providers declare a notification policy.
src/groupware_sync/models.py Extends SyncSummary with an aborted flag.
src/groupware_sync/engine.py Adds plan formatting + confirmation callback path; annotates plans with notification suppression capability.
src/groupware_sync_calendar/cli.py Makes calendar sync interactive-by-default with TTY guard and --non-interactive; passes confirm callback to engine.
src/groupware_sync_contacts/cli.py Makes contacts sync interactive-by-default with TTY guard and --non-interactive; passes confirm callback to engine.
src/groupware_sync_calendar/adapters/jmap_adapter.py Adds sendSchedulingMessages: false, declares notification policy, and handles notCreated on create.
src/groupware_sync_calendar/adapters/graph_adapter.py Adds _write_request() to attach Prefer: outlook.send-notifications=false to write calls; declares policy.
src/groupware_sync_calendar/adapters/caldav_adapter.py Declares UNSUPPORTED notification policy for CalDAV calendar adapter.
src/groupware_sync_contacts/adapters/jmap_adapter.py Declares SUPPRESSED notification policy for JMAP contact adapter.
src/groupware_sync_contacts/adapters/graph_adapter.py Declares SUPPRESSED notification policy for Graph contact adapter.
src/groupware_sync_contacts/adapters/carddav_adapter.py Declares SUPPRESSED notification policy for CardDAV contact adapter.
tests/test_engine_interactive.py Adds unit tests for plan formatting annotations and confirm/abort behavior in sync_trees().
tests/test_jmap_calendar_suppression.py Verifies JMAP calendar writes include sendSchedulingMessages: false and policy is declared.
tests/test_graph_calendar_suppression.py Verifies Graph calendar writes include the suppression Prefer header and reads do not.
tests/test_cli_calendar.py Adds CLI behavior tests for TTY guard / flag precedence for calendar CLI.
tests/test_cli_contacts.py Adds CLI behavior tests for TTY guard / flag precedence for contacts CLI.
tests/test_contacts_notification_policy.py Verifies all contacts adapters declare SUPPRESSED.
tests/test_caldav_calendar_suppression.py Verifies CalDAV calendar adapter declares UNSUPPORTED.
tests/test_models.py Adds tests for new notification model enums/policy and SyncSummary.aborted.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/groupware_sync/engine.py Outdated
Comment on lines +105 to +106
if not operations:
log.info("Plan is empty — nothing to confirm, no writes to execute")
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interactive confirm branch currently prompts whenever operations is non-empty, but a common “no changes” run returns a SKIP_SUBTREE op from compare_trees(). That leads to a confirmation prompt even though there are no writes to execute, and users can “abort” a no-op plan (setting summary.aborted=True) which is confusing.

Consider skipping the confirm prompt unless the plan contains at least one write-producing op (e.g., CREATE_CONTAINER/CREATE_ITEM/MERGE_ITEM, DELETE_ITEM where target_side != "both", DELETE_CONTAINER). For skip-only / no-write plans, you can populate summary.skipped and return without prompting.

Suggested change
if not operations:
log.info("Plan is empty — nothing to confirm, no writes to execute")
def _is_write_operation(op: SyncOp) -> bool:
if op.op_type in (
OpType.CREATE_CONTAINER,
OpType.CREATE_ITEM,
OpType.MERGE_ITEM,
OpType.DELETE_CONTAINER,
):
return True
if op.op_type == OpType.DELETE_ITEM:
return op.target_side != "both"
return False
has_writes = any(_is_write_operation(op) for op in operations)
if not operations or not has_writes:
_populate_plan_summary(operations, summary)
log.info("Plan has no writes — nothing to confirm, no writes to execute")

Copilot uses AI. Check for mistakes.
Comment thread src/groupware_sync_contacts/cli.py Outdated
cfg.m365.auth_provider_name,
)
except ValueError as e:
except (ValueError, Exception) as e:
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except (ValueError, Exception) is equivalent to except Exception (since ValueError is already an Exception). It’s redundant and makes it less clear which error types are intentionally handled here; consider catching just Exception (or a narrower set of expected exceptions) for clarity.

Suggested change
except (ValueError, Exception) as e:
except Exception as e:

Copilot uses AI. Check for mistakes.
The lint/test/vuln-scan jobs ran 'uv run <tool>' without --extra dev,
so ruff/pytest/pip-audit weren't installed into the venv — all three
failed with 'Failed to spawn: <tool> — No such file or directory'.

The e2e job declared radicale under services:, which start BEFORE
actions/checkout runs — so the workspace-relative volume paths didn't
exist yet and Docker failed with 'not a directory'. Start radicale
explicitly after checkout via docker run, with a curl poll to wait
for readiness.
…atch

Copilot review on PR #1:

1. A plan consisting only of SKIP_SUBTREE or 'both-gone' DELETE_ITEM
   ops has no writes — prompting the user to confirm a no-op was
   confusing, and answering 'n' would spuriously set aborted=True.
   New _has_write_op() predicate short-circuits the confirm branch for
   these cases: populate the summary's skipped count and return cleanly.

2. 'except (ValueError, Exception)' in both CLIs is equivalent to
   'except Exception' since ValueError is already an Exception. Drop
   the redundant listing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants