Skip to content

feat(replay): client-owned session_id + per-tab window_id + idle handling#350

Open
ayushjhanwar-png wants to merge 2 commits into
mainfrom
feat/session-replay-window-id
Open

feat(replay): client-owned session_id + per-tab window_id + idle handling#350
ayushjhanwar-png wants to merge 2 commits into
mainfrom
feat/session-replay-window-id

Conversation

@ayushjhanwar-png

@ayushjhanwar-png ayushjhanwar-png commented Jul 3, 2026

Copy link
Copy Markdown

What & why

Fixes the session-replay reliability issues found on prod Frameo data. On prod, ~44% of eligible sessions showed no replay and ~46% of replay-bearing sessions had chunks spanning far beyond the session (stale-closure / multi-tab collisions). Root cause: the server owned session_id and the recorder captured it once, so events and chunks diverged; multiple tabs collided on chunk_index; and background DOM churn kept sessions alive for hours.

This ports PostHog's proven model. Fully backward compatible — old SDKs that don't send session_id/window_id fall through to the exact existing server behaviour (verified by diff: the new worker path only activates when a client session_id is present).

Changes

SDK (@openpanel/web)

  • SessionIdManager — client generates + owns session_id in localStorage, rotates on 30-min idle / 24-hr cap.
  • window_id per tab/page-load (in-memory) — concurrent recorders never collide on chunk_index; rotates with the session.
  • Recorder starts synchronously (removes the 10s poll).
  • PostHog-style idle handling — only real user interactions count as activity (DOM mutations don't). After 5 min idle the recorder drops events and resumes on interaction with a fresh FullSnapshot. Kills multi-hour ghost recordings.

Server

  • api: trust client session_id; thread window_id into replay chunks.
  • worker: use client session_id when present; a rotated id starts a new session. Legacy path unchanged.

DB

  • session_replay_chunks: add window_id column (migration 20).
  • getSessionWindows + windowId filter on all chunk-fetch queries.

Dashboard

  • Per-tab window selector on the session detail page (plays one recorder at a time — no mixed rrweb mirror states).
  • Skip-idle toggle in the player (default on).

Testing (local worker/API against dev backend)

  • session_id generation + idle rotation ✓
  • window_id per tab, zero chunk_index collisions across concurrent tabs ✓
  • server creates session rows under the client session_id ✓
  • idle-stop verified (recorder goes quiet ~after threshold, resumes on interaction) ✓
  • window selector + skip-idle in the dashboard ✓

Deploy plan

Merge → dev first. Before prod: run analytics regression (session duration/bounce after the worker change), verify the CH migration via the framework, then ship the SDK to @dashverse/openpanel-web and roll out on Frameo behind a flag.

Deferred (follow-ups, tracked)

  • Top-level "Session Replays" list page (built then removed — needs UI polish).
  • Funnel → "View Replays" drill-down (Mixpanel-style; ~1 day, most infra exists).
  • Ghost-window filter in the selector.

Summary by CodeRabbit

  • New Features

    • Session replays are now available per browser window/tab, with a window selector when multiple recorders exist.
    • Added a “Skip idle” toggle to jump over inactive periods during playback.
    • Improved replay/tracking correlation as sessions rotate, including per-window identifiers.
    • Added documentation for how session replay rendering and chunk fetching work.
  • Bug Fixes

    • Reduced mixed-up replay chunks from concurrent recorders/tabs.
    • Improved stability for long idle stretches and canvas-heavy recordings.
    • Updated session handling so client-provided session IDs are respected.

…ling

Fixes the session-replay reliability issues found on prod Frameo data,
matching PostHog's proven model. Fully backward compatible — old SDKs that
don't send session_id/window_id fall through to the exact existing behaviour.

SDK (@openpanel/web):
- SessionIdManager: client generates + owns session_id in localStorage,
  rotates on 30-min idle / 24-hr cap (no more server/client divergence).
- window_id per tab / page-load (in-memory): distinguishes concurrent
  recorders so multi-tab sessions never collide on chunk_index. Rotates
  with the session (PostHog parity).
- Recorder starts synchronously (no 10s poll).
- PostHog-style idle handling: only real user interactions (mouse/click/
  scroll/input) count as activity — DOM mutations do NOT. After 5 min idle
  the recorder drops events instead of recording; resumes with a fresh
  FullSnapshot on the next interaction. Kills multi-hour "ghost" recordings.

Server:
- api: trust client session_id + thread window_id into replay chunks.
- worker: use client session_id when present; treat a rotated id as a new
  session. Legacy path (no client id) unchanged.

DB:
- session_replay_chunks: add window_id column (migration 20).
- getSessionWindows + windowId filter on all chunk-fetch queries.

Dashboard:
- Per-tab window selector on the session detail page (plays one recorder at
  a time — no mixed rrweb mirror states / "Node not found").
- Skip-idle toggle in the player (default on).

Docs: session-replay-rendering.md — how a recording renders in the player.
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds window-scoped session replay storage and retrieval, client-owned session ID rotation, idle-aware replay recording and playback, multi-window replay selection in the UI, worker session handling changes, and updated replay documentation.

Changes

Session replay window scoping and idle-skip feature

Layer / File(s) Summary
Database schema and replay queries
packages/db/code-migrations/20-add-window-id-to-replay-chunks.ts, packages/db/src/buffers/replay-buffer.ts, packages/db/src/services/session.service.ts
Adds window_id to replay chunks, extends replay chunk types, filters replay queries by windowId, and adds window listing output.
Session router window endpoints
packages/trpc/src/routers/session.ts
Adds windowId inputs to replay chunk procedures and exposes replayWindows.
Replay chunk ingestion and session job routing
apps/api/src/controllers/track.controller.ts, apps/worker/src/jobs/events.incoming-event.ts, apps/worker/src/utils/session-handler.ts
Stores window_id on replay chunks, trusts client session_id for session job routing, and deduplicates delayed session-end jobs.
Storage and session ID management
packages/sdks/web/src/storage.ts, packages/sdks/web/src/session-id-manager.ts
Adds safe storage wrappers and a client-side session ID manager with rotation and callbacks.
Idle-aware replay recording
packages/sdks/web/src/replay/recorder.ts
Tracks interactive rrweb activity, stops chunk emission during idle periods, and resumes with a forced full snapshot.
SDK payload contracts and web replay wiring
packages/sdks/sdk/src/index.ts, packages/sdks/web/src/index.ts
Adds window_id/session_id payload fields and wires session rotation, replay tagging, and recorder restarts into the web SDK.
Replay player and multi-window session replay UI
apps/start/src/components/sessions/replay/index.tsx, apps/start/src/components/sessions/replay/replay-player.tsx
Adds skipInactive playback support, window-scoped chunk loading, replay window tabs, and the skip-idle toggle.
Session replay rendering documentation
docs/session-replay-rendering.md
Adds documentation for the replay format, render pipeline, window-id behavior, and UI surface.
Estimated code review effort: 4 (Complex) ~75 minutes
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main changes: client-owned session_id, per-tab window_id, and replay idle handling.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/session-replay-window-id

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/worker/src/jobs/events.incoming-event.ts (1)

180-213: 🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift

Replace the existing delayed session-end job when session_id changes. createSessionEndJob dedupes on projectId + deviceId, so the stale job keeps the old sessionId payload. After a client-side rotation, later events keep matching that old job, which leads to duplicate session_start events and no end job for the new session. Update/remove the pending job before enqueueing the new one, or reuse it with the new payload.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/worker/src/jobs/events.incoming-event.ts` around lines 180 - 213, The
delayed session-end job handling in incoming-event processing is leaving a stale
job behind when `session_id` changes, so later events keep matching the old
payload. Update the `createSessionEndJob` flow in `events.incoming-event.ts` to
remove or replace any existing pending job before enqueueing the new one, and
make sure the new job uses the current `payload`/`sessionId` from the
`isNewSession` path so `createSessionStart` and `createSessionEndJob` stay in
sync.
🧹 Nitpick comments (3)
apps/start/src/components/sessions/replay/index.tsx (1)

324-342: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add aria-pressed to the Skip idle toggle.

It's a two-state toggle button; exposing aria-pressed={skipInactive} lets assistive tech announce the on/off state (the visual styling alone doesn't convey it).

♻️ Proposed change
                   <button
                     type="button"
+                    aria-pressed={skipInactive}
                     onClick={() => setSkipInactive((v) => !v)}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/start/src/components/sessions/replay/index.tsx` around lines 324 - 342,
The Skip idle control in the replay session component is a two-state toggle but
does not expose its pressed state to assistive technologies. Update the button
in the replay UI to include aria-pressed driven by skipInactive so screen
readers can announce the on/off state; use the existing toggle handler and
label/title in the same button component to locate it.
packages/sdks/web/src/index.ts (1)

219-263: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Extract the duplicated recorder-start block.

The startReplayRecorder(opts, (chunk) => this.send({ type: 'replay', payload: { ...chunk, session_id: activeSessionId } }), bumpActivity) invocation is duplicated verbatim for initial start and rotation restart. Extracting a small local helper (closing over bumpActivity) keeps the two paths from drifting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sdks/web/src/index.ts` around lines 219 - 263, The recorder start
logic is duplicated in the initial setup and the session-rotation restart inside
the sessionManager.onSessionIdChanged handler. Extract the shared
startReplayRecorder call into a small local helper near activeSessionId and
bumpActivity that closes over the send payload construction, then use that
helper in both places to keep the behavior in sync and reduce drift.
packages/sdks/web/src/replay/recorder.ts (1)

147-151: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

onUserActivity fires on every interactive emit, including high-frequency MouseMove/Scroll.

ACTIVE_SOURCES includes MouseMove (1) and Scroll (3), which emit many times per second. The downstream onUserActivity (bumpActivity) performs a synchronous localStorage.setItem on each call, so this can turn scroll/move into a localStorage-write hot path. Consider throttling activity bumps (e.g., at most once per second) since idle thresholds are in minutes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sdks/web/src/replay/recorder.ts` around lines 147 - 151,
`onUserActivity` in `Recorder` is firing on every interactive event, including
high-frequency MouseMove and Scroll events, which can trigger excessive
synchronous `localStorage` writes via `bumpActivity`. Update the
`record`/interactive-event path to throttle activity notifications so
`onUserActivity` is called at most once per second (or similar), while still
updating `lastActivityMs` for all interactive events; use the existing
`isInteractiveEvent` and `ACTIVE_SOURCES` flow in `recorder.ts` to place the
guard cleanly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/sdks/web/src/index.ts`:
- Around line 235-268: The replay stop flow in the web SDK still leaves the
session-rotation listener active, so `stopReplay()` does not fully stop
recording. Capture the unsubscribe returned by
`this.sessionManager.onSessionIdChanged(...)` in the same class that defines
`startReplayRecorder()`/`stopReplay()`, and make `stopReplay()` invoke that
cleanup before or alongside `stopReplayRecorder()`. Ensure the callback
registered in `onSessionIdChanged` cannot fire after a manual stop, so it does
not call `startReplayRecorder()` again on the next session rotation.

---

Outside diff comments:
In `@apps/worker/src/jobs/events.incoming-event.ts`:
- Around line 180-213: The delayed session-end job handling in incoming-event
processing is leaving a stale job behind when `session_id` changes, so later
events keep matching the old payload. Update the `createSessionEndJob` flow in
`events.incoming-event.ts` to remove or replace any existing pending job before
enqueueing the new one, and make sure the new job uses the current
`payload`/`sessionId` from the `isNewSession` path so `createSessionStart` and
`createSessionEndJob` stay in sync.

---

Nitpick comments:
In `@apps/start/src/components/sessions/replay/index.tsx`:
- Around line 324-342: The Skip idle control in the replay session component is
a two-state toggle but does not expose its pressed state to assistive
technologies. Update the button in the replay UI to include aria-pressed driven
by skipInactive so screen readers can announce the on/off state; use the
existing toggle handler and label/title in the same button component to locate
it.

In `@packages/sdks/web/src/index.ts`:
- Around line 219-263: The recorder start logic is duplicated in the initial
setup and the session-rotation restart inside the
sessionManager.onSessionIdChanged handler. Extract the shared
startReplayRecorder call into a small local helper near activeSessionId and
bumpActivity that closes over the send payload construction, then use that
helper in both places to keep the behavior in sync and reduce drift.

In `@packages/sdks/web/src/replay/recorder.ts`:
- Around line 147-151: `onUserActivity` in `Recorder` is firing on every
interactive event, including high-frequency MouseMove and Scroll events, which
can trigger excessive synchronous `localStorage` writes via `bumpActivity`.
Update the `record`/interactive-event path to throttle activity notifications so
`onUserActivity` is called at most once per second (or similar), while still
updating `lastActivityMs` for all interactive events; use the existing
`isInteractiveEvent` and `ACTIVE_SOURCES` flow in `recorder.ts` to place the
guard cleanly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5c05bd9a-725d-453d-9d75-10a9d5084785

📥 Commits

Reviewing files that changed from the base of the PR and between ab32860 and cb291eb.

📒 Files selected for processing (14)
  • apps/api/src/controllers/track.controller.ts
  • apps/start/src/components/sessions/replay/index.tsx
  • apps/start/src/components/sessions/replay/replay-player.tsx
  • apps/worker/src/jobs/events.incoming-event.ts
  • docs/session-replay-rendering.md
  • packages/db/code-migrations/20-add-window-id-to-replay-chunks.ts
  • packages/db/src/buffers/replay-buffer.ts
  • packages/db/src/services/session.service.ts
  • packages/sdks/sdk/src/index.ts
  • packages/sdks/web/src/index.ts
  • packages/sdks/web/src/replay/recorder.ts
  • packages/sdks/web/src/session-id-manager.ts
  • packages/sdks/web/src/storage.ts
  • packages/trpc/src/routers/session.ts

Comment thread packages/sdks/web/src/index.ts Outdated
… throttle, a11y

- web SDK: stopReplay() now detaches the session-rotation listener so a later
  rotation can't restart the recorder after a manual stop. Extract a shared
  startForSession() helper (was duplicated between initial start + restart).
- worker: createSessionEndJob replaces any existing delayed job for the device
  before enqueueing. Previously a client session_id rotation left the stale
  (deviceId-keyed) job with the old payload, so every later event saw
  isNewSession=true and spawned duplicate session_start events.
- recorder: throttle onUserActivity to <=1/sec — MouseMove/Scroll fired a
  localStorage write on every event. lastActivityMs still updates on all.
- dashboard: aria-pressed on the Skip-idle toggle for screen readers.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/worker/src/utils/session-handler.ts (1)

39-45: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Log the swallowed remove() failure.
When remove() throws on a locked job, the fallback add() is a no-op and the stale session payload stays in the queue. A warn/debug log here would make that failure mode observable without changing the non-fatal behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/worker/src/utils/session-handler.ts` around lines 39 - 45, The session
cleanup path in session-handler’s existing job removal swallows failures from
existing.remove(), which hides locked-job cleanup problems. Update the
getJob/remove flow to catch the remove error and emit a warn/debug log with the
jobId and error details before continuing with the fallback add; keep the
non-fatal behavior intact in the same existing/session queue logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@apps/worker/src/utils/session-handler.ts`:
- Around line 39-45: The session cleanup path in session-handler’s existing job
removal swallows failures from existing.remove(), which hides locked-job cleanup
problems. Update the getJob/remove flow to catch the remove error and emit a
warn/debug log with the jobId and error details before continuing with the
fallback add; keep the non-fatal behavior intact in the same existing/session
queue logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 41cdef73-dcd3-4cbf-aa63-05b6f1dd6200

📥 Commits

Reviewing files that changed from the base of the PR and between cb291eb and 4388b87.

📒 Files selected for processing (4)
  • apps/start/src/components/sessions/replay/index.tsx
  • apps/worker/src/utils/session-handler.ts
  • packages/sdks/web/src/index.ts
  • packages/sdks/web/src/replay/recorder.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • packages/sdks/web/src/replay/recorder.ts
  • packages/sdks/web/src/index.ts
  • apps/start/src/components/sessions/replay/index.tsx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant