Skip to content

Implement computed-airtable-fields (framework for pushing aggregate fields back into Airtable)#2614

Merged
Will-Howard merged 12 commits into
masterfrom
wh-2582-computed-airtable-fields-2026-06
Jun 10, 2026
Merged

Implement computed-airtable-fields (framework for pushing aggregate fields back into Airtable)#2614
Will-Howard merged 12 commits into
masterfrom
wh-2582-computed-airtable-fields-2026-06

Conversation

@Will-Howard

@Will-Howard Will-Howard commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Description

Creates a new library, computed-airtable-fields:

  • Computed fields are defined in libraries/computed-airtable-fields/src/definitions.ts, analogous to schema.ts
  • A new cron job, recomputeComputedAirtableFieldsCron in pg-sync-service runs every 2 hours to recalculate the fields
  • I suggest looking at the tests (libraries/computed-airtable-fields/src/core.test.ts) to get an idea of the properties of this recalculation logic, e.g. only writing changed rows
  • I ran this locally and measured the time taken. It takes ~400ms/row, which is around 12 min for a full refresh with the 3 fields I have added here, with most of this being the single db.update calls for changed rows

Issue

Fixes #2582

Developer checklist

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@Will-Howard, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 56 minutes and 41 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e0428dde-76c7-405f-b6a4-46d4e7b9c1e0

📥 Commits

Reviewing files that changed from the base of the PR and between 13222b2 and 12d65e4.

📒 Files selected for processing (2)
  • libraries/computed-airtable-fields/src/definitions.test.ts
  • libraries/computed-airtable-fields/src/definitions.ts
📝 Walkthrough

Walkthrough

This PR adds a new package @bluedot/computed-airtable-fields (types, implementation, tests, configs, and README) that defines computed Airtable fields and a recomputeValues routine which paginates rows, computes new values per-chunk, diffs and writes only changed values (with optional beforeWrite hooks). It registers concrete computed-field definitions for exercise response counts, resource completion counts, and resource average ratings. The pg-sync-service now depends on the package (excluded from tsup) and schedules recomputeComputedAirtableFieldsCron to run every two hours with a reentry guard and shared rate-limiting.

Possibly related PRs

  • bluedotimpact/bluedot#2605: Adds the underlying computed Airtable columns and enums that the new computed-field definitions depend on.
  • bluedotimpact/bluedot#1326: Also modifies pg-sync-service cron wiring to add scheduled work, intersecting with this PR's cron scheduling changes.
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description is incomplete. While it explains the changes and links to issue #2582, it lacks specific explanations and is missing several standard sections from the template. Expand the description to explain the 'why' behind the changes, clarify any design decisions or potential concerns, and ensure all relevant checklist items are properly addressed.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately and concisely summarises the main change: implementing a new computed-airtable-fields framework for pushing aggregate fields back into Airtable.
Linked Issues check ✅ Passed The PR fully implements all requirements from issue #2582: a new library for computing aggregate fields is created, a cron job for periodic updates is implemented, and three computed fields (numResponses, numCompletions, averageRating) are populated.
Out of Scope Changes check ✅ Passed All code changes are directly scoped to issue #2582: new library creation, cron job implementation, and computed field definitions. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch wh-2582-computed-airtable-fields-2026-06

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces @bluedot/computed-airtable-fields, a framework for periodically recomputing aggregate values and writing them back to Airtable, and wires it into pg-sync-service as a cron job that runs every 2 hours.

  • Core engine (core.ts): Cursor-based pagination (500-row chunks), per-chunk and per-write error isolation, change-detection to avoid spurious writes, and an optional beforeWrite hook used to share the existing Airtable rate-limiter budget.
  • Three computed fields (definitions.ts): computedNumResponses on exercise (scoped inArray query), computedNumCompletions and computedAverageRating on resource (PostgreSQL unnest + arrayOverlaps pattern with correct post-filtering for multi-resource rows).
  • Test coverage: Both core.test.ts and definitions.test.ts are thorough, covering idempotency, null transitions, multi-chunk boundaries, unknown-field validation, and error resilience at both compute and write layers.

Confidence Score: 5/5

Safe to merge — the new library is additive and read-path failures are fully isolated; writes only happen for rows where the value actually changed.

The computation engine is well-tested across all meaningful edge cases (idempotency, null transitions, multi-chunk, error isolation), and the cron integration correctly guards against concurrent runs while sharing the existing rate-limiter. The two suggestions are minor observability and API-surface nits that do not affect correctness.

No files require special attention. The SQL patterns in definitions.ts (unnest + arrayOverlaps) are the most novel part and are covered by dedicated integration tests.

Important Files Changed

Filename Overview
libraries/computed-airtable-fields/src/core.ts Core recompute engine: cursor-based pagination, per-chunk error isolation, beforeWrite hook, and change-detection logic. Well-implemented with no functional issues.
libraries/computed-airtable-fields/src/definitions.ts Three computed fields using correct SQL patterns (inArray scoped query, unnest+arrayOverlaps, avg with NO_RESPONSE exclusion); chunk-ID post-filtering is correct.
apps/pg-sync-service/src/lib/cron.ts Adds recomputeComputedAirtableFieldsCron with guard flag, correct 6-field cron schedule (every 2h), sequential field processing, and shared rateLimiter. Individual row errors are not surfaced beyond the counts.
libraries/computed-airtable-fields/src/core.test.ts Comprehensive tests covering single/multi-chunk, idempotency, null handling, beforeWrite sequencing, unknown-field validation, and error resilience at both compute and write layers.
libraries/computed-airtable-fields/src/definitions.test.ts Happy-path tests for all three compute functions wired through the real definitions registry; covers completed-only counting, multi-resource array rows, and average-rating edge cases.
libraries/computed-airtable-fields/src/index.ts Exports recomputeValues and two core types but omits ComputedAirtableFieldGroup, which is needed to type external definition arrays.
libraries/computed-airtable-fields/package.json New package scaffold; only runtime dependency is @bluedot/db, test deps are appropriate.
libraries/computed-airtable-fields/README.md Minimal how-to for adding new computed fields; clear and complete.

Sequence Diagram

sequenceDiagram
    participant Cron as recomputeComputedAirtableFieldsCron
    participant Core as recomputeValues
    participant PG as PostgreSQL
    participant RL as RateLimiter
    participant AT as Airtable (via db.update)

    Cron->>Core: "recomputeValues({ db, definition, beforeWrite })"
    Core->>PG: "SELECT id, field WHERE id > cursor LIMIT 500"
    PG-->>Core: chunk of rows
    Core->>Core: "compute(db, chunkIds) → { id: newValue }"
    Core->>Core: diff newValue vs currentValue
    loop for each changed row
        Core->>RL: beforeWrite() → rateLimiter.acquire()
        RL-->>Core: slot acquired
        Core->>AT: "db.update(table, { id, field: newValue })"
        AT-->>Core: ok / error (caught, counted as failed)
    end
    Core->>PG: next chunk (cursor advances)
    Core-->>Cron: "{ checked, updated, failed, errors }"
    Cron->>Cron: logger.info(counts)
Loading

Reviews (2): Last reviewed commit: "Make compute functions more query-effici..." | Re-trigger Greptile

Comment thread libraries/computed-airtable-fields/src/definitions.ts
Comment thread libraries/computed-airtable-fields/README.md Outdated
@Will-Howard Will-Howard marked this pull request as draft June 5, 2026 14:14

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
libraries/computed-airtable-fields/src/core.ts (1)

55-59: 💤 Low value

Consider validating that compute returns values for all input ids.

The diff loop iterates Object.entries(computed), so if the compute function returns fewer ids than requested, those missing ids are silently skipped (their stored values remain unchanged). Whilst all current definitions follow the contract of returning a value for every input id, adding validation would catch future bugs earlier.

🛡️ Proposed validation
 const processChunk = async (rows: { id: string; current: ComputedAirtableFieldValue }[]) => {
   let computed: Record<string, ComputedAirtableFieldValue>;
   try {
     computed = await definition.compute(db, rows.map((r) => r.id));
   } catch (err) {
     // Whole chunk fails — we couldn't determine fresh values, so count every row as failed.
     errors.push(err);
     failed += rows.length;
     return;
   }
+
+  // Validate contract: compute must return a value for every input id
+  for (const row of rows) {
+    if (!(row.id in computed)) {
+      const err = new Error(`Compute function did not return value for id: ${row.id}`);
+      errors.push(err);
+      failed += rows.length;
+      return;
+    }
+  }

   const currentById = Object.fromEntries(rows.map((r) => [r.id, r.current]));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libraries/computed-airtable-fields/src/core.ts` around lines 55 - 59, The
loop over Object.entries(computed) can silently ignore missing ids from the
compute result; after calling compute (the function that produces `computed`)
validate that every expected id (use the keys from `currentById` or the original
requested ids array) exists in `computed`, and if any are missing throw or
return a clear error specifying the missing ids (include the id list in the
message) so you fail-fast; reference the `computed`, `currentById` and the
compute call to locate where to add this validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/pg-sync-service/src/lib/cron.ts`:
- Around line 64-76: The recompute loop should not abort on a single field
error: wrap the await recomputeValues call inside a per-field try/catch so each
{table, fields} -> [field, compute] iteration catches its own errors, logs them
(use logger.error with context including getTableName(table.pg) and field), and
continues to the next field; keep beforeWrite: () => rateLimiter.acquire() as-is
and still log successful runs with logger.info when recomputeValues returns
checked/updated/failed.

---

Nitpick comments:
In `@libraries/computed-airtable-fields/src/core.ts`:
- Around line 55-59: The loop over Object.entries(computed) can silently ignore
missing ids from the compute result; after calling compute (the function that
produces `computed`) validate that every expected id (use the keys from
`currentById` or the original requested ids array) exists in `computed`, and if
any are missing throw or return a clear error specifying the missing ids
(include the id list in the message) so you fail-fast; reference the `computed`,
`currentById` and the compute call to locate where to add this validation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d5caac56-b0d5-41f7-9b7a-73c0c2281427

📥 Commits

Reviewing files that changed from the base of the PR and between 854befb and 9133ee5.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (13)
  • apps/pg-sync-service/package.json
  • apps/pg-sync-service/src/lib/cron.ts
  • libraries/computed-airtable-fields/.env.test
  • libraries/computed-airtable-fields/README.md
  • libraries/computed-airtable-fields/eslint.config.mjs
  • libraries/computed-airtable-fields/package.json
  • libraries/computed-airtable-fields/src/core.test.ts
  • libraries/computed-airtable-fields/src/core.ts
  • libraries/computed-airtable-fields/src/definitions.test.ts
  • libraries/computed-airtable-fields/src/definitions.ts
  • libraries/computed-airtable-fields/src/index.ts
  • libraries/computed-airtable-fields/tsconfig.json
  • libraries/computed-airtable-fields/vitest.config.mjs

Comment thread apps/pg-sync-service/src/lib/cron.ts Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/pg-sync-service/src/lib/cron.ts`:
- Around line 68-73: The current destructuring from recomputeValues ignores its
returned errors array, so when failed > 0 we only log counts and lose diagnostic
details; update the call to destructure errors (e.g., const { checked, updated,
failed, errors } = await recomputeValues(...)) and after the existing
logger.info add a logger.error branch: if (failed > 0) log an error-level
message that includes the key `${getTableName(table.pg)}.${field}` and at least
one item from errors (errors[0] or a summarized errors[0].message) so callers
can see an example failure and stack/message for debugging.
- Around line 85-88: The three cron registrations are currently executed at
module import; move the COMPUTED_AIRTABLE_FIELDS_RECOMPUTE_SCHEDULE registration
(the call that schedules recomputeComputedAirtableFieldsCron) out of
apps/pg-sync-service/src/lib/cron.ts so it is not started at import time and
instead register it from the service post-start path (e.g., after the optional
initial sync in apps/pg-sync-service/src/index.ts post-start hook). Concretely:
stop calling cron.schedule for COMPUTED_AIRTABLE_FIELDS_RECOMPUTE_SCHEDULE in
the module top-level, expose a function like
registerComputedAirtableRecomputeSchedule or startComputedAirtableCron that
performs cron.schedule(recomputeComputedAirtableFieldsCron,
COMPUTED_AIRTABLE_FIELDS_RECOMPUTE_SCHEDULE), and invoke that function from the
index.ts post-start/after-initial-sync code path so the schedule only starts
after bootstrap completes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c3d4a0e0-64dd-468d-b4a1-4b978d095e87

📥 Commits

Reviewing files that changed from the base of the PR and between 9133ee5 and a81a1fc.

📒 Files selected for processing (2)
  • apps/pg-sync-service/src/lib/cron.test.ts
  • apps/pg-sync-service/src/lib/cron.ts

Comment thread apps/pg-sync-service/src/lib/cron.ts Outdated
Comment thread apps/pg-sync-service/src/lib/cron.ts

@marn-in-prod marn-in-prod left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good, mostly changes on edge cases. I'll add them in a second

@marn-in-prod marn-in-prod left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just the edge case that we should probably make a decision on

.where(and(
arrayOverlaps(resourceCompletionTable.pg.resourceId, ids),
// != NO_RESPONSE also excludes NULL (NULL != 0 → UNKNOWN, filtered out).
ne(resourceCompletionTable.pg.resourceFeedback, RESOURCE_FEEDBACK.NO_RESPONSE),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computedNumCompletions filters on isCompleted = true but this average doesn't, so a row with isCompleted: false and an actual rating still moves the average while not counting as a completion.

This is currently possible in the UI, rate a resource and then uncomplete it.

The data state already exists, but since this field is new we should probably consciously decide what we intend here and add a test to make the decision obvious (or a comment)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable 👍. I think I'll go with isCompleted = true everywhere to keep things simple (so we can keep the mental model that there's not much different between a row having isCompleted = false vs simply not existing)

@Will-Howard Will-Howard temporarily deployed to wh-2582-computed-airtable-fields-2026-06 - bluedot-preview PR #2614 June 10, 2026 09:08 — with Render Destroyed
@Will-Howard Will-Howard temporarily deployed to wh-2582-computed-airtable-fields-2026-06 - bluedot-storybook-preview PR #2614 June 10, 2026 09:08 — with Render Destroyed
@Will-Howard Will-Howard merged commit f132ad2 into master Jun 10, 2026
7 checks passed
@Will-Howard Will-Howard deleted the wh-2582-computed-airtable-fields-2026-06 branch June 10, 2026 09:13
Will-Howard added a commit that referenced this pull request Jun 10, 2026
Squashed rebase of the original branch onto master: the getFirst refactor
commits merged separately as #2620, and the computed-airtable-fields
library (#2614) is adapted to read the flipped tables.
Will-Howard added a commit that referenced this pull request Jun 10, 2026
Squashed rebase of the original branch onto master: the getFirst refactor
commits merged separately as #2620, and the computed-airtable-fields
library (#2614) is adapted to read the flipped tables.
Will-Howard added a commit that referenced this pull request Jun 10, 2026
Squashed rebase of the original branch onto master: the getFirst refactor
commits merged separately as #2620, and the computed-airtable-fields
library (#2614) is adapted to read the flipped tables.
Will-Howard added a commit that referenced this pull request Jun 11, 2026
Squashed rebase of the original branch onto master: the getFirst refactor
commits merged separately as #2620, and the computed-airtable-fields
library (#2614) is adapted to read the flipped tables.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Moving resource/exercise responses into Postgres-only: Create framework for pushing aggregate fields back into Airtable

2 participants