Skip to content

HNT-1890 (2/4): RedisCorpusCache + corpus_cache config + unit tests#1439

Open
mmiermans wants to merge 1 commit into
hnt-1890-cache-1-infrafrom
hnt-1890-cache-2-impl
Open

HNT-1890 (2/4): RedisCorpusCache + corpus_cache config + unit tests#1439
mmiermans wants to merge 1 commit into
hnt-1890-cache-1-infrafrom
hnt-1890-cache-2-impl

Conversation

@mmiermans

@mmiermans mmiermans commented Apr 27, 2026

Copy link
Copy Markdown
Collaborator

PR 2/4 to implement shared caching of curated recommendations between pods.

Stack: 1/4 infra + architecture doc → 2/4 (this) → 3/4 integration tests → 4/4 wire-up.

⚠️ Base of this PR is hnt-1890-cache-1-infra (PR 1/4), not main. Review after PR 1/4 has been approved — that PR establishes the architecture this one implements.

References

JIRA: HNT-1890

Description

Implements RedisCorpusCache — the distributed stale-while-revalidate (SWR) cache that wraps the Pocket Corpus GraphQL backends. Not yet wired into the providers in this PR; consumer wiring lands in PR 4/4. With cache = "none" (the default added in this PR), the class is unreachable from production code, so this PR is behavior-preserving.

The architecture and full rationale live in docs/operations/curated-recommendations/corpus-cache.md (added in PR 1/4).

What's in this PR

  • merino/curated_recommendations/corpus_backends/redis_cache.py (~322 LOC) — RedisCorpusCache and the RedisCachedScheduledSurface / RedisCachedSections adapters. Accepts an injected CorpusCacheConfig dataclass, so the class is independently testable without Dynaconf or app startup.
  • merino/configs/default.toml — new [default.curated_recommendations.corpus_cache] section (cache="none", soft_ttl_sec=60, hard_ttl_sec=86400, lock_ttl_sec=30, key_prefix="curated:v1").
  • merino/configs/__init__.py — Dynaconf validators for the new section (allowed values, type/range checks).
  • tests/unit/curated_recommendations/corpus_backends/test_redis_cache.py (~616 LOC) — unit suite covering SWR semantics, lock contention between concurrent callers, cold-miss/503 path, stale deserialization, retry-after-lock-held, and error handling.

Implementation decisions specific to this PR

Decision Choice Why
Cache format Pydantic model dicts via orjson Processed models cached, not raw GraphQL. Saves CPU across the ~300 pods
CorpusCacheConfig injection Dataclass passed at construction Keeps redis_cache.py decoupled from merino.configs.settings and trivially unit-testable
Soft TTL default 60s Average propagation time to NewTab is ~half this value (~30s avg)
Hard TTL default 1 day Safety net so data survives extended Corpus API outages

Integration tests (real Redis via testcontainers) ship in PR 3/4.

PR Review Checklist

  • Conforms to Contribution Guidelines
  • PR title starts with JIRA reference
  • [load test: (abort|skip|warn)] keywords applied (n/a — code unreachable until PR 4/4)
  • Documentation updated (in PR 1/4)
  • Test coverage expanded (unit; integration in PR 3/4)

Adds the shared L2 cache class that wraps the Pocket Corpus GraphQL
backends, plus its config defaults and Dynaconf validators. Not yet
wired into the providers — consumer wiring lands in PR 4/4. With the
default cache="none" this code is unreachable, so this PR is behavior-
preserving on production today.

Files:
- merino/curated_recommendations/corpus_backends/redis_cache.py:
  RedisCorpusCache implementation (distributed SWR + SET NX EX lock).
  Accepts an injected CorpusCacheConfig dataclass — no settings import.
- merino/configs/default.toml: [default.curated_recommendations.corpus_cache]
  with cache/soft_ttl_sec/hard_ttl_sec/lock_ttl_sec/key_prefix.
- merino/configs/__init__.py: Dynaconf validators for the new section
  (allowed values, type/range checks).
- tests/unit/.../test_redis_cache.py: 616-line unit suite covering
  SWR semantics, lock contention between concurrent callers, the
  cold-miss/503 path, stale deserialization, retry-after-lock-held,
  and error handling.

Architecture and design tradeoffs are documented in
docs/operations/curated-recommendations/corpus-cache.md (PR 1/4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mmiermans mmiermans force-pushed the hnt-1890-cache-2-impl branch from e65da9d to 80f8f34 Compare April 27, 2026 20:45
@mmiermans mmiermans changed the title HNT-1890 (2/4): RedisCorpusCache implementation + tests + docs HNT-1890 (2/4): RedisCorpusCache + corpus_cache config + unit tests Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant