RFC: index-resident metadata prefilter for vchordrq via INCLUDE columns

### Background

We run vector search over a ~6.2 M-row corpus of `vector(1024)`
candidates on PostgreSQL 18 with VectorChord. Most of our production
queries are **cohort-shaped**: a vector ORDER BY combined with several
mandatory equality / range / bitmask predicates over fixed-width
columns (feed, status, visibility, geo, time bucket, etc.). The
predicates typically cut the working set to 5–30 % of rows, but the
ORDER BY still needs ANN over the survivors.

Stock `vchordrq.prefilter` (the existing prefilter from #247, thanks!)
already helps by letting us avoid quicksort and scan the index. But
it's a *heap-side* prefilter: the IVF scan returns candidates, the
index fetches each candidate's full f32 vector, *then* the heap row is
checked, and rejected candidates' vector reads are wasted I/O.

For us this matters most on memory-constrained instances. With a 33 GB
vchord index that doesn't fit in `shared_buffers`, every wasted vector
fetch is a random page fault on the index, and at the cohort
selectivities we see (~10 %), that means 90 % of the IVF candidates
get read, scored, and discarded.

### Proposed feature

Allow `INCLUDE` columns of bigint type on vchordrq indexes, store their
values in the H0 leaf tuples, and add a candidate-filter callback that
runs *before* the vector page fetch. Introduce a GUC
`vchordrq.metadata_prefilter` with three modes: `off`, `reject_only`
(definitely-false predicates short-circuit further work), and
`covered_skip_heap` (lossless predicates can also skip the heap
recheck).

In the IVF scan: for each H0 candidate, evaluate the compiled metadata
predicates against the candidate's INCLUDE values; if any predicate
proves "definitely false", skip the candidate without faulting in its
vector. Otherwise proceed as today (vector fetch + rerank + heap
recheck if the predicate is lossy).

### Prototype + numbers

We have a working prototype in a private fork (it carries some
company-specific tooling and configuration in its history that we can't
publish). It implements all of the above. Happy to share the repo
privately with a maintainer if that would help review; we'd rather not
upstream-PR off that branch directly anyway, since it hardcodes our
domain's column names and would need to be generalized first.

Two boxes, two bottleneck regimes, identity recall verified vs the
no-metadata path on 20 random queries:

**Clean attribution test (32 GiB / 4 vCPU, CPU-bound).**
Stock 1.0.0 vs the prototype's metadata path on the *same* hardware,
with `vchordrq.prefilter=on`, `enable_sort=off`, and `read_stream` IO
applied to *both* runs (the only delta is the metadata feature itself
plus the wide-INCLUDE index):

| config | peak QPS | p95 at peak |
|---|---|---|
| stock 1.0.0, plain index, GUCs tuned | 5.68 (c=8) | 2 324 ms |
| **prototype, wide-INCLUDE + `metadata_prefilter=reject_only`** | **6.84 (c=4)** | **836 ms** |

That's **+20 % peak QPS, −64 % p95**. The QPS gain is modest because
this box is CPU-bound at c≥4 with only 4 vCPUs. The latency gain is
where the metadata reject earns its keep: sub-second p95 at the same
concurrency where stock-tuned is still over 2 seconds.

**Memory-bound regime (16 vCPU host, 16 GiB cgroup cap).** Same
prototype, toggling whether the metadata predicates are emitted:

| config | peak QPS @ c=16 | p95 @ c=16 |
|---|---|---|
| prototype, no metadata predicates emitted | 11.86 | 2 635 ms |
| **prototype, wide-metadata reject** | **37.79** | **732 ms** |

**+218 % peak QPS, −72 % p95** under memory pressure. The gain is much
larger here because rejecting a candidate before its f32 vector page
faults in is exactly the right thing to do when `shared_buffers` can't
hold the index. The 33 GB index on this corpus is dominated by per-row
f32 vector copies on `VectorTuple` pages. Those copies exist solely so
rerank can avoid a heap fetch, and they're exactly what the metadata
reject lets us avoid reading in the first place.

A side note from the attribution work: the prototype's non-metadata
improvements (read_stream IO, prefetcher batching, search-loop
refactor) produced essentially zero throughput delta over stock 1.0.0
with the same GUCs applied. The metadata-prefilter feature is the only
novel contribution worth upstreaming.

### Open questions for the maintainers

A couple of decisions are upstream-shaped and we'd value your steer
before writing the public PR:

1. **How users mark "lossless" vs "lossy" INCLUDE columns**.
   Some metadata predicates *prove* the heap row matches (e.g. a
   stored bitmask passing `(col & mask) = mask` is exact); some only
   *accelerate the reject case* (e.g. a 64-bit hash where a non-match
   is conclusive but a match is "probably matches, recheck the heap").
   Lossless predicates let us skip the heap recheck; lossy ones can't.
   The most natural way to surface this in PG would be per-column
   opclass options. We'd define two `bigint` opclasses,
   `bigint_lossless_ops` and `bigint_lossy_ops`, and use them on the
   INCLUDE columns:

   ```sql
   CREATE INDEX … USING vchordrq (vec vector_ip_ops)
     INCLUDE (flags_col bigint_lossless_ops,
              hash_col  bigint_lossy_ops);
   ```

   That keeps the SQL declarative and uses standard PG plumbing. Any
   pushback on that approach, or a different shape you'd prefer?

2. **Tuple format change**. The prototype adds a versioned metadata
   tail to `FrozenTuple` and `AppendableTuple`
   (`METADATA_TAIL_VERSION = 2`). For indexes built without INCLUDE
   columns, `attr_count = 0` ⇒ no tail emitted ⇒ existing 1.x indexes
   read identically. Is that backward-compat shape acceptable, or
   would you rather the entire feature be gated behind a new index
   reloption / opclass so the on-disk format is unchanged for existing
   users?

3. **Tests**. We'd plan to add ≥6 sqllogictest files in
   `tests/vchordrq/` covering eq / IN / range / bitmask / NULL / mixed
   with `ORDER BY`, plus an identity-recall test against the
   no-metadata path. Anything else you'd want before review?

If this is on your roadmap or you'd rather take a different shape,
please let us know. We'd rather align early than rebuild later.

### Acknowledgements

The prototype was implemented by @SamORichards (cc'd). I (@spenc-r) am happy to do the upstream-generalization work and own the
public PR; planning to add Sam as `Co-authored-by:` on each commit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: index-resident metadata prefilter for vchordrq via INCLUDE columns #457

Background

Proposed feature

Prototype + numbers

Open questions for the maintainers

Acknowledgements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

config	peak QPS	p95 at peak
stock 1.0.0, plain index, GUCs tuned	5.68 (c=8)	2 324 ms
prototype, wide-INCLUDE + `metadata_prefilter=reject_only`	6.84 (c=4)	836 ms

config	peak QPS @ c=16	p95 @ c=16
prototype, no metadata predicates emitted	11.86	2 635 ms
prototype, wide-metadata reject	37.79	732 ms

RFC: index-resident metadata prefilter for vchordrq via INCLUDE columns #457

Description

Background

Proposed feature

Prototype + numbers

Open questions for the maintainers

Acknowledgements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions