pandaproxy/sr: add global_context compatibility fallback#29713
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates Schema Registry compatibility lookups to include the .__GLOBAL context as an intermediate fallback tier (behind the defaultToGlobal query parameter), aligning the resolution hierarchy with the reference implementation.
Changes:
- Implement multi-tier compatibility fallback for context-level and subject-level lookups, including
.__GLOBALas an intermediate fallback whendefaultToGlobal=true. - Update store/sharded-store APIs and add extensive unit tests covering fallback behavior across default, non-default, and global contexts.
- (Stacked) Add reserved subject/context validation (
__GLOBAL,__EMPTY,.__GLOBAL) with a newsubject_invaliderror mapping and corresponding tests.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/rptest/tests/schema_registry_test.py | Adds an rptest covering reserved subject/context rejection behavior. |
| src/v/pandaproxy/schema_registry/types.h | Introduces global_context and default_top_level_compat; declares subject/context validation helper. |
| src/v/pandaproxy/schema_registry/types.cc | Implements validate_context_subject() throwing subject_invalid. |
| src/v/pandaproxy/schema_registry/store.h | Reworks compatibility resolution to support default_to_global and consult global_context appropriately. |
| src/v/pandaproxy/schema_registry/sharded_store.h | Updates get_compatibility(context, ...) signature to include fallback flag. |
| src/v/pandaproxy/schema_registry/sharded_store.cc | Wires fallback-aware context compatibility through to the underlying store. |
| src/v/pandaproxy/schema_registry/handlers.cc | Plumbs defaultToGlobal into GET /config and context-only GET /config/{subject} paths; adds reserved-name validation calls. |
| src/v/pandaproxy/schema_registry/seq_writer.cc | Updates internal compatibility reads to use the new fallback-aware context API. |
| src/v/pandaproxy/schema_registry/errors.h | Adds compatibility_not_found(context) and subject_invalid(...) helpers. |
| src/v/pandaproxy/schema_registry/error.h / error.cc | Adds subject_invalid to SR error codes and maps it to reply error codes. |
| src/v/pandaproxy/error.h / error.cc | Adds/mappings for reply error code 42208 (subject_invalid). |
| src/v/pandaproxy/schema_registry/test/sharded_store.cc | Adds comprehensive sharded-store tests validating fallback chains across contexts/subjects. |
| src/v/pandaproxy/schema_registry/test/store.cc | Updates store tests for the new fallback-aware context compatibility API; removes an invalid/obsolete test. |
| src/v/pandaproxy/schema_registry/test/consume_to_store.cc | Updates test to call the new context compatibility API. |
| src/v/pandaproxy/schema_registry/test/context_subject.cc | Adds unit tests for reserved subject/context validation and error code correctness. |
| auto fallback = parse::query_param<std::optional<default_to_global>>( | ||
| *rq.req, "defaultToGlobal") | ||
| .value_or(default_to_global::no); | ||
|
|
||
| // Ensure we see latest writes | ||
| co_await rq.service().writer().read_sync(); | ||
|
|
||
| auto res = co_await rq.service().schema_store().get_compatibility( | ||
| default_context); | ||
| default_context, fallback); |
There was a problem hiding this comment.
The new global_context fallback behavior is exposed via the defaultToGlobal query param on GET /config and GET /config/{subject}, but there is no rptest/integration coverage asserting that setting compatibility on :.__GLOBAL: is reflected when defaultToGlobal=true (and not reflected when it is absent/false). Adding an integration test would help catch regressions in the handler/query-param wiring (as opposed to only store-level resolution logic).
e3c7db8 to
1c15ab3
Compare
|
Force push to rebase against latest base |
Retry command for Build#81116please wait until all jobs are finished before running the slash command |
9b807fb to
19012db
Compare
Retry command for Build#81192please wait until all jobs are finished before running the slash command |
19012db to
c749564
Compare
|
Force push to condense fallback code. |
c749564 to
d273045
Compare
|
Force push to rebase on latest base. |
d273045 to
6616407
Compare
|
Force push to add back |
6616407 to
8fdbe6a
Compare
|
Force push to rebase on base. |
933631f to
6e0d3ef
Compare
|
Force pushes:
The resulting code is roughly the same as before these force pushes; unit tests were moved around for better organization. It'd likely be easier for the reviewer to review the new series of commits instead of looking at the changes between force pushes. |
pgellert
left a comment
There was a problem hiding this comment.
Looks great! Thanks for bearing with all the change requests!
6e0d3ef to
4dca154
Compare
6a8e0c5 to
6d30353
Compare
|
Force push for formatting |
pgellert
left a comment
There was a problem hiding this comment.
I think you need to update this with a rebase, but otherwise this looks good to go
Wires the `default_to_global` parameter into the context-only overload of `get_compatibility` for consistency with the subject-based overload, and threads the `defaultToGlobal` query param through to context-only lookups. The parameter itself is unused for now and the fallback logic is left as a TODO for a follow-up commit.
Replace the placeholder get_compatibility(context) implementation with proper fallback semantics: non-default contexts return an error when no config is set and fallback is disabled, while the default context always resolves to the hardcoded default. Adds sharded_store tests covering all context/fallback combinations and updates audit_log_test accordingly.
The reference implementation does not require a subject to exist before its compatibility config can be queried. Fix the fallback path so a missing subject falls through to context-level resolution instead of immediately returning compatibility_not_found. Add sharded_store tests covering subject config resolution in both default and non-default contexts with and without fallback enabled.
The reference implementation supports a special global_context
(.__GLOBAL) that sits at the top of the compatibility fallback
hierarchy. Redpanda's get_compatibility had no awareness of this
context: when no context-level config was set, both the context
and subject overloads fell through to the hardcoded default
directly, skipping global_context entirely.
Add global_context as a fallback tier so that the resolution
chains become:
With fallback enabled:
context: context config → global config → hardcoded default
subject: subject config → context config → global → hardcoded
global_context itself always resolves to its own config or the
hardcoded default regardless of the fallback flag, and subjects
within global_context always fall through to global_context
resolution.
Extend sharded_store tests to cover all context type (default,
non-default, global) * subject presence * fallback flag
combinations, verifying that each tier in the chain is consulted
in order and that clearing a tier correctly exposes the next one.
Wire the `default_to_global` parameter into the context-only overload of `get_mode` for consistency with the subject-based overload, and thread the `defaultToGlobal` query param through to context-only lookups. The parameter itself is unused for now and the fallback logic is left for a follow-up commit.
Replace the placeholder get_mode(context) implementation with proper fallback semantics: non-default contexts return an error when no mode is set and fallback is disabled, while the default context always resolves to the hardcoded default. Adds sharded_store tests covering all context/fallback combinations for both context-level and subject-level mode lookups.
The reference implementation supports a special global_context
(.__GLOBAL) that sits at the top of the mode fallback
hierarchy. Redpanda's get_mode had no awareness of this
context: when no context-level mode was set, both the context
and subject overloads fell through to the hardcoded default
directly, skipping global_context entirely.
Add global_context as a fallback tier so that the resolution
chains become:
With fallback enabled:
context: context mode → global mode → hardcoded default
subject: subject mode → context mode → global → hardcoded
global_context itself always resolves to its own mode or the
hardcoded default regardless of the fallback flag, and subjects
within global_context always fall through to global_context
resolution.
Extend sharded_store tests to cover all context type (default,
non-default, global) * subject presence * fallback flag
combinations, verifying that each tier in the chain is consulted
in order and that clearing a tier correctly exposes the next one.
6d30353 to
304348c
Compare
|
Force push to rebase on latest dev |
Previously, the schema registry's
GET /configandGET /config/{subject}endpoints did not support the global context (.__GLOBAL) fallback tier. When no context-level compatibility config was set, lookups returned the hardcoded default directly, ignoring any config set on the global context. This diverges from the reference implementation's fallback hierarchy, where.__GLOBALsits between per-context config and the hardcoded default.This PR implements the full multi-tier fallback chain. With the
defaultToGlobalquery parameter enabled:With the
defaultToGlobalquery parameter disabled, only the default context and the global context resolve to the hardcoded default; all other lookups return compatibility_not_found.GET /configandGET /config/{subject}now consult the global context (.__GLOBAL) whendefaultToGlobal=trueand no context-level config is set. Users who have set compatibility on.__GLOBALwill see it reflected in lookups from other contexts.Part of CORE-15192.
Additionally, this change also fixes a bug where attempting to retrieve a config for non-existent subjects immediately returned a not found error (regardless of the
defaultToGlobalparameter), which did not match the reference implementation.Backports Required
Release Notes
Bug Fixes
defaultToGlobal=trueinstead of always returning an error.