schema_registry/test: lower json recursion depth locally#29546
Conversation
Locally the test is failing at a lower recursion limit, likely because of machine/OS/build-type differences. So, lower the limit to ensure that tests pass locally, while still keeping the limit as is for CI to pick up any regressions.
There was a problem hiding this comment.
Pull request overview
This PR refactors the CI environment detection logic by moving it from a local helper function in security/tests/license_utils.h to a shared utility function in test_utils/test_env. The new shared function is then used to adjust the JSON schema recursion depth test limits based on whether the tests are running in CI or locally, addressing test failures on local development machines.
Changes:
- Added a shared
is_on_ci()function intest_utils/test_envto detect CI environment - Refactored existing CI detection code to use the new shared utility
- Adjusted JSON schema recursion depth test to use lower limits locally (17) while maintaining higher limits (30) in CI
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/v/test_utils/test_env.h | Declares new is_on_ci() function |
| src/v/test_utils/test_env.cc | Implements CI detection using environment variable check |
| src/v/test_utils/BUILD | Adds abseil strings dependency for case-insensitive comparison |
| src/v/security/tests/license_utils.h | Removes local is_on_ci() implementation and uses shared version |
| src/v/security/tests/BUILD | Updates dependency from abseil strings to test_env |
| src/v/pandaproxy/schema_registry/test/test_json_schema.cc | Changes max_test_depth from constant to runtime value based on CI detection |
| src/v/pandaproxy/schema_registry/test/BUILD | Adds test_env dependency |
| @@ -2376,7 +2377,7 @@ SEASTAR_THREAD_TEST_CASE(test_object_recursion_depths) { | |||
| // With validation disabled, setting the limit above ~130 causes corruption | |||
| // of the heap due to stack overflow, which typically manifests as a crash | |||
| // during Seastar shutdown, or during is_superset. | |||
There was a problem hiding this comment.
The change from constexpr int to const int and the different values (30 vs 17) warrant a comment explaining why the local environment requires a lower recursion depth. This would help future maintainers understand the reasoning behind these specific values and the platform-dependent behavior.
| // during Seastar shutdown, or during is_superset. | |
| // during Seastar shutdown, or during is_superset. | |
| // CI builds typically have a larger effective stack (and different build | |
| // settings) than local developer runs, so they can safely test up to depth | |
| // 30. Local environments have been observed to overflow the stack at lower | |
| // depths (e.g. with sanitizers or smaller thread stacks), so we cap them at | |
| // 17 to avoid heap/stack corruption. These values are empirical and | |
| // platform-dependent; adjust only with care. |
Retry command for Build#80231please wait until all jobs are finished before running the slash command |
dotnwat
left a comment
There was a problem hiding this comment.
likely because of machine/OS/build-type differences.
in what way is it failing?
With a stack overflow. Json schemas need to be validated against a json metaschema and we use the |
Locally the test is failing with a stack overflow at a lower recursion limit, likely because of machine/OS/build-type differences.
So, lower the limit to ensure that tests pass locally, while still keeping the limit as is for CI to pick up any regressions.
Related to: #29290
Backports Required
Release Notes