schema_registry/test: lower json recursion depth locally by pgellert · Pull Request #29546 · redpanda-data/redpanda

pgellert · 2026-02-05T18:31:51Z

Locally the test is failing with a stack overflow at a lower recursion limit, likely because of machine/OS/build-type differences.

So, lower the limit to ensure that tests pass locally, while still keeping the limit as is for CI to pick up any regressions.

Related to: #29290

Backports Required

Release Notes

none

Locally the test is failing at a lower recursion limit, likely because of machine/OS/build-type differences. So, lower the limit to ensure that tests pass locally, while still keeping the limit as is for CI to pick up any regressions.

Copilot

Pull request overview

This PR refactors the CI environment detection logic by moving it from a local helper function in security/tests/license_utils.h to a shared utility function in test_utils/test_env. The new shared function is then used to adjust the JSON schema recursion depth test limits based on whether the tests are running in CI or locally, addressing test failures on local development machines.

Changes:

Added a shared is_on_ci() function in test_utils/test_env to detect CI environment
Refactored existing CI detection code to use the new shared utility
Adjusted JSON schema recursion depth test to use lower limits locally (17) while maintaining higher limits (30) in CI

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
src/v/test_utils/test_env.h	Declares new `is_on_ci()` function
src/v/test_utils/test_env.cc	Implements CI detection using environment variable check
src/v/test_utils/BUILD	Adds abseil strings dependency for case-insensitive comparison
src/v/security/tests/license_utils.h	Removes local `is_on_ci()` implementation and uses shared version
src/v/security/tests/BUILD	Updates dependency from abseil strings to test_env
src/v/pandaproxy/schema_registry/test/test_json_schema.cc	Changes max_test_depth from constant to runtime value based on CI detection
src/v/pandaproxy/schema_registry/test/BUILD	Adds test_env dependency

Copilot · 2026-02-05T18:32:37Z

@@ -2376,7 +2377,7 @@ SEASTAR_THREAD_TEST_CASE(test_object_recursion_depths) {
    // With validation disabled, setting the limit above ~130 causes corruption
    // of the heap due to stack overflow, which typically manifests as a crash
    // during Seastar shutdown, or during is_superset.


The change from constexpr int to const int and the different values (30 vs 17) warrant a comment explaining why the local environment requires a lower recursion depth. This would help future maintainers understand the reasoning behind these specific values and the platform-dependent behavior.

Suggested change

// during Seastar shutdown, or during is_superset.

// during Seastar shutdown, or during is_superset.

// CI builds typically have a larger effective stack (and different build

// settings) than local developer runs, so they can safely test up to depth

// 30. Local environments have been observed to overflow the stack at lower

// depths (e.g. with sanitizers or smaller thread stacks), so we cap them at

// 17 to avoid heap/stack corruption. These values are empirical and

// platform-dependent; adjust only with care.

vbotbuildovich · 2026-02-05T20:05:01Z

Retry command for Build#80231

please wait until all jobs are finished before running the slash command

/ci-repeat 1
skip-redpanda-build
skip-units
skip-rebase
tests/rptest/tests/random_node_operations_smoke_test.py::RedpandaNodeOperationsSmokeTest.test_node_ops_smoke_test@{"cloud_storage_type":1,"mixed_versions":false}

vbotbuildovich · 2026-02-05T20:13:27Z

CI test results

test results on build#80231

test_class	test_method	test_arguments	test_kind	job_url	test_status	passed	reason	test_history
RedpandaNodeOperationsSmokeTest	test_node_ops_smoke_test	{"cloud_storage_type": 1, "mixed_versions": false}	integration	https://buildkite.com/redpanda/redpanda/builds/80231#019c2f23-6dd4-4e69-b5fb-ec8d1ee94d6a	FLAKY	8/11	Test FAILS after retries.Significant increase in flaky rate(baseline=0.0110, p0=0.0052, reject_threshold=0.0100)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=RedpandaNodeOperationsSmokeTest&test_method=test_node_ops_smoke_test
WriteCachingFailureInjectionE2ETest	test_crash_all	{"use_transactions": false}	integration	https://buildkite.com/redpanda/redpanda/builds/80231#019c2f21-a617-4018-9c66-a84a22045d3a	FLAKY	23/31	Test PASSES after retries.No significant increase in flaky rate(baseline=0.1098, p0=0.0405, reject_threshold=0.0100. adj_baseline=0.2947, p1=0.3032, trust_threshold=0.5000)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=WriteCachingFailureInjectionE2ETest&test_method=test_crash_all
VerifyConsumerOffsetsThruUpgrades	test_consumer_group_offsets	{"versions_to_upgrade": 2}	integration	https://buildkite.com/redpanda/redpanda/builds/80231#019c2f21-a61f-4ec3-af8c-faddbfd88b7e	FLAKY	10/11	Test PASSES after retries.No significant increase in flaky rate(baseline=0.0009, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000)	https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=VerifyConsumerOffsetsThruUpgrades&test_method=test_consumer_group_offsets

dotnwat

likely because of machine/OS/build-type differences.

in what way is it failing?

pgellert · 2026-02-06T09:20:59Z

in what way is it failing?

With a stack overflow. Json schemas need to be validated against a json metaschema and we use the jsoncons library to validate them for us. Their validation logic uses a recursion-based DFS to validate the schema, which can trigger stackoverflow at deep recursion depths.

pgellert added 2 commits February 5, 2026 18:29

test_utils: move is_on_ci() from license_utils

190ff81

pgellert requested a review from a team February 5, 2026 18:31

pgellert self-assigned this Feb 5, 2026

pgellert requested review from IoannisRP, Copilot and nguyen-andrew and removed request for a team February 5, 2026 18:31

github-actions Bot added area/build area/redpanda labels Feb 5, 2026

Copilot AI reviewed Feb 5, 2026

View reviewed changes

dotnwat reviewed Feb 6, 2026

View reviewed changes

pgellert requested a review from dotnwat February 6, 2026 09:21

IoannisRP approved these changes Feb 6, 2026

View reviewed changes

pgellert merged commit 5e266c2 into redpanda-data:dev Feb 6, 2026
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schema_registry/test: lower json recursion depth locally#29546

schema_registry/test: lower json recursion depth locally#29546
pgellert merged 2 commits into
redpanda-data:devfrom
pgellert:fix/local-json-recursion-limit

pgellert commented Feb 5, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 5, 2026

Uh oh!

vbotbuildovich commented Feb 5, 2026

Uh oh!

vbotbuildovich commented Feb 5, 2026

Uh oh!

dotnwat left a comment

Uh oh!

pgellert commented Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

-    // during Seastar shutdown, or during is_superset.
+    // during Seastar shutdown, or during is_superset.
+    // CI builds typically have a larger effective stack (and different build
+    // settings) than local developer runs, so they can safely test up to depth
+    // 30. Local environments have been observed to overflow the stack at lower
+    // depths (e.g. with sanitizers or smaller thread stacks), so we cap them at
+    // 17 to avoid heap/stack corruption. These values are empirical and
+    // platform-dependent; adjust only with care.

Conversation

pgellert commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backports Required

Release Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

vbotbuildovich commented Feb 5, 2026

Retry command for Build#80231

Uh oh!

vbotbuildovich commented Feb 5, 2026

CI test results

Uh oh!

dotnwat left a comment

Choose a reason for hiding this comment

Uh oh!

pgellert commented Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pgellert commented Feb 5, 2026 •

edited

Loading