Skip to content

[CORE-15281] Cloud Storage Clients: Implement batch delete for GCS#29246

Merged
oleiman merged 8 commits into
redpanda-data:devfrom
oleiman:ct/core-15281/multi-part-delete-gcs
Jan 21, 2026
Merged

[CORE-15281] Cloud Storage Clients: Implement batch delete for GCS#29246
oleiman merged 8 commits into
redpanda-data:devfrom
oleiman:ct/core-15281/multi-part-delete-gcs

Conversation

@oleiman

@oleiman oleiman commented Jan 13, 2026

Copy link
Copy Markdown
Member

This PR adds support for multipart plural deletes on GCS.

Proof: https://buildkite.com/redpanda/redpanda/builds/78997#019bba33-c39b-473f-9cd0-01a8424fd158

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.3.x
  • v25.2.x
  • v25.1.x

Release Notes

Improvements

  • Add batch delete support for GCS cloud storage clients

…ndler

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
@oleiman oleiman force-pushed the ct/core-15281/multi-part-delete-gcs branch from ec84329 to 76680d3 Compare January 14, 2026 00:04
@oleiman oleiman marked this pull request as ready for review January 14, 2026 00:07
Copilot AI review requested due to automatic review settings January 14, 2026 00:07

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements batch delete support for Google Cloud Storage (GCS) in Redpanda's cloud storage clients. GCS does not support S3-style batch delete operations, requiring a different implementation using GCS's native batch API with multipart/mixed format.

Changes:

  • Added GCS batch delete API implementation using multipart/mixed format
  • Modified multipart response parser to handle empty response bodies
  • Updated tests to validate batch delete behavior for success, partial errors, and edge cases

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/v/http/tests/utils.h Added content_type_overrides parameter to flexible_function_handler
src/v/http/tests/utils.cc Implemented content_type_overrides handling in response processing
src/v/cloud_storage_clients/util.cc Fixed multipart parser to accept responses with multiple trailing CRLFs
src/v/cloud_storage_clients/tests/util_test.cc Added test for multipart parser with empty body case
src/v/cloud_storage_clients/tests/gcs_client_test.cc Added comprehensive test suite for GCS batch delete operations
src/v/cloud_storage_clients/tests/BUILD Added build target for gcs_client_test
src/v/cloud_storage_clients/s3_client.h Added method declarations for GCS batch delete
src/v/cloud_storage_clients/s3_client.cc Implemented GCS batch delete request creation and response parsing
src/v/cloud_storage_clients/BUILD Added rapidjson dependency
src/v/cloud_storage/tests/remote_test.cc Updated tests to validate GCS batch delete behavior
src/v/cloud_io/tests/s3_imposter.h Added content_type_overrides parameter and batch delete parsing function
src/v/cloud_io/tests/s3_imposter.cc Implemented batch delete request handling in test imposter
src/v/cloud_io/remote.cc Set GCS batch delete limit to 100 keys per request

Comment thread src/v/cloud_storage_clients/tests/gcs_client_test.cc Outdated
Comment thread src/v/cloud_storage_clients/tests/gcs_client_test.cc Outdated
Comment thread src/v/cloud_storage_clients/s3_client.cc Outdated
@oleiman oleiman force-pushed the ct/core-15281/multi-part-delete-gcs branch from 76680d3 to cb8941e Compare January 14, 2026 00:13
@vbotbuildovich

vbotbuildovich commented Jan 14, 2026

Copy link
Copy Markdown
Collaborator

CI test results

test results on build#78994
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
ControllerForcedReconfiguration_Size5 test_cluster_recovery {"scenario": "Simple"} integration https://buildkite.com/redpanda/redpanda/builds/78994#019bb9e9-e92b-493d-a216-74abb7a918ac FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0350, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1013, p1=0.3438, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ControllerForcedReconfiguration_Size5&test_method=test_cluster_recovery
DataMigrationsMultiClusterTest test_with_consumer_groups null integration https://buildkite.com/redpanda/redpanda/builds/78994#019bb9e9-e929-4e4b-b4c9-6452a54aa3fe FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=DataMigrationsMultiClusterTest&test_method=test_with_consumer_groups
WriteCachingFailureInjectionE2ETest test_crash_all {"use_transactions": false} integration https://buildkite.com/redpanda/redpanda/builds/78994#019bb9f2-7e7d-4c7a-8713-9b71198f6c8c FLAKY 9/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.1053, p0=0.6714, reject_threshold=0.0100. adj_baseline=0.2839, p1=0.1761, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=WriteCachingFailureInjectionE2ETest&test_method=test_crash_all
test results on build#79009
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
ControllerForcedReconfiguration_Size5 test_cluster_recovery {"scenario": "Simple"} integration https://buildkite.com/redpanda/redpanda/builds/79009#019bbb38-a2e9-49f1-b765-22eab4bf54f5 FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0323, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ControllerForcedReconfiguration_Size5&test_method=test_cluster_recovery
test results on build#79075
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
NodesDecommissioningTest test_decommission_status null integration https://buildkite.com/redpanda/redpanda/builds/79075#019bc055-e779-48bf-a0dc-4871c2863082 FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0399, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1150, p1=0.2946, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=NodesDecommissioningTest&test_method=test_decommission_status
test results on build#79354
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
RedpandaKerberosTest test_init {"acl": false, "fail": false, "req_principal": "client", "topics": ["always_visible"]} integration https://buildkite.com/redpanda/redpanda/builds/79354#019bddf8-db48-4447-a3f7-41f868dff189 FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=RedpandaKerberosTest&test_method=test_init

@oleiman

oleiman commented Jan 14, 2026

Copy link
Copy Markdown
Member Author

CI Failure: ducktape-release failed to generate final report, but build looks clean

@oleiman

oleiman commented Jan 14, 2026

Copy link
Copy Markdown
Member Author

/cdt
rp_version=build
provider=gcp
region=us-west2
cdt_instance_type=n2d-standard-4
tests/rptest/tests/cloud_topics/l0_gc_test.py

Comment thread src/v/cloud_storage_clients/s3_client.cc Outdated

@dotnwat dotnwat left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread src/v/cloud_io/tests/s3_imposter.cc Outdated

return R"xml(<DeleteResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"></DeleteResult>)xml";
} else if (
request._method == "POST" && request._url.contains("batch")) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request._url.contains("batch")

how is POST + url.contains('batch') precise enough to identify a batch delete request?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just happens to be the case. better to use the whole "/batch/storage/v1" though

Comment thread src/v/cloud_io/tests/s3_imposter.cc Outdated
"Content-Type: application/http\r\n"
"Content-ID: response-{}\r\n\r\n"
"HTTP/1.1 {}\r\n"
"X-GUploader-UploadID: test-upload-id-{}\r\n\r\n",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is s3_imposter GCS specific?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, but this endpoint is. incidentally, I meant to remove all of these guploader headers. they wouldn't realistically appear in a batch delete response

Comment thread src/v/cloud_storage_clients/s3_client.h Outdated
@oleiman oleiman force-pushed the ct/core-15281/multi-part-delete-gcs branch from 9b47476 to e4b52fb Compare January 15, 2026 06:00
Very crude request parsing, but it works well enough for existing tests.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
@oleiman oleiman force-pushed the ct/core-15281/multi-part-delete-gcs branch from e4b52fb to 55e475e Compare January 15, 2026 19:05
@oleiman oleiman requested review from Copilot and dotnwat January 15, 2026 19:06

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

src/v/cloud_storage_clients/tests/gcs_client_test.cc:1

  • The format string has 4 placeholders but 5 arguments are provided (boundary, i, code, i). The extra 'i' at the end (line 432) will be ignored by fmt::format, which may indicate a copy-paste error or incorrect format string.
/*

@oleiman

oleiman commented Jan 15, 2026

Copy link
Copy Markdown
Member Author

/cdt
rp_version=build
provider=gcp
region=us-west2
cdt_instance_type=n2d-standard-4
tests/rptest/tests/cloud_topics/l0_gc_test.py

@oleiman

oleiman commented Jan 16, 2026

Copy link
Copy Markdown
Member Author

/cdt
provider=gcp
region=us-west2
cdt_instance_type=n2d-standard-4
tests/rptest/tests/cloud_topics/l0_gc_test.py

@oleiman

oleiman commented Jan 20, 2026

Copy link
Copy Markdown
Member Author

@Lazin @dotnwat @rockwotj @nvartolomei - Bump for reviews. TIA!

@dotnwat

dotnwat commented Jan 20, 2026

Copy link
Copy Markdown
Member

@Lazin @dotnwat @rockwotj @nvartolomei - Bump for reviews. TIA!

Thanks! Will get to this today

dotnwat
dotnwat previously approved these changes Jan 20, 2026
Comment thread src/v/cloud_storage_clients/util.cc
Comment on lines +670 to +671
if (cfg.is_gcs) {
return ss::make_shared<gcs_client>(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

Comment thread src/v/cloud_storage_clients/s3_client.cc Outdated
Comment thread src/v/cloud_storage_clients/s3_client.cc
This commit relaxes the line break accounting in multipart_response_parser::get
to accept arbitrarily many CRLFs preceding the next boundary.

A subrequest with a non-empty body looks like this:

- header CRLF
- CRLF
- body CRLF
- CRLF
- boundary

Previously, the multipart response parser expected to find _exactly_ two CRLF
sequences preceding the next boundary. However, if the sub-request body is
empty, then we get the following:

- header CRLF
- CRLF
- CRLF
- boundary

Since the boundary is preceded by three CRLF sequences, the parser would treat
the subrequest as malformed and STOP, discarding the rest of the outer multipart
response body.

Note that a subresponse body could be anything at all, including some string
ending in one or more CRLFs.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Infer cloud storage backend at config creation time, then use that to choose
between s3_client & gcs_client in the client_pool.

At this stage gcs_client is just a thin extension of s3_client w/ no added
functionality. A subsequent commit will add an override for GCS-native
multipart batch deletes.
This is only needed for GCS, so various identifiers will reflect that.

Includes json response body parsing to get GCS-native error reasons.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
- Wire the batch request limit (100) into remote.cc
- Update cloud_io/remote_test

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Bit simpler than retrofitting s3_client_test.

Just flexing the batch delete code.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
@oleiman oleiman force-pushed the ct/core-15281/multi-part-delete-gcs branch from 55e475e to b24f8d2 Compare January 21, 2026 00:10
@oleiman oleiman requested a review from dotnwat January 21, 2026 00:12
@oleiman

oleiman commented Jan 21, 2026

Copy link
Copy Markdown
Member Author

@Lazin @nvartolomei - I'm going to merge this on the basis that it's quite similar to the ABS version (modulo gcs_client split, which is trivial). Feel free to comment here if you get around to it and have concerns.

@oleiman oleiman merged commit a781320 into redpanda-data:dev Jan 21, 2026
19 checks passed
@nvartolomei nvartolomei mentioned this pull request Apr 17, 2026
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants