Skip to content

Add validations for upload in s3 mulitpart client #6282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zoewangg
Copy link
Contributor

@zoewangg zoewangg commented Jul 20, 2025

Motivation and Context

The S3 multipart upload client lacked proper validation of upload parameters, which could lead to runtime failures or incorrect behavior when publishers provide malformed data. This addresses potential issues where:
• AsyncRequestBody parts lack content length information
• Individual parts exceed the configured part size
• The actual number of parts doesn't match expected calculations
• Publisher implementations continue sending data after completion

Changes Made

Validation Enhancements

Content Length Validation: Ensure all AsyncRequestBody parts have content length present
Part Size Validation: Verify each part (except the last) matches the expected part size exactly
Last Part Size Validation: Validate the final part size matches the calculated remainder or full part size
Part Count Validation: Ensure the total number of parts received matches the expected count
Publisher State Guards: Add isDone checks in onNext to prevent processing after completion

Code Structure Improvements

Enhanced MpuRequestContext: Added expectedNumParts field to track anticipated part count
Improved Error Handling: Better exception chaining in MultipartUploadHelper.failRequestsElegantly
Consistent Naming: Renamed variables for clarity (contentLength → totalSize, partCount → expectedNumParts)

Testing

Added tests

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@zoewangg zoewangg requested a review from a team as a code owner July 20, 2025 18:43
@zoewangg zoewangg requested a review from Copilot July 20, 2025 18:45
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds validation gaps in the S3 multipart client by implementing additional safety checks for multipart upload operations. The changes enhance error handling and prevent potential issues with malformed uploads by validating content lengths, part sizes, and part counts before attempting uploads.

  • Add content length and part size validation for AsyncRequestBody objects in multipart uploads
  • Implement checks to ensure the number of parts matches expected values
  • Add isDone guards to prevent processing after completion in subscriber implementations

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
UploadWithUnknownContentLengthHelper.java Adds validation for content length presence, part size limits, and expected part count matching
UploadWithKnownContentLengthHelper.java Updates multipart request context to include expected number of parts
KnownContentLengthAsyncRequestBodySubscriber.java Implements comprehensive part validation including content length, part size, and last part size checks
MpuRequestContext.java Adds expectedNumParts field with validation to the multipart request context
MultipartUploadHelper.java Enhances error handling with better exception wrapping
Test files Add comprehensive test coverage for the new validation scenarios
.changes/next-release/bugfix-AmazonS3-6522f77.json Documents the bug fix in the changelog

int expectedNumPart = genericMultipartHelper.determinePartCount(totalLength, partSizeInBytes);
if (parts.length != expectedNumPart) {
SdkClientException exception = SdkClientException.create(
String.format("The number of UploadParts requests is not equal to the expected number of parts. "
Copy link
Preview

Copilot AI Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message uses 'UploadParts requests' which should be 'upload part requests' for consistency with other error messages in the codebase.

Suggested change
String.format("The number of UploadParts requests is not equal to the expected number of parts. "
String.format("The number of upload part requests is not equal to the expected number of parts. "

Copilot uses AI. Check for mistakes.

@zoewangg zoewangg force-pushed the zoewang/validationsMultipartS3Client-upload branch from e843444 to 0144760 Compare July 20, 2025 19:00
@joviegas
Copy link
Contributor

  1. How are these validations handled with S3 Crt client ?
  2. Do we have or should we write a test suite to test S3Multipart client and S3Crt Client to check if the behaviour is same for these validations

return;
}

if (existingParts.containsKey(partNumber.get())) {
partNumber.getAndIncrement();
int currentPartNum = partNumber.getAndIncrement();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please help me to understand why earlier we used to do contains on get, now we first increment and then do containsKey check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why we did that earlier, but the reason I changed is to avoid another atomic integer get call (micro perf optimization)

@@ -179,6 +178,49 @@ public void onNext(AsyncRequestBody asyncRequestBody) {
subscription.request(1);
}

private void validatePart(AsyncRequestBody asyncRequestBody, int currentPartNum) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we already have JUnit tests for each class. However, I was wondering how we end up with the validation failures below from an external API perspective, such as when users pass invalid AsyncRequestBody or files get corrupted in transit. Is it possible to write end-to-end test cases for these scenarios so that we can test them with UnknownContentLength publisher or with S3CrtClient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, good point, added

@zoewangg
Copy link
Contributor Author

How are these validations handled with S3 Crt client ?
Do we have or should we write a test suite to test S3Multipart client and S3Crt Client to check if the behaviour is same for these validations

CRT team is working on adding the same validations on CRT side. I'm not sure if we want to add tests on our side for CRT-based S3 client since I'd expect tests to be added on their side.

@zoewangg zoewangg force-pushed the zoewang/validationsMultipartS3Client-upload branch from e1ad65a to c64fc4c Compare July 22, 2025 21:48
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
77.2% Coverage on New Code (required ≥ 80%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants