use a consistent part size so that ETags are predictable

S3 uses this algorithm for ETags for multipart uploads:
1. Compute the MD5 sum of each part
2. Compute the MD5 sum of the concatenation of the MD5 sum digests of each part
3. Digest of that + '-' + part count

Sadly this means that when you want to check a local file to see if it matches an S3 object that was uploaded via multipart, you must also know what size each of the parts are, and this information is not available in S3 metadata.

One way to mitigate this problem is to use consistent part sizes when uploading files. For example, if I set `maxPartSize` to 5MB, then each part uploaded to S3 should be _exactly_ 5MB, except for the last one. Currently the code flushes a part when the part size is slightly above `maxPartSize`. This makes it impossible to do client side ETag calculation.

Note that s3cmd, which by default does 15MB multipart uploads, has behavior like I am describing where each part is exactly 15MB (except for the last one).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use a consistent part size so that ETags are predictable #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

use a consistent part size so that ETags are predictable #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions