Skip to content

badsyntax/s3-etag

Repository files navigation

S3 ETAG

Build, Test & Publish CodeQL

Generate an accurate S3 ETAG in Node.js for any file (including multipart).

Please Note, this only works for unencrypted buckets.

Installation

npm install s3-etag

Usage

import { generateETag } from 's3-etag';

// Simple MD5 hash of contents for non-multipart files
const etag = generateETag(absoluteFilePath);

// MD5 hash of combined contents & part number (see below) for multipart files
const partSizeInBytes = 10 * 1024 * 1024; // 10mb
const etag = generateETag(absoluteFilePath, partSizeInBytes);

How It Works

This is a Node.js implementation of this algorithm.

At a high level:

  • If no partSizeInBytes is specified, return MD5 hash of file contents
  • If partSizeInBytes is specified:
    • Generate parts by comparing partSizeInBytes to the file size
    • Read each part from the file, MD5 hash the part, and append it to a global combined hash
    • Once all parts are processed, generate a new MD5 from the global combined hash, and suffix with the amount of parts

If the partSizeInBytes is unknown, you can find it by using AWS CLI:

  1. Use head-object to retrieve the object's metadata using the following command, where raw-files is the bucket name, and IMG2345.CR2 is the key aws s3api head-object --bucket raw-files --key IMG2345.CR2 this will return
      {
      "AcceptRanges": "bytes",
      "ContentType": "text/html",
      "LastModified": "2024-05-22T08:45:10+00:00"
      "ContentLength": 28333464,
      "ETag": "\"85ae33db28930d3afe594da14cd190bb-2\"",
      "VersionId": "28P4UkX5sCO.8vbyMojvecHndkHDwf",
      "ContentType": "binary/octet-stream",
      "ServerSideEncryption": "AES256",
      "Metadata": {}
      }
    
  2. Look for a -n at the end of the eTag, where n is a number >= 2, representing the number of parts/chunks. For single part objects, the eTag will simply be the MD5 of the object.
  3. Use aws s3api head-object --bucket raw-files --key IMG2345.CR2 --part-number 1to get the metadata of the first part/chunk. This will return
       {
       "AcceptRanges": "bytes",
       "ContentType": "text/html",
       "LastModified": "2024-05-22T08:45:10+00:00"
       "ContentLength": 17179870,
       "ETag": "\"85ae33db28930d3afe594da14cd190bb-2\"",
       "VersionId": "28P4UkX5sCO.8vbyMojvecHndkHDwf",
       "ContentType": "binary/octet-stream",
       "ServerSideEncryption": "AES256",
       "Metadata": {},
       "PartsCount": 2
       }
    
  4. Use the ContentLength value as the partSizeInBytes

License

See LICENSE.md.

About

Generate an accurate S3 ETAG in Node.js for any file (including multipart)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •