perf: Stream file contents during hashing to lower memory usage by anthonyshew · Pull Request #12059 · vercel/turborepo

anthonyshew · 2026-02-28T04:41:32Z

Summary

Both hash_file (gix path) and git_like_hash_file (manual fallback) previously called std::fs::read() / read_to_end(), loading entire files into memory before hashing. When rayon parallelizes hashing across many large files, this can OOM memory-constrained environments.
Now both paths stat the file for its size, write the git blob header into the hasher, then stream through a 64KB BufReader. Peak memory per hash call is bounded regardless of file size.
Hash output is identical — verified by tests comparing against git hash-object.

What changed

crates/turborepo-scm/src/hash_object.rs — hash_file() now uses gix_index::hash::hasher + gix_object::encode::loose_header to build the hasher with the blob header, then streams via BufReader instead of std::fs::read.

crates/turborepo-scm/src/manual.rs — git_like_hash_file() writes the blob header using the file size from metadata, then streams through the sha1::Sha1 hasher in 64KB chunks instead of read_to_end.

Testing

Extended test_blob_hash_matches_git_hash_object with 128KB (multi-buffer) and 64KB (exact-buffer-boundary) cases.
Added test_manual_hash_matches_git_hash_object — the manual path previously had no test verifying hash correctness against git hash-object. This new test covers the same edge cases including streaming buffer boundaries.

Both hash_file (gix path) and git_like_hash_file (manual fallback) previously read entire files into memory before hashing. For large files hashed in parallel on rayon, this could cause OOM on memory-constrained CI runners. Now both paths stat the file for its size, write the git blob header, then stream through a 64KB BufReader. Peak memory per hash call is bounded regardless of file size. Hash output is identical — verified against git hash-object.

vercel · 2026-02-28T04:41:38Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
examples-basic-web	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
examples-designsystem-docs	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
examples-gatsby-web	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
examples-kitchensink-blog	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
examples-nonmonorepo	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
examples-svelte-web	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
examples-tailwind-web	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
examples-vite-web	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
turbo-site	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
turborepo-agents	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am
turborepo-test-coverage	Ready	Preview, Comment, Open in v0	Feb 28, 2026 4:49am

- Propagate metadata error with ? instead of silently falling back to size 0, which would produce an incorrect blob hash header. - Use std::fs::write in test to correctly write binary content instead of silently writing empty files via str::from_utf8 fallback.

github-actions · 2026-02-28T04:56:13Z

Coverage Report

Metric	Coverage
Lines	81.39%
Functions	53.22%
Branches	0.00%

View full report

## Release v2.8.13-canary.8 Versioned docs: https://v2-8-13-canary-8.turborepo.dev ### Changes - fix: Exclude peer dependencies from workspace external dep resolution (#12050) (`3a75547`) - test: Port all 15 workspace-configs prysk tests to Rust (#12058) (`55442be`) - release(turborepo): 2.8.13-canary.7 (#12060) (`495afdc`) - perf: Stream file contents during hashing to lower memory usage (#12059) (`f03cdce`) - fix: Treat `npm: alias` dependencies as external, not workspace references (#12061) (`b179cb8`) - test: Port 18 more prysk tests to Rust (other/ + lockfile-aware-caching/) (#12062) (`7887af2`) --------- Co-authored-by: Turbobot <turbobot@vercel.com>

anthonyshew requested a review from a team as a code owner February 28, 2026 04:41

anthonyshew requested review from tknickman and removed request for a team February 28, 2026 04:41

vercel Bot deployed to Preview – turborepo-test-coverage February 28, 2026 04:41 View deployment

vercel Bot deployed to Preview – turborepo-agents February 28, 2026 04:41 View deployment

vercel Bot reviewed Feb 28, 2026

View reviewed changes

Comment thread crates/turborepo-scm/src/manual.rs Outdated

Comment thread crates/turborepo-scm/src/manual.rs Outdated

vercel Bot deployed to Preview – turborepo-test-coverage February 28, 2026 04:48 View deployment

vercel Bot deployed to Preview – turborepo-agents February 28, 2026 04:48 View deployment

anthonyshew changed the title ~~perf: Stream file contents during hashing to prevent OOM on large repos~~ perf: Stream file contents during hashing to lower memory usage Feb 28, 2026

anthonyshew merged commit f03cdce into main Feb 28, 2026
73 checks passed

anthonyshew deleted the anthonyshew/streaming-file-hash branch February 28, 2026 05:00

github-actions Bot mentioned this pull request Feb 28, 2026

release(turborepo): 2.8.13-canary.8 #12063

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Stream file contents during hashing to lower memory usage#12059

perf: Stream file contents during hashing to lower memory usage#12059
anthonyshew merged 2 commits into
mainfrom
anthonyshew/streaming-file-hash

anthonyshew commented Feb 28, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Feb 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Feb 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anthonyshew commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Testing

Uh oh!

vercel Bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Feb 28, 2026

Coverage Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

anthonyshew commented Feb 28, 2026 •

edited

Loading

vercel Bot commented Feb 28, 2026 •

edited

Loading