perf: Use Arc<str> for task dependency hashes to avoid heap clones#11962
Merged
anthonyshew merged 12 commits intomainfrom Feb 22, 2026
Merged
perf: Use Arc<str> for task dependency hashes to avoid heap clones#11962anthonyshew merged 12 commits intomainfrom
Arc<str> for task dependency hashes to avoid heap clones#11962anthonyshew merged 12 commits intomainfrom
Conversation
…very Three targeted optimizations to the turbo run hot path: 1. Engine builder: Cache turbo.json chain per package and move the visited check before the expensive task_definition() call. The chain only depends on the package name, so multiple tasks in the same package reuse the cached result. 2. Task visitor: Defer env() computation to non-dry-run branches. The execution environment is unused during dry runs, avoiding per-task RwLock acquisition and env var map cloning. 3. find_untracked_files: Replace Mutex<Vec> with per-thread local buffers flushed via mpsc channel on drop, eliminating per-file mutex contention in the parallel walker.
Parallelize several sequential phases of turbo run's pre-execution pipeline: dependency resolution, turbo.json loading, and task summary construction. Also reduce per-call allocation overhead in the task hash tracker and gix index classification.
…parallelize-hot-path # Conflicts: # crates/turborepo-scm/src/repo_index.rs
…parallelize-hot-path
Adds profiling visibility to functions that were invisible in --profile output: TLS initialization, rayon-spawned hash tasks, Visitor constructor, Engine scheduler, and per-task cache phases.
…/add-profiling-spans
…/add-profiling-spans
Store task hashes as Arc<str> in TaskHashTrackerState instead of String. In calculate_dependency_hashes, cloning an Arc<str> is a ref count bump instead of a heap allocation. For the api monorepo (1687 tasks, ~3 deps each), this eliminates ~5000 String heap allocations per run.
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Arc<str> for task dependency hashes to avoid heap clones
…Summary Removes the .to_string() conversion in the HashTrackerInfo::hash() trait impl by changing the trait return type from Option<String> to Option<Arc<str>>. Also changes SharedTaskSummary.hash to Arc<str> so the Arc flows all the way to serialization without any String allocation.
The Arc<str> fields in TaskHashTrackerState and SharedTaskSummary require serde's "rc" feature for Serialize. Previously this worked via transitive feature activation; make it explicit so it doesn't break if the transitive path changes.
Contributor
Coverage Report
|
github-actions Bot
added a commit
that referenced
this pull request
Feb 22, 2026
## Release v2.8.11-canary.22 Versioned docs: https://v2-8-11-canary-22.turborepo.dev ### Changes - release(turborepo): 2.8.11-canary.21 (#11961) (`83774bc`) - perf: Use `Arc<str>` for task dependency hashes to avoid heap clones (#11962) (`56329a6`) --------- Co-authored-by: Turbobot <turbobot@vercel.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Arc<str>inTaskHashTrackerStateinstead ofString, so thatcalculate_dependency_hashesclones a ref-counted pointer (atomic increment) instead of heap-allocating a newStringfor each dependency hash lookup.Why
queue_taskis called once per task in the topological dispatch loop. For each task,calculate_dependency_hasheslooks up every dependency's hash and clones it into aVec. On large monorepos likeapi(1687 tasks, ~3 deps each), that's ~5000Stringheap allocations per run — all for 16-char hex strings that are already stored in the tracker and never mutated.Switching to
Arc<str>makes each "clone" a pointer-width copy + atomic ref count increment instead of a heap allocation + memcpy.Testing
Arc<str>derefs tostr).task_hashable_multiple_dependency_hashestest with a pinned hash value to guard against serialization regressions.queue_taskself-time dropped from ~208ms to ~193ms.