Skip to content

perf: Parallelize turbo run pre-execution hot path#11958

Merged
anthonyshew merged 5 commits intomainfrom
perf/parallelize-hot-path
Feb 22, 2026
Merged

perf: Parallelize turbo run pre-execution hot path#11958
anthonyshew merged 5 commits intomainfrom
perf/parallelize-hot-path

Conversation

@anthonyshew
Copy link
Copy Markdown
Contributor

@anthonyshew anthonyshew commented Feb 22, 2026

Summary

Parallelizes several sequential phases of turbo run's pre-execution pipeline and reduces per-call allocation overhead. Individual functions show significant improvement in --profile traces, though end-to-end wall-clock improvement is within noise on hyperfine benchmarks due to uninstrumented overhead (stdout serialization, daemon negotiation) dominating total runtime.

Changes

Parallel to_summary task construction (tracker.rs)

The loop that builds TaskSummary structs was sequential. On a large repo with ~1700 tasks this was ~92ms. Moved to rayon::par_iter. Each task_summary() call is read-only on the engine, hash tracker (RwLock read), and package graph.

Profile: to_summary 92ms → 10ms

Parallel turbo.json preloading (loader.rs, builder.rs)

The engine builder loaded each package's turbo.json lazily — sequentially on first access. Added preload_all() that reads and parses all package turbo.json files in parallel via rayon before the engine builder needs them. The FixedMap cache uses OnceLock per key, so concurrent loads are non-blocking.

Profile: build_engine 74ms → 43ms

Parallel connect_internal_dependencies (builder.rs, dep_splitter.rs)

Dependencies::new calls that resolve internal vs external deps were sequential. Each call is read-only on the workspaces map, so moved to rayon::par_iter. Also hoisted package_manager.link_workspace_packages() (which reads a config file from disk for pnpm/Berry) above the parallel loop so it's computed once instead of N times.

Profile: connect_internal_dependencies 52ms → 24ms

Faster hex encoding in gix index (repo_index.rs)

Replaced e.id.to_hex().to_string() (goes through HexDisplayDisplay::fmt → heap String) with hex::encode_to_slice into a stack buffer, skipping the intermediate allocation.

Reduce find_untracked_files allocations (repo_index.rs)

Eliminated the Vec<String> + Arc that cloned every RepoStatusEntry.path for binary search. status_entries is pre-sorted before the call, and walker threads binary search directly on the borrowed &[RepoStatusEntry] slice.

Reduce TaskHashTracker per-call overhead (lib.rs)

Changed external_deps_hash_cache from HashMap<PackageName, String> to HashMap<String, String>. Lookups use task_id.package() directly instead of allocating a PackageName via to_workspace_name() on every calculate_task_hash call.

Measurement

Profile-based measurements (turbo run build --dry=json --profile, 5-run median on a large ~1000 package repo) show the instrumented portion of the run dropping from ~761ms to ~663ms. However, hyperfine benchmarks on --dry runs across three repos of varying size show no statistically significant end-to-end improvement — results are 1.00-1.03× with error bars of ±0.09 to ±0.22.

Testing

  • All existing tests pass
  • Added a regression test for connect_internal_dependencies verifying graph edges and external dependency classification are correct after parallelization
  • Existing git_index_regression_tests (31 tests) validate the hex encoding and find_untracked_files changes
  • Deadlock analysis: all parallelization uses either pure read-only shared references, non-blocking OnceLock CAS, or read-only RwLock acquisition with no concurrent writers

…very

Three targeted optimizations to the turbo run hot path:

1. Engine builder: Cache turbo.json chain per package and move the
   visited check before the expensive task_definition() call. The
   chain only depends on the package name, so multiple tasks in the
   same package reuse the cached result.

2. Task visitor: Defer env() computation to non-dry-run branches.
   The execution environment is unused during dry runs, avoiding
   per-task RwLock acquisition and env var map cloning.

3. find_untracked_files: Replace Mutex<Vec> with per-thread local
   buffers flushed via mpsc channel on drop, eliminating per-file
   mutex contention in the parallel walker.
Parallelize several sequential phases of turbo run's pre-execution
pipeline: dependency resolution, turbo.json loading, and task summary
construction. Also reduce per-call allocation overhead in the task
hash tracker and gix index classification.
@anthonyshew anthonyshew requested a review from a team as a code owner February 22, 2026 03:08
@anthonyshew anthonyshew requested review from tknickman and removed request for a team February 22, 2026 03:08
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Feb 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
examples-basic-web Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
examples-designsystem-docs Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
examples-gatsby-web Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
examples-kitchensink-blog Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
examples-nonmonorepo Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
examples-svelte-web Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
examples-tailwind-web Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
examples-vite-web Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
turbo-site Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
turborepo-agents Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm
turborepo-test-coverage Ready Ready Preview, Comment, Open in v0 Feb 22, 2026 1:20pm

…parallelize-hot-path

# Conflicts:
#	crates/turborepo-scm/src/repo_index.rs
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 22, 2026

Coverage Report

Metric Coverage
Lines 75.16%
Functions 46.86%
Branches 0.00%

View full report

@anthonyshew anthonyshew changed the title perf: Parallelize turbo run pre-execution hot path perf: Parallelize turbo run pre-execution hot path Feb 22, 2026
@anthonyshew anthonyshew merged commit b79b680 into main Feb 22, 2026
101 of 102 checks passed
@anthonyshew anthonyshew deleted the perf/parallelize-hot-path branch February 22, 2026 13:32
github-actions Bot added a commit that referenced this pull request Feb 22, 2026
## Release v2.8.11-canary.20

Versioned docs: https://v2-8-11-canary-20.turborepo.dev

### Changes

- perf: Optimize engine builder, task visitor, and untracked file
discovery (#11956) (`e145bc6`)
- release(turborepo): 2.8.11-canary.19 (#11957) (`be8c782`)
- perf: Parallelize `turbo run` pre-execution hot path (#11958)
(`b79b680`)

---------

Co-authored-by: Turbobot <turbobot@vercel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant