perf: Defer TLS initialization to a background thread#11967
Merged
anthonyshew merged 19 commits intomainfrom Feb 23, 2026
Merged
perf: Defer TLS initialization to a background thread#11967anthonyshew merged 19 commits intomainfrom
anthonyshew merged 19 commits intomainfrom
Conversation
…very Three targeted optimizations to the turbo run hot path: 1. Engine builder: Cache turbo.json chain per package and move the visited check before the expensive task_definition() call. The chain only depends on the package name, so multiple tasks in the same package reuse the cached result. 2. Task visitor: Defer env() computation to non-dry-run branches. The execution environment is unused during dry runs, avoiding per-task RwLock acquisition and env var map cloning. 3. find_untracked_files: Replace Mutex<Vec> with per-thread local buffers flushed via mpsc channel on drop, eliminating per-file mutex contention in the parallel walker.
Parallelize several sequential phases of turbo run's pre-execution pipeline: dependency resolution, turbo.json loading, and task summary construction. Also reduce per-call allocation overhead in the task hash tracker and gix index classification.
…parallelize-hot-path # Conflicts: # crates/turborepo-scm/src/repo_index.rs
…parallelize-hot-path
Adds profiling visibility to functions that were invisible in --profile output: TLS initialization, rayon-spawned hash tasks, Visitor constructor, Engine scheduler, and per-task cache phases.
…/add-profiling-spans
…/add-profiling-spans
Store task hashes as Arc<str> in TaskHashTrackerState instead of String. In calculate_dependency_hashes, cloning an Arc<str> is a ref count bump instead of a heap allocation. For the api monorepo (1687 tasks, ~3 deps each), this eliminates ~5000 String heap allocations per run.
…Summary Removes the .to_string() conversion in the HashTrackerInfo::hash() trait impl by changing the trait return type from Option<String> to Option<Arc<str>>. Also changes SharedTaskSummary.hash to Arc<str> so the Arc flows all the way to serialization without any String allocation.
The Arc<str> fields in TaskHashTrackerState and SharedTaskSummary require serde's "rc" feature for Serialize. Previously this worked via transitive feature activation; make it explicit so it doesn't break if the transitive path changes.
…arc-str-dependency-hashes
The --profile output previously had no visibility into the pre-execution phases of turbo run. This made it impossible to diagnose where startup overhead was coming from. Add info-level tracing spans to every significant phase: - shim_run: top-level shim execution - cli_run: CLI dispatch (arg parsing, http client, telemetry) - http_client_init: TLS initialization for reqwest - telemetry_init: telemetry client setup - command_base_new: config loading and opts construction - run_builder_new: API client and auth setup - pkg_dep_graph_build: package graph construction - scm_task_await: waiting for background SCM/git index - async_cache_new: cache initialization - calculate_filtered_packages: package filtering - env_infer: environment variable snapshot - turbo_json_preload: turbo.json cache warming - build_engine: task graph construction - hash_scope: parallel file hashing - start_ui: UI initialization - repo_inference: repository root detection These spans appear in both the chrome trace JSON and the generated markdown summary when using --profile.
Holding an EnteredSpan guard across .await corrupts tracing's thread-local state when the future resumes on a different tokio worker thread. Use .instrument() which correctly re-enters the span on whichever thread the future polls on.
TLS initialization (loading root certificates, setting up the TLS backend) takes ~95ms and previously blocked the critical path at the start of every turbo run. No work could proceed until it completed. Move TLS init to a spawn_blocking task that starts immediately when cli::run begins. The HTTP client is resolved via a shared OnceCell — the background task and any consumer race to initialize it, and OnceCell guarantees only one wins. Telemetry uses a new DeferredTelemetryClient that resolves the HTTP client lazily on first flush rather than at construction, so telemetry initialization never blocks on TLS. The API client for remote cache and analytics is resolved in RunBuilder::build() after the package graph and SCM tasks have been running concurrently. By that point, TLS init has had the full duration of arg parsing, config loading, package graph construction, and SCM indexing to complete in the background. RunBuilder::build() now returns (Run, Option<AnalyticsHandle>) since analytics startup moved inside build() (it needs the resolved API client).
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
8a5823d to
5cbe8c5
Compare
TLS initialization (loading root certificates, setting up the TLS backend) takes ~95ms and previously blocked the critical path at the start of every turbo run. No work could proceed until it completed. Move TLS init to a spawn_blocking task that starts immediately when cli::run begins. The HTTP client is resolved via a shared OnceCell — the background task and any consumer race to initialize it, and OnceCell guarantees only one wins. Telemetry uses a new DeferredTelemetryClient that resolves the HTTP client lazily on first flush rather than at construction, so telemetry initialization never blocks on TLS. The API client for remote cache and analytics is resolved in RunBuilder::build() after the package graph and SCM tasks have been running concurrently. By that point, TLS init has had the full duration of arg parsing, config loading, package graph construction, and SCM indexing to complete in the background. RunBuilder::build() now returns (Run, Option<AnalyticsHandle>) since analytics startup moved inside build() (it needs the resolved API client).
5cbe8c5 to
7dcebb7
Compare
Contributor
Coverage Report
|
…' into perf/defer-tls-init # Conflicts: # crates/turborepo-api-client/src/telemetry.rs
… init When the tokio runtime shuts down before the background TLS init task completes (common for short-lived commands like `unlink`), the spawn_blocking JoinHandle returns JoinError::Cancelled. The previous code called .expect() on this, crashing the process. Replace all three spawn_blocking + expect sites with proper error propagation via a new HttpClientCancelled error variant. Telemetry failures silently return errors (telemetry is never worth crashing over), and the run path surfaces the error normally.
github-actions Bot
added a commit
that referenced
this pull request
Feb 23, 2026
## Release v2.8.11-canary.24 Versioned docs: https://v2-8-11-canary-24.turborepo.dev ### Changes - perf: Add more tracing spans into startup path (#11965) (`23e144d`) - release(turborepo): 2.8.11-canary.23 (#11966) (`b425c39`) - perf: Defer TLS initialization to a background thread (#11967) (`f1d487f`) --------- Co-authored-by: Turbobot <turbobot@vercel.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
TLS initialization (~95ms) previously blocked the critical path at the start of every
turbo run. This PR moves it to a background thread so it overlaps with package graph construction, SCM indexing, and other startup work.How it works
cli::runspawns TLS init onspawn_blockingimmediately — before arg parsingDeferredTelemetryClientthat resolves the HTTP client lazily on first flush, so telemetry init never blocksRunBuilder::build()resolves the HTTP client via a sharedOnceCellafter the package graph and SCM tasks have been running concurrently — by that point, TLS has had the full startup pipeline to complete in the backgroundbuild()since it needs the resolved API client, sobuild()now returns(Run, Option<AnalyticsHandle>)Profiling results
Measured with
--profileon three repos of different sizes. The key metric isresolve_api_client— how long the main thread blocks waiting for TLS init after all the overlapping work has completed. Lower is better (0ms = fully hidden).build_http_client(TLS)resolve_api_client(wait)Savings scale with repo size: larger repos have more
build()work (package graph construction, lockfile parsing, SCM indexing) running concurrently with TLS init. Small repos see minimal improvement since there's almost nothing to overlap with, but they don't regress — the worst case is parity with the baseline.Testing
cargo check --workspaceclean (zero errors, zero warnings)cargo test -p turborepo-api-client --lib— 18 passed--profilecapturesresolve_api_client,http_client_init, andbuild_http_clientspans correctlybuild_http_clientcall occurs (OnceCell deduplication works)