Skip to content

perf: Deduplicate file hashing and parallelize globwalks#11902

Merged
anthonyshew merged 1 commit intomainfrom
faster-and-faster
Feb 18, 2026
Merged

perf: Deduplicate file hashing and parallelize globwalks#11902
anthonyshew merged 1 commit intomainfrom
faster-and-faster

Conversation

@anthonyshew
Copy link
Copy Markdown
Contributor

@anthonyshew anthonyshew commented Feb 18, 2026

Summary

Optimizes turbo run --dry wall-clock time by up to 1.48x on large monorepos by eliminating redundant file hashing work and removing a serialization bottleneck in globwalk operations.

Benchmarks

Tested across three repos of varying size:

Repo Packages Before After Speedup
large ~1000 5.903s 3.999s 1.48x
medium ~120 1.461s 1.380s 1.06x
small ~6 0.659s 0.693s ~1.0x (noise)

The improvement scales with repo size — specifically with how many tasks share the same (package, inputs) combination.

Changes

File hash deduplication — Multiple tasks in the same package with identical inputs config (e.g. build, lint, typecheck all in one package) previously each ran an independent globwalk + file hash computation. Now tasks are grouped by (package_path, globs, include_default) and each unique combination is computed once, with results shared across tasks.

Parallel globwalks via retry-on-EMFILE — The previous IoSemaphore (max=1) serialized all globwalk operations to prevent fd exhaustion, making this the dominant bottleneck on large repos. This replaces the semaphore with retry-with-exponential-backoff on EMFILE errors (the same pattern Node's graceful-fs uses), allowing globwalks to run fully parallel on rayon. If the OS returns "too many open files", the operation sleeps briefly and retries — up to 10 times with exponential backoff capped at 1s.

Zero-copy lockfile dependency lookupsLockfile::all_dependencies now returns Cow<'_, HashMap<String, String>> instead of cloning the HashMap on every call. For pnpm (which pre-builds a dependency index), this eliminates ~329k HashMap clones during transitive closure resolution.

Optimized transitive closure cache keys — The DashMap resolve cache now uses a single null-byte-separated String key built into a reusable buffer, instead of allocating a (String, String, String) tuple per lookup.

HashMap importers for pnpm — Converted pnpm's importers field from BTreeMap to HashMap (with sorted serialization) for O(1) workspace lookups during resolve_package.

- Deduplicate file hashing across tasks that share the same package and
  inputs, reducing redundant globwalks and file hash computations
- Remove the IoSemaphore (max=1) that serialized all globwalk operations,
  replacing it with retry-on-EMFILE backoff at the globwalk and
  hash_objects layers
- Change Lockfile::all_dependencies to return Cow<HashMap> for zero-copy
  pnpm lookups during transitive closure resolution
- Optimize transitive closure cache keys to use a single reusable String
  buffer instead of allocating 3 Strings per lookup
- Convert pnpm importers from BTreeMap to HashMap for O(1) workspace
  lookups
@anthonyshew anthonyshew requested a review from a team as a code owner February 18, 2026 21:55
@anthonyshew anthonyshew requested review from tknickman and removed request for a team February 18, 2026 21:55
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Feb 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
examples-basic-web Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
examples-designsystem-docs Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
examples-gatsby-web Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
examples-kitchensink-blog Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
examples-nonmonorepo Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
examples-svelte-web Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
examples-tailwind-web Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
examples-vite-web Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
turbo-site Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
turborepo-agents Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm
turborepo-test-coverage Ready Ready Preview, Comment, Open in v0 Feb 18, 2026 9:56pm

@github-actions
Copy link
Copy Markdown
Contributor

Coverage Report

Metric Coverage
Lines 74.35%
Functions 46.22%
Branches 0.00%

View full report

@anthonyshew
Copy link
Copy Markdown
Contributor Author

The query test on Windows is hanging, and unrelated. Merging through.

@anthonyshew anthonyshew merged commit 57cf69c into main Feb 18, 2026
101 of 102 checks passed
@anthonyshew anthonyshew deleted the faster-and-faster branch February 18, 2026 22:09
github-actions Bot added a commit that referenced this pull request Feb 18, 2026
## Release v2.8.11-canary.4

Versioned docs: https://v2-8-11-canary-4.turborepo.dev

### Changes

- fix: Resolve npm packages in @turbo/gen compiled binary (#11900)
(`c2266b0`)
- release(turborepo): 2.8.11-canary.3 (#11901) (`b21423e`)
- perf: Deduplicate file hashing and parallelize globwalks (#11902)
(`57cf69c`)
- perf: Improve transitive dependency resolution cache sharing across
workspaces (#11903) (`fd1b6e8`)

---------

Co-authored-by: Turbobot <turbobot@vercel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant