Monorepo realities
A monorepo holds many projects in one repository — better refactors, atomic cross-project changes, single dependency graph. The cost: scale. Repos can grow to gigabytes and millions of files, where naive Git becomes painful. The good news: modern Git has tools designed for exactly this.
The performance stack
- Sparse checkout (cone mode): only check out the directories you work on.
- Sparse index: skip non-sparse paths in the index entirely (Git 2.37+).
- Partial clone: fetch blobs lazily, on demand.
- Commit-graph + Bloom filters: fast log/blame on subsets.
- Fsmonitor: O(1)
git statusvia OS file change events. - Background maintenance: keep gc, repacking, and indexes fresh.
One-shot setup
git clone --filter=blob:none --sparse https://example.com/big.git
cd big
git sparse-checkout init --cone --sparse-index
git sparse-checkout set apps/web libs/ui
git config feature.manyFiles true
git config core.fsmonitor true
git config core.untrackedCache true
git maintenance start
feature.manyFiles turns on a bundle of optimizations including index.version=4 and core.untrackedCache.
CODEOWNERS and branch protection
Monorepos rely on per-directory ownership. Combine GitHub/GitLab CODEOWNERS files with required reviews so only the right teams gate the right paths. Couple with required CI checks per path.
Path-aware CI
Use git diff --name-only origin/main..HEAD in CI to only test changed projects:
changed=$(git diff --name-only origin/main..HEAD | cut -d/ -f1 | sort -u)
for pkg in $changed; do
[ -f "$pkg/Makefile" ] && make -C "$pkg" test
done
Tooling layer
Build orchestrators (Bazel, Buck2, Pants, Nx, Turborepo) consume the diff and re-run only affected targets. Microsoft's Scalar (now bundled with Git) bundles many of the above settings into one command. See "Scalar: batteries-included large repo management".
Common mistakes
Treating a monorepo like a small repo and skipping the optimizations — operations get exponentially slower as files multiply. Mixing many small fork-style branches into the same shared remote: rely on PR-based workflows. Forgetting that git status times include kernel work; fsmonitor solves it. Underestimating gc time on a multi-gigabyte repo; let background maintenance handle it.
Related
See "Sparse checkout for monorepos", "The sparse index: operating without a full index", "Partial clone: promise and promisor remotes", and "Scalar: batteries-included large repo management".