By admin , 29 April 2026

Two ends of one wire

Fetch and push performance depends on negotiation efficiency, transfer size, and server-side computation. Each has levers, and the right ones differ by repo size.

Protocol v2

Protocol v2 (default in Git 2.26+) reduces the upfront ref advertisement, dramatically helping repos with thousands of refs. Verify it is in use:

GIT_TRACE_PACKET=1 git ls-remote origin 2>&1 | head -2
git config protocol.version 2

See "Protocol v2 for efficient fetch negotiation".

By admin , 29 April 2026

Why maintenance, not gc

git maintenance (Git 2.31+) is the modern, task-oriented replacement for gc --auto. It runs specific tasks (commit-graph, prefetch, incremental-repack, loose-objects, pack-refs, gc) on schedules tuned for each, in the background, without blocking your interactive commands.

By admin , 29 April 2026

What gc does

git gc performs maintenance: repacks loose objects, prunes unreachable ones past the expiry window, packs loose refs into packed-refs, expires reflogs, and writes commit-graph and MIDX where configured. It runs automatically when certain thresholds are exceeded.

Auto-trigger

Git invokes gc --auto at the end of operations like commit, merge, rebase. The thresholds:

By admin , 29 April 2026

The full repack cost

Traditional git gc runs git repack -ad, rewriting all objects into one packfile. On a multi-gigabyte repo, this is hours of CPU and IO. Geometric repacking (Git 2.32+) avoids this by maintaining a series of packs whose sizes follow a geometric progression — only the smallest are merged each cycle.

By admin , 29 April 2026

The set operation problem

Operations like clone, fetch, and gc need to compute "what objects are reachable from these commits?" — a graph traversal that touches every reachable object. Reachability bitmaps store this answer as compressed bitmaps, turning the traversal into bitwise OR/AND operations.

By admin , 29 April 2026

The many-pack problem

A repo with many packfiles must search each one to locate an object — a binary search per pack. With dozens or hundreds of packs (common in active repos using geometric repack), this O(packs × log objects) cost adds up. The multi-pack-index (MIDX) consolidates all pack indexes into one binary search, restoring O(log total objects).

By admin , 29 April 2026

The path-restricted log problem

git log -- path/to/file must, in principle, walk every commit, diff each against its parent, and emit those that touched path/to/file. On large repos this is dominated by tree comparisons. Changed-path Bloom filters (Git 2.27+) accelerate this dramatically by storing, for each commit, a probabilistic set of paths it touched.

By admin , 29 April 2026

The reachability bottleneck

Many Git operations need to answer "is commit X reachable from commit Y?" or "which commit is the merge base?" Naively this means walking the commit graph from raw object reads — slow on large repos. The commit-graph file precomputes parent pointers, generation numbers, and (optionally) Bloom filters into a binary side file.

Where it lives

Older Git: .git/objects/info/commit-graph (single file). Newer Git: .git/objects/info/commit-graphs/ (chained, allowing incremental writes).

By admin , 29 April 2026

The Trace2 facility

Trace2 (introduced in Git 2.22) is the structured tracing facility built into Git. It emits region begin/end events, child process tracking, and timing information in a stable schema, suitable for both human inspection and automated analysis.

By admin , 29 April 2026

Linear tools, exponential repos

Git was originally tuned for the Linux kernel — large by 2005 standards but tiny by today's. Modern repos can hold millions of files, hundreds of gigabytes of history, and tens of thousands of refs. Many Git operations were O(working tree size) or O(history) by default, and at scale they became visibly slow. Performance work since 2018 has added optional features that turn many operations into O(changed) rather than O(total).

Where time goes

Common slow paths: