Knowing your weight
Before optimizing, measure. Git ships git count-objects for basic stats; the third-party git-sizer from GitHub gives a deeper analysis with thresholds for trouble. Together they tell you whether your repo needs LFS, partial clone, or rewrite.
git count-objects
git count-objects -v
git count-objects -vH # human-readable
git count-objects --human-readable
Output explains: count of loose objects, size, in-pack, packs, garbage, prune-packable. Loose object count over a few thousand triggers auto-gc.
git-sizer
Install: brew install git-sizer or download a binary. Run from any clone:
git-sizer
git-sizer --verbose
git-sizer --no-progress --threshold 1
It analyzes commits, trees, blobs, references, and reports sizes plus warnings (e.g., "1 file with 800MB, consider Git LFS").
Sample insights
- Maximum number of files in a tree: warns above 100k.
- Maximum blob size: warns above 50MB.
- Total size of all commits: tracks growth.
- Maximum tag depth: detects pathological tag chains.
- Total reachable objects: indicates clone time.
Finding big files
git rev-list --objects --all | \
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize:disk) %(rest)' | \
awk '$1=="blob" {print $3, $4}' | sort -rn | head
This pipeline finds the largest blobs in history. Combine with git log --all --find-object=<sha> to identify when each was introduced.
Action items by finding
- Big blobs: migrate to Git LFS or rewrite history with
filter-repo. - Many small files: enable
feature.manyFiles, sparse checkout. - Many refs: enable protocol v2; consider reftable.
- Slow walks: write commit-graph with Bloom filters.
- Slow lookups: enable MIDX with bitmaps.
Common mistakes
Looking only at .git size on disk — packed objects share bytes via deltas, so on-disk size can be misleading. Use git-sizer's logical sizes. Confusing reachable size with total: garbage objects inflate disk but not clone bandwidth.
Tracking growth
git-sizer --json > sizer-$(date +%Y%m%d).json
diff <(jq . sizer-old.json) <(jq . sizer-new.json)
Related
See "Git garbage collection: gc, prune, and pack-refs", "filter-repo: rewriting history safely", and "Recovery and repair of corrupt repositories".