Synopsis
git filter-repo --path <path> [--invert-paths]
git filter-repo --replace-text <file>
git filter-repo --strip-blobs-bigger-than 10M
Description
The git filter-repo tool is the modern, recommended way to rewrite Git history. It replaces the deprecated git filter-branch, which is slow and error-prone. filter-repo can purge large files, remove sensitive data (credentials accidentally committed), restructure paths, rewrite authors via a mailmap, or extract subdirectories into their own repository.
It is distributed separately from core Git (typically via pip install git-filter-repo or your package manager). Operations are destructive — always work on a fresh clone and coordinate with collaborators because every SHA changes.
In day-to-day use, git filter-repo integrates closely with shell aliases, editor plugins, and continuous integration. Power users often add aliases that combine flags they always pass, or wrap the command in scripts that enforce team conventions. Output formatting can be customized via Git config — pretty formats, color schemes, and pager behavior are all tunable. When something goes wrong, the first diagnostic step is usually to re-run the command with GIT_TRACE=1 in the environment, which reveals the underlying plumbing calls. For unusual situations, the --help output (git filter-repo --help) opens the full manual page with details on every option, including those rarely used in casual workflows but essential for debugging or scripting at scale.
Understanding how git filter-repo interacts with the rest of Git's data model — the object database, the index, refs, and the working tree — pays dividends. Each command operates on some subset of these pieces, and knowing which it touches helps predict outcomes and recover from mistakes. Reading the official Git documentation alongside hands-on practice in a throwaway repository is the fastest way to internalize the nuances. Most production issues with Git stem from one of three causes: surprising default behavior, partial network operations, or rewriting history that was already shared. A working mental model of git filter-repo's side effects helps avoid all three.
Common Options
| Option | Description |
|---|---|
--path <p> | Keep only the specified path. |
--invert-paths | Inverse: remove the specified paths. |
--path-glob <g> | Use a glob pattern. |
--replace-text <file> | Replace strings throughout history. |
--strip-blobs-bigger-than <size> | Drop large blobs. |
--mailmap <file> | Rewrite author/committer info. |
--analyze | Produce a report on repo size before rewriting. |
--force | Override the fresh-clone safety check. |
Examples
git filter-repo --analyze
# Inspect what's eating space
git filter-repo --path docs/ --invert-paths
# Remove docs/ from all history
git filter-repo --strip-blobs-bigger-than 50M
# Purge any file over 50 MB
echo 'API_KEY==>REDACTED' > replacements.txt
git filter-repo --replace-text replacements.txt
# Scrub a leaked secret
Common Mistakes
Running filter-repo in your live repo without a backup risks losing data. Always clone fresh. After rewriting, force-push and have collaborators re-clone — anyone with old SHAs will get conflicts. The tool removes origin by default to prevent accidental pushes; re-add it deliberately.
Related Commands
git filter-branch (deprecated), git gc, git reflog, git push --force-with-lease