Why filter-repo
git filter-repo is the modern replacement for the deprecated git filter-branch. It is dramatically faster — sometimes 100x — safer by default, and provides the high-level operations history surgery actually needs: removing files, renaming paths, stripping authors, splitting subdirectories, and changing email addresses across all commits.
Installation
filter-repo is not bundled with Git itself but is widely packaged. On macOS: brew install git-filter-repo. On Debian/Ubuntu: apt install git-filter-repo. Or download the single Python file from the project and place it in PATH.
Removing a file from all history
git filter-repo --path secrets.env --invert-paths
This rewrites every commit that ever touched secrets.env to omit the file, then prunes orphaned objects. Force-push the result and rotate any leaked credentials immediately. See "Removing sensitive data from history" for the full incident response.
Extracting a subdirectory as a new repo
git clone --no-local original/ extracted/
cd extracted
git filter-repo --subdirectory-filter packages/foo
The result is a repository whose root is what used to be packages/foo, with all relevant history preserved.
Renaming paths
git filter-repo --path-rename old/dir/:new/dir/
Trailing slashes matter: with them you rename a directory; without, you rename an exact file.
Updating author emails
Build a mailmap-style file:
cat > mailmap.txt <<'EOF'
Jane Doe <[email protected]> <[email protected]>
EOF
git filter-repo --mailmap mailmap.txt
Safety features
filter-repo refuses to run on a non-fresh clone by default — it expects a freshly cloned repository so a mistake does not destroy your only copy. Override only when you understand the implications:
git filter-repo --force
Common mistakes
Running on your only clone with --force and no backup. Always clone first. Forgetting that filter-repo removes the original remote to prevent accidental pushes — re-add it explicitly. Trying to mix filter-repo and filter-branch on the same repo confuses object reachability. Using filter-repo on a shared repo without coordinating: every collaborator must reclone or rebase atop the new history.
Performance
filter-repo streams via git fast-export/fast-import, which is why it is fast. For very large repos, run on a fast SSD and increase core.bigFileThreshold if you encounter memory issues.
Related
See "Removing sensitive data from history" for incident workflows and "Ahead-of-time packfile building with git fast-export" for the underlying machinery.