By admin , 29 April 2026

Why filter-repo

git filter-repo is the modern replacement for the deprecated git filter-branch. It is dramatically faster — sometimes 100x — safer by default, and provides the high-level operations history surgery actually needs: removing files, renaming paths, stripping authors, splitting subdirectories, and changing email addresses across all commits.

Installation

filter-repo is not bundled with Git itself but is widely packaged. On macOS: brew install git-filter-repo. On Debian/Ubuntu: apt install git-filter-repo. Or download the single Python file from the project and place it in PATH.

Removing a file from all history

git filter-repo --path secrets.env --invert-paths

This rewrites every commit that ever touched secrets.env to omit the file, then prunes orphaned objects. Force-push the result and rotate any leaked credentials immediately. See "Removing sensitive data from history" for the full incident response.

Extracting a subdirectory as a new repo

git clone --no-local original/ extracted/
cd extracted
git filter-repo --subdirectory-filter packages/foo

The result is a repository whose root is what used to be packages/foo, with all relevant history preserved.

Renaming paths

git filter-repo --path-rename old/dir/:new/dir/

Trailing slashes matter: with them you rename a directory; without, you rename an exact file.

Updating author emails

Build a mailmap-style file:

cat > mailmap.txt <<'EOF'
Jane Doe <[email protected]> <[email protected]>
EOF
git filter-repo --mailmap mailmap.txt

Safety features

filter-repo refuses to run on a non-fresh clone by default — it expects a freshly cloned repository so a mistake does not destroy your only copy. Override only when you understand the implications:

git filter-repo --force

Common mistakes

Running on your only clone with --force and no backup. Always clone first. Forgetting that filter-repo removes the original remote to prevent accidental pushes — re-add it explicitly. Trying to mix filter-repo and filter-branch on the same repo confuses object reachability. Using filter-repo on a shared repo without coordinating: every collaborator must reclone or rebase atop the new history.

Performance

filter-repo streams via git fast-export/fast-import, which is why it is fast. For very large repos, run on a fast SSD and increase core.bigFileThreshold if you encounter memory issues.

Related

See "Removing sensitive data from history" for incident workflows and "Ahead-of-time packfile building with git fast-export" for the underlying machinery.