By admin , 29 April 2026

Why custom diffs

Git's default diff is line-by-line text. For binary formats (PDF, DOCX, images) and structured text where line diffs are unhelpful (minified JS, generated SQL dumps), custom diff drivers produce something humans can read.

Built-in language drivers

Git ships with hunk-header patterns for many languages: ada, bash, bibtex, cpp, csharp, css, dts, elixir, fortran, fountain, golang, html, java, kotlin, markdown, matlab, objc, pascal, perl, php, python, ruby, rust, scheme, tex. Activate via gitattributes:

# .gitattributes
*.py    diff=python
*.rs    diff=rust
*.go    diff=golang

Now hunk headers in git diff show the enclosing function or class, dramatically improving review.

Defining a custom driver

Drivers live in your gitconfig:

[diff "exif"]
textconv = exiftool
cachetextconv = true

And gitattributes selects them:

*.jpg diff=exif
*.png diff=exif

textconv commands print a textual representation of the file to stdout; Git diffs the outputs. cachetextconv stores the converted text under .git/objects for speed.

Word-level diff for prose

[diff "tex"]
wordRegex = "[^[:space:]\\\\]+"

Then run git diff --word-diff for human-friendly prose comparison.

Binary detection override

Force Git to treat a file as text or binary regardless of heuristics:

*.svg   diff
*.pdf   -diff
*.bin   binary

The binary macro is shorthand for -text -diff.

Real example: SQL dumps

[diff "sqldump"]
textconv = "sed 's/^INSERT INTO/\\nINSERT INTO/g'"

Splits cramped insert lines into readable diffs.

Common mistakes

Drivers configured in ~/.gitconfig are local — colleagues see no diff. Document driver installation. Heavy textconv commands without cachetextconv slow every diff. git log -p respects drivers, so misconfigurations affect history readability everywhere.

External diff entirely

GIT_EXTERNAL_DIFF=meld git diff
git config diff.tool meld
git difftool

Related

See "Gitattributes: line endings, exports, and encoding" and "Custom merge strategies and drivers".