Why custom diffs
Git's default diff is line-by-line text. For binary formats (PDF, DOCX, images) and structured text where line diffs are unhelpful (minified JS, generated SQL dumps), custom diff drivers produce something humans can read.
Built-in language drivers
Git ships with hunk-header patterns for many languages: ada, bash, bibtex, cpp, csharp, css, dts, elixir, fortran, fountain, golang, html, java, kotlin, markdown, matlab, objc, pascal, perl, php, python, ruby, rust, scheme, tex. Activate via gitattributes:
# .gitattributes
*.py diff=python
*.rs diff=rust
*.go diff=golang
Now hunk headers in git diff show the enclosing function or class, dramatically improving review.
Defining a custom driver
Drivers live in your gitconfig:
[diff "exif"]
textconv = exiftool
cachetextconv = true
And gitattributes selects them:
*.jpg diff=exif
*.png diff=exif
textconv commands print a textual representation of the file to stdout; Git diffs the outputs. cachetextconv stores the converted text under .git/objects for speed.
Word-level diff for prose
[diff "tex"]
wordRegex = "[^[:space:]\\\\]+"
Then run git diff --word-diff for human-friendly prose comparison.
Binary detection override
Force Git to treat a file as text or binary regardless of heuristics:
*.svg diff
*.pdf -diff
*.bin binary
The binary macro is shorthand for -text -diff.
Real example: SQL dumps
[diff "sqldump"]
textconv = "sed 's/^INSERT INTO/\\nINSERT INTO/g'"
Splits cramped insert lines into readable diffs.
Common mistakes
Drivers configured in ~/.gitconfig are local — colleagues see no diff. Document driver installation. Heavy textconv commands without cachetextconv slow every diff. git log -p respects drivers, so misconfigurations affect history readability everywhere.
External diff entirely
GIT_EXTERNAL_DIFF=meld git diff
git config diff.tool meld
git difftool
Related
See "Gitattributes: line endings, exports, and encoding" and "Custom merge strategies and drivers".