Introduction
Git names every object by the cryptographic hash of its content. This is content-addressed storage: the name is the content's fingerprint. The default hash is SHA-1 (160 bits, 40 hex chars). SHA-256 support exists experimentally since Git 2.29.
How a hash is computed
Git prepends a header of the form <type> <size>\0 to the raw content, then hashes:
printf 'blob 6\0hello\n' | sha1sum
# ce013625030ba8dba906f756967f9e9ca394464a -
The same value comes out of git hash-object:
printf 'hello\n' | git hash-object --stdin
# ce013625030ba8dba906f756967f9e9ca394464a
Why content addressing
- Deduplication: identical content stored once.
- Integrity: any corruption changes the hash, instantly detectable.
- Reproducibility: the same tree always has the same name.
- Distributed sync: Git can ask "do you have
abc123?" without explaining what it is.
Abbreviated hashes
Git accepts the shortest unambiguous prefix, normally 7 characters. As a repo grows, more digits may be needed:
git rev-parse --short HEAD
git rev-parse --short=12 HEAD
git config --global core.abbrev 12
SHA-1 versus SHA-256
SHA-1 has known collision attacks (SHAttered, 2017). The Git project responded with SHA-256 as an alternative object format. To create a SHA-256 repo:
git init --object-format=sha256
Note: SHA-1 and SHA-256 repos cannot interoperate yet; tooling and hosting support is still maturing as of Git 2.45+. For now, most projects continue with SHA-1, augmented by Git's hardened collision-detecting SHA-1 implementation.
Verifying integrity
git fsck --full
git fsck --strict
git fsck recomputes hashes for every object and reports any mismatch.
Inspecting an object
git cat-file -t <sha> # type
git cat-file -s <sha> # size
git cat-file -p <sha> # pretty-printed content
Hardened SHA-1
Since 2017 Git has shipped with a "collision-detecting" SHA-1 implementation (sha1dc) that recognizes the published SHAttered pattern and refuses to hash such inputs. This makes practical attacks against Git repositories essentially impossible without discovering a new collision technique. The cost is a small slowdown on hashing; you can opt into stock OpenSSL SHA-1 with ./configure --with-openssl-sha1 when building from source, but few users have any reason to.
git --version --build-options 2>&1 | grep -i sha
Common mistakes
Worrying that SHA-1 collisions will compromise your repo. The known attacks require crafted inputs; Git additionally detects the published collision pattern. The pragmatic risk for a normal project is essentially zero. Truncating hashes too aggressively in scripts; in a repo with a million objects, 7 characters can collide. Use git rev-parse --short to let Git choose. Editing object files directly under .git/objects; that breaks the content-address invariant and corrupts the repo. All writes must go through Git plumbing.