By admin , 28 April 2026

Introduction

Git names every object by the cryptographic hash of its content. This is content-addressed storage: the name is the content's fingerprint. The default hash is SHA-1 (160 bits, 40 hex chars). SHA-256 support exists experimentally since Git 2.29.

How a hash is computed

Git prepends a header of the form <type> <size>\0 to the raw content, then hashes:

printf 'blob 6\0hello\n' | sha1sum
# ce013625030ba8dba906f756967f9e9ca394464a  -

The same value comes out of git hash-object:

printf 'hello\n' | git hash-object --stdin
# ce013625030ba8dba906f756967f9e9ca394464a

Why content addressing

  • Deduplication: identical content stored once.
  • Integrity: any corruption changes the hash, instantly detectable.
  • Reproducibility: the same tree always has the same name.
  • Distributed sync: Git can ask "do you have abc123?" without explaining what it is.

Abbreviated hashes

Git accepts the shortest unambiguous prefix, normally 7 characters. As a repo grows, more digits may be needed:

git rev-parse --short HEAD
git rev-parse --short=12 HEAD
git config --global core.abbrev 12

SHA-1 versus SHA-256

SHA-1 has known collision attacks (SHAttered, 2017). The Git project responded with SHA-256 as an alternative object format. To create a SHA-256 repo:

git init --object-format=sha256

Note: SHA-1 and SHA-256 repos cannot interoperate yet; tooling and hosting support is still maturing as of Git 2.45+. For now, most projects continue with SHA-1, augmented by Git's hardened collision-detecting SHA-1 implementation.

Verifying integrity

git fsck --full
git fsck --strict

git fsck recomputes hashes for every object and reports any mismatch.

Inspecting an object

git cat-file -t <sha>       # type
git cat-file -s <sha>       # size
git cat-file -p <sha>       # pretty-printed content

Hardened SHA-1

Since 2017 Git has shipped with a "collision-detecting" SHA-1 implementation (sha1dc) that recognizes the published SHAttered pattern and refuses to hash such inputs. This makes practical attacks against Git repositories essentially impossible without discovering a new collision technique. The cost is a small slowdown on hashing; you can opt into stock OpenSSL SHA-1 with ./configure --with-openssl-sha1 when building from source, but few users have any reason to.

git --version --build-options 2>&1 | grep -i sha

Common mistakes

Worrying that SHA-1 collisions will compromise your repo. The known attacks require crafted inputs; Git additionally detects the published collision pattern. The pragmatic risk for a normal project is essentially zero. Truncating hashes too aggressively in scripts; in a repo with a million objects, 7 characters can collide. Use git rev-parse --short to let Git choose. Editing object files directly under .git/objects; that breaks the content-address invariant and corrupts the repo. All writes must go through Git plumbing.