Introduction
Git's repository is, at its heart, a content-addressed object store. There are exactly four object types: blob, tree, commit, and tag. Every operation eventually touches them. Understanding these four types demystifies most of Git.
Blobs
A blob stores file contents, nothing more. No filename, no permissions, no history. Two files with identical contents share one blob, regardless of where they live in the tree.
echo "hello" | git hash-object --stdin
# ce013625030ba8dba906f756967f9e9ca394464a
Trees
A tree is a directory listing: a sorted set of (mode, type, sha, name) entries. Subdirectories are themselves trees, recursively. Every commit points at exactly one root tree.
git ls-tree HEAD
# 100644 blob a1b2... README.md
# 040000 tree c3d4... src
Commits
A commit object contains:
- A pointer to a root tree.
- Zero or more parent commits (zero for the root, two or more for merges).
- Author and committer with timestamps.
- A message.
- Optional GPG/SSH signature.
git cat-file -p HEAD
# tree 9f1a...
# parent b2c3...
# author Ada <[email protected]> 1714300000 +0000
# committer Ada <[email protected]> 1714300000 +0000
#
# Add greeting
Tags
An annotated tag is its own object pointing at another object (almost always a commit), with tagger info and a message. Lightweight tags are just refs and have no object.
git cat-file -t v1.0.0
# tag
git cat-file -p v1.0.0
Putting it together
Walk a commit's tree manually:
git cat-file -p HEAD^{tree}
git cat-file -p HEAD^{tree}:src
git cat-file -p HEAD:README.md
The ^{tree} peel and the commit:path syntax are how all Git tools navigate.
Storage
Every object is zlib-compressed and addressed by the SHA-1 (or SHA-256) of its uncompressed content plus a header. Identical content, anywhere in history, deduplicates automatically.
Tree entry modes
The mode field in a tree entry is a small set of POSIX-like file modes:
100644: regular non-executable file (blob).100755: executable file (blob).120000: symbolic link (blob whose content is the target).040000: subdirectory (tree).160000: gitlink (submodule reference to a commit SHA).
git ls-tree HEAD
git update-index --chmod=+x scripts/run.sh
git ls-tree HEAD scripts/
Git is intentionally limited; arbitrary permissions and ownership are not stored.
Common mistakes
Believing Git stores diffs. It does not; it stores full snapshots, deduplicated by hash and later delta-compressed in pack files. Confusing trees with directories on disk; trees are immutable objects. Mistaking lightweight tags for annotated ones; only annotated tags carry metadata and signatures. Finally, expecting that renaming a file changes the blob; the blob is the same, only the tree's name entry changes. Spend ten minutes with git cat-file -p on a real repository and the model becomes second nature.