Inside the .git directory
Introduction
The goal of this article is to demistify the .git directory a little bit. It’s not that hard to pick up a basic workflow with git, create a branch, change some files, commit and push, but then people often feel intimidated by git once the task at hand goes beyond a few simple commands.
Note: This article does assume that the reader has some basic familiarity with git already.
Basic orientation
Before diving into it, let’s touch on a few important concepts:
- -> repository
Technically, the repository itself is the contents of the .git directory, but not the files outside it. This may sound counter-inituitive, but the files and directories immediately outside of .git are actually the working directory, which is entirely optional for git. It’s possible to have a git repository with nothing but the .git directory. This is called a bare git repository, and it’s very common for remote repositories on servers to be bare repos.
- -> working directory
These are the actual project files, outside of the .git repository. By default, this directory represents the latest commit of the local repository. Once git checkout is issued, it’ll reflect the state of that commit.
- -> staging area
The file that tracks the changes relative to the local repository. When something in the working directory is modified, git knows that a change exists, and it is up to the user to issue a git add to move it to this staging area. That’s when these added files are flagged for inclusion in the next commit. Basically, think of it as “the proposed contents of the next commit”.
The .git directory
As established, this directory is the actual repository, so let’s dive into one:
root@linuxpc:/datadisk/git/repo01# (master) ls .git
branches hooks info logs objects refs COMMIT_EDITMSG config description HEAD index ORIG_HEAD packed-refs
Let’s unpack these one by one, in no particular order:
- -> config
This is a text file that contains configurations for the current repository. This is where a repository can be set to bare, this is where the url to the remote repository can be configured, branch tracking rules (which local branch maps to which remote branch), as well as any local overrides about the user.
This is not an exhaustive list, and there’s a lot of other fun stuff like sparse-checkout (change the working directory from having all tracked files present to having just a subset).
- -> HEAD
This will either contain the name of a branch, or it will directly contain a commit ID. This entirely depends on whether the user in in a branch (i.e. git status says On branch ) or in a detached HEAD state, pointing directly to a commit.
- -> description
A small text file where a description of what the repository represents can be given. GUI frontends will be able to make use of this.
- -> info
The info directly is less commonly used directly, but for example, if the earlier mentioned sparse checkout feature is used, (e.g. via git config core.sparseCheckout true), then a file like sparse-checkout can be created inside it, where files and directories to be excluded can be listed.
- -> hooks
This is the first directory with some more significant substance to it. Here scripts that do stuff before/after git does its things can be added.
Git usually comes with a few samples, to give an idea of the type of hooks that are possible:
root@linuxpc:/datadisk/git/repo01# (master) ls .git/hooks
applypatch-msg.sample fsmonitor-watchman.sample pre-applypatch.sample pre-merge-commit.sample pre-push.sample pre-receive.sample update.sample
commit-msg.sample post-update.sample pre-commit.sample prepare-commit-msg.sample pre-rebase.sample push-to-checkout.sample
Sidenote on git hooks
There are a few ways I find these useful, I’ll give two brief examples:
- -> hide or remove a variable relevant to my local/test environment before pushing to the remote (pre-commit hook):
#!/bin/bash
SETTING_REGEX='^(enforce_read_only_slaves=).+$'
STAGED_FILES=$(git diff --cached --name-only)
for FILE in $STAGED_FILES; do
if [[ "$FILE" == *.cnf ]]; then
if [[ -f "$FILE" ]]; then
sed -Ei "/${SETTING_REGEX}/d" $FILE
git add $FILE
echo "$FILE re-added due to setting removal..."
fi
fi
done
exit 0
It pattern matches based on a regex in .cnf files, and adds them again before a commit is made.
- -> fix directory ownership after basing a deploy on git pull (post-merge hook):
#!/bin/bash
OWNER="apache:apache"
DIR_PATH="/etc/httpd/htdocs/<application directory>"
cd $DIR_PATH || exit
unset GIT_DIR
files="$(git diff-tree -r --name-only --no-commit-id HEAD@{1} HEAD)"
for file in $files; do
[ -e "$file" ] && chown $OWNER "$file"
done
echo "post-merge ownership processing done..."
It basically gets a list of files changed during git pull, and resets their ownership status.
If there’s a hook that is found to be frequently useful across git repositories, there is the option to put the hook into $HOME/.git_remplate/hooks/ and it will be enabled with newly created git repos.
With that note out of the way, let’s continue with the rest of the .git directory:
- -> objects
This directory essentially stores all the repository files, in a file-system based key-value pairing. The direstory is the key, and the contents is the value.
The value can be of four main types: commit, tree, blob and tag.
Each of these is worth talking about on its own merit, as there’s quite a lot to unpack here.
Commit
Let’s check a commit so that we can explore this:
commit 40db5b2c3155e0056d3f2fb38a672ad2eb29bb87 (HEAD -> master, origin/master)
Author: root
Date: Sat Oct 4 22:44:13 2025 +0200
remove redundant explainer
So a commit with hash 40db5b2c3155e0056d3f2fb38a672ad2eb29bb87 was made, we can use git cat-file to check the objects:
root@linuxpc:/datadisk/git/repo01# (master) git cat-file -p 40db5b2c3155e0056d3f2fb38a672ad2eb29bb87
tree 3881df217cd25cf597f86e71ffa8c5c210e65b6b
parent bf3da7913687524f683906bc6cb6420515bc5725
author root 1759610653 +0200
committer root 1759610653 +0200
remove redundant explainer
What this tells us, is that a commit object is just a pointer to a tree object with some additional information, like the author, commit time, commit message, and signature, if there is one. Here, the commit points to the tree 3881df217cd25cf597f86e71ffa8c5c210e65b6b.
Let’s explore this tree.
Tree
root@linuxpc:/datadisk/git/repo01# (master) git cat-file -p 3881df217cd25cf597f86e71ffa8c5c210e65b6b
100644 blob 1b58029ebb69d4119920101f0ef9367c20879815 LICENSE
100644 blob 0c03f81996732db9c2b469b3c37cff1b0591df8b Makefile
100644 blob 02b7ee29dd0d4f07811e8fd553857c962c86ff16 README
100644 blob f97a69b8497848184d7f1a2dfc39b0116d790e00 compat.h
...
The output lists all the files in the project at the time of commit. That is pretty accurate as to what a tree generally is, a snapshot of the directory listing of the project. If the project contains more directories, then it may itself contain more trees.
What all of this shows us so far, is that a commit isn’t really a patch, or set of changes, but actually a complete snapshot of an entire project at a specific point in time.
Blob
In the above tree listing, we can see multiple “blobs”. Basically, a blob object stores the content of a file without the metadata like size, extension, timestamp, filename, or permissions. The metadata is not stored here, because it’s either already present in the tree, or doesn’t need to be tracked by git, such as creation timestamps.
Looking into a blob, we can basically recover the file at the time of the relevant commit:
root@linuxpc:/datadisk/git/repo01# (master) git cat-file -p 0c03f81996732db9c2b469b3c37cff1b0591df8b | sed 8q
.POSIX:
NAME =
VERSION =
# paths
PREFIX = /usr/local
MANPREFIX = ${PREFIX}/man
Tag
Tags are pointers other git objects, most commonly to commits, with additional metadata like annotations, tagger information, and a message.
The main idea of tags is that they are (generally speaking) static. Once a commit is tagged, the label sticks with it forever. This can be good if some deployment process outside of git needs to hook into git, in a way that it can track what happened independently of git internals.
- -> refs
The best way to understand this directory, is to think about what some of the basic git commands actually do for us.
When a new repository is cloned, we know that it results in a local repository with a default master branch, which is linked to a remote master branch, with some remote connection “origin”, and all of the remote branches.
Initially, the working directory of the repo would only contain a copy of the files that are present in the remote master branch. Although the contents of the remote branches are cloned and present in the local repo, their file contents aren’t initially visible.
What this tells us, is that on top of keeping track of local branches, there must be some way to keep track of, and access the contents of remote branches.
This is where references (refs) come into the picture.
As mentioned before, cloning a repo creates an “origin” pointing back to the remote repository. Git stores these remote-tracking branches as references, and updates the config file accordingly.
So basically, we can link all of this back to commands we’re already familiar with:
git branch -> basically list out the contents of ./git/refs/heads/ and its sub-directories
git branch -r -> basically list out the contents of .git/refs/remotes/ and its sub-directories
git tag -l -> .git/refs/tags/
The main takeaway is that in a simplified way, git commands just walk through the contents of those directories to know what branches, tags, and remote branches are available to it.
We can always take a look manually:
root@linuxpc:/datadisk/git/repo01# (master) cat .git/refs/heads/master
40db5b2c3155e0056d3f2fb38a672ad2eb29bb87
What this shows, is that a branch is just a text file containing the commit id currently points to. So, basically, branch -> points to a commit id, which effectively represents the current state of things in that branch. commit -> snapshot of things in the project at a particular moment of time, when a commit is created, a new commit object is added, and the branch reference is updated. Each commit contains a parent, pointing to the previous commit, which is how changes over time can be delved into.