Git: Understanding HEAD
HEAD in git is perhaps the single most confusing aspect when learning git, so this short article will aim to give a practical, non-rigorous explanation on what it is, and why sometimes you might want to have a detached HEAD.
What is HEAD?
To understand this question, we must remember that git works with branches. A branch does not track the entirety of the project, only the individual changes relative to the last commit in the branch. If a file in the project is edited, the rest of the other files remain untouched. The branch itself holds no information about what is changed, only on the state of things. Information about file changes are stored in the commits, once a commit is pushed, it is permanent and becomes part of the project’s history, and can be used to roll back past versions of files.
If you examine closely what git does, you will notice that the branch file itself is just the checksum of a commit, reinforcing the notion that the branch is merely a state of things. This is also why checking out to another branch is usually very fast, the project isn’t reloaded from scratch, git just jumps to another commit and loads the differences only.
Now with this in mind, the usual descriptions of HEAD will be easier to follow. It is often cited, that HEAD is a reference to the latest commit in your currently checked out branch. In less technical words, HEAD is whatever you, the user is looking at when inside a git repository.
Whatever the HEAD points to, will be the parent of the next commit you will do.
Detached HEAD
When HEAD points directly to a commmit, and not a branch label, you’ll see the following error:
You are in 'detached HEAD' state.
Any changes and commits you do here won’t belong to any branch, and they would be unreachable once committed. If you see this, don’t panic, you don’t need to rm -rf and git clone.
You can very simply just create a new branch (remember, branches are free!):
git checkout -b reattach-my-head
The HEAD will now point to this newly created branch, and you can then decide whether the changes are worth keeping, if you’d rather delete the branch along with the commits. What that means, is that if you are checked out into a commit, HEAD is detached, if you are checked out into a branch, head is “attached”.
If you simply ignore the error message, and decide to create commits in detached HEAD state, nothing particularly terrible will happen from the point of view of the repository overall, but you will find it difficult to locate these commits in the future, and git will eventually get rid of them as part of its garbage collection cycle.
Why might you sometimes want a detached HEAD?
A common example of purposefully detached HEADs are automated deployments. Git can act as the backbone of a deployment pipeline, especially in the case of script repositories, where there are no compilation steps. As we have established earlier, branches are the current state of things, making them rather unsuitable for deployment. The state of a branch can change at any moment with no warning (unless you are a solo dev…), but a commit is a specific list of changes relative to the previous revision.
This makes commits perfectly suitable for deployments. Once we check out to a specific commit, we will be in a detached HEAD state, but that is OK, since we just want to deploy a list of changes, regardless whether the branch has since moved on or not.
For example, a simple script deployment system can latch onto git, and check what merges have been made against master. In a lot of cases, master is the main source of truth, and all branches on the remote repository end up being merged to master. That implies, that something should be deployed.
We can use git the following way to check for this:
git log --merges master | sed 1q | awk '{print $2}'
At the time of execution, this will grab the latest merge to the master. Assuming your repository isn’t getting absolutely hammered with merges, putting this into crontab and have it run every few minutes will grab your latest merges. These commit id’s can be saved into a database table for later reference, possibly with a datetime column like release_date that is set to NULL, to mark this commit as not yet deployed.
A second script, also running in crontab every few minutes, can then read this database table, get each commit id where the release_date is NULL, and check out to the commit to see what changes should be deployed:
git -c advice.detachedHead=false checkout -q $COMMIT
At this point, git will intentionally enter a detached HEAD state, because we want to ensure that we are deploying exactly what was merged, even if there are since more commits on top on the master branch. This can also be used as a very simple rollback mechanism, by flipping any old commit in your database table to have a NULL release_date.
Conclusion
The detached HEAD message can seem scary on a first encounter, since no git terminology ever mentions an attached HEAD, but once you understand what a branch represents, and what a commit represents, you may even find practical uses for a detached HEAD.