This post is actually the by-product of my research for another post about git (.git/ actually). I was so much astonished by the amount of myths about complexity of git system removed from my mind, that I decided to make it the today’s post itself. To me the git system appears much simple now than yesterday.
CommitCommit is the central piece of the git system. The git world is simply a collection of commit objects, each of which hold a tree, which hold references to other trees and blobs. The branches, the tags, the HEAD are just fancy aliases for commits (more on these in some other post (next may be)).
A commit is basically the snapshot of present working tree. I will spare the details for a future post (it’s worth it).
Let’s now discuss what this post is about. Revealing the secrets involved in every git users ritual of ‘commit’ing, and performing a commit manually.
This should reveal quite some details about the internal working of git (no you don’t need to run away, it’s not that deep).
You might already know the concepts, but knowing sex and having sex are kind of different things.
Ok! Let’s start the exercise for manual commit.
First we need to create an empty directory, call it ‘work’. And some simple file in it.
=> mkdir workInitiate a git repo in it and add the ‘hello_world’ file
=> echo “Hello world!” > hello_world
=> git initWe will keep an eye on the changes that happen in the ‘.git’ repo throughout our exercise. For now check out what’s saved in the HEAD.
=> cat .git/HEADAs HEAD is basically just a reference to the commit which represents the current commit associated with the working tree. So one might guess .git/refs/heads/master would point to the tip of a branch. Let’s check it
=> ref: refs/heads/master
=> ls .git/refs/heads/masterThere is nothing in there. Since we have not committed any commits yet, there are no branches (since branches are merely named commits which happen to have multiple child commits).
=> git branch
If you are feeling adventurous, you can try ‘git log’.
=> git logLet’s now add our file to the staging area
=> fatal: bad default revision ‘HEAD’
=> git add hello_world
Staging area is the middle system which keeps our content after ‘git add’ and before ‘git commit’
A blob is the git’s representation of a file. It’s not actually a file, but just the content. A blob do not have any name or other metadata. It’s referenced in trees which contain metadata for blobs.This command converted content of ‘hello_world’ file into a blob and placed it in the index (aka staging area). A ‘blob’ is how our content is represented in git. You can check the .git dir, a new file ‘index’ is created. This file contain the references to all blobs and trees which get added to our staging area.
Tree is the object which stores references to other trees and blobs as leaf nodes.
=> ls .gitAt this point we would generally just ‘commit’ the ‘index’, but not this time. The ‘git commit’ command hides many details and is a great convenience. You’ll value it after this exercise.
=> branches config description HEAD hooks *index* info objects refs
Git store all our content in form of blobs. The blobs do not have any kind of meta-data attached with them (like name, creation date or something). They are just nameless ‘blobs’. To identify a blob, they are saved in ‘trees’ as leaf nodes. Different trees can save reference to same blob with different meta-data attached. But a git repository will have exactly one copy of a blob. This is the reason of compact storage of git.
We can see the blob for our content in ‘hello world’ present in staging area (index)
=> git ls-files –stageIf you entered the same content as me, both your and my hashes should be same. We can check what type of object the above hash belong to
=> 100644 802992c4220de19a90767f3000a79a31b98d0df7 0 hello_world
=> git cat-file -t 802992cThe above blob is not referenced by any tree. It’s only referenced from .git/index (which store references to objects (blobs and trees) which make up our staging area).
A ‘commit’ object in git holds a single tree. A tree may have references to more trees or blobs. So to ‘commit’ the above created blob of our content, we need a tree.
So we now need to create a tree.
=> git write-tree‘write-tree’ command make a tree with the contents of the ‘index’.
Now when we have the tree, let’s create a commit object with it.
=> echo “Initial commit” | git commit-tree cdbf8e1That’s the hash for our commit object. You directly use ‘git commit
So our commit object is ready, and we are done. Right? Not actually. What we have created is called an ‘unreachable commit’.
An unreachable commit is a commit which do not have any parents and which is not referenced by any of the files in .git/refs/heads/. Such commits are automatically removed by the git system after some time.
To make our commit reachable, we need to create a reference in a file in .git/refs/heads.
=> echo a5a86835ba72e3ca7d5267c68c06c212392f9b7d > .git/refs/heads/helloActually we should instead use the more safe way to update references in git system.
=> git update-refs refs/head/hello a5a86835ba72e3ca7d5267c68c06c212392f9b7dHere if we had used name ‘master’ instead of ‘hello’, we could have used ‘git log’. But now it’s still giving ‘fatal:’ because the HEAD is referring to ‘refs/heads/master’ which does not exist.
Now try the command ‘git branch’.
=> git branchHere we see what a branch actually is to git. A reference to a ‘commit’ object.
Wait there is more to it. Now when we have created the branch ‘hello’, we need to make HEAD refer to it.
=> git symbolic-ref HEAD refs/heads/helloThis command associated our working tree with the newly created branch/commit of ours. This is what actually happens on a checkout normally.
Now we can use ‘git log’. If you are using ‘zsh’ with appropriate theme, at this point the git branch indicator will change form uncommitted ‘master’ to committed ‘hello’.
Now we are done. Officially. Git system is this frighteningly simple inside. Hope this was as helpful for you as it was for me. I really enjoyed writing this post.