Mastering git branches

By Henrique Mota

Branches are one of the most misunderstood concepts in git and yet they are so simple to understand. How many uncomfortable situations have you been when dealing with branches? All the rebases and merges that were asked from your boss, all those conflicts that appear along the way.

The good news is that after you read this post, you will understand branches for good and the stress levels will raise down.

We are going to start with an image of a git graph.

image showing 3 branches

Looking at this image, it’s unarguable that we are in the presence of 3 branches. But the questions are: What is the Branch 1 branch? What is the master branch? What is the Branch 2 branch? How are they stored?

By definition a branch is any deviation and in fact we are in the presence of 3 deviations, but for git a branch is not a simple deviation it is even simpler than that.

You know the answer if you already read my first post about git. If you didn’t let me shock you a little bit, a branch in a nutshell is just a pointer to a single commit. Of course this pointer is only useful to maintain this deviations that we call branches and from now on in this post you will see me use branch when I want to refer that deviation and use branch reference when I want to refer to the pointer itself.

image showing what is a branch reference

This reference has other characteristics that we are going to see next, but the single fact that a branch is a pointer leverage the achievement of advanced operations seamless, like merges and rebases.

The answer is in the way git stores the connection between the commits. This reference is made in the opposite direction of the graph, like showed in the image below:

image showing how a commit references a ancestor

Like you see referencing a commit we can induct the whole tree visiting is ancestors, pretty sweet. By visit the ancestor of each branch you know that the last common commit is the number 4.

Something is missing here right? Ok a branch is a reference to a single commit, but what happens to the reference when we do a commit or when we hard reset inside a branch? Well those two actions commit and hard reset are impacting also the branch pointer.

Starting with the commit, let’s imagine we want commit a change to branch-1:

> git checkout branch-1
> echo "add one more line" >> existing_file.something
> git add existing_file.something
> git commit -m "Add a change"

Describing the actions like I was talking to git:

  • “Hey go to branch-1”
  • “Add one more line to an existing file”
  • “Put that changes in staging area”
  • “Given that we are in branch-1, commit the changes in stage and move the branch-1 reference to the new commit”

By another hand it’s even easier to reason about a hard reset:

> git checkout branch-1
> git reset --hard HEAD~1

This is even simpler to put in words:

“Given that we are in the branch-1, forget about the current commit and instead point to the previous commit.”

In the next sections we are going to see the two main operations that can be done when dealing with branches, that is rebase and merge.

Rebase is the operation we use when we want to rewrite history and this is not clear by the operation name, but that is the reality. In this section I am going to show the use of rebase in front of two different needs:

  • Rewrite a commit message, plus squash, remove or reorganise commits. This is when you want intentionally change the history within the deviation point to the branch reference, the same is to say within the entire branch.
  • Change the point where the branch started, the deviation point. Now as you going to see when we change the deviation point, all history must change because the starting point is different.
this is the scenario we are going to start

Let’s start with the simplest scenario, change the base of our branch. Our goal here is to make the branch-1 start from master reference/commit 8. Imagine that the numbers in the commits are the commit messages.

For this effect we only need to specify a commit or a branch reference and then all the commits of the branch will be re-applied on the top of the base commit.

> git log --pretty --graph --oneline --all
* 34d036a (HEAD -> branch-1) 7
* 982b9cc 5

| * cde59d7 (master) 8
| * 4323998 6
|/
* 98e63b8 4
* b9cb989 3
* adbf8d4 2
* d42dc27 1
> git checkout branch-1
> git rebase master
> git log --pretty --graph --oneline --all
* eee7e1f (HEAD -> branch-1) 7
* a9fd961 5

* cde59d7 (master) 8
* 4323998 6
* 98e63b8 4
* b9cb989 3
* adbf8d4 2
* d42dc27 1

You can see that the hash of commits 5 and 7 are now different. That’s perfectly normal because the ancestor of commit 5 changes and because a commit is immutable, we have to create a new one and the same applies to the following commits.

Let’s experiment the scenario where we are going to change the history of our branch branch-1.

Say that we want to rename the commit with the message 5 to 20 just for integrity sake, but as you know this message is up to the git user and this is just for explanation purposes. We have to use the rebase in interactive mode, given that we intend to change a specific commit. In interactive mode we set the base for our branch and let’s maintain the same (commit 4) and a prompt will fire in our text editor to choose the actions we want to apply from the base to the front.

> git log --pretty --graph --oneline --all
* 34d036a (branch-1) 7
* 982b9cc 5

| * cde59d7 (HEAD -> master) 8
| * 4323998 6
|/
* 98e63b8 4
* b9cb989 3
* adbf8d4 2
* d42dc27 1

So let’s do the interactive rebase:

> git checkout branch-1
> git rebase -i 98e63b8
pick 982b9cc 5
pick 34d036a 7

As you can see we have pick as the default action for each commit, this action just keeps the commit as it is. If we want we can change the action of each commit to:

  • r/reword, maintain the commit but change the message
  • e/edit, use the commit but stop for amending
  • s/squash, melt into previous commit
  • f/fixup, like “squash” but discard this commit log
  • x/exec, run rebase in the shell from here
  • d/drop, remove commit

In this example we are going to use the reword but I encourage you to experiment the other actions in a toy project.

r 982b9cc 5
pick 34d036a 7

Then an editor prompts you the new message

20

Executing the log we will see the changes

* ced67cf (HEAD -> branch-1) 7
* 165bf17 20

| * cde59d7 (master) 8
| * 4323998 6
|/
* 98e63b8 4
* b9cb989 3
* adbf8d4 2
* d42dc27 1

We’ve already saw this pattern, every time a commit needs to change a new commit has to be created and if a new commit is create if affects as well the following ones.

Merging is completely different from rebasing and in my opinion is simpler. Imagine that we want to reflect the changes of master in branch-1, but instead of rebasing we don’t want to change the commits until 7. Well that is the perfect scenario to use merge.

this is the scenario to rebase

Merge comes in two flavours, fast-forward and non fast-forward. Fast forward is to move the pointer of a branch to the branch we want to merge. Is not always possible, and the above scenario is the prove. you cannot move the branch-1 to the master reference commit because it will lose the commits 5 and 7. If you rebased the branch-1 on the top of the master branch then you could merge fast-forward branch-1 into master.

git graph after rebase of branch-1 on the top of master

Then in intuitive to understand that to merge branch-1 into master we only need to move master reference to commit 7.

> git checkout master
> git merge branch-1 -ff
git graph after merge fast-forward

But now let’s remember our first scenario:

this is the scenario to rebase

Let’s merge master into branch-1, fast-forward to see what happens:

> git merge master --ff-only
fatal: Not possible to fast-forward, aborting.

As expected is not possible. Let’s try now without forcing fast forward:

> git merge master
> git log --pretty --graph --oneline --all
*   9a4eb5c (HEAD -> branch-1) Merge branch 'master' into branch-1
|\
| * cde59d7 (master) 8
| * 4323998 6
* | 34d036a 7
* | 982b9cc 5
|/
* 98e63b8 4
* b9cb989 3
* adbf8d4 2
* d42dc27 1

As you can see a new commit was created, and this commit has a particularity it has 2 ancestors and it’s called a merge commit.

Now our graph will be like this:

merge without fast-forward

When we want to reflect the changes of a branch in another one we use rebase or merge? Well to be honest it depends.

If you are in a public branch that you and your colleagues are working at the same time merge is the answer because it keeps all the commits until know and put the changes in the top of your last commit. This allow to pull normally the changes.

If you are working in a branch of yours, rebase will keep your history linear and then, when you merge your branch with the master it will be more smooth and clean.

If you found this article useful check my other article:

If you found the merge vs rebase interesting read this article from @fredrikmorken, pretty cool:

Kind regards,

Henrique Mota