Merge recursive strategy
You must have heard about git merge recursive strategy which is the default algorithm that git uses when merging two branches.
How does it work and why is it good?
The basics: elements of a merge
You've got two branches you want to merge. The basic elements to consider are:- The source: the changeset you're merging from. Changeset 16 in the example below.
- The destination: the changeset you're merging to. Changeset 15 in the example below.
- The ancestor: the changeset (or commit) which is the nearest parent of the source and the destination. This is Changeset 10 in the example below.
When is merge recursive needed?
What if we find "two common ancestors"? The branch explorer view below shows an alternative in which there are two possible "common ancestors".One solution is to "select" one of the ancestors as the valid one for the merge (which is the option Mercurial takes) but as we will see below, it has many drawbacks.
How merge recursive works?
When more than one valid ancestor is found, the recursive-merge strategy will create a new unique "virtual ancestor" merging the ones initially found. The following image depicts the algorithm:A new ancestor 2 will be used as "ancestor" to merge the "src" and "dst".
The "merge recursive strategy" is able to find a better solution than just "selecting one of the two" as I'll describe below.
Why merge recursive is better: a step by step example
Let me use the following "notation" in the next example: a file foo.c with three lines like these:b c dIt will be described as: /foo.c = bcd.
Of course, for the sake of simplicity I'll be using "stupid lines" like "abc" but assume the example is valid for real code too.
Let's take a look at the following case in the diagram below:
I’ll try to describe it changeset by changeset:
- 0. We had a file foo.c with content foo=bcd (three lines, first is b, second is c and third is d)
- 1. We edit foo.c on a branch and add a new line so it ends up being foo=bcde
- 2. We modify the second line of the file on main. Now the file is foo=bCd
- 3. We create a new branch from changeset 2 and add a new line at the beginning so it is now foo.c=abCd
- 4. Going back to task001, we modify the line we just added: foo.c=bcdE
- 5. We "undo" the change we just did on main: foo.c=bcd
- 6. We merge 4 and 3 and create 6 as foo.c=abCdE. (We combine the changes we made on task002 (adding a new line at the beginning) with the ones in task001 (adding E at the end) and also the change coming from 2 on /main.)
- 7. We now merge 4 and 5 introducing the change from 4 (last line added) into main: foo.c=bcdE
We should get the "addition" on task002 (a new line a at the beginning) on top of 7.
The expected result is: foo.c=abcdE.
We shouldn't get the line with the uppercase C as it is on task002 because we fixed it afterwards on main.
As you can see in the diagram, I highlighted the changesets 4 and 2 because they're the two possible common ancestors from 6 and 7.
Which one should we choose?
Mercurial will choose 4 because its algorithm chooses the "deepest" ancestor in the case were more than one is found.
What happens if we choose 4?
We will merge 4=bcdE, 6=abCdE and 7=abcdE and the automatic result (by any 3-way merge tool will be):
- First line -> a (it's there on 6 and 7)
- Second line -> b (it's there on the three contributors)
- Third line -> C (changed on 6 but unchanged on 4 and 7)
- Fourth line -> d (unchanged)
- Fifth line -> E (unchanged)
We took C instead of c due to the wrong ancestor selection.
How recursive merge fixes the mess?
As I described above, the first thing recursive merge is going to do is to calculate a new "virtual ancestor" merging 4 and 2 as the following picture shows:The result changeset X is foo=bCdE.
Later, changeset X is used as the "ancestor" of 6 and 7 and then we get:
- Ancestor: foo=bCdE
- Source: foo=abCdE
- Destination: foo=bcdE
- Result: foo=abcdE, which is what we were looking for!
Why it is so good?
If you have to deal with branching and merging and you don't have a good merge algorithm, you can end up with broken files without warning! Branching and merging are the two weapons you must have in your developer's toolset... but ensure you have the best possible ones, the ones that really do the job. In short: Git will do it correctly, Hg will break the result, and SVN and others will simply mess up the whole thing.Plastic SCM also includes a powerful merge-recursive algorithm, so it is able to produce the same result. (In fact, our algorithm is even more powerful, correctly handling cases that even Git is unable to deal with successfully).
Learn how Plastic SCM's MergeMachine works |
Try Plastic SCM free: a full stack VCS
More on recursive merge strategy
Recursive merge is not only good for “criss-cross” merge situations but also for more “regular” ones where merge history simply gets complicated. Watch our on spot, detail-rich explanation of recursive merge in Plastic SCMContinue reading part II: Only one ancestor is the right option
10 comentarios: