More on recursive merge strategy
[Edit - 2013/10/02 - screencast available at http://youtu.be/Lg1igaCtAck]A few days ago I had the chance to read a comment on my previous blog post about merge recursive pointing to the Mercurial mailing list where Matt Mackall (of Mercurial fame) strongly criticized my explanation of the recursive merge strategy.
Edit: I replied to Matt on the mailing list and things were clarified. Plastic is not a 'git wrapper' as he pointed but a full feature scm, and implementing quite strong merge :P
Well, some were simply about the “rendering style” of plastic’s branch explorer (which is what I used to drive the examples), and others about the explanation itself.
Mercurial’s programmer admits that recursive merge is better although argues that it is only worth in some cases and that its costs to implement is bigger than its benefits. Implementing a recursive merge isn’t cheap, as we known down here at Codice, but you know, the big difference between a great product and an exceptional one normally relies on the upper 2% of features that makes it rock.
Back to the drawing boardRecursive merge is not only good for “criss-cross” merge situations but also for more “regular” ones where merge history simply gets complicated.
I’d like to go back to the original example and explain it now in bigger detail:
DAG formatJust in case anyone has trouble understanding that the green lines are merges and they’re rendered from source to destination, here’s an alternative diagram:
Common ancestorsIn order to run a merge, you need to identify the “source”, the “destination” and the nearest common ancestor.
In this case the “src” and “dst” are clear but the problem is the “common ancestor” because there’s more than one candidate.
If you look carefully you’ll see that both changesets “2” and “3” can be taken as common ancestors.
A non-recursive merge algorithm will only use one of them, and depending on the “choice”, the result can be quite different.
Selecting “3” as common ancestorIf the algorithm chooses “3” (and as we explained in our previous post, this is the choice taken by Hg, and also would be the one taken by former plastic’s algorithm previous to “merge recursive” implementation) then the following situation will happen:
Selecting “2” as common ancestor – the lucky shotIf the algorithm uses “2” as common ancestor, then the merge will be between content “B”, “B” and “C•”, so, since two contributors are the same, the result will be “C”.
This time, the result matches the expected result, but sometimes it simply leads to an incorrect automatic choice, which is the real problem and the real reason why “recursive” is implemented by git and plastic.
The recursive optionAgain, here’s how the “recursive works”: since there’re two ancestors at the same distance, it will “merge them” creating a new virtual ancestor that will be used as the “base” for the final merge between the merge candidates “5” and “6”:
Where is the huge difference?Comparing recursive and no-recursive algorithms in this example, you might only see that the recursive is able to avoid a manual conflict because it is able to find out how it was solved before. Is it such a big difference? Just avoiding one manual merge?
Real world has the answer: try a "real" merge involving several hundreds of files and then the answer is clear: usable if you don't have to solve the conflicts vs not usable.