The Plastic SCM blog

Linus on branching...

A few months ago, Linus Torvalds shared some interesting thoughts and concerns regarding the Git branching patterns being used in Kernel development.

Since learning what Torvalds has to say is always enlightening, I wanted to delve into the points he mentioned, because they align pretty closely with the techniques we recommend with Plastic SCM. Obviously, the points apply to Git, Plastic SCM, and any other SCM with good branching support, too.

Linus’ words


This is what Linus wrote (I only extracted a fragment, for the complete text, go to http://lkml.org/lkml/2010/9/28/362).

The real problem is that maintainers often pick random - and not at all stable - points for their development to begin with. They just pick some random "this is where Linus -git tree is today", and do their development on top of that. THAT is the problem - they are unaware that there's some nasty bug in that version.


Shooting a moving target


What is Linus talking about? Actually, it’s one of the very well-known issues with trunk development that I described here in the section titled “don’t shoot moving targets!”.

This problem hits “mainline/trunk pattern” followers hard, since they keep updating to “what’s more recent”, probably due to fear of merging. Continuous integration delays the problem, but doesn’t solve it. But this problem can also hit those following a “feature branch” pattern, unless they follow all the rules.

The problem they’re facing is shown in the following picture:



The kernel team uses branches, but what if they’re using the “master” as an integration branch and they’re branching off commits that are not tagged? (As the picture shows).

The problem is that the master branch can be used as an integration point and hence get commits that leave the branch in an intermediate, unstable state. (This is especially true when using fast forward merges on Git, which I don’t really like precisely due to this issue. But I guess Linus must be doing real merges because he DOES know how to use Git. :P) Then you branch from an unstable commit and… you end up in trouble!

Baselines are key



We had exactly the same issue with some teams using Plastic because they failed to understand the importance of baselines. Once your code is stable, tag it, label it, and create branches ONLY from this well-known point!
That’s basically the rule of thumb: create baselines frequently (as many as you can) which of course must be fully tested. (That’s the time-consuming part, since integration is pretty fast nowadays with modern SCMs) and then CREATE BRANCHES ONLY FROM STABLE BASELINES, as the figure shows:



This way if something fails on your task branch (feature branch) you know… it’s your fault!!, because all tests were fine on the baseline -- not just the fast integration ones (like the ones you can run on checkin with continuous integration tools) but also the slow ones that you use to validate a release.
Very easy rule of thumb, great savings!

Moved code detection

Since we launched the Community Edition we've seen a huge increase in the number of downloads and also the number of teams jumping to Plastic all around the globe.

While Plastic has always been free for education and open source, our main intention with CE is to reach small teams and help them in their transition to DVCS and parallel development. Most of the software companies out there have less than 15 users and they're exactly the ones we're targeting with CE.

That being said, I wonder if all the new people using Plastic is aware of some of the really cool features we've implemented in 3.0. Do you know Plastic can detect moved code both in the diff tool and the merge tool??

Check the following posts:

  • Moved (and modified) code detection on diff
  • Moved (and modified) code detection during merge

    And, you know... tell your colleagues!!! :P
  • Live to merge, merge to live...

    This blog post was initially published back in 2008 at DDJ but since DDJ "Guru blogs" moved to the new location some images have been broken so I've decided to publish it again here.

    Live to merge, merge to live...


    As a professional programmer you’re familiar with a variety of programming languages, you know by heart the basics and the not so basics of data structures and algorithms. You are an expert working in your favorite IDE. You master software patterns and you’re aware of the newest trends in agile methods. But there’s a useful tool in the programmer’s toolbox which is normally more feared than used: the merge tool! This article will explain, step by step, the very basics of merging and will explore the different merge types, their uses and advantages.

    Fear of merging


    So, you write code, don’t you? And there’s a big chance that you don’t develop code in isolation, right? The code you write for your projects is normally scattered across a number of files. And, according to the Pareto Principle, 20% of the files in your project will receive 80% of the changes. You can try to trace a bad design smell here, you can try to refactor your code from top to bottom day and night, but, unfortunately, that’s just reality: if you and your team work on a project, there’s a huge chance you’ll end up editing the same files at the same time.

    For a number of projects out there this is a big problem. I’ve found a number of project managers and software designers trying to avoid concurrent modification wrestling their project plans and software designs. Wouldn't it be better if they were putting such efforts into making better software and finishing it on time?

    But, there’s an ancestral fear behind this behavior: the arcane fear of merging. “Hey, if you and me modify the same file... we’ll have to reconcile all our changes!!”. And of course, they assume it will be a painful and error prone process. “So, let's schedule our changes so that only one of us touches the file at the same time”. Ouch!

    Long ago software development was about lone eagles working alone. Now software development is about new languages, new tools, but at the end of the day it’s all about collaboration. And, avoiding collaboration doesn’t look like the smartest way of getting the best out of a team.
    Then, why are we initially so scared of merging our file changes? I find two key reasons:

  • Lack of knowledge about how merge tools really work. People tend to think about code merging as some sort of magic process able to understand their code in order to combine changes. They don’t believe code can be always correctly understood by the system, and they don’t trust merge tools. Of course, under their assumptions, they’re right. The problem is merge tools don’t analyze or understand the code, they just apply some simple and clever rules to combine texts. The same way you trust your compiler will generate the right machine code, you should trust your merge tool.

  • Past bad experiences: merge tools and their big brothers the version control tools, have evolved during the last two decades. You’ve probably experienced some awkward issue with an old-fashioned (but still alive) version control system. Believing they’re still the same, is like still preferring coding in assembler because you don’t trust compilers.

    Merging explained: automated conflicts


    Do you know how a merge tool works? Let’s take a look at a very simple example. I won’t dive into the obscure algorithm details but just make a 1000ft flyby.

    Suppose we have a piece of code like the one at the next figure. Then you and I start making changes on the file at the same time. I make a couple of changes at the beginning of the file, and you add a new method below.



    Merging our changes manually is possible; it is a small file so it will just take a few minutes to do. It will require both of us to carefully look at our changes, but we’ll make it.

    Of course the picture changes if we’ve changed not one but 15 files in total, and up to 8 have been modified in parallel. Also, what would happen if, instead of only the two of us, 5 other people were working on the code at the same time? Yes, the process is doable, but it is time consuming, error prone and... boring!! I bet you have better things to do than manually combine files.

    But, what would a merge tool be able to do in our previous example?

    The tool will find an automatic conflict. Look carefully, our changes don’t collide, so what we would do manually would be just copy and paste my changes on the right part of your file or vice versa. It is very simple, but doing manually is error prone. This is exactly what a merge tool will do: just put the two set of changes together, with no possible collision or conflict.

    Normally, during a merge, the tool won’t even bother asking you to look into such a conflict; it is so simple it can solve it by itself. Of course, almost all the tools out in the market will allow you to set a mode in which all conflicts are reviewed by the user. It will just propose the changes, but you will be the one actually making the decision. Do you feel safer now? Ok, I bet after a week of manually reviewing trivial conflicts you’ll switch to automated mode.

    The next figure shows the results of the first merge and how a merge tool will combine the changes together to create the result (remember to click on the images to make them bigger).



    You’re in trouble: manual conflicts


    But, a developer’s life can be exciting and full of challenges, but it is not easy. So, eventually you’ll face a situation like the one depicted by the next figure:



    Yes, know we’ve modified exactly the same code in one of our changes on the file, which makes things much more complicated.

    Now you can say “the tool can’t know the right solution!”.

    And you’re right. But, as I told you, the merge tool is not a wizard’s device; it is just a programmer’s tool. So, use it correctly and it will make your life much easier.

    The 4th figure shows a merge tool in action letting you decide what to do with your manual merge conflict.



    Under these circumstances the tool will always prompt the user. It will still save you precious time because it directly focuses you on the problem, but you’ll have to make the decision yourself.

    So, the merge tool will help you with automated conflicts not even asking you if you don’t want to (and honestly, it’s the right choice) and will ask you for help whenever it finds a manual conflict, which is basically a code fragment with changes made by two developers at the same time.

    The rule of thumb is very easy and will help you trust the tool because there’s no complex code analysis behind it. It just looks into the lines of code: if only one contributor changed the fragment, it is an automatic conflict, otherwise, it is not trivial and the tool will ask.

    2-way and 3-way merging


    What’s all this fuss about 2-way and 3-way merge tools? What are they all about? Ok, that’s what I’ll be explaining in the next few paragraphs. It basically depends on the number of file versions you consider for your merge operations.

    So far, what you’ve seen is a 3-way merge in action:
  • You have the original file: it is the file as it was at the beginning before a specific set of changes were performed by our developers.
  • Then you have the file you have modified (remember the previous examples).
  • And finally the file I’ve modified.

    The result file is the one created after the changes are combined, the one at the bottom of the previous Figure (some tools prefer to hide the base file and just show the result one).

    I didn’t explain 2-way merge yet, but it’s not very complicated: it doesn’t consider the base file (also known as the common ancestor) for the merge.

    Is it better? Simply put: no, it isn’t. 3-way merge knows what you’ve added or removed to a file while 2-way merge can’t because it doesn’t know how the file was at the beginning.

    But, still, I’ve found developers who seem to be more used to 2-way merge tools. Let’s try to figure out why.

    Let’s go back to the original Java file, make a couple of very simple changes, and try to merge them with a two-way merge tool. Check the results on the next image.



    Do you see the problem? Basically, at each difference the 2-way merge tool won’t be able to decide whether it is modified or removed code, so it will always have to ask you!

    This may be good for the paranoid but, believe me, if you have to manage a good number of merges, you’ll end up wasting your time.

    The same two conflicts would be automatically solved by a 3-way merge tool.

    Of course, there’s a remark here: you can only use 3-way merge with a version control tool handling your code. Otherwise you won’t have access to the base file unless you have a very good memory or a crazy naming convention to keep your old files...

    Merge tracking, what’s in it for me?



    If you’ve never heard of merge tracking... well, welcome to a whole new world. You’ve probably heard about it after Subversion 1.5 had been released. It has finally introduced merge tracking. It still has some caveats but it’s evolving in the right direction. Many other systems out there have had merge tracking since long ago and most had it as part of the core product since their inceptions.
    Anyway, what does it mean?

    Merge tracking is deeply related to version control tools. You can run a file merge in isolation, but with merge tracking you rapidly enter the field of SCM (whether you want to translate it as Software Configuration Management or Source Code Management is just your choice).

    Merge tracking is also deeply related to branching, but I’ll try to postpone the topic as much as possible.

    Have a look at the next image. It represents the merge we’ve been running in the previous examples. We have the original file and then your changes and mine drawn as some sort of tree or graph.

    After I merge your changes with mine, a merge link is created telling the system I’ve merged your changes with mine. Also, I would like to highlight that during this change we modified exactly the same lines of code, so it will be a manual merge... Remember it because I’ll use it below.

    What’s the benefit? First: you know what you’ve done since your version control system takes care of this information. If you don’t have it, it’ll be harder to figure out what happened.



    But, let’s make another set of changes with our sample files. You can check how our tree looks like after the changes at the next figure.



    You make a new modification and I make another one, and once the two of us are finished, I decide to merge your changes back with mine again.

    How does merge tracking help here? First of all, you remember I mentioned above our first merge was not automatic, don’t you? So, what’s the benefit of merge tracking? It will just merge the changes after the last merge happened, and you won’t have to solve the same manual conflict again. It greatly simplifies merging because it will let you focus on what’s new and you won’t have to merge all the old stuff.

    How does the merge tool know what to merge? In any three way merge you’ll need a base (or original file) and two contributors. In the sample highlighted at the previous figure the version control tool, with the help of merge tracking, will first try to find what’s the base file for the two changes we have (two revisions after all).

    And how does the system locate it? It will use your tree of versions and try to locate the closest parent revisions of the ones you’re trying to merge. The algorithm is known as nearest common ancestor and it is about finding the closest parent of a couple of nodes on a directed graph.

    In our sample the next figure shows the base or common ancestor for the two revisions we’re trying to merge.



    Look carefully, if the merge link wasn’t there, the parent would be the original revision, and then the merge tool would have to ask you again about the previously solved conflict (the code would be the same at the two revisions but different from the base).

    The merge arrow solves the problem allowing the underlying system to correctly identify which one is the new base for the merge.

    Branching


    I don’t know whether you realized or not but... we’ve been using branching!

    Look back at Figure 7 (two figures back). There’s a set of changes named your set of changes and a set of changes named my set of changes, right?

    Well, they’re actually two different branches which is nothing more than a couple of sets of revisions.

    They allow you to have parallel sets of changes, which is great when you’re doing development.

    It can’t be easier.

    I feel like a myth buster today, but as you can see, branching, one of the concerns for a number of developers, is not an issue at all.

    Unfortunately branch management is a nightmare with some old fashioned version control tools (think about CVS or SourceSafe, for instance, and even SVN until merge tracking becomes mainstream and stable), and that’s the reason behind all this fear...

    Wrapping up...


    So far we’ve introduced all the basics (and not so basics) of merging. As you have noticed it is not a difficult task at all once you correctly understand the steps and contributors involved.

    Merging is one of the daily tools of a professional developer, but still unknown for a wide amount of users. Mastering the process will make them more productive and will allow projects to evolve faster.
  • Fixing a bug – the branch-per-task way

    This is based upon a real story: I have a piece of code that launches an external application, but it wasn’t taking into account the possibility of having programs with spaces in their pathnames. Something like “c:\mypath\tool.exe” worked, but not something like “c:\program files\my tool\tool.exe”. To be able to launch such programs, we need to enclose the pathnames in quotes.

    The code


    The original code was something like the following and obviously unable to deal with quotes.

    Issue tracking system


    First things first: we need to start with a task in an issue tracking system. Internally at Codice we use TTS (task tracking system), our web-based app for issues. (Yes, if you’re interested in it, I can tell you we’ll be releasing it soon.) The task I’ll be working on is 8651.

    Creating a branch for the task


    Nothing fancy here: just go to the BranchExplorer and create a new branch for the new task. The branch will be based at a well-known point, the latest known stable release.

    Next, I’ll switch my workspace to the new branch and start editing the code.

    Initial refactoring of the code


    First, I’m going to modify the code to extract the argument parsing to a new class named ArgsParser. I modify the original code as follows, to create an ArgsParser object and then use its members Program and Args.

    Just after I write this code, I checkin the changes, explaining it was a first step in my refactoring. REMEMBER: Because I’m using a task branch, I’m free to create as many checkins as I need, without affecting the other developers!

    Inspect the first change


    Look at the BranchExplorer after the first checkin:

    Now I inspect the changes contained in the changeset on branch SCM8651. (It’s the second one you see in the picture. The first cset in a child branch is just there to keep track of its connection to the parent branch).

    And then I launch the Diff changeset content view, which contains two panes: the top pane shows all the files modified in a cset; the bottom pane shows the line-by-line differences for the currently selected file.

    It is a nice interface, isn’t it?

    We detect moved code!


    But, wait, let’s take a deeper look at this diff. Do you notice the two buttons close to the difference and the line going down? The next figure shows these details.

    Scrolling down following the line, we get to the ArgsParser class implementation I talked about before:

    Plastic is able to track the moved code! It has figured out that the original lines of code have been moved to a method in the new ArgsParser class. Amazing, isn’t it??
    Even better is the second button over the difference. It runs a sub-diff. I click on it.

    The sub-diff shows how I’ve modified the code AFTER moving it to the new location, renamed the variable names, introduced some class members and so on. Plastic is still able to track it!!

    Completing the change


    After my initial refactoring, the next step is to actually add the code that handles the quoted program pathnames.
    I’ll make this change and also add some unit tests to check that my code is fine.
    Once I’m done I can go to the Pending changes view in the Plastic GUI (or invoke View Changes in the Visual Studio plugin):

    and checkin three files.
    Now I look at how the BranchExplorer renders the current situation, with a new changeset being created containing the second change.

    Inspecting the changes


    One of the really cool features in Plastic is the ability to “walk changesets” inside branches, one by one. If you’re careful doing your checkins (as I have been here) your branches will tell a story, a very complete one, describing every change in detail and greatly helping reviewers. (Sometimes just seeing all the differences in one step is not as useful as walking the changes one by one, nor so easy to understand.)

    Here’s the view created by the Explore changesets in branch command:

    This view is very much like the one I used above, Diff changeset content, but with an extra dimension. Instead of exploring the contents of a single changeset, I can now explore all the changesets of my branch.
    As you can see, walking the changes one by one is very, very easy this way.

    For distributed developers – push your changes back


    Once I’m done, I’ll run my test suite and then go to our issue tracking system and set the task as finished.
    Since I’m working with a local Plastic SCM server on my laptop (remember Plastic is a DVCS), the last step is pushing my branch back to the main server. I can use the BranchExplorer for this, too, using the Push this branch command.

    Wrapping up


    Now it is your turn to ask questions! Do you see the beauty of branch per task (or task oriented, if you prefer) development, using Plastic SCM’s powerful visual tools?

    The version control timeline

    Software Configuration Management (or source code management, for you real hard core coders) has been around for quite a few years, slowly moving from an almost manual-labor, dark prehistory to the shiny days of the DVCS (distributed version control system).

    You’ve all used at least one of the SCMs on the following list, but are you aware of how long the system you’re using has been around? Do you know the big names? Ok, that’s what I’ll try to supply with this short compilation.

    The big picture


    Look at the following diagram to find some of the main names in SCM history. Yes, I must be missing a good number of them, so don’t be shy: post a comment and I’ll update the list with your favorite one I missed :)



    If you’re still feeling good about using really “old irons”, I’ve added some pictures of how “cell phones” looked like when the SCMs were released, so you feel older and bloody outdated :). So yes, if you’re using CVS and still think it’s ok, look at the cell phone directly below “CVS” in the diagram. Do you feel like Gordon Gecko (Michael Douglas) in “Wall Street, Money Never Sleeps”, getting his brick-sized cell phone back as he leaves jail?

    Prehistory


    There was a time when you stored your versions manually. Ok, for many of you this time wasn’t the 80s, but a few years back when you were at college naming your source-code archives exercise.zip, exercise-0.zip, exercise-good.zip, exercise-good-final.zip, and so on. Well, believe it or not, there was a time without real SCMs. It was always dark and people were living in caves.

    RCS


    Then 1982 came and RCS was released. RCS is not a huge piece of technology, but you can still find it around in Unix distros. It is simple and straight to the point.

    One nice feature was that text changes were stored as deltas (pretty important, considering hard drives used to be small!). Deltas are still used nowadays by most SCMs.

    Some RCS drawbacks worth mentioning:
  • It is text only.
  • There is no central repository; each version-controlled file has its own repo, in the form of an RCS file, stored near the file itself. For example, the RCS file for /usr/project/foo.c is /usr/project/foo.c,v -- or a little better, in a subdirectory, /usr/project/RCS/foo.c,v.
  • Developers make private workspaces by creating symbolic links to RCS subdirectories – say, a symlink from /usr/home/john/RCS to /usr/project/RCS.
  • Naming of versions and branches is downright hostile. A version might be named 1.3, and a branch might be named 1.3.1, and a version on the branch might be named 1.3.1.7.

    The classic era


    In the SCM arena, the 90s are the classic era.

    CVS


    It all started with CVS (Concurrent Version System) in 1990. It was able to handle multiple versions being developed concurrently on different machines and stored on a central server. The client-server age was upon us and developers took major advantage out of it.

    CVS was able to handle versions in a decent way. And it even supported branching and merging, though it wasn’t very good at doing it. That’s one of the reasons many people are scared about the “B” word and the “M” word.

    CVS didn’t track directories or filename changes (no refactoring allowed here!) and heavily relied on locking the whole repository. It is outdated now, but it worked in the 90s! (If you have it, just walk away and go on to something else!)

    PVCS


    Polytron Version Control System (PVCS) was initially released in 1985 and then went through a series of mergers and acquisitions: Polytron, then Sage, Merant, and finally Serena.

    It’s an old, outdated system (initially designed to avoid branching/merging, using file-locking instead), but it’s still supported by Serena Software.

    ClearCase


    In 1992, one of the major beasts in the SCM world was born. ClearCase was clearly ahead of its time and for some it is still the most powerful SCM ever built.
    Outdated, slow moving, over priced, and overly complicated to administer (in the early days, you had to generate a new Unix kernel to run the beast!), good-old CC isn’t the cool guy anymore -- you can hardly find anything positive about it on the net. But it’s still very good at branching and merging and still has unique features, such as its legendary “dynamic views”. While powerful, CC came from a time when disk space was scarce and networks were mostly LANs, with no concerns for things like latency or working through firewalls.

    Atria (the developer of ClearCase) merged with Pure (which was run by Reed Hastings, now the head of Netflix), was purchased by Rational and then IBM. And lo, the powerful CC stopped evolving. Well, it did evolve towards UCM in the early 2000s, which basically got rid of all the good things and left the weak ones, together with a huge price. Not very good idea.

    ClearCase is still one of the most-used SCMs in the corporate world, and certainly one of the revenue leaders.

    VSS


    All the systems on my list had their moment and their clear advantages over previous systems. All except Visual SourceSafe. VSS was a weak system from day one, forcing developers to work with a “locking” approach, discouraging parallel development and creating a huge “fear of merging”.

    Slow, error prone, and utterly limited, VSS has been one of the most-used systems by Windows developers around the world. It is still in use, spreading pain and fear among good-hearted coders. But VSS was ahead of its time in one sense: it more properly belongs in the “dark SCM middle ages” (see below), instead of the classic era.

    VSS was entirely graphical, which was probably one of the reasons why it was widely adopted (along with being closely tied in with Visual Studio distributions).

    Perforce


    Perforce (P4) is one of the independent vendors who are totally focused on SCM, battling for the SCM gold. It is still one of the market leaders among mid-range companies with huge teams, and it has a strong presence in some market niches, such as the gaming industry.

    When it was released in the mid 90s, P4 was one of the most affordable and powerful systems to date. Worlds ahead of VSS and CVS, it was never at the level of Clearcase. But it was able to clearly beat CC in cost, performance, and ease of use.

    Being centralized and not very good with branching and merging (branches are implemented as subdirectory trees – didn’t they ever hear of metadata?) P4 doesn’t seem to be the best option for the future, but it is rock solid, mature, and well established. That will help it keep growing. At the time of this writing, P4 is the biggest code repository inside Google. Cool!

    Enter the middle ages


    A time of darkness, when most of the previous advances were lost and a degraded environment emerged…

    Subversion


    Subversion (SVN) was conceived as “enhanced CVS” and its developers hit their target: it is better than CVS. Period.

    Although systems like ClearCase were perfectly capable of branching and merging, SVN educated an entire developer generation on the following dogma: fear branching and merging at all cost! This caused environmental damage that persists to this day, only starting to be healed by the new DVCS generation.

    SVN was close to P4 in features, and spread like crazy: more than 5 million developers around the world use SVN on a daily basis. Huge!

    SVN is extremely simple to use and evangelized everyone on the “mainline development model”. Error-prone (break the build!) on non-toy projects, it helped developed techniques like “continuous integration” as a way to “avoid integrations”. While the idea is good, most of the surrounding concepts were clearly limited by the tool itself.

    Linus himself raged against SVN when he first introduced Git back in 2006.
    During 2009 and 2010, all major open-source projects on earth gravitated away from SVN. A good sign of how wrong SVN was. But it’s still big and won’t die for ages.


    AccuRev


    Born in an age of darkness, AccuRev was developed as an entirely new approach to source control. Its original way of doing things still seems new to lots of developers nowadays.

    AccuRev has strong support for branching (“streams” in its jargon) and merging. It has played a valuable role in helping the community move away from ClearCase and older tools like CVS.

    Enter The Renaissance


    After an age of darkness, an entirely new generation of SCM systems broke the established status quo. “SCM is a mature market” was the analysts’ conventional wisdom, but the new generation broke onto the scene and blew everything apart.

    Able to sever ties with the Internet and work unplugged (like cool rock stars), the new generation also excels at branching and merging, which was touted as the root of all evil during the “dark ages”. These new systems have successfully shifted the tide in the “branching/merging is good” direction.


    BitKeeper


    BitKeeper was one of the innovators in the DVCS field. Designed by Larry McVoy (who previously worked on TeamWare, Sun’s internal version control system, built on top of SCCS, long evolution story here…) it rose to fame in 2002 when the Linux kernel development team started using it. A huge flame war started, with some developers complaining about using commercial tools for the world’s premier open-source project.

    Things only got worse in 2005 when fights with the core kernel developers grew even bigger. BitMover, the company behind the product, became concerned about people reverse-engineering their code. They discontinued support for open-source development and, ironically, thus prompted the creation of Git to fill the gap.
    For more, see http://en.wikipedia.org/wiki/Bitkeeper.

    Git


    Linus Torvalds, the father of Linux himself, designed and implemented the first version of Git (almost over a weekend, in pure-hacker style) to give his kernel developers an alternative to BitKeeper. Linus not only did the original design (simple, clean, genius), but helped promote the project with his unique style. (See http://codicesoftware.blogspot.com/2007/05/linus-torvalds-on-git-and-scm.html.)

    During his famous speech, he heavily criticized (ok, insulted) CVS, SVN, and Perforce: “Subversion has been the most pointless project ever started”, “If you like using CVS, you should be in some kind of mental institution or somewhere else” and finally “Get rid of Perforce, it is sad, but it is so, so true”.

    You can love him or hate him, but he definitely made his point: the Middle Ages were over and now distributed systems were to rule the world, including removing the arcane fear of branching and merging, a key concept behind every DVCS.

    During the next years, every major open-source project migrated away from Subversion towards Git (and www.github.com provided a really huge, huge hosting service), making it the strongest and coolest SCM on earth.

    Git is based on a DAG structure (Directed Acyclic Graph), in which the main unit of change is the changeset. It implements full merge-tracking, but at the commit level instead of the individual file revision level (as, for instance, ClearCase does). It is extremely fast, with the only caveats being management of large binary files and the requirement to replicate repositories in their entirety.

    Git is clearly influenced by its kernel roots, and it’s obviously not the easiest thing on earth to use . But it will definitely be the SCM of the next decade. Check out this awesome book.

    Mercurial


    Mercurial (Hg) was first announced on April 2005, also rushing in after the BitMover decision to remove support for the free version. Hg is also one of the key open-source DVCSs, along with Git. They can even work together quite well: Scott Chacon, the Git evangelist and one of the best SCM tech writers ever, wrote a nice integration -- see http://mercurial.selenic.com/wiki/HgGit.

    But Hg differs quite a bit from Git in terms of design. They share the concept of commit/changeset as the unit of change. Git implements this based on trees; each tree points to an older tree, and so on – hence the DAG. With Hg, every changeset is a flat list of files and directories, called a revlog.

    (For more on Hg, including internals, see http://mercurial.selenic.com/wiki/Design and http://mercurial.selenic.com/wiki/DeveloperInfo.)

    Mercurial provides very strong merging, but it’s a bit different from other SCMs in its branching model: it has “named branches” but the preference is to create a new repository as a separate branch instead of hosting “many heads” inside a single one.

    Joel Spolsky has written an extremely good Hg tutorial (hginit.com), which will help a lot of new users. Spolsky’s company, Fog Creek Software, has recently released Kiln, a commercial wrapper around the Hg core.

    Darcs


    Darcs (Darcs Advanced Revision Control System) is another open source attempt to get rid of CVS and Subversion. It started in 2002 and has been continuously evolving since then, reaching version 2.5 in November 2010.

    The major shortcomings of Darcs have been performance and its different way of handling history: instead of managing “snapshots” (commits or changesets) it manages patches, but in a way that makes traversing history difficult to understand. (a current status may have not been a real snapshot).

    Bazaar


    Bazaar (bzr) is another open-source DVCS, which tries to provide some fresh air to the SCM world. While less used than Git and Mercurial, Bazaar features interesting features, such as the ability to work in a centralized way, if needed. (The “pure” DVCSs didn’t include central servers in their original design.)

    Bazaar was developed by Canonical (yes, the Ubuntu company!) and became GNU in early 2008.

    Plastic SCM


    Plastic is a DVCS system designed with commercial use in mind instead of open-source projects (unlike Git and Mercurial). Plastic was first released in late 2006, featuring strong branching and merging, including full merge tracking and rename support in merges. It provides a highly graphical working environment, with many data-visualization capabilities, including a 3D revision tree). This distinguishes it from DVCSs that are oriented toward the hard-core, CLI-oriented hacker community.

    The motivation of Plastic’s developers (BTW, I’m one of them) is to target small and medium teams, closing the gap between expensive high-end systems like ClearCase and low-end ones like SVN.

    Plastic is built around the concept of parallel development, encouraging use of the “branch per task” pattern (feature branches). It can handle thousands of branches without breaking a sweat. Plastic is also distributed, supporting disconnected development, pushing and pulling of changesets on branches, and conflict resolution.

    A Community Edition of Plastic SCM was launched in November 2010.

    Team Foundation Server


    Microsoft, wanting to play a role in the SCM/ALM market, came up with Team Foundation Server (TFS). It’s an effort to heal the pain caused by its own VSS devil.
    While TFS is not very strong as a source-control system (kind of a new guy on the block, but using previous-generation technology), it comes fully packaged with a huge set of tools, from issue tracking to test management, in the pure “corporate-huge-integrated-thing-style”.

    You won’t be doing branching, merging, or DVCS if you go for it, but maybe your company already purchased it, along with an MSDN subscription.
  • Concerto, a cross-platform .net profiler

    I'm going to introduce the basics behind Concerto, the cross-platform profiler we've developed to help us optimizing Plastic SCM.

    The problem


    The problem we face is simple: we need to optimize our code in Mono/Linux, Mono/Solaris, Mono/Mac and .NET/Windows. The best way to find issues is to run exactly the same test on the same hardware but with different OS (ok, Mac and Solaris are slightly more complex, right?, but for the main ones, Linux/Windows, the same iron will work), and then check the results.

    We've been doing that for months but we always missed good comparable profiling data. You know, we do use things like AQTime on Windows and then the Mono Profiler on Linux but the generated data is not easy to compare.

    We needed a different tool.

    Let's start the music


    We had the following idea: let's instrument the code to add some instructions to measure the time spent on each method, this way we can use the same instrumented assemblies to run with .NET and Mono and then we can compare.

    Cecil was there to help doing the tough instrumentation part, so I emailed our resident Mono hacker: Dick Porter, and told him the idea. It took him a few hours to come up with a prototype, and has been refining it for the last week or so...

    And, since it was all about instrumenting code... Dick named it Concerto.

    Welcome Concerto


    Here is how the toy works:


    $ mono Instrument.exe --help
    Usage: Instrument [OPTIONS] assemblies
    Instrument one or more assemblies to
    record profiling data.
    If only one assembly is specified,
    instrumented assembly output name and
    data filename can be set. Otherwise
    defaults are chosen for each assembly.

    Options:
    -v, --verbose Increase verbosity
    -h, -?, --help Show this message and exit
    -f, --filename=VALUE The filename where profiling data is written to
    at runtime
    -o, --out=VALUE The filename where the instrumented assembly is
    written
    -p, --private Include private types
    -c, --class=VALUE The specific class to instrument (can be given
    more than once)



    Usage is pretty simple:

    1) run Instrument.exe on an assembly of your choice
    2) copy the output assembly back to your application, along with the generated Concerto-blah.dll
    3) run your application
    4) Look at the output file with mprof-decoder (or equivalent)

    For example:


    $ ls -l hello.* Concerto*dll *.mprof
    -rwxr-xr-x. 1 dick dick 13312 2010-11-08 14:06 Concerto.dll
    -rw-r--r--. 1 dick dick 384 2010-11-04 20:10 hello.cs
    -rwxr-xr-x. 1 dick dick 3072 2010-11-05 16:51 hello.exe

    $ mono Instrument.exe -vvv hello.exe
    Instrumenting hello.exe, creating hello.exe.ins
    The helper assembly is Concerto-hello_exe.dll
    Data shall be written to concerto-hello_exe.mprof
    Instrumenting class hello
    Instrumenting method System.Void hello::.ctor()
    Instrumenting method System.Int32 hello::DoStuff(System.Int32)
    Instrumenting method System.Void hello::DoMoreStuff()
    Instrumenting method System.Void hello::Main()
    Done.

    $ ls -l hello.* Concerto*dll
    -rw-r--r--. 1 dick dick 13824 2010-11-12 11:53 Concerto-hello_exe.dll
    -rwxr-xr-x. 1 dick dick 13312 2010-11-08 14:06 Concerto.dll
    -rw-r--r--. 1 dick dick 384 2010-11-04 20:10 hello.cs
    -rwxr-xr-x. 1 dick dick 3072 2010-11-05 16:51 hello.exe
    -rw-r--r--. 1 dick dick 3584 2010-11-12 11:53 hello.exe.ins

    $ mono hello.exe.ins
    Hello, world!
    1
    1
    2

    $ ls -l hello.* Concerto*dll *.mprof
    -rw-r--r--. 1 dick dick 13824 2010-11-12 11:53 Concerto-hello_exe.dll
    -rwxr-xr-x. 1 dick dick 13312 2010-11-08 14:06 Concerto.dll
    -rw-r--r--. 1 dick dick 365 2010-11-12 11:54 concerto-hello_exe.mprof
    -rw-r--r--. 1 dick dick 384 2010-11-04 20:10 hello.cs
    -rwxr-xr-x. 1 dick dick 3072 2010-11-05 16:51 hello.exe
    -rw-r--r--. 1 dick dick 3584 2010-11-12 11:53 hello.exe.ins


    Now let's inspect the output with the mprof-decoder


    $ mono mprof-decoder.exe concerto-hello_exe.mprof

    ------------------------------------------------
    Reporting execution time (on 4 methods)
    97.19% (0.007282s) hello.System.Void hello::Main()
    2.48% (0.000186s) hello.System.Int32 hello::DoStuff(System.Int32)
    1 calls from hello.System.Void hello::Main()
    1 calls from hello.System.Void hello::DoMoreStuff()
    0.33% (0.000025s) hello.System.Void hello::DoMoreStuff()
    1 calls from hello.System.Void hello::Main()


    ------------------------------------------------
    Reporting execution time by stack frame
    97.19% (0.007282s, 1 calls) hello.System.Void hello::Main()
    2.33% (0.000169s, 1 calls) hello.System.Int32 hello::DoStuff(System.Int32)
    0.34% (0.000025s, 1 calls) hello.System.Void hello::DoMoreStuff()
    66.67% (0.000017s, 1 calls) hello.System.Int32 hello::DoStuff(System.Int32)


    Finally an example of picking classes to instrument would look like this:

    Get the list of classes with verbosity level 2:


    $ mono Instrument.exe -vv plastictcpchannel.dll
    Instrumenting plastictcpchannel.dll, creating plastictcpchannel.dll.ins
    The helper assembly is Concerto-plastictcpchannel_dll.dll
    Data shall be written to concerto-plastictcpchannel_dll.mprof
    Instrumenting class Codice.Channels.ClientSinkProvider
    Instrumenting class Codice.Channels.ClientSink
    Instrumenting class Codice.Channels.PlasticBinaryServerFormatterSink
    Instrumenting class Codice.Channels.PlasticBinaryServerFormatterSinkProvider
    Instrumenting class Codice.Channels.PlasticTcpChannel
    Done.


    Pick a couple of classes to instrument:


    $ mono Instrument.exe -vv -c
    Codice.Channels.PlasticBinaryServerFormatterSinkProvider -c
    Codice.Channels.ClientSinkProvider plastictcpchannel.dll
    Instrumenting plastictcpchannel.dll, creating plastictcpchannel.dll.ins
    The helper assembly is Concerto-plastictcpchannel_dll.dll
    Data shall be written to concerto-plastictcpchannel_dll.mprof
    Instrumenting class Codice.Channels.ClientSinkProvider
    Instrumenting class Codice.Channels.PlasticBinaryServerFormatterSinkProvider
    Done.


    Enjoy!

    Branch explorer tour

    It's been a week since we announced the Community Edition and a big number of developers have already downloaded and started to work with Plastic.

    People are coming from different SCMs and jumping into Plastic: developers with CVS background, experience with SVN, Mercurial, Git and even Clearcase and TFS. So, this is a really heterogeneous group.

    My intention today is to focus on one of the key features in Plastic: the branch explorer. This is to help the newcomers can really extract the best out of our SCM (our former users are already more than familiar with it, or anyone attending a demo).

    So I just recorded a short screencast showing some of the key functionalities within it. Almost everything is doable from the BrEx (as we call it internally), and you guys should get familiar with it!

    Here we go (I recorded it in high-res so configure your viewer accordingly):

    Mylyn integration

    One of or key goals with the new release is to continue improving Plastic SCM support for the Java/Eclipse ecosystem.

    The new release includes a number of important improvements on the core Eclipse integration (such us the new Synchronize view, the ability to directly import projects from Plastic SCM during workspace creation and revisited check-in functionality) and the totally new Mylyn integration to improve parallel development within Eclipse.
    The Mylyn integration represents a significant step for Codice as it will streamline parallel development for Eclipse users.

    New Mylyn integration


    Mylyn is the task and application lifecycle management (ALM) framework for Eclipse and as such is all about creating a task-focused interface for developers. Plastic SCM is designed to implement a task-oriented cycle through extensive use of parallel development with branching and merging. It is clear that the two tools can team up perfectly to create a really strong task oriented environment.

    One of the Mylyn limitations when the underlying version control system has limited branching capabilities (as it happens with Subversion, CVS or Perforce) is the ability to correctly manage overlapping changes: What if two different tasks have to modify a colliding set of files? Mylyn, due to the SCM (Software Configuration Management) limitations, is unable to correctly handle the situation unless the changes on the task are committed prior to the switch to a different task. So, having simultaneous open tasks with pending changes is troublesome.

    The Plastic SCM Mylyn plugin resolves the issue by extending the basic Mylyn functionality and associating each task with a different Plastic branch, hence implementing real parallel development and powerful isolation between branches.

    As the following figure shows, when the developer activates a different task, the Plastic SCM Mylyn plugin is able to assist the user to create or switch to a different branch, associated to the mentioned task.

    Task to branch association can’t be simpler nor more effective than the way the new Mylyn plugin implements: the entire branch management is driven by the tool, assisting the user at every step.


    Plastic SCM integrates natively with a number of issue tracking and project management system like Atlassian Jira, FogBugZ, Mantis, VersionOne, Rally, Bugzilla and others. Now the integration and task visualization is also available from the Eclipse side as it is from the GUI. The following figure shows the Plastic SCM graphical user interface running on Ubuntu Linux integrated with Jira.



    Eclipse plugin enhancements


    There are several new features on the core Eclipse plugin for Plastic SCM, all focused on improving the general usability within the IDE.

    The first functionality to be described is the new synchronize view, available from the team menu.


    The new synch view introduces an alternative way to perform checkins of pending changes, more familiar than the former checkin dialog especially for Eclipse users previously using Subversion or CVS.

    The team synchronization view allows an easy navigation and review of the pending changes and also enables an easy way to commit all changes together.



    The second important change introduced by the release of this reviewed Eclipse plugin is the new checkin pending changes dialog, improved to match the user’s expectations.



    Finally, we’ve added a new import option to the IDE so developers can easily configure new workspaces directly downloading the code from Plastic as they’d do using Subversion or CVS.



    The import wizard will drive users through the workspace creation process and will also help selecting the repository and branch or label to work on.



    Download and try!


    It's been only a few days since we released the Community Edition, which as you know is free for small teams (up to 15 developers!). Interest is pretty high and we expect Eclipse users to be taking a look at Plastic too with this new enhanced plug-in.

    So, you know: spread the word!! Tell every one Plastic is now free and they can start using it asap at no cost!

    Plastic 3.0.7 is out!!

    Together with the launch of the new Community Edition we're announcing a new version ready to be downloaded: 3.0.7! And it comes loaded with new features: new Visual Studio integration and Mylyn integration for Eclipse users.


    Our policy is to release a couple of major releases a year, but our dev team is so active and there are so many ideas floating around that we need to publish new features more often than that.

    That’s why after we released 3.0 a few months back, instead of just “retaining” new features for the next major version, we’re just publishing them on a frequently basis.

    And that’s why today I’m announcing a good number of new features in 3.0.6, together with some fixes and customers requests.

    You can find all the details in the release notes, but let me explain in a little more depth what’s 3.0.7 is about.

    New Visual Studio integration package


    Visual Studio is one of the most used IDEs by Plastic SCM users out there and that’s why we’re trying to continuously improve our integration with it. As part of the new release, we’re introducing a new Visual Studio Package for Visual Studio 2005 and higher. The new package allows our customers to implement full SCM operations.

    Why are we coming up with a new package? Here are the reasons:


  • We had originally offered very basic SCC integration, which is still maintained and enables compatibility with older versions of Visual Studio. However, getting rid of the old SCC limitations which will enable us to continue evolving the integration rapidly and without meaningless restrictions.
  • We are now able to add more integrated functionality: checkin behavior is now shared with the GUI, the same happens with the access to all the GUI views and so on.
  • We have created a better base to continue evolving plugins and enable more innovative functionalities. We’re considering a bunch of ideas to extract info in real time from the SCM and make it available while the programmer is coding, helping him to make the right decisions. Stay tuned because there will be a lot more coming.

    Now let me describe in bigger detail how the new package looks like

    Configuring the new Version Control Package


    Going to Tools/Options you’ll be able to select the new “Source Control Package.” Keep in mind that you can choose from the old SCC or the new one, it’s up to you!


    New context menus available



    Thanks to the new integration we were able to include much more functionality in the Visual Studio IDE.
    Look at the new context menu options that were simply not available before, like annotating code or showing up history and tree 3D (before only the tree was available).



    All the views in the Plastic GUI are also available from View/Plastic SCM menu (it was there also in the former 2005 package, but now we added a “workspace explorer” too).



    Integrated history


    Several users requested it in the past and in fact it is something we must have added long ago. But finally the history of an item can be browsed from within Visual Studio.



    Integrated annotate


    The annotate is now available within the Visual Studio IDE.


    New pending changes view


    The checkin window is now shared with the GUI, displaying the “pending changes” dialog.



    Integrated diff


    We’ve made several changes in the diff infrastructure for 3.0.7 and now there’s a new configuration page on the wizard to let you set up the diff tools.



    You can choose the “embedded diff viewer” (new) or the Plastic SCM diff tool (the default) or set up your own diff tool.
    In case you go for the “embedded” one the diffs will be displayed as follows in the GUI and Visual Studio.





    Hope you enjoy the changes!!!
  • Building Plastic SCM's Community With a Community Edition

    We're just announcing a Community Edition of Plastic SCM. We will license Plastic SCM for free to environments with up to 15 developers, which will cover a huge percentage of the development teams.

    For the ones new to Plastic SCM:

  • It is a distributed version control system that can act as a centralized one too
  • It is very, very strong doing branching and specially merging (no issues with renaming, full merge tracking, visual merge diagrams (branch explorer), 3D version tree and many more). It does include tools like xmerge and xdiff, the only "refactor aware" diff and merge tools on the market.
  • It focuses on visualization: from the branch explorer, the 3D version tree, the view based GUI, integration with Visual Studio and many more




    We now have hundreds of customers whose developers rely on Plastic SCM to manage and overcome many of the challenges related to distributed development. We hear from our users often, and that feedback is very important to us. We want to build a community of Plastic users who exchange information about the product and their successes, while giving us constructive feedback on the product.


    The new edition aims to build our community. With the new licensing, our customers receive a full-featured version of the product, with all the bells and whistles one receives from our enterprise licensed version. Our customers often tell us that Plastic is the only product out there that can handle merging of distributed code while providing an easy-to-use graphical interface. Our interface is not an “add-on,” it’s a cornerstone. We built Plastic with things such as visualization, usability, advanced diff and merge tools in mind. And now the full features of Plastic are available to small teams for free.

    Community edition customers will be able to subscribe to priority support in case they need further assistance than what our user community forums will provide.

    We invite all developers to use Plastic SCM for their smaller projects by capitalizing on the community edition licensing. We then invite you to visit and contribute to our forums.

    For more information, visit the community edition product page, which also leads to a FAQ.

    Also today, we're introducing new enhancements to Plastic SCM, as outlined in two additional blog posts.
  • Real Time Web Analytics