You’ve all used at least one of the SCMs on the following list, but are you aware of how long the system you’re using has been around? Do you know the big names? Ok, that’s what I’ll try to supply with this short compilation.
The big picture
Look at the following diagram to find some of the main names in SCM history. Yes, I must be missing a good number of them, so don’t be shy: post a comment and I’ll update the list with your favorite one I missed :)
If you’re still feeling good about using really “old irons”, I’ve added some pictures of how “cell phones” looked like when the SCMs were released, so you feel older and bloody outdated :). So yes, if you’re using CVS and still think it’s ok, look at the cell phone directly below “CVS” in the diagram. Do you feel like Gordon Gecko (Michael Douglas) in “Wall Street, Money Never Sleeps”, getting his brick-sized cell phone back as he leaves jail?
There was a time when you stored your versions manually. Ok, for many of you this time wasn’t the 80s, but a few years back when you were at college naming your source-code archives exercise.zip, exercise-0.zip, exercise-good.zip, exercise-good-final.zip, and so on. Well, believe it or not, there was a time without real SCMs. It was always dark and people were living in caves.
Then 1982 came and RCS was released. RCS is not a huge piece of technology, but you can still find it around in Unix distros. It is simple and straight to the point.
One nice feature was that text changes were stored as deltas (pretty important, considering hard drives used to be small!). Deltas are still used nowadays by most SCMs.
Some RCS drawbacks worth mentioning:
The classic era
In the SCM arena, the 90s are the classic era.
It all started with CVS (Concurrent Version System) in 1990. It was able to handle multiple versions being developed concurrently on different machines and stored on a central server. The client-server age was upon us and developers took major advantage out of it.
CVS was able to handle versions in a decent way. And it even supported branching and merging, though it wasn’t very good at doing it. That’s one of the reasons many people are scared about the “B” word and the “M” word.
CVS didn’t track directories or filename changes (no refactoring allowed here!) and heavily relied on locking the whole repository. It is outdated now, but it worked in the 90s! (If you have it, just walk away and go on to something else!)
Polytron Version Control System (PVCS) was initially released in 1985 and then went through a series of mergers and acquisitions: Polytron, then Sage, Merant, and finally Serena.
It’s an old, outdated system (initially designed to avoid branching/merging, using file-locking instead), but it’s still supported by Serena Software.
In 1992, one of the major beasts in the SCM world was born. ClearCase was clearly ahead of its time and for some it is still the most powerful SCM ever built.
Outdated, slow moving, over priced, and overly complicated to administer (in the early days, you had to generate a new Unix kernel to run the beast!), good-old CC isn’t the cool guy anymore -- you can hardly find anything positive about it on the net. But it’s still very good at branching and merging and still has unique features, such as its legendary “dynamic views”. While powerful, CC came from a time when disk space was scarce and networks were mostly LANs, with no concerns for things like latency or working through firewalls.
Atria (the developer of ClearCase) merged with Pure (which was run by Reed Hastings, now the head of Netflix), was purchased by Rational and then IBM. And lo, the powerful CC stopped evolving. Well, it did evolve towards UCM in the early 2000s, which basically got rid of all the good things and left the weak ones, together with a huge price. Not very good idea.
ClearCase is still one of the most-used SCMs in the corporate world, and certainly one of the revenue leaders.
All the systems on my list had their moment and their clear advantages over previous systems. All except Visual SourceSafe. VSS was a weak system from day one, forcing developers to work with a “locking” approach, discouraging parallel development and creating a huge “fear of merging”.
Slow, error prone, and utterly limited, VSS has been one of the most-used systems by Windows developers around the world. It is still in use, spreading pain and fear among good-hearted coders. But VSS was ahead of its time in one sense: it more properly belongs in the “dark SCM middle ages” (see below), instead of the classic era.
VSS was entirely graphical, which was probably one of the reasons why it was widely adopted (along with being closely tied in with Visual Studio distributions).
Perforce (P4) is one of the independent vendors who are totally focused on SCM, battling for the SCM gold. It is still one of the market leaders among mid-range companies with huge teams, and it has a strong presence in some market niches, such as the gaming industry.
When it was released in the mid 90s, P4 was one of the most affordable and powerful systems to date. Worlds ahead of VSS and CVS, it was never at the level of Clearcase. But it was able to clearly beat CC in cost, performance, and ease of use.
Being centralized and not very good with branching and merging (branches are implemented as subdirectory trees – didn’t they ever hear of metadata?) P4 doesn’t seem to be the best option for the future, but it is rock solid, mature, and well established. That will help it keep growing. At the time of this writing, P4 is the biggest code repository inside Google. Cool!
Enter the middle ages
A time of darkness, when most of the previous advances were lost and a degraded environment emerged…
Subversion (SVN) was conceived as “enhanced CVS” and its developers hit their target: it is better than CVS. Period.
Although systems like ClearCase were perfectly capable of branching and merging, SVN educated an entire developer generation on the following dogma: fear branching and merging at all cost! This caused environmental damage that persists to this day, only starting to be healed by the new DVCS generation.
SVN was close to P4 in features, and spread like crazy: more than 5 million developers around the world use SVN on a daily basis. Huge!
SVN is extremely simple to use and evangelized everyone on the “mainline development model”. Error-prone (break the build!) on non-toy projects, it helped developed techniques like “continuous integration” as a way to “avoid integrations”. While the idea is good, most of the surrounding concepts were clearly limited by the tool itself.
Linus himself raged against SVN when he first introduced Git back in 2006.
During 2009 and 2010, all major open-source projects on earth gravitated away from SVN. A good sign of how wrong SVN was. But it’s still big and won’t die for ages.
Born in an age of darkness, AccuRev was developed as an entirely new approach to source control. Its original way of doing things still seems new to lots of developers nowadays.
AccuRev has strong support for branching (“streams” in its jargon) and merging. It has played a valuable role in helping the community move away from ClearCase and older tools like CVS.
Enter The Renaissance
After an age of darkness, an entirely new generation of SCM systems broke the established status quo. “SCM is a mature market” was the analysts’ conventional wisdom, but the new generation broke onto the scene and blew everything apart.
Able to sever ties with the Internet and work unplugged (like cool rock stars), the new generation also excels at branching and merging, which was touted as the root of all evil during the “dark ages”. These new systems have successfully shifted the tide in the “branching/merging is good” direction.
BitKeeper was one of the innovators in the DVCS field. Designed by Larry McVoy (who previously worked on TeamWare, Sun’s internal version control system, built on top of SCCS, long evolution story here…) it rose to fame in 2002 when the Linux kernel development team started using it. A huge flame war started, with some developers complaining about using commercial tools for the world’s premier open-source project.
Things only got worse in 2005 when fights with the core kernel developers grew even bigger. BitMover, the company behind the product, became concerned about people reverse-engineering their code. They discontinued support for open-source development and, ironically, thus prompted the creation of Git to fill the gap.
For more, see http://en.wikipedia.org/wiki/Bitkeeper.
Linus Torvalds, the father of Linux himself, designed and implemented the first version of Git (almost over a weekend, in pure-hacker style) to give his kernel developers an alternative to BitKeeper. Linus not only did the original design (simple, clean, genius), but helped promote the project with his unique style. (See http://codicesoftware.blogspot.com/2007/05/linus-torvalds-on-git-and-scm.html.)
During his famous speech, he heavily criticized (ok, insulted) CVS, SVN, and Perforce: “Subversion has been the most pointless project ever started”, “If you like using CVS, you should be in some kind of mental institution or somewhere else” and finally “Get rid of Perforce, it is sad, but it is so, so true”.
You can love him or hate him, but he definitely made his point: the Middle Ages were over and now distributed systems were to rule the world, including removing the arcane fear of branching and merging, a key concept behind every DVCS.
During the next years, every major open-source project migrated away from Subversion towards Git (and www.github.com provided a really huge, huge hosting service), making it the strongest and coolest SCM on earth.
Git is based on a DAG structure (Directed Acyclic Graph), in which the main unit of change is the changeset. It implements full merge-tracking, but at the commit level instead of the individual file revision level (as, for instance, ClearCase does). It is extremely fast, with the only caveats being management of large binary files and the requirement to replicate repositories in their entirety.
Git is clearly influenced by its kernel roots, and it’s obviously not the easiest thing on earth to use . But it will definitely be the SCM of the next decade. Check out this awesome book.
Mercurial (Hg) was first announced on April 2005, also rushing in after the BitMover decision to remove support for the free version. Hg is also one of the key open-source DVCSs, along with Git. They can even work together quite well: Scott Chacon, the Git evangelist and one of the best SCM tech writers ever, wrote a nice integration -- see http://mercurial.selenic.com/wiki/HgGit.
But Hg differs quite a bit from Git in terms of design. They share the concept of commit/changeset as the unit of change. Git implements this based on trees; each tree points to an older tree, and so on – hence the DAG. With Hg, every changeset is a flat list of files and directories, called a revlog.
(For more on Hg, including internals, see http://mercurial.selenic.com/wiki/Design and http://mercurial.selenic.com/wiki/DeveloperInfo.)
Mercurial provides very strong merging, but it’s a bit different from other SCMs in its branching model: it has “named branches” but the preference is to create a new repository as a separate branch instead of hosting “many heads” inside a single one.
Joel Spolsky has written an extremely good Hg tutorial (hginit.com), which will help a lot of new users. Spolsky’s company, Fog Creek Software, has recently released Kiln, a commercial wrapper around the Hg core.
Darcs (Darcs Advanced Revision Control System) is another open source attempt to get rid of CVS and Subversion. It started in 2002 and has been continuously evolving since then, reaching version 2.5 in November 2010.
The major shortcomings of Darcs have been performance and its different way of handling history: instead of managing “snapshots” (commits or changesets) it manages patches, but in a way that makes traversing history difficult to understand. (a current status may have not been a real snapshot).
Bazaar (bzr) is another open-source DVCS, which tries to provide some fresh air to the SCM world. While less used than Git and Mercurial, Bazaar features interesting features, such as the ability to work in a centralized way, if needed. (The “pure” DVCSs didn’t include central servers in their original design.)
Bazaar was developed by Canonical (yes, the Ubuntu company!) and became GNU in early 2008.
Plastic is a DVCS system designed with commercial use in mind instead of open-source projects (unlike Git and Mercurial). Plastic was first released in late 2006, featuring strong branching and merging, including full merge tracking and rename support in merges. It provides a highly graphical working environment, with many data-visualization capabilities, including a 3D revision tree). This distinguishes it from DVCSs that are oriented toward the hard-core, CLI-oriented hacker community.
The motivation of Plastic’s developers (BTW, I’m one of them) is to target small and medium teams, closing the gap between expensive high-end systems like ClearCase and low-end ones like SVN.
Plastic is built around the concept of parallel development, encouraging use of the “branch per task” pattern (feature branches). It can handle thousands of branches without breaking a sweat. Plastic is also distributed, supporting disconnected development, pushing and pulling of changesets on branches, and conflict resolution.
A Community Edition of Plastic SCM was launched in November 2010.
Team Foundation Server
Microsoft, wanting to play a role in the SCM/ALM market, came up with Team Foundation Server (TFS). It’s an effort to heal the pain caused by its own VSS devil.
While TFS is not very strong as a source-control system (kind of a new guy on the block, but using previous-generation technology), it comes fully packaged with a huge set of tools, from issue tracking to test management, in the pure “corporate-huge-integrated-thing-style”.
You won’t be doing branching, merging, or DVCS if you go for it, but maybe your company already purchased it, along with an MSDN subscription.