Version control scalability shoot-out!

July 23, 2010

Let's go straight to the point: we took 2 of the biggest mainstream version control systems and put them to work under really heavy load: 1 single server and 100 concurrent clients. And then we compared them with Plastic SCM 3.0.

Test environment

A mainstream QuadCore 64bits server with 4GB Ram. Nothing fancy at all, just what you can purchase with about $500.

100 desktop computers like the ones your team can use, all of them running Linux. They're quite heterogeneous: from single core 5 years old machines to newer 4 cores, from 1.5 Gb RAM to 4Gb.

Server and clients are connected by a simple 100Mbps network.

Sample repository

We tested with a variety of different repositories, from really small ones to larger ones.

The one I'm describing today is just a small one (and I can tell you results only get worse for the slow SCMs with more data...):

  • 1376 files
  • 66 folders
  • 22,4 Mb when downloaded to a workspace

    Test automation

    In order to automate all the client machines we used PNUnit, you know, the extension we made to NUnit for "parallel" testing. Quite useful for load testing.

    Test scenario 1 - working on trunk

    A really simple scenario every developer is familiar with: just commit changes to the main branch.

    Every client will do the following:
  • Modify 100 files
  • Checkin
  • Repeat 5 times.

    Test scenario 1 - working on trunk - results

    Ok, how our beautiful friends behave under really heavy load? Considering we tested with Subversion and Perforce, 2 of the most used version controls on the market, we expected high scalability... :)

    We used SVN 1.5.7, Perforce 2009.2 64bits and Plastic SCM 3.0.

    All results are using a Windows server and Linux clients, except for Subversion: we run the SVN server on Linux (dual boot server machine) because on Windows it couldn't handle more than 30 concurrent clients without consistently crashing (out of memory, 4Gb!!! and gone).

    We run the same test described above with 1 client, 10, 20, 50 and 100. Check the results here:


    The two old irons doesn't scale that well at all, uh? ;-)

    Plastic is using a SQL Server backend and it seems it can handle the load much better than the others, even doing trunk development.

    Test scenario 2 - working on branches

    The second scenario tries to reproduce a "branch per task" pattern, something we strongly recommend with Plastic.

    The scenario is as follows:

  • Update to trunk
  • Create a branch from trunk
  • Switch to it
  • Perform changes on the branch (about 50 modified files)
  • Checkin changes
  • Go to step number 2 (5 times)

    Test scenario 2 - working on branches - results

    We always say most of the version control systems out there are not ready to handle branching, and we always hear people asking why.

    Ok, a picture is worth a thousand words.

    If you miss some data point in one of the version control systems compared is not because of a mistake, the reason is that the server simply starts locking too much, rejecting clients and making the test fail (even considering that the test is able to handle retries if it gets rejection errors).

    More data

    I'll be sharing the data regarding the Plastic server running on Linux in the coming weeks. We used MySQL on Linux and while it is slightly slower, it still consistently beats all competitors.

    Anonymous said...

    you should do a comparison to git.

    pablo said...

    Hi @anonymous!

    Yes, that would be a very good idea, indeed we were talking about it today.

    But in order to really compare Git under heavy load, what do you think the scenario should be?

    We considered something like:

    - make changes locally
    - push

    And then of course have 100 nodes doing the same...

    I think it could be fair.

    bionic said...

    This benchmark does make you look rather good.

    However, the repository size simply doesn't reflect something 100 developers would work on. How about you try with something closer to 50-100k files. Also, how large was the revision history?

    At what frequency were the operations submitted? As fast as possible? Or something more closely resembling human behaviour?

    pablo said...

    Hi bionic,

    The respository size is tiny. As I mentioned in the post, we have more tests results but I started publishing the tiny one.

    We passed exactly the same tests working on a repository with 25files on trunk (LAST) in two modes: without any data (empty files, so we don't get affected by the network) and with data. Plastic is always faster under heavy load.

    The repositories are empty for SVN and Perforce when the test starts. If not, they even get slower, something that doesn't affect Plastic either.

    Finally, yes, the operations are submitted as fast as possible, so in fact we're simulating much more than 100 users. I wouldn't know exactly how many but of course more than 100 since obviously a human is not able to work that fast.

    I can tell that the 100 users tests generates the same workload in 30 minutes on a 25k files repo than what we consistely track on a real team, with the same size, in 8hours of work (a regular day).

    Hope I've answered all your questions.

    Filip Navara said...

    It is a bit unfair to compare against Subversion 1.5.7 when the latest version is 1.6.12. Also there's no information where FSFS or BDB backend was used, although that can make quite a difference under such load.

    pablo said...

    Hi Filip,

    Actually we did run against the latest SVN too, but we didn't find any differences in terms of performance. When we run the last testsuite last week we just tried with 1.5.7.

    In fact, we're testing against SVN under heavy load since 2006 and 1.4 was the only one (and still is) not crashing when running on Windows (server).

    About the backend type, we're using the default FSFS. We can easily rerun using BDB and share the results.

    Elaine said...

    We have Subversion implementations with 40,000 users, 18 million transactions per day. That's real life not a lab. That's why Subversion has 5 million implementations and probably why bogus 'bench marks' are published that are NOT independent or endorsed by a REAL customer.

    I am more than happy to put your product into our lab and push it through it's paces with our simulation spec which is, by definition a MASSIVE implementation.

    Filip Navara said...

    Hi Pablo,

    I wouldn't expect dramatic differences with the newer SVN version, but it seemed a bit unfair to compare the latest and greatest Plastic SCM version against year old version of SVN.

    If it's not too much effort I would welcome the comparison to SVN w/ BDB backend since it has very different performance characteristics than the FSFS one. It is faster for certain operations and slower for others. Also I hope the SVN protocol was used for accessing the repository and not the HTTP one, which arguably performs much worse than the native protocol.

    That said, I welcome all the improvements to Plastic SCM and applaud all the work on the performance side.

    pablo said...

    Hi Filip,

    Absolutely. I can tell you what we tried on Windows server is 1.6.4 (r38063), and on the clients we've 1.6.11 (r934486). But we were never able to avoid crashes with the Windows server, that's why we tried with a Linux server and, as expected, it went smoother.

    We used the svn protocol, not http.

    We will definitely give a try to the other database backend and I'll be more than happy to share the results, of course. We've been working on performance for a long time and of course Subversion is a widely adopted product, so trying to beat it is a big challenge for a team our size.

    Thanks for the remarks and comments, I hope to be sharing more info soon.

    I'd also like to add Hg and Git to the comparison, although we're still trying to define a fair scenario, as I mentioned above in a previous comment.

    Thanks! :)


    pablo said...

    Hi Elaine,

    40k users! That's impressive!

    I guess you're using a good number of SVN servers to support that, aren't you?

    I mean, that's still great, but it is not exactly the same kind of test.

    We're just testing a simple $500 server handling as much load as it can, and there I can tell you SVN doesn't do a good job.

    Of course the scenario I described maybe is not fair for SVN (although I doubt we can find anything simpler than the first one, work on trunk, it is rather simple as I wrote on my post), or maybe our configuration is not the best for SVN and there are some tweaks we can use to improve performance, as Filip was mentioning.

    I'll be happy to share the commands we're running on plastic and SVN for the test too, of course.

    And, definitely, I'd love you guys to give a try to Plastic. I know you've done an excellent job trying to get the best out of Subversion, so if you're looking for something more scalable and functional, we'll be more than happy to talk. If you just want to try to crunch plastic through your test suite, of course we'd love to see that too. :)

    Only one remark: being smaller than Wandisco doesn't make us ''bogus'', just makes us smaller.

    Thanks for your comments.


    pablo said...

    Hi Filip,

    We're testing *right now* on our testing cluster (where we've another 100 machines, like the customer site we used before) with the Berkeley backend on Linux, with the tiny repos.

    The SVN Linux server directly goes out of mem as soon as the 100 clients start updating.

    majestic:~ # svn log svn://localhost:8084 -r head
    svn: Berkeley DB error for filesystem '/root/svnrep/db' while opening environment:

    svn: Cannot allocate memory
    svn: bdb: unable to allocate memory for mutex; resize mutex region

    We'll continue testing and trying to correctly configure it.

    pablo said...

    With only 10 clients, the Subversion Berkeley DB backend is much, much slower than the FSFS one.

    So I think we selected the fastest for a fair comparison.

    We're also trying with the latest SVN server on Linux and it is slightly slower than the one we used (1.5.x) for testing.

    robertc said...

    It would be interesting to see more details - e.g. how you provoked or avoided merge conflicts.

    Also potentially interesting to reproduce the test :)

    Anyway, good to see you making progress!


    pablo said...

    Hi Robert,

    The test bots (as we call them) are configured so they always modify file ranges that don't collide, that's how we do avoid merging (that would be a subject for another test).

    We do use PNUnit as the load test coordinator (you know, our humble contribution to NUnit) and yes, I'd be more than happy to share the code and the setup, no prob. I just have to find some time to make it happen ;)

    Thanks for the support!


    Real Time Web Analytics