GIT vs Plastic performance
Performance! I’m starting to be obsessed by this word. Almost since the beginning of the project one of my main concerns has been performance: being able to run update operations faster and faster and faster.
If I remember correctly, months before our initial release I was already checking whether Plastic was faster than Perforce updating the workspace, which is one of the most common version control operations.
At a certain point in time we were able to beat Perforce (which claims to be the fast scm out there), so I was extremely happy. I knew Perforce was still faster than Plastic doing group check ins and add operations, but we were faster updating.
Later on I run a test to compare Plastic to SVN and… doing “update forced” (which means downloading again everything into our workspace) was about three times faster!! Good old days!
The last weeks I’ve been working on a big change to the “solve selector” code. What does it mean? Well, selectors are the key piece of the Plastic system, whether they are visible or not (it depends on whether users are using professional version instead of the standard one, and even then many users don’t feel like bothering with them, so prefer the “switch to branch” mechanism) they still play a core role into the Plastic system. Simply put: revisions are not “directly linked” to their containing directories (as it happens with “path space branching systems” like TFS, Perforce, Subversion and so on) but the system has to “solve” where a revision is depending on the selector.
Does it mean that the same revision of a file could be placed in different locations only changing the selector? Yes! This is exactly what it means.
The mechanism is extremely powerful, but it comes with a cost: it is not especially simple to implement. In exchange you get true rename and many other possibilities, but the system has to perform some work to calculate where each element is located.
Well, the point is that BL064 brings a whole new selector solving mechanism that greatly outperforms the old one. Now not only “update forced” is fast, but also detecting what has to be changed if someone committed a change to the branch you’re working on. The new system is much, much, much faster.
-“BL064? What are you talking about? You released Plastic 1.5 a few weeks ago… what’s that?” – I can hear some of you complaining… :-)
Well, we do use an internal build numbering (which is also available on the external releases, please take a look into the versioning info of the executables) scheme (as everybody else does, nothing new under the sun), so our last 1.5 release is actually BL063. But we’re currently using BL065 internally to support our development. So, yes, I’m afraid the enhancements I’m talking about are not yet available, please be patient.
Well, I was very happy with the new “selector system”, but then I suddenly remembered GIT… Oh! We thought we were the fastest ones in town doing updates but then I tried GIT and… well, it is stupid and ugly :-), but it is damn fast…
So I installed again GIT 220.127.116.11 on our Linux FC4 box, which has a couple of Intel(R) Xeon(TM) CPU 3.00GHz. It is already >2 years old but still runs quite fast.
$ uname -a
$ uname -a
Linux juno 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 i686 i686 i386 GNU/Linux.
Then I imported one release of our own code into a git repository. I removed all the local files (except the .git subdirectory, of course!) and run “git checkout .” which is supposed to recreate all the files. It took about 20s…
Then I tried exactly the same scenario with Plastic (remember, BL065). It took about 29s. Yes, GIT is faster but… hey! Plastic server was actually installed on another box so… all the contents was downloaded through our local 100Mbps network!!!
To make it more dramatic I’d say that the test is made against our real server which is not only holding a copy of the repository (like GIT was doing) but all the whole development but… it won’t really make a big difference if the server were containing only one copy because fortunately size doesn’t affect update performance.
The bad news (for me) is that I then repeated the “git checkout . “ (after removing everything again) and it took only 8s!!! I don’t know exactly why (disk caches?) but if I run the “rm –rf *” just before running “git checkout .” it takes much longer than if I leave some time between the operations. Plastic always performed exactly the same ranging from 28 to 30 seconds, but not being affected by the remove operation being run immediately before.
“Ok, but what happens when Plastic server is installed in the same box as the client? Wouldn’t it be a fairer scenario and it would perform even better?” you may ask…
Well, I’m afraid if you want to get the answer you’ll have to wait till the next chapter… :-)