Directory Notifications to find changes

October 22, 2014

Pending Changes is now faster than ever because it doesn’t need to traverse the workspace anymore. We have implemented a new mechanism based on Windows Directory Notifications to detect workspace changes faster than ever.

It is available only on Windows but we’ll eventually implement it for Linux and Mac (based on their corresponding notification mechanisms).

What does it mean for you? Well, as soon as you install 5.0.44.608 or 5.4.15.604 (or higher) Pending Changes will be faster. You’ll clearly notice the speed up with really large workspaces (in number of files) and with slow disks. The slower the disk is, the clearer the speed up will be.

How Pending Changes works? (without directory notifications)

Whenever you click on “refresh” on Pending Changes Plastic triggers a search to find the files that have been modified on your workspace.

The diagram below describes the process in detail:

  • The process checks the “pending changes options” first: if only checkouts are requested, then there’s nothing to look for, just print the list. That’s why working with checkouts makes sense for huge workspaces (>400k files).
  • If the options to find changed files on disk are set, the directory walk will start.
  • For each directory starting on the root of the workspace Plastic will try to find changed files. It will compare the timestamp on disk with the stored timestamp on the wktree file: the plastic.wktree file (inside .plastic) stores the metadata of the file. It know “how it was” after the last update or checkin. So if the timestamp and size doesn’t match, the file was changed. If timestamp doesn’t match and size does, Plastic hashes the file. It is slower but it makes sure the file is different. Alternatively there’s an option to force Plastic to always find changes based on file contents (ignoring the timestamp) which is definitely slower but required on some scenarios.
  • At the end of the disk walk, Plastic has a list of all the modified files on your workspace.

The diagram doesn’t show the last step (if the option is set): find “moved and renamed files” by matching potential added and deleted files.

What I want to explain with the diagram is that there is at least one IO operation by directory. If you keep pressing “refresh” on the Pending Changes view, chances are the list will be filled quickly: after the first traversal the workspace will be loaded in the file system cache so the next reads will be blazing fast. But if your disk is not very fast, or your computer is performing a lot of IO, or using a lot of RAM, chances are that your workspace won’t be entirely loaded in the file system cache, and then walking it will take longer.

You probably noticed it when after some coding you go back to Plastic, click refresh, and it takes longer than usual. This is exactly what we wanted to improve with this feature.

How Pending Changes with Directory Notifications works?

It is rather simple: we use Windows directory notifications to listen to events on the workspace directory. Each time a file is written, deleted, added, moved or renamed inside the workspace, Plastic gets a notification.

So, while we perform an initial directory traversal the first time the Pending Changes view loads, no other full directory walk will be needed later, greatly speeding up the operation.

What we do is the following: after the first traversal we keep a tree with the metadata of what is on disk, and we invalidate parts of it (on a directory basis) each time a change happens inside it. This way Pending Changes only has to reload parts of the tree instead of walking the workspace entirely. It saves precious time while still being a robust solution.

One of the issues with Directory Notifications on Windows is that it can’t really notify file or directory moves, so you have to match pairs of added/deleted. Instead of trying to pair the notifications we just invalidate parts of the tree and let the regular Pending Changes code to do the rest.

So, there’s still room for improvement but our initial tests probed that the extra complexity of doing a more precise event tracking didn’t pay off compared to just invalidating parts of the tree.

Availability

Chances are you are already using it :-)

If you’re using 5.0.44.608 or higher or 5.4.15.604 or higher, you’re already enjoying the Directory Notifications powered Pending Changes view.



Branch differences

September 22, 2014

We’ve implemented a batch of improvements in the branch/cset/label diff window in the last months. Some of the improvements are pretty recent and some others have been around for months already. I’ll be walking through them and explaining how they help when running diffs on a daily basis.

Improved diff groups

Availability: 5.0.44.603 (Sep 12th 2014) and 5.4.15.605 (Sep 19th 2014).

The first feature is the ability to group together files and directories that only differ in file system permissions (typical scenario when some files get executable flag on Linux). Now the files that only differ in file system permissions are grouped together so you can better focus on what was actually modified.

The goal here is to let you focus on the real changes when diffing code for a review or when figuring out why a bug occurred, by grouping away potentially uninteresting changes.

The second feature is a slight modification since we released Item Merge Tracking last year. Now the files that were in changed/changed merge conflicts will show up in their own category:

And this is how the same info was grouped before:

As you can see now he highlight the “changed/changed” and put them at the very beginning of the list so you can focus first on the diffs of files which were modified by both contributors during a merge, and hence are potentially worth reviewing.

Analyze diffs

Availability: 5.0.44.536 (March 3rd 2014) and 5.4.7.538 (March 10th 2014).

There’s a story behind this feature: sometimes you find yourself reviewing a task where the developer modified a huge number of files, like 200 or so.

You launch the diff and then you see the counter with a big number on it: 100, 200, 250 files… whatever. Your feeling will be: “oh god! Let’s go and grab some coffee first”. Which, in short, means that huge reviews are extremely unproductive because you enter in this state of mood of “this will take a loooooong time”.

And sometimes that’s correct and you truly have to spend a long time.

But chances are that many of the files changed by your colleague only contain trivial changes. Like a rename on a method that affects a bunch of files, but all this files only have a trivial modification.

How can you figure out beforehand? Well, that’s why we’ve introduced “analyze changes”.

You click on the new “analyze differences” button and then Plastic will use cloc to calculate the lines of code being modified. You can see it running in my review with 284 files below:

And here goes the result once the changes were classified:

As you can see there’s a new SLOC column where you see a summary of the lines of code being added, changed and deleted (since it uses SLOC and not LOC, there will be cases where changes only affected comments or white lines and hence they’ll show up as zero changes!).

You can sort the column and then check how many files have really “big changes” and it will have a positive effect on your mood and willingness to go through the entire list, which at the end of the day means increased productivity :-)

In my screenshot you can see how the selected file just contains a namespace change… which is not a big deal to review and in fact in my example this refactor affected more than 180 files out of 280, not bad.

This is the first step towards true “semantic multi-file diff” :-)

Find in files

Available since: 5.0.44.574 (Jun 6th 2014) and 5.4.12.575 (July 1st 2014).

Sometimes you’re diffing a branch and then you would like to find something (a method call, a given text or comment) inside the files being diffed.

That’s why we added “find in files” to the diff window: click on the new “find in files” button or “CTRL+SHIFT+F” and the following panel will show up:

You can enter the pattern to find, then decide whether you want to search in both sides or only one of them, select the extension or whether you want to select the search to the filtered files in the diff window.

And the results are displayed as follows:

With the occurrences of the left and right sides separated to ease the navigation.

In my case I was looking for the word “comment” and it is clearly more common before the change, which means the word was deleted from the file.

Our plan is to implement another feature on top: “search only in diffs” to restrict the search to the text blocks which were really modified. This is useful, for instance, to see if a certain method call has been really used in the new code or not.

Annotate each contributor

Available since: 5.0.44.577 (Jun 18th 2014) and 5.4.13.578 (Jun 26th 2014).

We added the option to launch the annotate from within the diff view so you can instantly see the detailed information about each line of each of the two sides of a diff.

I find it especially useful when going through complex merges: look at the screenshot above. Do you see the “Yes” at the end of the annotate? It means this line was modified during a merge. Which means the two contributors to the merge don’t contain the line but it was actually edited during the merge (at least prior to the checkin). This information is very useful to figure out what was really changed manually and what simply comes directly from one of the merge contributors.

As it happens with some of the other features explained in this blog post, this is just a first step. We’re working on more improvements to item merge tracking to be able to display on files in the “changed-changed” group exactly which lines come from merges or were modified on the branch.

Info about binary files

Available since: 5.0.44.592 (Aug 6th 2014) and 5.4.14.593 (Aug 8th 2014).

Sometimes you’re diffing a branch and there are binary files that can’t be diffed. So far we were just displaying a message telling “diff not available” (although you can CTRL+D to launch the external diff, for images for instance). Now we display useful metadata to understand the change:

The date and size and author are especially useful to understand what is going on with the file.

For images our plan is to embed the Image Diff here (check the gallery to find out more about the tool).



Update progress

September 17, 2014

The update operation is the responsible of downloading files and directories from the repository into your workspace. It can be a time consuming action if there’s a high number of files to be downloaded (or updated) or if the overall size is big. The equivalent to “update” in Plastic is “checkout” in SVN and Git jargon.

We added a new “update progress” so that when there’s a lot to be downloaded you get more precise feedback:

The progress is updated in 4Mb blocks which is the transfer unit we use. It can be a chunk of a big file or a group of small files summing up to 4Mb.

Availability

This feature is only available in version 5.4 (5.4.13.578 - Jun 26th 2014 and higher). 5.0 won’t get this one merged since the code is built on top of some other changes and improvements developed for 5.4 only.



Improved checkin progress

September 15, 2014

We’ve improved the way in which the checkin progress is handled in the GUI so that it shows more details when the data is transferred through a slow network.

The default checkin scale uses megabytes, but if the network is too slow (sometimes checking in data through the VPN) it is equivalent to not having progress at all.

What we’ve done is to add a secondary progress bar that shows up only when the transfer is too slow. The secondary bar shows the progress of each 4Mb block being transferred. Remember Plastic splits the checkins in 4Mb chunks. The chunks can be just parts of a large file or groups of small files.

The following screencast shows how it works on a network with changing speed (we use WAMEm to modify the network bandwidth and hence overall speed). The example is not realistic but it is helpful to explain the new feature:

Availability

The improved checkin progress has been available for a while, since:

  • 5.0.44.577 - Jun 18th 2014
  • 5.4.11.568 - May 19th 2014


SyncView revisited – improved performance

September 13, 2014

We’ve improved the performance and usability of the SyncView: it is now able to exclude branches making the sync process much faster.

As you know, the SyncView is the view in the GUI that you can use to preview what needs to be replicated between different servers and then run the replicas.

Excluded branches have been added to improve sync performance

We have added “excluded branches”: branches that you don’t want to sync between your repo and the remote one.

In my case I run a Plastic server on my laptop (using a SQLite backend handling about 18Gb) but I don’t have full replicas of the central repositories. I just pull the branches I need (to develop, code review or manually test tasks before getting them released). It means there are a few thousand branches on the remote server that I’ll never pull. Some of them are already years old.

Since the SyncView calculates all the changesets that need to be pushed or pulled in order to let you preview them, it started to get slow with thousands of branches.

That’s why we added the “excluded branches” feature.

You can select the branches you won’t be syncing and just “exclude them”. The result is that the sync view will be much, much faster, saving precious time on each loop.

We also added an option to show the “excluded branches” so that you can include them again in the calculation in case you need them later on (expanding the excluded branches is way much faster in 5.4 than in 5.0 since we implemented a new server API call in 5.4 to speed up the calculation, while 5.0 API is frozen and can’t take advantage of it).

New behavior in the “refresh” button

Previously the refresh button in the “sync view details” lower panel just affected the expanded repositories. While it wasn’t an issue when you worked with small lists, it wasn’t effective dealing whith long lists like this:

So from now on the “refresh” will trigger the calculation of the entire list of syncs instead of just the expanded ones, while you can still refresh them individually using the context menu.

New “push visible” and “pull visible” buttons

Especially when you’re using Xlinks it is very useful to use the filter to push (or pull) all the branches with a given name, in different repositories.

We’ve added two new buttons: “push visible” and “pull visible” to launch the pull or push of all the branches selected by the filter.

Underlying format change

All the SyncView configuration is stored on a file named syncviews.conf. We’ve modified the file format to make it human readable and better structured than it was before. It will be automatically upgraded by the new 5.0 and 5.4 releases so no user action is required.

Availability

This feature has been available since:

  • 5.0.44.592 (August 6th 2014) and later.
  • 5.4.14.593 (Aug 8th 2014) and later.


How to handle big files with Plastic

September 9, 2014

So it's basically done like any other file!



Plastic is not affected by the file size like other systems out there.

Keep coding while we take care of your files.





600 releases of Plastic and counting

September 4, 2014

Today we've released Plastic 5.0.14.600. Yes, 600. Six hundred official releases since our first one back in 2005, more than one year before Plastic was launched.

Plastic has changed *a lot* during these years in all imaginable aspects: from performance to GUI, features, security, ease of use... almost everything version 1.0 included has been redesigned an incredibly improved a few times since.

I still remember reading great Spolsky's post saying 'Great software takes 10 years' and thinking "hey, can't be true! 10 years! When did he go crazy?" and I've to admit that as we get closer to 10 years in business I think I understand better than ever what it takes to develop good software.

Plastic *had to* change and evolve because our challenges turned out to be much bigger: our first paying customer was located in Northern Spain and had less than 10 developers. Things are bigger now and more demanding like an Asian company with more than 1000 developers working concurrently on a huge codebase and taking advantage of everything from distributed to high scalability and Xlinks.

We used to perform checkins of the quake source code as benchmark. The first checkin ever took 11 hours to complete! 11 hours (we threw up that code!). Nowadays we don't use this code anymore for testing since it is too small (a checkin takes about 4 secs or so and not even a 300Gb checkin takes so long).

Some of the core ideas in the first Plastic version persist, though. We wanted to turn "branch per task" into a mainstream practice to help developers on a daily basis, telling a story checkin after checkin and being able to use the version control not just as a delivery mechanism (that used to be most feared than loved) but as a real productivity tool you can't live with even if you're working alone. You checkin 'almost' as often as you hit CTRL-S (unless you're in vim :P). And this is something that is still there and it is easier to explain today than it was back 600 releases ago.

Looking back...

If you go back to the early posts on this blog you'll be able to go through the evolution of the initial Plastic releases, even before the official launch of 1.0.

From some really early GUI screenshots to my beloved and now defunct 3D version tree (which I expect will return soon and I hope will be much appreciated by game developers moving to Plastic).

Visit this album of 'historic' screenshots at our Facebook.

Back in 2005 we expected to fight the SVNs of the world and when we started visiting companies and explaining why using branches was good, we faced tons of skepticism.

Things got better after the Git revolution which evangelized the whole community teaching how things could be done in a different way.

Branching and merging was not evil anymore but the powerful tool to take advantage of.

As a result, nowadays instead of having to change developer's minds we basically get teams knocking at our door asking for good branching and merging, distributed AND centralized combined, visualization and good GUIs. If you put these three blocks together only Plastic stays as a valid choice, which is why despite of the distributed version control revolution lead by open source Git, there are still companies willing to pay for the software we develop.

We don't fight SVNs anymore (although we keep replacing them, but not really fighting :P) but we face big challenges since we focus only on teams that need really advanced features not available anywhere else. A real challenge which fortunately means we've to design new features and work on so many different areas that we can never say our job got boring after all these years.

Congratulations to the entire Plastic SCM team!!

Thanks to all the customers who make this adventure possible!



Real Time Web Analytics