Who we are

We are the developers of Plastic SCM, a full version control stack (not a Git variant). We work on the strongest branching and merging you can find, and a core that doesn't cringe with huge binaries and repos. We also develop the GUIs, mergetools and everything needed to give you the full version control stack.

If you want to give it a try, download it from here.

We also code SemanticMerge, and the gmaster Git client.

Directory Notifications to find changes

Wednesday, October 22, 2014 Pablo Santos , 0 Comments

Pending Changes is now faster than ever because it doesn’t need to traverse the workspace anymore. We have implemented a new mechanism based on Windows Directory Notifications to detect workspace changes faster than ever.

It is available only on Windows but we’ll eventually implement it for Linux and Mac (based on their corresponding notification mechanisms).

What does it mean for you? Well, as soon as you install 5.0.44.608 or 5.4.15.604 (or higher) Pending Changes will be faster. You’ll clearly notice the speed up with really large workspaces (in number of files) and with slow disks. The slower the disk is, the clearer the speed up will be.

How Pending Changes works? (without directory notifications)

Whenever you click on “refresh” on Pending Changes Plastic triggers a search to find the files that have been modified on your workspace.

The diagram below describes the process in detail:

  • The process checks the “pending changes options” first: if only checkouts are requested, then there’s nothing to look for, just print the list. That’s why working with checkouts makes sense for huge workspaces (>400k files).
  • If the options to find changed files on disk are set, the directory walk will start.
  • For each directory starting on the root of the workspace Plastic will try to find changed files. It will compare the timestamp on disk with the stored timestamp on the wktree file: the plastic.wktree file (inside .plastic) stores the metadata of the file. It know “how it was” after the last update or checkin. So if the timestamp and size doesn’t match, the file was changed. If timestamp doesn’t match and size does, Plastic hashes the file. It is slower but it makes sure the file is different. Alternatively there’s an option to force Plastic to always find changes based on file contents (ignoring the timestamp) which is definitely slower but required on some scenarios.
  • At the end of the disk walk, Plastic has a list of all the modified files on your workspace.

The diagram doesn’t show the last step (if the option is set): find “moved and renamed files” by matching potential added and deleted files.

What I want to explain with the diagram is that there is at least one IO operation by directory. If you keep pressing “refresh” on the Pending Changes view, chances are the list will be filled quickly: after the first traversal the workspace will be loaded in the file system cache so the next reads will be blazing fast. But if your disk is not very fast, or your computer is performing a lot of IO, or using a lot of RAM, chances are that your workspace won’t be entirely loaded in the file system cache, and then walking it will take longer.

You probably noticed it when after some coding you go back to Plastic, click refresh, and it takes longer than usual. This is exactly what we wanted to improve with this feature.

How Pending Changes with Directory Notifications works?

It is rather simple: we use Windows directory notifications to listen to events on the workspace directory. Each time a file is written, deleted, added, moved or renamed inside the workspace, Plastic gets a notification.

So, while we perform an initial directory traversal the first time the Pending Changes view loads, no other full directory walk will be needed later, greatly speeding up the operation.

What we do is the following: after the first traversal we keep a tree with the metadata of what is on disk, and we invalidate parts of it (on a directory basis) each time a change happens inside it. This way Pending Changes only has to reload parts of the tree instead of walking the workspace entirely. It saves precious time while still being a robust solution.

One of the issues with Directory Notifications on Windows is that it can’t really notify file or directory moves, so you have to match pairs of added/deleted. Instead of trying to pair the notifications we just invalidate parts of the tree and let the regular Pending Changes code to do the rest.

So, there’s still room for improvement but our initial tests probed that the extra complexity of doing a more precise event tracking didn’t pay off compared to just invalidating parts of the tree.

Availability

Chances are you are already using it :-)

If you’re using 5.0.44.608 or higher or 5.4.15.604 or higher, you’re already enjoying the Directory Notifications powered Pending Changes view.

Pablo Santos
I'm the CTO and Founder at Códice.
I've been leading Plastic SCM since 2005. My passion is helping teams work better through version control.
I had the opportunity to see teams from many different industries at work while I helped them improving their version control practices.
I really enjoy teaching (I've been a University professor for 6+ years) and sharing my experience in talks and articles.
And I love simple code. You can reach me at @psluaces.

0 comentarios: