Who we are

We are the developers of Plastic SCM, a full version control stack (not a Git variant). We work on the strongest branching and merging you can find, and a core that doesn't cringe with huge binaries and repos. We also develop the GUIs, mergetools and everything needed to give you the full version control stack.

If you want to give it a try, download it from here.

We also code SemanticMerge, and the gmaster Git client.

Partial replica

Thursday, June 06, 2013 Pablo Santos 0 Comments

I’m working on a big repo and want to continue working on my laptop, maybe while traveling, but I’d like to avoid a full clone of the entire repo for size or time issues.

I’d just like to have a “working copy” but able to checkin, branch from it, and eventually replicate the parts I didn’t get when I setup the repo.

This is the reason why we implemented “partial replica” back in 2010 when we jumped from Plastic 3 to 4.

(Note: we changed from “item level” merge-tracking to “changeset-level” merge-tracking which greatly simplified some aspects and opened new doors).

Replica goes branch by branch

In Plastic when you replicate a branch you replicate “just this branch” you selected, so all replicas are “partial” unless you select all the branches in the original repo.

Look at the following example where I’ll be replicating from the “London” server to the “Stockholm” one. In the original server I have 3 branches but in reality in Stockholm today I’m only interested on “branch2”. What can I do?

You can simply replicate “branch2” to the newly created repo at “Stockholm” and you’ll get a working repo with only 2 changesets but with all the files required to download the source tree. The branch is perfectly functional and you can do more checkins or even branch from it.

And you can push your changes back and they’ll go to the right places on the “London” repo.

A trick to replicate just from a single changeset

Suppose you need to replicate just from “changeset 6” but you don’t need to pull all the previous changesets on the “main” branch.

If that is the case there is a small trick you can use today: create a branch from “changeset 6” and replicate the new empty branch. Look at the figure below:

The “branch4” will be perfectly functional on the “pablolaptop” machine (it will replicate the entire tree loaded by changeset 6 @ “London”) and you can checkin new changes, branch from it and so on.

Note: you’ve to use the “empty branch” trick because we didn’t implement “replicate from cset 6” so far… Yep, we’ve tons of things to do and we didn’t schedule the task! :-D

Beware of the merge history

If you replicate partially the history from a repo, you’ve to be careful when running merges because you do not have the entire merge history and hence merges can be different. If you do need to make sure you can safely merge, then replicate all the branches.

Look at the scenario below: if you replicate “main” and “branch2” only, then the merge between cset “10” and “11” won’t detect “7” as ancestor but “3”, turning the merge into something more complex or even wrong.

Of course, it is perfectly safe to run merges when you know it is a simple branch hierarchy, like the following:

And the same holds true for any branch where the entire merge hierarchy is present on the repo.

We’re considering a change here because while it is perfectly fine for very advanced users, sometimes it is confusing for the rest of us :P. We’re considering two options: whether we ask the original repo for the missing merge tree information (when available), which sounds pretty cool under certain cases (especially for collocated teams with developers working in dvcs mode but will full access to the central repo), or we replicate always the entire changeset hierarchy (aka merge tree info) but not the associated data (so it would still be achieving the goal of partial replica but also detecting when intermediate csets are missing for merge).

Conclusion

Partial replica is a very strong feature for both distributed developers and servers on different locations since it can save tons of space and replication time. But right now you need to understand how it works in order to make sure you’re setting up a correct scenario.

Pablo Santos
I'm the CTO and Founder at Códice.
I've been leading Plastic SCM since 2005. My passion is helping teams work better through version control.
I had the opportunity to see teams from many different industries at work while I helped them improving their version control practices.
I really enjoy teaching (I've been a University professor for 6+ years) and sharing my experience in talks and articles.
And I love simple code. You can reach me at @psluaces.

0 comentarios: