The Plastic SCM blog

RepliKate



After accidentally cloning a sexy repository, two college grad students decide to educate the newly created script and turn it into the perfect Plastic SCM replication command. (Source: http://en.wikipedia.org/wiki/Repli-Kate)

CmdRunner background

Using the CmdRunner library we have created a small application for replicating a complete PlasticSCM repository. You can find the RepliKate code here.

What RepliKate does

You can use repliKate for two purposes:
1) Replicate a complete repository from one server to another.
2) Replicate all the changes done inside a repository in a period of time .

Replicate a repository from zero

You can use the following command to migrate all the source repository content into a new one, all your branches, changesets, permissions, labels and so on will be replicated to the destination repository.

"replikate srcrepos@srcserver dstrepos@dstserver"

Remember that if you are a GUI guy you can always use the "Sync view" to achieve the same result.

Keeping your repositories synced

Here is where RepliKate can help you more, let's work with an example.

I have my central code repository in my "OmicronPersei8" server and I want to push periodicaly my new changes to my "Amphibios9" server, in order to keep "Amphibios9" server as a mirror/backup server.

First I need to replicate all the "OmicronPersei8" code repository to the "Amphibios9" server using repliKate, just like we did up there.

Now that I have the repository synced I need to replicate everyday all the new changes made into the "Amphibios9" repository, it's easy using the "Sync view" but it's also too much manually. With repliKate you can use a Windows scheduled task (or Linux cron) to issue the following command every night:

"replikate srcrepos@srcserver dstrepos@dstserver --syncdate=yesterday"

As you can imagine this command will replicate all the branches with changesets created since yesterday. Running this command everyday will keep my mirror server updated.

Instead of the "yesterday" key word you can use a certain initial date, for example:

"replikate srcrepos@srcserver dstrepos@dstserver --syncdate=initialdate(Month/Day/Year)"

Run "repliKate.exe" to get the full help.

RepliKate log

You can use this log4net configuration file to get the log info from the repliKate application, you need to place it in the same location as the replikate.exe file, it will generate a replicate.txt log file.

In a future blogpost I will explain you how to configure it to reveive email notifications when repliKate is not able to push the new changes.

Enjoy and customize it!

The definitive guide to "find"

Motivation

We designed the “cm find” as a SCM query language able to solve almost any query in SQL-like format but avoiding direct queries to the underlying database system and hence avoiding difficult and long joins.
It has been there for years, and years, and years... but only the advanced plastikers knew about it... now is the time to open up the Pandora's box! :)
This post will help learning the possibilities of the “find” functionality which is available through the “cm find” command and also in the GUI through the “advanced” panel on most of the views.

What can you use “find” for?

“find” is great for scripting your own tools, but it is also great for customizing your UI with simple but very powerful queries.

The very basics

There are a few things to keep in mind when dealing with “find”:
  • Where are the fields? You can find all the “searchable” fields running “cm showfindcommands
  • How can I format the output? If you’re on the CLI, you can format your output with –format=”{field} {field2}”

    Useful “finds” to start with

    Let’s go through some useful queries:
  • All the branches I created: cm find branch where owner=’pablo’
  • The first “trick”: cm find branch where owner=’me’. “me” will be replaced by the user invoking the query, quite useful as default configuration for some “views” in the GUI. Yo can try it from the “advanced” panel at the branches view as: find branch where owner=’me’
  • Another trick: doing it more “English like”: find branches where owner=’me’
  • Branches you created since the beginning of 2010: find branches where date >= ‘2010/01/01’
  • Finding all the branches you created with changes (important WITH changes) since the beginning of Feb 2012: find branches where owner='me' and changesets >='2012/02/01'. The field “changesets” does the trick. It actually hides a “join” (for the SQL guys out there) with the “changeset” table.
  • Let’s try again with “all the changes I created THIS month”: find branches where owner='me' and changesets >='this month'

    Ok, now that you’re curious about what you can type on the “date” fields, here’s the entire list of available keywords: today, yesterday, "thisweek, "this week", "thismonth", "this month", "this year”, “this year","onedayago", "one day ago", "oneweekago", "one week ago", "onemonthago", "one month ago"

    And now the ones to be used with a number like find branches where changesets >= “10 days ago”: "daysago", "days ago", "monthsago", "months ago", "yearsago", "years ago"
    Enough for today! :)

  • Setting up a Bamboo server to test task branches (a.k.a. feature branches)

    Plastic SCM integrates with Atlassian's Bamboo for Continuous Integration. We implemented the Bamboo's plugin long time ago (3.0 supported it, about one year and a half if I recall correctly).

    We use "branch per task" internally (more info about the task driven development here) which can blend perfectly with a slightly modified CI process (which in fact is considered by many as "the future of Continuous Integration").

    As Codice's build master, I decided to adapt the Bamboo plugin for internal use, making it ready for "branch per task". My plan (combined with product strategy) is to start with Bamboo and then loop internally through Zutubi's Pulse, TeamCity from JetBrains and later Jenkins (aka Hudson).

    How CI servers work

    The way in which a CI server works is:
  • It monitors the "main" branch (or master or trunk depending on your SCM jargon)
  • It launches a new build each time it detects a new changeset on the "main" branch All CI systems are normally designed to monitor a single branch, so when you try to use them with "branch per task"... well, you need some changes. Long ago we wrote a blog post about how to set it up with Cruise Control but now is time to get it to the new systems :)

    What a branch-ready CI needs to do

    Basically it needs to switch branches before each build, to get ready to compile the latest "finished" branch. The dynamics are quite similar but the "branch switching" is required.

    And that's basically what we added to the Bamboo plugin.

  • Detect a branch ready to be tested
  • Switch to the branch
  • Build the code
  • Pass all tests (in our case about 3 hours of automated testing per task: unit tests, GUI tests, command line based tests)
  • Mark the branch as "tested" if everything goes fine or "failed" if something fails (from build to testing)

    Benefits of branch per task

    Each branch in "branch per task" resolves a unique issue (task). Then what I do is integrate these branches in a separate branch and when all the tests have passed we merge up to our main branch.

    This pattern is very flexible and allows us to separate the work of every task and integrate each task in several branches, deciding what to integrate and what to discard at every moment.

    As an integrator I really love this advantage.

    How to set up the Bamboo plugin to work in "branch per task"

    The only thing you have to do is go to 'Configure plan', then click on the 'Source repository' tab and fill the selector field with something like this:
      repository "codice"
        path "/"
          smartbranch "BRANCH_NAME"
    

    Find how it looks like here:

    The BRANCH_NAME keyword will be replaced by the specific branch to test.

    How Bamboo knows about the branches to be built

    Now, we need to tell Bamboo which branches to test:
  • To do this, we create an attribute in Plastic SCM called 'status'.
  • Bamboo will test the branches that have the status attribute set to 'RESOLVED'. (this is basically what we've modified on the plugin)

    Using scripting and the Bamboo CLI API you can check whether a build finished ok or not and changed the attribute properly.

    Internally we use the following statuses:

  • RESOLVED: The branch is ready to be tested.
  • PASSED: The branch passed all the tests and it's ready to be integrated.
  • FAILED: The changes done in the branch broke some tests and cannot be integrated.
  • TESTING: The branch is being tested right now.
  • The build master will only integrate PASSED branches and every developer will fix their FAILED branches and set the attribute again as RESOLVED so that the system test the branch again.

    This way we get all our branches tested and the release cycle is much easier and much faster; also the developers don't have to run tests (well unit tests are run very fast so there's no excuse to run them) but the important thing is that with very little effort they get results very quickly.

    Remove repository trigger


    We have been told that removing PlasticSCM repositories is very easy. Indeed, it's is.

    But, come on!! who wants to remove its own production repositories!!! Ok, ok it can be a mistake.. let's prevent it.... Since the "rm" permission is too much generic (rm label, rm branch, rm changeset, rm item) we are creating a new bunch of permissions for PlasticSCM 4. But until it's released we can use our lovely triggers!!

    First create the "rmrep" trigger, you can find the example trigger here. Create it as a "before-rmrep" in order to deny all the "rmrep" operations.

    Now try to remove the repository...

    You can't!! And you will receive an emergency alert to your mail!


    Make sure you create the trigger as a "before-rmrep" trigger and the return value of the program it's not zero.

    Enjoy!






    Working on a single branch- update-merge explained

    While our recommended pattern is definitely “branch per task”, sometimes you’ll find yourself sharing a branch with another developer or even the whole team. This post will explain the two common situations and how Plastic SCM solves them.

    Scenario 1- Working on the same branch without conflicts

    Suppose you’re working on your branch, you check the branch explorer and you see your situation is like this:

    You actually have some changes pending to be committed and you just continue working on them. Your changes are:

    Meanwhile, another developer ends up with his changes that don’t collide with yours, and later decides to checkin:

    Now you go and check your branch explorer and see that now you’re “behind the head” of the branch:

    If you now try to check-in Plastic will warn you telling that you might have conflicts:

    You can go and check the changes of this last cset and since you see they do not conflict with yours, you can just go to the root of your workspace and run an “update”.

    Plastic SCM will detect that your changes do not collide with the new branch head and will be able to move you to the new head, downloading the new changes. You’ll be again synced with the head and ready to checkin without merge:

    Scenario 2 – Synchronizing colliding changes

    Let’s go back to the beginning: you’re working on your branch and modified the following files:

    At the same time your colleague modified the following:

    Ok, the two first files are colliding!

    Your colleague now is faster than you and checkins his.

    So now your workspace is behind.

    You try to checkin and see again the “merge needed” dialog:

    Please note that the merge-needed dialog brings you the option to start a new branch, which is really convenient!!

    You can press “Ok” and merge, or you can try to update your workspace as you did before. Now, if you try to update the workspace, since there are colliding changes, the system will warn you:

    And the following merge dialog will show up:

    You finish your merge (hopefully you do not have tough conflicts to solve :P) and then go back to the branch explorer to check what happened:

    You see there’s a “merge in progress” pointing from the head of the branch to the changeset you’re working on (calm down directed-acyclic-graph purists... this is just a way to show the merge in progress and most likely we will modify it to render the “in progress” changeset too!!)

    Now you’re ready to checkin and once you do it you’ll get:

    So you’ve created a “sub-branch” (branch within a branch) without even noticing it! :)

    Shelving (stashing) introduced

    We had automatic shelving on versions 1 to 3.0 but it was removed in 4.0 due to the major underlying re-design and also time constraints.

    Now we’re working to release 4.1 and it includes, among others, the new shelve system that blends the good things of the traditional shelving mechanism and the ability to “merge” shelves like some DVCS do (consider git stash).

    This is the list of features of the new “shelve mechanism”:

    Feature Plastic SCM 4.1 TFS Git Perforce
    Store temporary work in progress
    Yes Yes (shelve) Yes (stash) Yes (p4 shelve)
    Apply temporary work to a different branch Yes No (can't merge, just copy) Yes (stash) Partial (you can later apply a "resolve")
    Share shelves among developers Yes Yes No Yes
    Share shelves among different servers (DVCS way) Yes (using replica) No (is not a DVCS) No (can't share stash) No (not a DVCS)

    When is shelving useful?

    Check this great Stack Overflow post for more info.
  • Context Switching: saving your work on your current task so you can switch to another high priority task. Say you're working on a new feature, minding your own business, when your boss runs in and says "Ahhh! Bug Bug Bug!" and you have to drop your current changes on the feature, and go fix the bug. You can shelve your work on the feature, fix the bug, then come back and unshelve to work on your changes later. Altenatively in Plastic SCM and “branch per task” you can simply checkin and switch. But shelving definitely adds flexibility on top of branch per task.
  • Sharing Changesets: if you want to share a changeset with another developer without checking it in, you can shelve it and make it available on the server side. This could be used when you are passing an incomplete task to someone else (although here ‘branch per task’ solves the scenario in a cleaner way) or if you have some sort of testing code you would never EVER check in that someone else needed to run.

    The answer highlights one point I wouldn’t use “shelves” for:

  • Saving your progress: while you're working on a complex feature, you may find yourself at a 'good point' where you would like to save your progress, this is an ideal time to shelve your code. Say you are hacking up some CSS / HTML to fix rendering bugs, usually you bang on it iterating every possible kludge you can think up until it looks right. However, once it looks right you may want to try and go back in to cleanup your markup so that someone else may be able to understand what you did before you check it in. In this case, you can shelve the code when everything renders right, then you are free to go and refactor your markup knowing that if you accidentally break it again, you can always go back and get your changeset.

    I would never use shelves for this. Branch per task is simply a much better pattern. You checkin when you want to create a new “checkpoint”, that’s it, easy and simple, no extra operations needed.

    TFS needs to use “shelving” for this because “branch per task” is not doable with high number of branches (I mean, it works in “hello world” projects but not in real conditions).

    How does it work?

    Suppose you’ve some debug code that you don’t want to checkin but want to keep the same debug statements when switching to a new branch.

    You go to Plastic’s “pending changes” view but instead of “checkin in” you decide to “shelve”.

    Now you decide to switch to a different branch (for instance due to a context change, priority change or whatever).

    Then you look for the “shelves view” in Plastic:

    And show the available “shelves”:

    From here you can simply select “apply shelve” on the shelve context menu and a “merge will happen”.

    The important change from 3.0 is that a shelve is not just a copy that you can “restore”, you can make some changes and apply them to a different branch using the same underlying merge mechanism you’re used to.

    When will it be available?

    Our plan is to make 4.1 available in March 2012, before the Game Developers Conference. We plan to create a “labs” area under downloads to make the new version available for the ones who can’t wait! :P
  • DVCS myths & facts

    Unless you’re not in the software industry or you’ve been down to Mars for the last 5 years, you’ve heard about DVCS. It stands for Distributed Version Control System and well, it is simply “the way to go in planet Earth for all programming related stuff”.

    Admittedly I was more than happy today when I went to the SD Times front page and read: “Branching and merging: the road ahead for SCM”, which is cool considering “branching & merging” is the heart of our daily work.

    This is our mantra!!

    We always say “branching and merging is good”, we designed Plastic SCM from the ground up back in 2005 to help all the small and medium teams in companies out there moving towards parallel and later distributed development.

    And now, certainly thanks to DVCS major players like Git and software heroes like Linus Torvalds, it became a major industry trend.

    Version control used to be a commodity circa 2004, but 2005 rocked the SCM world and a new wave of tools tossed the foundations of the configuration management sector: big tools like PVCS and ClearCase and mass-oriented ones like Subversion started to vanish away.

    And a brave new world emerged shaped to the form of the new development trends: agile, faster, more dynamic and fully distributed.

    But, as usual, there’s confusion:

  • centralized minds making bold statements about what DVCS can and can’t do.

    I’m trying to come up with a list in the purest Chuck Norris facts style about what’s true and what’s not true about DVCS. But, to start with, I’ll simply try to check whether 4 DVCS myths stated in the article in SDTimes are true.

    Myth one - DVCS has a steep learning curve

    Replace “dvcs” by “git” and you’ll get the original statement.

    Well, it is simply incorrect: there’s much more to DVCS than Git. Give a try to Hg, check Plastic SCM or simply read a little bit more about Git and you’ll find that learning DVCS is not harder than learning Subversion or Perforce.

    Myth two- DVCS GUIs aren’t there yet

    “There is no effective GUI yet, so it appeals more to power users than enterprise developers.” (sic Perforce’s Randy DeFauw)

    As a Plastic SCM developer this is fun to read. I know he is talking about Git, but still :P Our system is all about cool GUI and advanced DVCS. Compare Plastic SCM with others and check our GUI effectiveness.

    Myth three- DVCS requires a full replica on each developer machine, compromising security

    Today I feel like we were doing our homework down here at Codice: while this can be true for Git or Hg, the good thing with Plastic SCM is that it is all about choices: you want it centralized, fine, you want it distributed, fine too, you want a full replica, right, you just need to replicate a single commit (changeset) to your laptop, work on it and push back to the main server later on… fine!

    Regarding security: ACL based sec, triggers, up to 7 standard database backends, LDAP support built-in… what do you mean by “compromised”?

    Myth four - DVCS can’t deal with big files and it is only designed for text

    Down here we test Plastic SCM 4.0 with a checkin of 1Tb in a single file.

    You read correct: 1 TeraByte inside a single file.

    And yes, we run circles around Subversion while doing it.

    SO: maybe Git or Hg have big trouble dealing with big files, but Plastic SCM is distributed and can deal with files as big as you can handle, that’s why, for instance, some reputed game developers already use Plastic SCM.

    Myth five - There is no “master” file or canonical source

    Tell that to Mr. Torvalds and his “dictator/lieutenant” controlled integration model.

    No master file? Being distributed doesn’t mean there’s no canonical source. You can have more than one repository but still kwon which one is the master copy.

    Unable to do a replica doesn’t mean you’re better organized (SVN, Perforce), it only means you’re unable to do distributed work.

  • An arrow story

    Well, a few weeks ago someone posted some comments from the Mercurial folks about our (and Git's) recursive merge strategy. I covered it a few days ago here.

    Interestingly, some people argued that the way in which we render the branches in the branch explorer (merge links pointing from source to destination) is not correct. I really don't care about how correct it is since I don't feel like teaching graphs theory on a daily basis, but we do always listen :P.

    This is basically how the Branch Explorer renders branches and commits (changesets):

    And well, in Plastic SCM 4.1 (it is ready down here and soon will be published under "labs" for the impatient among you) there's a hacker hidden option:
    edit branchexplorer.cfg
    
    and add:
    
    display.options.dag_mergelinks=true
    
    and then you'll see:

    Which, ok, well, whatever, I guess it has its own audience! :P

    Vertical

    4.1 features the long awaited, begged, proposed and requested "2D version tree feature" that I'm not unveiling today... :P And together with it it comes the possibility to render your branch explorer... vertically... up means newer.

    All credits to Mr. Daniel PeƱalba! :P

    Enjoy!

    Migrating from SVN to PlasticSCM

    Thanks to the script made by Ignacio Calvo (ignaciocalvo.com) you can migrate your subversion repository into PlasticSCM in a very easy way.


    There are a lot of ways to do the same and with different tools, but this one is pretty and quite simple.

    You can find the source code here, and a lot of more info in our public forum http://www.plasticscm.net



    Real Time Web Analytics