Saturday, January 06, 2007

Bug Fixing

Bug fixing is a necessary evil of software development. How often have you been working on a large, complex, probably legacy software system and tried to fix a bug? The changes, while appearing only minor, have unknown side effects. It's hard to isolate and test only the changes you make. The other developers are also working on the same file(s) as you. No one really knows what patches are fixing what bugs. Sound familiar? For anyone who has worked on a large complex software system with little knowledge of the software architecture this is an every day fact of life. But, with a little organisation and a powerful revision control system (such as Plastic), we can bring order to this mess. Here's how.

By using parallel branches and isolating each bug fix to it's own branch we can begin to bring order to what may appear as a chaotic software jungle. Parallel branching enables all the developers to work concurrently on the source code. Any modifications are made in isolation away from the main branch or head revision. The aim is to keep the main branch in a known stable and working state - it always compiles and passes the test cases. Some may prefer to call this the 'release' branch but we'll simply call it 'main'.

The workflow is thus: the developer identifies the file or files that require modification. These files are branched on to a new branch and the appropriate modifications are made. Testing is carried out. When the tests succeed the modified files are merged back on to main.

Given some file named main.c on the the /main branch we will start by creating the bug fix branch, thus using the Plastic command line:

> cm makebranch br:/main/pr-1234

We will then branch the working file on to this new bug fixing branch like so.

1. First update the current workspace selector with the command:

> cm setselector

And change it to:

repository "default"
path "/main/pr-1234"
branch "/main/pr-1234"
path "/"
branch "/main" checkout "/main/pr-1234"


2. Checkout the file main.c ready for editing:

> cm checkout main.c

We are now working on the parallel bug-fixing branch away from any other changes that may be done to this file by our colleagues.

The file is edited and checked-in as normal. Using the graphical tree view in Plastic, we can confirm the revision history of the file on the bug-fixing branch.

> cm tree main.c



When the changes are complete, we merge the file from the /pr-1234 branch back on to the /main branch, like so:

1. First change the workspace selector to view only the /main branch

repository "default"
path "/"
branch "/main" checkout "/main"


2. Then we do the merge. We'll do this in two stages, firstly a "findmerge" to determine what files are required, then the actual merge:

To do the findmerge we:

> cm merge br:/main/pr-1234
Merge need on item c:\workspace\main.c" from br:/main/pr-1234 to br:/main#1 base br:/main#1


This shows that our file needs to be merge, so we do the merge:

> cm merge br:/main/pr-1234 --merge --graphical
Merge needed on item c:\workspace\main.c form br:/main/pr-1234#1 to br:/main#1 base br:/main#1
Going to merge c:\workspace\main.c
Merge done


This will leave the file in a checked-out state. So we first review that the merge succeeded before checking-in.



If the merge failed for some reason, simply do:

> cm undocheckout main.c

If we're done, we check-in:

> cm checkin main.c

Then view the final revision tree:

> cm tree main.c



And that's it.

Note here that we are referring to individual files in the source tree. We are not creating a copy of the entire source tree, but just the necessary files within it. The revision control system must be able to support this. It must also be able to support file renaming and deleting, i.e. the versioning of the actual directory contents. In other words, every single change and operation we make to the source tree, either in the code, the binary data files, the act of renaming or deleting or moving files and directories must be recorded. Without these features, the revision control system is not adequate for use in a production development environment. Why? Because you must ensure that the version history is complete. You can go back to any point in time and recreate that source tree. If you can't do that with your current revision control system, replace it.

Now we make one simple observation. By naming the developer's work branch in such a way so that it can be identified from the bug tracking system, we are making a traceable, isolated change set directly in the revision control system. Do this for all bugs and we are immediately able to look at the bug in the bug tracking system, identify the branch in the revision control system, run a diff and immediately see what was changed, by whom and when. With the aid of the graphical version tree we can instantly see where those changes were merged.

What was a chaotic collection of code changes has now become an organised, traceable collection of modifications. We can even begin to start gathering metrics on individual bug fixes, such as, how many lines of code were changed, what files were affected. Add a sprinkling of peer reviews and we start to mitigate the risks of changing complex software.

By the way, this approach also works earlier in the software development life cycle. Replace the concept of 'bug' with 'change request', 'feature implementation', 'cross platform version' and so on.

0 comments: