Who we are

We are the developers of Plastic SCM, a full version control stack (not a Git variant). We work on the strongest branching and merging you can find, and a core that doesn't cringe with huge binaries and repos. We also develop the GUIs, mergetools and everything needed to give you the full version control stack.

If you want to give it a try, download it from here.

We also code SemanticMerge, and the gmaster Git client.

Troubles in .NET remoting when IP changes

Sunday, October 26, 2008 Pablo Santos , 3 Comments

I’m going to talk about an issue I had with remoting a few days ago when handling a specific scenario with a server and a client.

You know remoting is no longer the newest and cooler technology out there, since WCF is being there for a while, but AFAIK we don’t have WCF yet on Mono, which means it simply doesn’t exist for me and all you multi-platform people. So, that’s why I’m still diving into the deep remoting implementation looking for answers.

Here’s the problem: I have a simple scenario like the one in the following picture:



A client and a server working together, nothing special. The server publishes a IRemoteCall interface using remoting and the following simple code (not I’m marshalling an existing object, with infinite lifetime, which is not the usual way of doing things unless you’re an old COM cowboy... :-P)


using System;
using System.Threading;
using System.Runtime.Remoting;
using System.IO;

using RemotingTest;

public class Remote: MarshalByRefObject, RemotingTest.IRemoteCall
{
public override object InitializeLifetimeService()
{
return null;
}

public void Send(string data)
{
for( int i = 0; i < 10; ++i )
{
Thread.Sleep(1000);
Console.WriteLine("server is waiting...");
}
Console.WriteLine("server is done");
}
}

public class Server
{
public static void Main(string[] args)
{
RemotingConfiguration.Configure("remoting.conf");
Remote rem = new Remote();
ObjRef objservice = RemotingServices.Marshal(rem, "remote");
System.Console.WriteLine("Hit to exit...");
System.Console.ReadLine();
RemotingServices.Disconnect(rem);
}
}



Ok, you see there’s nothing special so far.

Just a few lines of code to implement the shared interface and the server code to start up the service and publish it.
The client is even simpler:


using System;
using System.Runtime.Remoting;
using RemotingTest;

public class Client
{

public static void Main(string[] args)
{
RemotingConfiguration.Configure("remoting.conf");

Client me = new Client();

string server = args[0];

me.Run(server);

Console.WriteLine("Waiting for next call");
Console.ReadLine();
Console.WriteLine("Running again");
me.Run(server);
Console.ReadLine();
}

private void Run(string server)
{
IRemoteCall remoteCall =
(IRemoteCall)Activator.GetObject(
typeof(IRemoteCall),
server);

try
{
remoteCall.Send("hello");
}
catch (Exception ex)
{
Console.WriteLine("Exception: " + ex.Message);
}
}

}


Simple, right?

It simply access the remote object (specified at the command line, in my case something like tcp://beardtongue:6060/remote), makes a call, waits for a user key hit, and runs a second call.
The next listing is the remoting.conf file used on the client:

<configuration>
<system.runtime.remoting>
<application>
<channels>
<channel ref="tcp" >
</channel>
</channels>
</application>
<customErrors mode="Off" />
</system.runtime.remoting>
</configuration>


And the one for the server:


<configuration>
<system.runtime.remoting>
<application>
<channels>
<channel ref="tcp" port="8080">
<serverProviders>
<formatter ref="binary" typeFilterLevel="Full" />
</serverProviders>
</channel>
</channels>
</application>
<customErrors mode="Off" />
</system.runtime.remoting>
</configuration>


I’m using TCP remoting channels in binary mode.

Well, needless to say it was not my start up scenario, but it is a simple example I wrote to study in detail a problem I discovered on a real deployment. During normal operation both the client and server work smoothly, but I ran into problems when the machine running the server experienced an IP change. How this happened? Well, I was traveling back and forth from the office to home, and I experienced problems with clients and servers running on the same laptop. They worked at the office but I had to restart the client to continue working at home. What was going on? The issue seemed to be related to the IP change the laptop was having at each network when connecting to a different DHCP server. The scenario is better depicted at the next figure:



Let’s see: I have a server up and running, then my client makes a first successful call. My laptop is suspended and awakened with a new IP address. Then the client tries a new call and it fails.

In order to reproduce it without having to reconfigure my DHCP server or traveling 20km each time, I used the following commands on my XP laptop:


>netsh interface ip set address
name="Conexión de área local"
static 192.168.1.253
255.255.255.0
192.168.1.1 1


(Everything on a single line)

I changed from 253 to 245 each time I changed IP.

Why doesn’t it work if I’m using the server name instead of the IP? Shouldn’t it work?

Then I run the test using the Mono runtime and... it worked!! In fact, I was playing with a custom TCP channel I wrote deriving from the Mono TCP channel (implementing SSL and some other tuning) and it was working too... So the problem with the .NET implementation is somewhere in the TCP Channel, not the remoting stack.
After some study I found out the following: inside the Mono TCP channel implementation there’s a small class named ReusableTCPChannel which in turn implements a property named IsAlive. The IsAlive property makes a call to:

return !Client.Poll (0, SelectMode.SelectRead);


which basically checks whether the underlying socket is still valid each time it is retrieved from the internal TCP channel socket cache. When the server’s IP changes and the client tries to run the next call it detects that the socket is no longer usable and creates a new one.

This is not happening on the .NET stack (I’m talking about 1.1, didn’t check whether it is solved on .NET 2.0 or 3.5) and the initial socket is reused to issue the next call, and simply raises an exception after a few seconds when it can’t reach the server anymore. The problem could be probably fixed somewhere inside the remoting library, maybe at the SocketCache.GetSocket method, where the RemoteConnection is retrieved from a HashTable and the underlying socket is used but never checked.

Fortunately it’s not an extremely common scenario for server applications, but if it happens to you try to grab another TCP channel (use the one from Mono) to get it fixed.
Pablo Santos
I'm the CTO and Founder at Códice.
I've been leading Plastic SCM since 2005. My passion is helping teams work better through version control.
I had the opportunity to see teams from many different industries at work while I helped them improving their version control practices.
I really enjoy teaching (I've been a University professor for 6+ years) and sharing my experience in talks and articles.
And I love simple code. You can reach me at @psluaces.

3 comentarios:

Who we are

We are the developers of Plastic SCM, a full version control stack (not a Git variant). We work on the strongest branching and merging you can find, and a core that doesn't cringe with huge binaries and repos. We also develop the GUIs, mergetools and everything needed to give you the full version control stack.

If you want to give it a try, download it from here.

We also code SemanticMerge, and the gmaster Git client.

New blog posts

Tuesday, October 21, 2008 Pablo Santos 0 Comments

I've started to post at DDJ's blog a couple of weeks ago, and so far I've written a couple of topics.

In the first one I focus on the new Mono 2.0 release, and what I believe it means to developers out there. You can find the full text here.


And today I've just posted about what I end up calling "multi-experience" software design and talks about some of the issues covered at the latest Pragmatic Programmers great book on Pragmatic Learning & Thinking.

Enjoy
Pablo Santos
I'm the CTO and Founder at Códice.
I've been leading Plastic SCM since 2005. My passion is helping teams work better through version control.
I had the opportunity to see teams from many different industries at work while I helped them improving their version control practices.
I really enjoy teaching (I've been a University professor for 6+ years) and sharing my experience in talks and articles.
And I love simple code. You can reach me at @psluaces.

0 comentarios:

Who we are

We are the developers of Plastic SCM, a full version control stack (not a Git variant). We work on the strongest branching and merging you can find, and a core that doesn't cringe with huge binaries and repos. We also develop the GUIs, mergetools and everything needed to give you the full version control stack.

If you want to give it a try, download it from here.

We also code SemanticMerge, and the gmaster Git client.

Agile Scrum, CMMi, Testing and Continuous Integration

Friday, October 17, 2008 Pat Burma , , 0 Comments

At Codice software we use a combination of software development practices and organizational strategies. Specifically we use CMMi (we are level 2 certified) and Scrum. An unlikely pair as one of these aims tightens up the development environment with strictly defined rules and process while the other aims to loosen the development environment and allow change to flow more freely and quickly.

While these two methodologies may seem diametrically opposed they actually work together really well. Pablo Santos the CEO, spiritual leader and in house motorcycle aficionado has talked about the marriage of Scrum and CMMi in our internal process's before. Recently he has authored a white paper that talks about the testing environment at Codice and how CMMi and Scrum have impacted the way we test.

CMMi summarized in a few short words is about setting up a well defined process, well documented process that is rigorously enforced. Agile, to summarize, is about being lightweight and adaptable. A marriage of these two spawns a process that is lightweight, adaptable but provides a rigorously enforced development process.

Used to be, back in olden times of a couple years ago, that coders would code and testers would test. The onus was on the quality assurance department to discover bugs before they went out the door. In the modern era of today and tomorrow there has been a paradigm shift that puts the onus on coders to find bugs before they even get to the testers. The testers are still responsible for putting the final stamp of approval on things, doing integration testing and regression testing, but coders are now being asked to only commit bug free code, or as bug free as reasonably possible.

As a result of this automated testing tools have really been a booming industry. Codice uses a couple tools for managing testing. A modified NUnit (PNUnit), TestComplete and an internal work item system. It is important to use a wide range of test methods even an in agile environment, see Edward Correia's first of 10 Myths of Agile Testing. White box, unit and load testing are accomplished with PNUnit. Blox box, functional and behavioral testing are performed manually or with automation using TestComplete.

A few words on TestComplete. I have been working with and around automated testing tools since 2001. I am familiar with a whole lot of testing tools and the better one's came from IBM and Mercury (now HP). TestComplete may very well be the best tool on the market. It has appeal for software developers as well as QA testers. It's part of of the next generation of automated testing tools that that uses intelligent object recognition for testing playback. This is huge in an agile environment where change occurs rapidly and users don't want to spend all their time maintains the automated test scripts.

Pairwise testing is also important to the Codice testing environment. Agile is about speed and its not very practical to think you can run 10 hours of automated tests with every build. It also makes continuous integration nearly impossible. Pairwise testing is a way to break up testing responsibilities using a matrix so that you can get good coverage on all the major aspects of a product without having to cover every single combination of things. A simple example, Plastic runs on Windows and Linux with MySQL, MS SQL Server and FireBird. To test every version of Windows and Linux with every database supported by Plastic would lead to an enormous number of testing combination's. This is simplified by using the Pairwise technique so that we test every system once and every database once, but not every system and database combination.

The result is a rich testing environment that is very broad in scope but not too time consuming. A robust testing environment is one that takes advantage of many different disciplines like pairwise, unit, functional and manual testing. Too add a minor wrinkle; the most important features can be tested during continuous integration during sprints, its not necessary to run an entire suite of tests.

If you want to learn more about this read our white paper about the Codice testing environment.

To try it for yourself I recommend these four tools.

1.) Plastic SCM - for version control
2.) TestComplete - for automated functional testing
3.) VersionOne - for Agile project management
4.) CruiseControl - for Continous Integration

Plastic, TestComplete and VersionOne are all products developed by companies who know each other, integrate where possible, and specialize in the products they make. It's an ideal combination for an Agile shop looking for robust testing. Plastic offers integration into VersionOne and CruiseControl.

0 comentarios: