« January 2004 | Main | March 2004 »

February 28, 2004

TDD and Genetic Programming

Listen to this articleListen to this article

TADAIMA!

It's been sometime since my last entry (feels like the start of a confession) but I've been in Japan for the past 2 weeks. For various reasons I really needed to get away from computers for a bit (no pun intended) and what better way than to emerse myself in my other passion.

As I was sitting in the plane on the 9 hour leg from Sydney to Tokyo, I was thinking about genetic programming (GP) and good old fashioned artificial intelligence (AI). I remember being fascinated by AI when I was a kid only to discover that in essence it was all about fitting a curve to a line. In more recent years, I stumbled on GP which sparked my interest once again especially when I read about people taking out patents on the basis of genetic algorithms (GA) they had developed specifically for the purpose of producing patentable designs.

So anyway, I started to think back to simple (and not so simple) neural nets that I had worked with. The kind where you create a back propogating net, feed it the expected inputs and outputs and watch it reconfigure the weights, etc. accordingly.

This got me to thinking about my old friend test driven development (TDD). Neural nets "evolve" by reconfiguring themselves to satisfy the old requirements AND the new requirements. TDD encourages a similar behaviour.

One of the criticisms of neural nets is they can be quite brittle. That is, they may very well solve the problems they have trained for but can often fail dismally on problems that to us seem almost exactly the same yet obviously differ enough to "confuse" the net. Usually it's just a matter of re-training the 'net with all the old data AND the new data (just like running your tests!). The trick is to understand the problem sufficiently to create useful training data.

One of the criticisims of TDD has been that it creates brittle applications. Now my personal experience is that is just rubbish. If anything, it encourages designs that are more flexible/extensible in the long run. But, it is true to say that a purely TDD application may not cope with a scenario that hasn't been tested for. In fact I argue that anything that hasn't been tested for is by definition not a feature of the system. Not only is it not a feature of the system now, it is definitely not guranteed to work in any future release even if it does work by coincidence now.

But the real thing I liked about comparing TDD with AI is that it's evolutionary. The system is continually adapted to suit the changing needs. This naturally lead me to think about GP and wondered if it was possible or even practical (let alone useful) to write a GA that could take a suite of failing JUnit tests and generate an application that would make the tests pass? Then we would simply add some more tests and re-run the GA to produce a new system adapted to the new requirements.

As I've already stated, I'm not sure how practical this is let alone useful even if it is possible. But it was the last thought on computers I had until I arrived back in Melbourne.

Now to find a good Japanese restaurant...

February 13, 2004

Dead code elimination

Listen to this articleListen to this article

I recently added jCoverage to the build of a "mostly" TDD project. The bits that weren't TDD are classes that we either "hacked" together for a quick-and-dirty proto-type (and that will ultimately be removed) and some that we used for domain modelling.

I like test coverage results, mainly because it gives me a warm fuzzy feeling. I can see how well or how poorly the developers are going WRT test coverage. But, I've often wondered if it has much benefit over and above the warm fuzzies.

It has occured to me many times that in many cases (especially on a TDD project) the coverage analysis really shows me potential areas of dead code. That is, code that almost never (if at all) gets executed. Especially on large projects with lots of re-factoring going on, it can be hard to keep track of which methods are no longer used.

IDEs such as IntelliJ IDEA and Eclipse and even Checkstyle and I think PMD can show you visually which private members aren't being used. The IDEs can also show you which public and protected methods aren't being used but you have to run the search manually. Granted, it is also possible to write some code that loads all your classes and builds dependency graphs to do the same. But why bother when I already have a tool that does it for me?

And so it came to pass that this morning I was looking over a coverage report and found a few areas where the coverage was a big fat zero. Intrigued, I opened the code in my IDE and did a search for all usages. What do you know. None. Zero. Nada. Bupkis. Zilch. You get the idea :-). I did this on the next few untested methods and I'd say that around 25% of the untested code was actually dead code!

Just for curiosity, I'd really like to put this into a live system and see which parts of the code aren't used. I'm sure some of it will turn out to be old code that handles scenarios that never eventuate anymore. Who knows? Worth a try at least?

February 09, 2004

Testing thread safety - updated

Listen to this articleListen to this article

Not much this time except to say that I took the previous examples and made them a bit more generic. The example provided shows the simplest method of using the classes but it can easily be extended for more complex requirements. In fact, I've so far used these classes to successfully test some in-memory database code I'd been writing so it definitely works for other than tivial examples.

Continue reading "Testing thread safety - updated" »

February 07, 2004

Adding a comment feed

Listen to this articleListen to this article

I found an article detailing how to add a comment feed to a Movable Type blog. I made a few changes (as one does) and now you can subscribe to the comments on this blog as well as the main feed. So for anyone who's interested, here's the template:

Continue reading "Adding a comment feed" »

Testing thread safety revisited

Listen to this articleListen to this article

At a loose end today I turned my attention back to a recent blog of mine on testing for thread safety. Spurred on by your feedback and possibly just to prove a point ;-), I decided I'd spend a few hours and see what I could come up with.

Before delving into the code, I'll set out the scope or terms of reference for the excercise:

  • I wanted to test a very simple class for thread-safety;
  • The class should be written (designed) with thread-safety in mind, ie. with synchronization in place;
  • I will only deal with synchronization at the method delcaration level
  • The tests should prove that the class works correctly with synchronization in place;
  • The tests should also prove that the class fails when sychronization is removed; and finally;
  • I want it to be automated so I need a way to have a synchronized and unsychronized version of the class.

The last point here could be handled by using an interface much like the synchronized collection wrappers but I felt this was kind of cheating. Instead I decided to use the ASM byte-code library to do some magic on the class files.

Now onto the code and some brief explanation of the classes. You should be able to copy and paste the code into your favourite IDE, compile and run. You'll obviously need JUnit and the ASM byte-code library.

First the class under test which I based on some code I'd seen when researching threading:

Continue reading "Testing thread safety revisited" »

February 05, 2004

Don't touch my privates

Listen to this articleListen to this article

I was giving a talk today on design and testing, refactoring, tools etc. and a question regarding the testing private methods came up.

We were discussing reducing cyclomatic complexity by encapsulating conditional statements in private methods. I showed a block of fictional code that included a test that the customer was at least 18 years old and contained a number of conditionals including the 2 or 3 checks required to test the age.

So we re-factored the code to extract the age check and demonstrated that the code was now much more readable and understanable. The newly created isAtLeast18YearsOld(Date dob) method made it clear what the calling method was trying to achieve.

The next question was "should we write a test for this?" and if so, "how are we going to do that if it's private?"

Now, I almost never test private methods. I say almost, just in case i've either done it once before and forgotten or I need to change my mind at some point in the future. But as far as I'm aware, I never test private methods.

Private methods are private for a reason. They're implementation detail. Yet another reason I dislike Java Beans so much - they simply expose the innards of classes such that the private instance fields might as well have been marked as public. "Guns don't kill people, people kill people." True enough but give a developer a getter and it's death to good design.

Naturally, my first reaction is that I do believe there is enough logic in there to warrant a test but that it's private and I don't test private methods. Instead, maybe the method deserves to be public and therefore testable and if so where does it belong?

"how about we put it into the Person class?" someone asks. What a sensational idea I replied! No more passing around a Date.

Sometimes it's more natural to place the the logic in say a strategy, making it pluggable. In this case you may choose to make the method more general such as meetsAgeRequirements(). Then, once it's pluggable you could move the real implementation into a rules engine. Whatever you do, resist the temptation to put it into a static helper! IMHO statics are the last resort of the scoundrel programmer ;-).

In one simple example we'd managed to:

  • greatly simplify our code;
  • extract and make obvious some business logic;
  • put that logic back into the class where it's closest to the data on which it operates;
  • Justify making the method publicly accessible and therefore testable; and;
  • remove (from the class we're implementing) a dependency on data (ie date of birth) contained within another class.

We've given our classes behaviour!

Private methods exist primarily to reduce the complexity of other methods and/or to remove code duplication. Either way, they are incidental to the implementation detail.

If you feel you need to test a private method (maybe because it's complex or contains some kind of business logic), rather than subverting Javas access protection mechanisms or perverting your code, have a think about what the method really does and where it belongs. Chances are, you've missed an important abstraction or concept.

February 03, 2004

Lots of little classes

Listen to this articleListen to this article

I remember having a heated discussion many years ago over the use of hungarian notation. Their argument went something like:

...If I don't call it pLAmount, how do I know it's a long?

To which I replied:

You don't have to know if you create a Money class!

Whenever I see variables with type information embedded in their name or logically grouped by some prefix or suffix I immediately think there must be a missing class. An abstraction or simple concept that we haven't expressed in code yet.

As an example, we're in the process of building a booking system and we obviously needed a way to represent date ranges for various things such as, you guessed it, Bookings.

The first thing that emerged was a Booking containing a fromDate and toDate. This had two aspects (no, not AOP-speak) that annoyed me. One was the fact that the variable names all had suffixes of Date. The other, anytime we need to pass these values around, we need two, count 'em, 2 parameters for every date range!

Oh one more thing. I loathe and detest the java.util.Date classes. Many people have commented on the problems with them so I'll say no more.

Instead we created a TimePeriod holding the start and end of the period represented as millseconds GMT.

First the tests:

Continue reading "Lots of little classes" »

February 01, 2004

Ignorance was once bliss

Listen to this articleListen to this article

It's always nice to find something about which I know bugger all. So today, instead of my usual rhetoric, magic answers and dubious words of advice, I roll on my back and display my soft under belly.

You see, I'm trying to make a class thread-safe. Almost all of my classes are inherintly thread-safe by virtue of the fact they are immutable. This class however updates internal data structures such as Maps and Sets.

Although Maps and Sets can be made internally thread-safe (by calling Collections.synchronizedX(anX)), that doesn't really help when performing multiple calls (add(), remove(), etc.) on multiple structures (including instance variables) and treating them as one atomic-ish operation.

Now, I don't want to just put synchronization blocks everywhere and hope for the best. I want to validate that the class is actually thread-safe. But I soon realised I have very little idea how to unit test for concurrency and thread-safety.

A quick search of the 'Net turns up a few interesting links:

But these only address the problems associated with running multiple threads within a test. Unfortunately, simply running multiple threads doesn't actually prove that we have achieved thread-safety.

Because of the non-deterministic nature of thread scheduling, it is very difficult to ensure that we have had multiple threads accessing a single object concurrently, save adding hooks into the object itself to make it sleep or give up control at strategic points in the code.

Immediately, my brain kicks into golden hammer mode and decides that this problem looks like a BCEL or ASM nail. But, being the naturally lazy git that I am, writing code or having to deal with AspectJ doesn't really appeal to me right now. Nor can I come up with any heuristics to determine where I would inject code anyway.

As you all know by now, I scoff at the idea that something is too hard to test. That said, I'm still left pondering how the hell I'm going to validate that my code is thread-safe? More to the point, how am I going to (re)design my class so that it's easy to test?