Ignorance was once bliss
Listen to this article
It's always nice to find something about which I know bugger all. So today, instead of my usual rhetoric, magic answers and dubious words of advice, I roll on my back and display my soft under belly.
You see, I'm trying to make a class thread-safe. Almost all of my classes are inherintly thread-safe by virtue of the fact they are immutable. This class however updates internal data structures such as Maps and Sets.
Although Maps and Sets can be made internally thread-safe (by calling Collections.synchronizedX(anX)), that doesn't really help when performing multiple calls (add(), remove(), etc.) on multiple structures (including instance variables) and treating them as one atomic-ish operation.
Now, I don't want to just put synchronization blocks everywhere and hope for the best. I want to validate that the class is actually thread-safe. But I soon realised I have very little idea how to unit test for concurrency and thread-safety.
A quick search of the 'Net turns up a few interesting links:
But these only address the problems associated with running multiple threads within a test. Unfortunately, simply running multiple threads doesn't actually prove that we have achieved thread-safety.
Because of the non-deterministic nature of thread scheduling, it is very difficult to ensure that we have had multiple threads accessing a single object concurrently, save adding hooks into the object itself to make it sleep or give up control at strategic points in the code.
Immediately, my brain kicks into golden hammer mode and decides that this problem looks like a BCEL or ASM nail. But, being the naturally lazy git that I am, writing code or having to deal with AspectJ doesn't really appeal to me right now. Nor can I come up with any heuristics to determine where I would inject code anyway.
As you all know by now, I scoff at the idea that something is too hard to test. That said, I'm still left pondering how the hell I'm going to validate that my code is thread-safe? More to the point, how am I going to (re)design my class so that it's easy to test?
Comments
Simon,
You are far wiser than I in such things... but it does occur to me that you could force synchronicity into your Sets at the very points of vulnerability by, say, passing in a custom set class for testing.
Then, instead of creating a new HashSet(), call SetFactory.getSet(), which will return--in the case of testing--a new VulnerableSet().
A VulnerableSet() does whatever it takes to make a thread collision more likely--sleeps, or wait(), or something.
I don't know how too make it deterministic, though.
Posted by: Danyel Fisher | February 1, 2004 08:40 PM
Concurrency and synchronisation is a system level design decision. It cannot easily be retrofitted into classes that have not been designed with concurrency in mind. That approach leads to subtle deadlock, livelock and race condition bugs.
Separate your program into modules. Define how threads are assigned to modules. Create interfaces between threaded modules with monitors (e.g. java objects with synchronized methods).
Posted by: Anonymous | February 1, 2004 11:34 PM
"Anonymous",
Thanks for the feedback
Maybe my post wasn't clear enough so for that I apologise.
The issue of how to make a class thread-safe is not the problem.
The problem is how to PROVE that the class is thread-safe by means of an automated test.
Cheers,
Simon
Posted by: Simon Harris | February 2, 2004 12:33 AM
Simon,
"Proving" that a class is thread-safe is a fairly trivial exercise in theory: all you need is your own Thread implementation. :) This would allow you to control the order of execution.
More pragmatically, you can take a statistical approach: create a dozen or so threads that attempt to interact with the object under test, set them up to repeat often enough that you get a large sample size (at least a hundred iterations, say), and let them loose. Couple this with code inside the object to detect when synchronisation is violated (say, a counter that is incremented on entry and decremented on exit, with an assert that it is either 0 or 1), and you should soon have a statistical proof that it is thread-safe.
Not quite as good as the real thing, but it should be solid enough.
Posted by: Robert Watkins | February 2, 2004 07:00 AM
In his TDD book, Kent states that there are two things which are difficult to unit test - security and thread safety.
I might be getting that quote wrong (its off the top of my head). It occurs to me now that it might just be hard to /drive/ the development of those things by unit tests.
In any case, you could look it up and see what Kent has to say on the matter. I know Jon Eaves was quite pleased when he found this, because he had similar pain in testing some crypto code for bouncy castle.
Posted by: Marty Andrews | February 2, 2004 07:13 AM
I think it's an inherently hard problem due to the number of combinations that the thread scheduler could throw at you.
Suppose you had two possibly interfering methods A and B each with four intructions each. You'd have to test all of the following possibile execution sequences (where A1 means instruction 1 of method A):
A1, A2, A3, A4, B1, B2, B3, B4
A1, B1, A2, A3, A4, B2, B3, B4
A1, B1, B2, A2, A3, A4, B3, B4
... and so on... maybe somebody who remembers more of their school maths could help me out with the formula for the number of combinations, my guess it's going to have a "!" in it. You can certainly see how quickly this would get out of control with real code.
Posted by: Neil Bartlett | February 3, 2004 08:02 AM
Simon,
I've only tried to do this once before, and don't claim to have PROVED thread safety, but I did manage to use TDD principles to get a test to fail every time, and the only way to get it to green was with strategic synchronization. I'd love to try this out on multiple platforms, but I never did. It strikes me you could write two tests; one that tests an unsynchronized version of the class and fails if no race condition is detected, and a second one against the correct implementation that is robust. In this way you could prove that your platform is actually stressing the code and then prove that you have dealt with it.
Posted by: James Ross | February 3, 2004 09:35 PM
I've managed to do this to some extent before, but you do pay a price in code readability. You're completely correct that you cannot prove a class thread-safe through running multiple threads through it and hoping there are no collisions.
The secret is to pass in a LockFactory (which can, of course, be mocked up for testing) and to synchronize on locks provided by the factory rather than on the object monitor itself.
Although I can see that there is an argument that you just push the testability out of your class into the Factory!
Posted by: Ben | February 16, 2004 02:10 PM