« JRules - The story so far... | Main | Extreme Smart Arses »

Collections gotchas #472

Listen to this articleListen to this article

I found an interesting (to me anyway) implementation detail in Suns JDK LinkedList and HashSet code.

Compare these two. The first is deep in the bowels of LinkedList.indexOf(Object) which is used by the implementation of contains(Object):

if (o.equals(e.element))
    return index;

The next is from HashMap.containsKey(Object) which is used by HashSet to implement contains(Object):

return x == y || x.equals(y);

Notice anything interesting? No?

Well I know it's not much but HashSet which one would think probably only honours the equals(Object)+hashCode() contract, also checks for object identity. On the other hand, LinkedList, which I would have thought probably only checks for object identity, actually only honours the equals(Object)+hashCode() contract.

Why is this a problem? Well I found a class (in a 3rd-party library) that never returned true for equals(Object). I only found it because I was writing a test and wondered why when I added instances of the class to a LinkedList, contains(Object) always returned false even though when adding them to a HashSet, contains(object) always returned true.

I had expected List implementations to at least check object identity. Unfortunately it behaves exactly as the documented, says so I'll just have to deal with the real problem and shoot the developer who wrote the broken code instead. But saying so upfront would have spoiled a good rant :-)

And the moral of the story is? Tests are good; Assumptions are bad; Test your assumptions; and; Sometimes the documentation is worth reading but usually only after the fact when all the subtleties hidden therein become apparent.

Comments

To me, I think that the LinkedList implementation is correct, and the HashSet implementation is wrong, but that's just me ;)

The libraries should not have to work around broken code in this fashion.

Another lesson, less pragmatic and more theoretical, is that when you violate a contract, ANYTHING can happen. In this case the contract is the postcondition for equals(), which states that "x.equals(x) should return true" for non-null x.

Writing a good equals() method, one that meets all of the contract requirements, is a bit tricky. Angelika Langer & Klaus Kreft had a good article on it called "Secrets of Equals". Unfortunately, it seems that their Web site has been taken down. The text is still available via Google's cache; archive.org might have the whole thing. Wait, here's a copy at the C++ User's Journal, of all places: http://www.cuj.com/documents/s=8467/cujjsup2004langer/

Thanks for the comments guys. Unfortunately the problem is that it's not my code that was broken. It was a class in a third-party library that I was testing to make sure it behaved the way I assumed it would and in the process I found what I consider a "quirk" of the collections spec.

Post a comment