« December 2003 | Main | February 2004 »

January 28, 2004

Exceptionally Challenged

Listen to this articleListen to this article

I've finally finished my foray into C# and I suppose it would be obvious to all that I was rather less than impressed by some "decisions" that were made by the C#/.Net development team(s). Most noteably, the lack of checked exceptions.

So it was with much joy that I stubled upon this article, on the Artima web site. An interview with Anders Hejlsberg, the lead C# architect and a distinguished engineer at Microsoft, on "The Trouble with Checked Exceptions".

Fantastic I thought! At last I'll get some sensible, logical, coherent and rational explanation for some of the stuff that feels so uncomfortable to a Java weenie such as myself.

If only.

Thinking that maybe it was just me, I forwarded the link on to a good friend of mine James Ross who knows quite a lot about the Microsoft world. But alas, he drew similar conclusions. (Portions of our conversion have been included here)

No, it is sad to say but Mr. Hejlsberg shows his true colours and in the process makes me even less impressed with .Net.

I'm usually not fond of taking an argument and disecting it line-by-line. It's often too easy to take stuff out of context and often leaves one open for a counter argument in a similar vein ultimately leading to a flame war. But on the basis that Mr. Hejlsberg wouldn't know me from a bar of soap let alone read my blog, why not.

So without further ado:

Continue reading "Exceptionally Challenged" »

January 25, 2004

Simian bake-off

Listen to this articleListen to this article

Well you asked for it and here it is. Results from running the native, C#, flavour of Simian versus the Java flavour.

As I mentioned earlier, I had originally run the comparison on my linux machine using mono. As many people had pointed out, this was far from a "fair" comparison. Some people even suggesting that purely porting the code would result in poor performance. To this I reply fooey.

The test (and I use the term loosely) was performed on a DELL 2.0 GHz Inspiron 4150 with 512MB RAM running Microsoft Windows XP Pro against the JDK 1.4.1_01 source.

And the winner is...I'll let you be the judge:

The java version using the Sun JDK 1.4.2_03 ran in 64MB:

> java -jar simian.jar -recurse=*.java > java.txt
Similarity Analyser 2.1.0 - http://www.redhillconsulting.com.au/products/simian/index.html
Copyright (c) 2003-04 RedHill Consulting, Pty. Ltd.  All rights reserved.
Simian is not free unless used solely for non-commercial or evaluation purposes.
Loading (recursively) *.java from C:\jdk1.4.1_01\src
{ignoreCurlyBraces=true, ignoreModifiers=true, ignoreStringCase=true, threshold=9}
...
Found 40880 duplicate lines in 2339 blocks in 872 files
Processed a total of 369957 significant (1187603 raw source) lines in 3889 files
Processing time: 18.337sec

The C# version ran natively in 61MB:

> simian.exe -recurse=*.java > csharp.txt
Similarity Analyser 2.1.0 - http://www.redhillconsulting.com.au/products/simian/index.html
Copyright (c) 2003-04 RedHill Consulting, Pty. Ltd.  All rights reserved.
Simian is not free unless used solely for non-commercial or evaluation purposes.
Loading (recursively) *.java from C:\jdk1.4.1_01\src
{ignoreCurlyBraces=True, ignoreModifiers=True, ignoreStringCase=True, threshold=9}
...
Found 40880 duplicate lines in 2339 blocks in 872 files
Processed a total of 369957 significant (1187603 raw source) lines in 3889 files
Processing time: 12.628sec

Running with -server gains us about an 8% improvement in performance for the Java version but certainly still nothing like the nearly 30% needed to catch up to the native .Net

Surprisingly, running under BEA JRockit 1.4.2_03 used 235MB and tookaround 35 seconds using all default settings. The disk seemed to be thrashing but we made no attempt to tune the performance using JRockit options.

The C# version running under mono on the same hardware ran in 250MB and took around 78 seconds. Unfortunatele\y we couldn't get any of the optimize features to work on the windows version of mono. Besides, we figure this comparison is rather moot. Rather it is better to compare Java versus Mono on linux.

So here are the results on a DELL 1.8GHz Inspiron 8200 with 1GB RAM running Gentoo Linux (2.4 kernel) against the JDK 1.4.2_03 source.

The java version using the Sun JDK 1.4.2_03 ran in around 60MB and took 25 seconds.

The C# version under mono (with -O=all) ran in around 90MB and took 34 seconds.

Amusingly, nay astoundingly, the .Net version runs natively faster under windows+VMWare+linux than the mono on straight windows or straight linux!! Go figure?

Interesting to say the least. I wait with baited breath for the ensuing storm of abuse from the Java community this entry generates Hehehe. Though it'll make a change from receiving a serve from the .Net community.

Now the task is to see if we can pin-point what accounts for the difference. Unfortunately I fear, that because it's a direct port (ie line by line), any improvements I make to the Java version will likely carry forward into the .Net version.

Performance aside .Net is still not my bag baby. It still feels a little clumsy. But then I've years of practise getting my Java up to scratch.

January 20, 2004

My foray into C# - Part III

Listen to this articleListen to this article

Something I forgot to mention that was very cool about all this was the fact that I developed everything under Linux and the binaries run ASIS, under windows. That, to me, speaks volumes. It's a credit to the guys on the mono project.

I also feel compelled to answer some of the suggestions that I'm biased. Me? Never! Nor am I opinionated, loud, prone to ranting....hehehehe

I too found the process to be very smooth and very easy. In no way did I try and suggest otherwise. At the end of both blogs I clearly stated how easy it was.

I was really trying to give an accurate account of what it was like to start from knowing nothing, along with all the annoyances and frustrations that comes with it. Some comparisons may not be valid from the perspective of a born and bred C#/Microsoft developer but you have to remember that it's only natural to make comparisons. That's how we learn. We try and compare it with something we already know to give it some context, some meaning.

I hear from a mate that C# version 2 will have generics and Iterators and a few things other things.

Regarding string versus String, you can't (at least using mono) use any of the System classes without "using" the System namespace. This was the thing that seemed ridiculous to me. Unfortunately String (big S) lives in the System namespace. I only found this out after I had converted lots of code so I kept on with the string (little s) convention. This was how I had seen examples written up on the 'net that I was using as a guide.

As for performance, again, in no way did I try and make out that .Net was necessarily slower. All I reported was what I had found running under mono on gentoo linux. As stated in the blog, hardly a reasonable comparison hehehe.

The only thing I really can't get used to are the libraries. I really do find them awkward. I used to be a C++ developer before I moved to Java and I really don't see the C# libraries as a step forwards. Oh and unchecked exceptions.

In no way do I despise C#. Okay so the quip about compsci students may have been a little harsh... I got over all the extra keywords, maybe you can forgive me in return hehehe. C# and .Net may not be my cup of tea but they no longer appear too different and too scary to at least try out.

Thank-you linesmen. Thank-you ball boys. It's back to watching the Australian Open...

My foray into C# - Part II

Listen to this articleListen to this article

Well it's just gone 7pm and I've pretty much finished the conversion of Simian to C#.

All in all it took around 8 hours to convert 2000 lines of real source (around 6,500 raw source lines) of Java code to C#.

I haven't quite finished due to the blatently stupid file/directory handling in .Net. But I'm almost there. I've hard-coded it to run from the current directory for testing purposes. But 99% of it has been completed. Even the output is identical which is nice to see.

Performance? Well again, I stress I'm running under mono on gentoo so that's probably not the best test but it seems to take around 50% longer to run and consume around 50% more memory than the Java version. I'll have to run some tests on windows and see how that performs.

Some, hopefully, interesting bits to add to my last blog...

Case statements don't allow you to fall through to the next one if you have defined any code for it. Instead you have to explicitly say you want to goto the particular case you need. Hmmm...not sure about that.

C# doesn't allow you to put array specifiers [] after the variable name. It only allows them to go after the type. This is good. Only because it's the way I code anyway but I did find some code that somehow managed to slip through unnoticed.

What's up with structs? Sheesh! How are they in any way different to an Object except in the underlying implementation (ie they're probably allocated on the stack or something)? More warm and fuzzies for the C fraternity? Get rid of them I say.

Marking something as readonly doesn't actually seem to mean you must have assigned it a value. This may be a quirk of the mono compiler. I don't know. But if not, that's just wrong. The thing I love about final in Java is that it stops me from accidentally forgetting to assign a value to something. Took me 10 minutes to track down a NullReferenceException because I had accidentally deleted a line (an assignment) from a constructor. Using const is sometimes an option but it has a few caveats: const can only be used if you want a static variable; and; more importantly, the value has to be known at compile time! This pretty much means it's only useful for simple, primitive values assigned in the declaration. I can't use const for say a collection or an instance of any object that requires new. Nor can I use it for values assigned in constructors. Which is exactly the problem I had.

instanceof becomes is. How nice of them to simplify something I hardly ever use and generally consider poor practice anyway LOL.

I started off thinking that having to specify override for things was a good thing. Now I'm not convinced. Basically, if I haven't thought that someone might want to override a method I'm pretty much screwed. Well they are. Luckily this is my own code base so I can refactor all I like. Shame about all those libraries and frameworks out there. If in doubt, you could always mark the methods virtual. Now that's all very well and good but if the majority of the time I want stuff marked as virtual, wouldn't it make sense to have that be the default? I think I'll need a macro in my IDE for all these keywords: v[TAB]; ro[TAB]; etc. hehehe. But again, maybe it's just what I'm used to?

Don't get me started on the FileInfo, DirectoryInfo, File, Directory, yada yada yada. Even the regular expression libraries seem to have all these extra, essentially static, classes. The .Net libraries look like they were designed (and I use the term loosely) by high-school students. First year CompSci students at best.

Readers become TextReaders, Writers become TextWriters. In and out are reserved words! probably should have named my parameters something more meaningful anyway?

Auto-boxing? Didn't use it. Well not that I know of. That's the problem. I have no idea. Scares the hell out of me. Let's see, we want a language with lots of optimizations to keep C/C++ developers happy so they can feel like they're writing on to the metal, but for everyone else, lets make it really easy to not know you're creating bazillions of objects all over the place. Hmmm...

All in all it was really easy and relatively painless. I started from nothing and probably still know nothing but it all seems to work. I can't help but think that even in the early days of Java, the libraries may have been less functional, but they weren't all over the place. More to the point, remember that C# came from Java (oh go on tell me it didn't) so in my opinion there are no excuses. What, since C# started, a myriad languages have popped up that are at least as functional (probably more so) and far easier to use.

So, whilst it's an interesting thing that we'll all have to get used to, I'm happy in my Java land for the forseable future.

Time for a G&T to top off a good afternoon.

January 18, 2004

My foray into C#

Listen to this articleListen to this article

Well after a relaxing 10 days riding my motorbike around Tasmania I return to Melbourne, chilled (literally), relaxed and raring to go. I think it's about time to convert Simian to .net as had been suggested by many users. So here is a little blog of my experiences.

To start with, I've never written a line of C#. EVER. So I'm totally ignorant about all language constructs (except that having read the CLR book it looks like Java with uppercase method names) including, importantly, the libraries.

I'm also a linux weenie so I'm using mono under Gentoo. The combination is pretty good. Mono comes with mcs the compiler, monodoc, a purpose built browser for the framework libraries and mono for running your assemblies. At least I think that's the new fandangled name for executables. Anyone?

I had first investigated writing an automated tool. The process seemed mechanical enough to automate yet difficult enough to end up doing some curly stuff. I also looked at some off the shelf converters but in the end decided doing it myself by hand was probably a good way to learn the ins and outs of the C# language and at the same time indulge my years of Java bias :-D.

My approach is simple: Open my project in IntelliJ (my editor of choice) and one by one, copy the .java files to corresponding .cs files, run mcs and hack away until it compiles cleanly. I have to also admit that I'm not going to turn my code into C# style stuff during this conversion. All my method names, variable names, class names, interfaces are remaining untouched. I'm not adding an I to my interfaces nor am I giving all my method names an uppercase first letter. Stuff that LOL.

So here we go...

The first thing is the package declaration. This becomes a namespace. Only, some crack smoking monkey decided that it should be a proper block which means I have to have open and closing curly braces around my entire source file. Not so bad but that means everything gets indented one more level. YUK! So I choose to just leave the indenting ASIS. Not so bad. Still seems unecessarily verbose to me.

Next. String becomes string. What's with that? Method names get uppercase but String (a class) gets a lowercase? I got over it. Find+Replace is my friend.

Once again, Find+Replace for boolean. It's a bool in C#. I'm guessing a hang-over from C++. Again, I can live with this.

What's the problem with declaring a class as final? Ahhh. A quick search of the 'net reveals that it's sealed. Ok. Another keyword to remember. Not too bad I guess. On we go...

Crap. How do I declare a constant? In java it's static final. In C# it's, quite logically I guess, const. Bulk Find+Replace for that. Done

Ok. Now I want an immutable (readonly) instance field. In Java I just mark something as final. In C# it turns out, I have to use readonly. Egads! Another keyword to remember.

It's interesting. When I first started using Java I thought it odd that a few keywords, such as final, seemed to be used for slightly different meaning but it soon became apparent that really the meaning was the same: It's final; Unchangeable; Immutable; Not modifiable. C# seems to want me to use a different keyword for every little thing. This is beginning to irritate me but it's still not difficult to do.

Starting to lose enthusiasm for this. I see all this syntactic sugar entering the Java language, obviously driven by C# and I wonder if my already less than perfect Java will become a quagmire of lexical sludge.

On I go...

Syntax Error. Hmmm... Maybe it's not called throws in C#? Quick IM to Mike Melia. C# has no checked exceptions. I knew this. But what I didn't know is that you can't even declare that a mehod throws an exception. Any exception. Hmmm. Oh well. Believe it or not, it's my custom Assert class (starting simple) and it's only IllegalStateException anyway so strictly speaking it doesn't have to be declared. I'll just delete it.

Ok what's the problem now? class IllegalStateException not found. What's an equivelant in C# I wonder. Trudging through the documentation I give up. This doco is really hard to follow. JavaDoc seems much easier to read. Maybe it's just a matter of what I'm used to. I'll just throw ApplicationException instead.

Still giving me grief. I prefix the class name with System.. That does the trick. Hmmm. That's a bit stupid. Having to import the System namespace? Go figure. I choose to add a using for it. I don't like putting package/namespace names into my code.

I discover that C# doesn't allow importing classes. Only namespaces. This is ok by me. I always figure that once you have a dependency on a class in a package, you really have a dependency on all classes in the package in a way. What I don't like is that now I don't have any idea, just by looking at the top of the class, what other classes it depends on.

Woohoo. A clean compile. It's complaining that there is no entry point but I can live with that. I haven't specified a main anywhere. At least we got a compile.

It's now a little after 12.30am. A bit of success has brought back my enthusiasm. I'll keep going.

Time for an interface. Bleh. Can't have public keyword on interfaces. Yeah Yeah I know. They're all public anyway. But as in Java, it annoys me that a missing visibility modifier means one thing for a class and another for an interface. At least in Java I can safely put the public there and it doesn't complain. Find+Replace. Done.

Now time to do one of the classes that implements the interface.

Ok. This is ridiculous. If I have an abstract class that implements an interface but doesn't implement all the methods, I still have to define the methods in the abstract class as abstract. I've defined the class as abstract. Can't the compiler work this out? Sheesh. Talk about needless typing.

Uh. Now I discover I have to explicitly say virtual on all my methods to allow them to be overriden. I have no problem with saying override when I actually do override something, that's kinda cool, but I'm guessing the virtual bit is a hang-over from C++ and/or it makes it easier to optimize the output of the compiler because I the developer have to tell the compiler that there is no need to create jump-vector-tables for the method. Whatever the reason, combined with the previous stuff to do with interfaces, it's really starting to drive me nuts.

Ok. It's now 2:45am and I'm honking along. Thankfully the compiler catches all the bits I forget to change. Sometimes the messages aren't particularly helpful but that may just be the mono compiler. Nothing to do with C# per-se.

I've done all the simple classes I could find. All the ones with few dependencies. Now it's time for some of the more meatier ones. I think I'll keep going until 3.30am and then call it a night.

I have a bunch of decorators (I love decorators) but boy is it a pain in the butt to implement in C#. It's like alphabet soup after I've added all the necessary keywords: virtual, override, sealed, etc. Do Microsoft developers get paid by keystrokes? Anyway I'm getting there. It really is becoming quite mechanical now.

C# uses C++ style class extension. So instead of saying extends you use a colon :. you also call the super-class constructor using the C++ style colon after the constructor name instead of on the first line of the constructor. That's not too bad. Pretty much the same thing anyway. Oh and just to be different, super becomes base.

I've just spent the last 20 minutes learning about the collections classes. I'm beginning to think the .Net libraries were an R&D project that made it into the wild a little too early. 57 (I exaggerate a little hehe) different collection classes and none seem to do what I want.

Aha! There it is. StringCollection. Basically a Set implementation for strings. Ok now to iterate over them...

Grrrr. No Iterators. IEnumerator? Gimme a break. Instead of hasNext() followed by next() they use MoveNext() which returns true if it was successfull and a Current property that is null if there isn't a current value. Ok. but I can change my for loops into foreach which is kinda cool and surely makes up for it. As I mentioned before, that's one of the features I'm looking forward to in J2SDK 1.5.

I've just converted some code that parses numbers and some that performs file I/O. Do you think that I could work out what exceptions might be thrown? I'm going head-first into the unchecked exceptions debate here and state outright that it is plain broken and wrong that the only way I can find out what exceptions can be thrown is to pray and hope that they were documented. My god! Not only that, but the I/O exceptions don't seem to extend any sensible base class. So now instead of hoping I've caught all the necessary exceptions, I'm forced to catch Exception.

I'm very glad I'm converting from fully tested, well designed (if I do say so myself, which I do hehehe) Java code because I truly believe at this stage that C# and the .Net libraries are woeful.

I admit, I've now around 3 1/2 hours of C# experience which clearly makes me an expert, NOT, but I struggle to see how you could take Java, and make it worse and less mature. They managed it though.

Ok, well now I'm screwed. IEnumerator doesn't allow you to remove from a collection whilst iterating. In fact it expressly says this isn't allowed. I've written a bunch of custom collection classes (for performance reasons) that I was hoping I could just ignore for now but looks like I'm going to have to convert them as well just to get the behaviour I need.

It's now 3.45am. Time for bed. My brain hurts. Back into it tomorrow me thinks. I've done around 45% of the code base in a couple of hours. Not bad. It's pretty easy. I wonder what the Microsoft conversion tool is like :-)

It's not too bad so far. A few quirky things here and there. It surely looks like they've tried to make all their existing developers happy by keeping lots of language constructs the same as those in Microsoft flavours of C/C++, VB, etc. I can understand that. In fact I think in some ways it's remarkable that they think that way about their developers. But still it's a bit of a heinz 57 varieties in places.