Thursday Apr 01, 2010

How to Write a Memory Leak Unit Test


Unit tests are great for ensuring that your functionality is correct. But how do you make sure you don't have leaks in your code, leaks that
eventually cause your application to crash when it runs out of memory? Unit tests don't typically don't run into memory problems directly because they tend to start up, run a little bit of code, and shut down -- getting a nice fresh memory environment on each test start.



One great way to help protect yourself from uptime problems is to write leak tests. These are unit tests where you first perform some operations, then do normal cleanup, and finally you assert that the objects you were using during the operation have been cleaned up completely. In other words, that there are no references left anywhere on the heap.



Testing for leaks isn't very hard; the basic trick is to create a weak reference to your to-be-cleaned object (by constructing a WeakReference wrapping your object), then you delete all your local references to the object, then you perform garbage collection, and finally you see whether the weak reference still contains your object. This works because the weak reference is handled specially by the garbage collection system.



It turns out things aren't quite as easy as that, because garbage collectors try to be smart, and simply calling System.gc() doesn't mean it's going to actually perform a complete and final garbage collection. This means that your unit test could incorrectly conclude you have a leak, since the reference is still held.



The NetBeans team has built a really good test infrastructure for this. When I wrote unit tests for NetBeans plugins, I could simply call their assertGC method to assert that the given weak reference should be garbage collected, and it would handle the rest. (Internally, they do things like actually allocating a bunch of memory chunks to really force the garbage collection to run.)



Here's an example:


// The following unit test fragment wants to ensure that a graphics object
// which shows the current selection in the authoring tool, is cleared up after
// the user clears the selection.
view.selectAll();
Node firstHandle = /\* Node showing selection highlight, lookup code here \*/;
assertNotNull(firstHandle);

// Now clear view selection -- then assert that all the handles have disappeared
view.selectNone();

WeakReference<Node> ref = new WeakReference<Node>(firstHandle);
firstHandle = null;
assertGC("Selection handle leaked", ref);




When you write these tests you also have to make sure you null out any local variable references you are holding right there in the test.
The above test will pass if and only if the target object, firstHandle, is properly garbage collected.



But wait -- that's not the best part. Let's say you've written a leak test, and it fails. Now what? What do you do -- make the process pause and attach a profiler and try to hunt it down?



This is where their unit test support really shines. When the unit test discovers that the reference is still held somewhere, it uses its own heap walking library to hunt down the offending reference, and dumps it out as part of the test assertion failure message!!!



Here's the output of a real leak test failure:


Testcase: testMemoryLeak(SelectionHandlesTest): FAILED
Selection handle leaked:
public static java.util.List javafx.scene.Scene@dc8a29-$dirtyCSSNodes->
java.util.ArrayList@4ed14172-elementData->
[Ljava.lang.Object;@3c22de9e-[41]->
javafx.scene.shape.Rectangle@612e4cd

In other words, we see that there is a static list in Scene named dirtyCSSNodes which is holding on to our target Rectangle. The syntax here is that you first see the class name for the object (where a prefix of [L means it's an array-of), then its system id, then the field name - and if it's an array, the array index.



When the unit test fails, it takes a while -- 20 seconds or so for the above test -- to actually produce the above trace. But the important part is that this is only slow when the test fails. It only has to work hard when you have a leak, and you don't want to have leaks!



I've been wanting to write leak tests for the authoring tool (which is written in JavaFX), since uptime matters a great deal in a tool which deals with potentially large objects (such as multimedia assets). And I realized that there is absolutely nothing NetBeans specific about the NetBeans leak unit test support. So I went and pulled out the relevant code into a separate library. The library basically contains two parts: a .jar file which contains the INSANE heap walking library, and a second jar which contains the assertGC() unit test assertion method and supporting infrastructure.



I have extracted this into standalone pieces (outside the NetBeans infrastructure) so you can get the bits easily - download leaktests.zip, then add the two jars in there on your test classpath and call assertGc() and assertSize() from your tests as described above. Here's the basic skeleton for all leak tests:


// (1) Create your objects
// (2) Look up the object you want to ensure gets cleaned up later:
Foo foo = // code to get foo
assertNotNull(foo);

// (3) Call your cleanup code which is supposed to free everything

// (4) Create a weak reference to your object reference, and null out
// your reference
WeakReference<Foo> ref = new WeakReference<Foo>(foo);
foo = null;
assertGC("Foo leaked", ref);




All I've done is extract NetBeans code written by others so I've kept the licenses the same as for NetBeans. All credit goes to the author of INSANE, Petr Nejedly -- and to the authors of the memory assertion stuff in NbTestCase.java. All the source code for INSANE and NbTestCase are in the NetBeans mercurial repository.



In addition to assertGC, there is also assertSize(). This method can be used to ensure that the transitive size of an object graph is below a certain threshold! This can be good for writing tests to not only ensure that you don't have leaks, but that your data structures are of the rough expected size so you don't need excessive amounts of memory. There are more details on this on the INSANE home page.



One special note on JavaFX: The above leak isn't actually a leak; it is a deliberate optimization of the scenegraph, and the reference will be cleared up after the next scene pulse completes. Therefore, for unit leak tests, in addition to actually nulling out the weak references, you also want to run through a scene pulse as well. One really hacky, implementation-dependent and unsupported way to do that is to call scene.$scenePulseListener.pulse();. You probably want to isolate that in a utility method such that you can update it in one place when it needs to change...



Finally, note that I built this on JDK6. If there is interest perhaps we could create a wrapper project for this on Kenai or java.net, where people can also create say a Maven binary for this, a JDK 5 version (there is nothing JDK6 specific so it just needs a recompile, but I don't have JDK5 on this Snow Leopard Mac), etc. Hope you find this all as useful as I have!

Friday Dec 11, 2009

How to determine the JUnit 4 current test name


In my unit tests I often want to know the name of the test that is executing. For example, I often want to have the golden file (expected test output) computed automatically from the testname. As another example in my JavaFX testing, I often generate screenshots inside failing tests and it's useful to name these screenshots by the failing tests.



In JUnit 3 this was simple, since your testcases would extend a builtin JUnit class which had a method you could call to return the current test. However, with JUnit 4 that's no longer possible. I've googled and found the "correct" way to do it - using a special @RunWith to run the test class - but I find that solution unsatisfying. My utility methods which are invoked to read golden files and screenshot etc are in one place and I now have to decorate all my tests. Besides, I already have a @RunWith annotation on my tests because I want to run them on the event dispatch thread so I have a special test runner for that.



So, I've found a better way to do it. Better for me I mean - this may have problems and limitations I'm not aware of, but for all of my tests this worked wonderfully, and doesn't have the @RunWith requirements (though note that I don't do multithreading in my tests, other than invoke them on the event dispatch thread, so if you try to call this from a thread that didn't invoke the test it probably won't work):


public static String getTestName() {
// Try to find a method on the stack which is annotated with @Test -- if so, that's the one
StackTraceElement[] elements = new Throwable().fillInStackTrace().getStackTrace();
for (int i = 1; i < elements.length; i++) {
StackTraceElement element = elements[i];
try {
Class clz = Class.forName(element.getClassName());
Method method = clz.getMethod(element.getMethodName(), new Class[0]);
for (Annotation annotation : method.getAnnotations()) {
if (annotation.annotationType() == org.junit.Test.class) {
return element.getMethodName();
}
}
} catch (NoSuchMethodException ex) {
} catch (SecurityException ex) {
} catch (ClassNotFoundException classNotFoundException) {
}
}

// Just assuming it's the calling method
return elements[1].getMethodName();
}


As with most of my test utilities, it's a public static method living in a class called TestUtils, which I statically import from my test cases such that I can simply reference the test name getter like I would in the JUnit 3 days:

import static org.junit.Assert.\*;
import static my.package.name.TestUtils.\*;

/\* ... \*/

screenshot(scene, getTestName());


By the way if you're using JavaFX you might be interested in the screenshot utility method. It's really simple:

public static File screenshot(Scene scene, String fileName) throws Exception {
BufferedImage image = (BufferedImage) scene.renderToImage(null);
if (!fileName.endsWith(".png")) {
fileName = fileName + ".png";
}
File file = new File(getScreenshotDir(), fileName);
file.createNewFile();
ImageIO.write(image, "png", file);
return file;
}

(where obviously getScreenshotDir() returns a File folder where you want your screenshots generated. A decent default implementation is return new File(System.getProperty("java.io.tmpdir")); ...)

Thursday May 14, 2009

Run Tests - Without Focus Loss!


I like unit tests - but running them can be painful. Commit-validation tests
which bring up UI are obviously annoying, but even simple unit tests that
get in the way. Does this look familiar?







The above menubar should look familiar to anyone on a Mac who's run unit tests for client side
Java code. Not necessarily GUI tests, just any test where GUI libraries are loaded.



For every testcase, the test runner fires up a new process, which tells
OSX that "I'm graphical, give me focus!", and this steals the focus from the user.
The test finishes quickly thereafter, the process quits - and then the test runner
goes to the next test and the whole cycle starts over.



This is really painful because I like to have lots of tests. The Python editor support
for Netbeans had 600+ tests;
the counts for the JavaScript support was higher, and
the Ruby support even higher than that. Whenever I run tests, I basically have to
fight with my computer to get focus. Forget trying to write anything - every second
or so my keystrokes get stolen as the next test grabs focus - so I've gotten in
the habit of using the time for browsing, since I'm mostly reading, and I can handle
clicking a link a second time if the first click got lost. But every now and then
somebody will ping me on instant messaging - and it's maddening trying to respond while
this is going on.



If this is sounding painfully familiar to you, I have good news. I've finally figured
out a setup where this is no longer a problem!



The key discovery was that I can run my tests from another account on the system.
With OSX' fast user switching, I can switch to the alternate account, launch the
unit tests, and return back to my regular account where the tests won't interfere
with display focus. In order to let me run my tests from that other account, I
just open a terminal there and su -l tor in the shell to run all the
commands (or NetBeans) as myself.






This was a huge improvement since it removes the 10-30 minutes testrun downtime.
But it had some disadvantages - first, I don't like running tests from a shell, and second,
it's hard to know when things finish - and switching back and forth to check is annoying
since I always have passwords on my accounts so the machine isn't open.



So the second step was to set up Hudson (a
continuous integration server that is trivial to setup, and has a huge number of
plugins which makes
graphing code coverage, unit tests, findbugs results, integrating with version control
systems etc trivial. And it's
not just for Java developers
.)



Instead of logging into the other account
for each test run, I log into the other account once, and start up Glassfish
with Hudson running. From now on I can access, configure and launch builds right
from my own browser in my primary account. The key step here is that Glassfish was
started from the secondary account, so its primary display is associated with the builder account.
When my build in Glassfish gets to the test stage, it's actually doing the display
connection just as before, and if I log into the secondary account, I get the annoying
focus flashing just as before. Look - the tested process is a user visible application in the dock:






Another improvement which really helped is the "File System" version control plugin for Hudson.
I want my Hudson builds to build my current working copy. I don't want to
check my code into Mercurial (even a local repository) just so that Hudson can
grab the code and build it. I want Hudson to build my current changes - my current
edits. After all, I'm trying to test them before checking in! And I discovered that
there is a plugin which will let me do that - it's just
a "file system" version control wrapper - which means you just point Hudson to your local
directory, and off it goes. When the build starts, it makes a quick disk-copy of the
source tree. Even though Mercurial cloning is pretty fast, this is even faster.
The disk copy also lets me specify a filter of files to exclude, so I had it ignore
\*.class files. The diskcopy only takes 10 seconds or so before the build kicks
off, and it's building a snapshot of my current in-progress, edited working copy!
(It can also just update the copy based on changed filestamps - that's even faster,
but it didn't seem to correctly delete removed files, so I let it start from scratch
each time.)







(Note - this plugin isn't in the catalog that you can browse right from within the
Manage Plugins page within Hudson; I downloaded the .hpi file from
http://wiki.hudson-ci.org/display/HUDSON/File+System+SCM
and installed it in the Advanced tab.)



The final ingredient is the new Hudson support in NetBeans 6.7. I don't even have
to go to the browser to kick off build jobs. I just point NetBeans to my Hudson
server once, and from then on I have full Hudson integration. When I want to
run my tests I just select Launch job:







I get notified if there's a problem:







I can look at failures and logs:







I can see build logs etc. directly in the output window, and hyperlinks warp to directly to files - to the files as they were in the build, not the current state:






So to recap - with this setup, as I'm editing my code and I want to check the tests,
I just right click on a node and say "Start Job" - and off it goes without bothering
me at all - no more focus interruptions, and no more GUI windows popping up from
interactive tests. It's trivial to check the results. And it's even added one more
level of convenience: I have multiple projects, each with unit tests, and from the
IDE I couldn't have them all run with a single gesture. My build job does that.



I'm really stoked! I was at one point able to do this when I was working on Linux
and Solaris by setting my $DISPLAY variable and doing tricks with VNC. But that
still required my tests to run in a console - which made interpreting the results sucky.



If you haven't played with Hudson, try it - it's unbelievably easy to set up. Just download
the Glassfish appserver and install it, download the Hudson .war file, and drop the hudson.war
into the autodeploy directory of the appserver, and browse to localhost:8080/hudson.
Once you're there you can install
plugins (under Manage hudson), point to your local installations of the JDK, ant,
etc., and configure your build jobs by running scripts, launching maven scripts, writing
ruby scripts, or obviously running ant scripts.



Some final miscellaneous tips:


  1. I don't want Time Machine to back up my builds trees, or Spotlight to
    index data in these directories, so I went to the TimeMachine preferences and had it
    exclude the ~/.hudson/jobs/ directory.



  2. I did the same thing for Spotlight - but unlike
    the Time Machine preferences, there was no checkbox to "Display Invisible Files" (e.g.
    files that start with a dot, such as .hudson) in its file chooser. Here's a tip I didn't
    learn until recently: When a Mac filechooser has focus, you can press slash (/) - and this
    will open a text field where you can directly type the path it should jump to. I typed
    /Users/tor/.hudson and from there I was able to select the jobs directory to exclude.



  3. You might be tempted to skip the "filesystem version control" plugin and just have
    your build symlink to your working copy. Be careful; if Hudson is configured to delete
    older builds you might find yourself without your source code. I'm not saying it will
    follow your symlinks - Java has support for symlinks now - but I haven't tried it, and
    I have been bitten by ant in the past where it decided to follow symlinks in its
    zeal to delete recursively!




  4. I recently discovered that you can reorder the build steps in a Hudson job. The little
    graphic to the left of a build task is a handle you can just drag to reorder!






Thursday Dec 18, 2008

JavaOne Submissions Due


The deadline for submitting talks for JavaOne 2009 is almost here. I was on one of the review committees last year, and saw many common patterns for rejected talks. I wrote these up in a blog entry last year - "Why Your JavaOne Submission Was Rejected. You might want to skim through it again while reviewing your submissions one final time!





Also, there's a new book about develping Ruby and Rails code with NetBeans, written by Chris Kutler and Brian Leonard. I reviewed some of the content and it looked very good. There are a lot of books on Ruby and Rails, but this book really shows you how to do things effectively with NetBeans. Mastering your tools is one of the best gifts you can give yourself as a developer (or as a craftsman of any craft, I think). I gave this advice a while back when I was interviewed for the JavaOne daily newspaper. Highlights from these interviews were recently compiled in this article, containing advice for students from a variety of developers. When this article was discussed on DZone I was surprised to see some people still talk about IDEs as something to be avoided until you know what you're doing. I think precisely when you are new to Java should you use an IDE to help you explore IDEs, to pinpoint errors are you're typing code rather when you try to run javac from the command line etc. The same goes for Ruby, Python, PHP, and so on. The IDEs let you explore APIs using code completion and go to declaration, they present important operations in project and execution menus, quickfixes and semantic warnings pinpoint errors in your code, and so on. If you're still writing code with Notepad, you're missing out.


Sunday May 11, 2008

JavaOne 2008


JavaOne 2008 is over. As usual, it was a great event, but with the stress and hard work leading up to the conference, it's a huge relief that it's all over. For my part, it was another extremely busy JavaOne, with two keynote demos, a technical session, a panel discussion, a BOF, as well as three presentations at CommunityOne the day before. If you add to that the prep time for these (keynote software setup and rehearsals, slide planning etc.) there was barely time for anything else, so despite my best intentions I didn't get to meet up with a lot of the out-of-towners visiting JavaOne that I had planned to. I only made it to three technical sessions - and in all three I learned something. Hopefully the rest of the sessions that I missed had the same high level of quality.



I was interviewed by the JavaOne paper on Wednesday; the online interview is here. If you read my blog you might find it interesting. However, in the paper version of the interview, something went horribly wrong... Some of the questions and answers attributed to me were from a previous interview! In particular, I'm found endorsing the Flex SDK, as well as talking about my "math background". Those parts were from the previous day's interview with Chet Haase!
If you're wondering why I'm referred to as a "Rock Star" in the interview, that's a JavaOne thing. The top 20 highest rated talks each year (as determined by the speaker survey forms collected for all talks) earn their speakers a lifetime "Rockstar" title at JavaOne. And speaking of Rockstars, I've got independent verification that I am one (wink, wink) since Ed Burns also interviewed me (and the other JavaPosse guys) for his Secrets of Rock Star Programmers book. The interview was conducted a year ago, but the book is out now. I've read some of the chapters already and really enjoyed it.



It's always exciting to be part of the keynote demos. It's a huge production with over 10,000 people in the keynote hall. A lot of work goes into it. There's a control room in the back, NASA ground-control style. Arun Gupta snapped a few pictures during rehearsals this year and posted them here. The 10th picture gives a sense of the size of the hall. On the left is a drawing one of my kids drew of how she imagined my keynote demo - it would be nice if we actually had chairs to sit on. I guess when you're a kid you wouldn't imagine standing up while doing things on a computer! Anyway, you can see the webcasts from the keynotes - they are all available here. In particular, you can see our Tic Tac Toe demo here (about 2:20 into it), and the JavaScript editing demo here (about 19:45 into it).



The Java Posse BOF was another highlight for me. We had a short appearance at CommunityOne, but with just 20 minutes we didn't quite get into the groove. Thursday night for our BOF however, and with beer, we had a great atmosphere. We had feared a really low turnout since our BOF was scheduled smack in the middle of the Smashmouth concert - but that turned out not to be a problem, either because it was cold outside, or because a lot of people hadn't heard of Smashmouth (think the movie soundtrack from Shrek), or perhaps because we have really loyal listeners! If so, thank you!! A contingent of Norwegian listeners came up and handed us a lot of Norwegian chocolate! I didn't catch your names - but thank you very much! My local gym also thanks you...



My brother (who also works at Sun and has his own blog) has been staying with us for the last month - first for the MySQL conference in April, then JavaOne. It's been great having him here. We tried to get him on the air in one of our podcasts, but he refused - so instead we embarassed him during our BOF with a dedicated slide and tribute! He headed back to Norway this morning; Trond, it's been great to have you here and welcome back.



I'm taking a few days off now to catch up on sleep and chores!

Tuesday Jul 24, 2007

Code Advice #15: Name getters properly!


(See intro for a background and caveats on these coding advice blog entries.)


It's been a while since I've posted Java-coding opinions. But I couldn't resist on this one. The tip itself is straightforward:


A getter method's name should not sound like an action method.

Getter methods and setter methods should not have the same name.



I'm speaking from bitter experience since I just fixed a bug where my code had accidentally been calling a getter method, when I thought I was calling a setter method.
The method in question is java.lang.ProcessBuilder#redirectErrorStream. If you're launching a process, and you want to read its output, you may want to redirect its error output to its standard output such that you only have to read the input from a single source (assuming you don't care to distinguish between output and error.)



Well, my code was initializing the ProcessBuilder:


ProcessBuilder pb = new ProcessBuilder(args);
pb.directory(pwd);
pb.redirectErrorStream();



The problem here is that redirectErrorStream() does NOT redirect the error stream! It just returns false to tell me that it's not yet planning to redirect the error stream. The correct way to do this is


ProcessBuilder pb = new ProcessBuilder(args);
pb.directory(pwd);
pb.redirectErrorStream(true);

That's right - the setter and the getter are overloaded! Ewwwww!!!



This is the kind of thing a bug detection tool like findbugs could detect. It already warns if it sees you doing something similar, like calling String.trim() without storing the return value. Unfortunately, it didn't warn me about this case - so I filed a request. But this was a great reminder to run findbugs on my code again, which turned up some other interesting things to look at...

Monday May 14, 2007

It's Over!


JavaOne 2007 is over. I haven't been blogging lately, and I can truly blame that on JavaOne. It turned out really busy for me: Two keynote demos, one technical session, two Java Posse appearances (our BOF, and NetBeans Day), and a half hour demo at NetBeans day. In addition, the Java Posse got full press treatment this time around, so we had tons of interviews lined up. The picture below was taken right after the Friday keynote when we chatted with James Gosling - in full Posse regalia.






If that doesn't sound all that busy, remember that keynote demos have rehearsals as well as machine setup times, the technical session needed slide edits, etc. I haven't posted any Ruby feature updates recently, and that's partially because I've been busy preparing for JavaOne. That doesn't mean the time has been wasted, though - in actually using the Ruby support as opposed to just building it and testing simple scenarios to see whether a new feature works or not, I discovered lots of issues that have been fixed. Thus, the current trunk build is in much better state than the Milestone 9 bits. Somebody asked us after the Ruby Tooling talk whether the bits we were using are available. They are; get NetBeans 6 milestone 9, and then install the trunk Ruby bits. There are some new features to use as well - I will document those in a separate blog entry.



First however I need to go through my inbox and catch up on all those e-mails I've been ignoring lately. If you're one of them, I sincerely apologize.



Finally, let me end by saying that it was really fun meeting so many of you at JavaOne. To those of you who stopped by and just said thanks for either the Ruby support, or the Java Podcast podcast, I truly appreciate it. Knowing that people find it fun or useful really inspires me to keep going. And don't forget, the Java Posse is doing a live recording tomorrow night at the Silicon Valley Java Users Group, hosted by Verisign. There will be free pizza and beer - and we'll be summing up our impressions from JavaOne. Hope to see some of you there!

Tuesday Jan 02, 2007

Code Tip: Statically Check Multiple Interfaces

You may think generics is all about avoiding casts in collections. However, I found a really good use for them the other day that's pretty useful.

Let's say you have a method which takes a List and does something with it.

    void doSomething(List list) {
        ...
    }
(In my code, it's actually List<String>, but I'm trying to keep other generics out of this to motivate those of you still needing a good reason to learn them.)

In my case, I needed to iterate over the list in reverse order. Uh oh, that's going to be pretty expensive for some types of lists, and if I switch my implementation to for example a LinkedList in the future, I'd like to revisit this. Obviously, I could make my method the following:

    void doSomething(ArrayList list) {
        ...
    }

However, using specific implementation classes rather than interfaces is frowned upon, and besides, this needlessly prevents the method from being used with other random access list. On the other hand, I would like to state in some way that this method really wants to be able to access the list elements in random order.

java.util.RandomAccess to the rescue. Lists that provide random access implement this interface. But what do I put in my method signature? RandomAccess is not itself a List. I really want to specify multiple constraints on my parameter.

This is precisely what generic methods allow! The following method declaration will accomplish what I want:

    <T extends List & RandomAccess> void doSomething(T list) {
        ...
    }

Here we're saying that my parameter T is a generic type, which satisfies multiple constraints: it both extends List and it implements RandomAccess! You can pass an ArrayList to the above method, but not a LinkedList!

Think hard about the constraints you add. In some cases, adding a RandomAccess constraint on a method is leaking implementation through to the client - it reveals something about the implementation behind the method signature that may change later. I find this most appropriate within implementation code.

Sunday Nov 12, 2006

Java Open Sourced - Podcast Special

It seems I'm late to join the blogging party; just about everybody has covered the news already. In the unlikely case that you haven't read about GPL'ed Java, read just about any blog on blogs.sun.com (and many on java.net). The main resource page is http://www.sun.com/software/opensource/java/.

We've got a special episode dedicated to this on the Java Posse. It's an hour interview with Mark Reinhold, Rich Sands and Eric Chu discussing just about everything related to this. Please check it out (and digg it).

Sunday Oct 15, 2006

Code Advice #14: Don't initialize fields to default values

(See intro for a background and caveats on these coding advice blog entries.)

If you've been coding in C, you've probably picked up the habit of initializing all your fields:

    char \*foo = NULL;
    int bar = 0;

This is necessary, because in C, memory can be left uninitialized, so there's no telling what value foo will have before it is assigned something.

In Java, however, the language specification clearly defines default values, so virtual machines will for example always initialize reference fields to null.

Specifically, this means that code like the following is redundant:

    private int foo = 0;
    private Bar bar = null; 
    private boolean baz = false;

The alternative form of initializing the fields explicitly in the constructor, is the same:

    private int foo;
    private Bar bar; 
    private boolean baz;

    public Foo() {
        foo = 0;
        Bar bar = null;
        baz = false;
    }

You can leave out the above initializations, and the program will behave the same way. Carl Quinn once convinced me that this was more readable, so I picked up the habit, and I now swear by it. On the one hand, you can argue that leaving the explicit initializations in is more readable because you're making it really clear what it is you are intending. On the other hand, Java programmers quickly learn what the default values are and understand that an uninitialized field is the same as a nulled out field.

It turns out that the two forms are not exactly identical! If you disassemble the above code, you'll find the following bytecode is generated:

   4:   aload_0
   5:   iconst_0
   6:   putfield        #2; //Field foo:I
   9:   aload_0
   10:  aconst_null
   11:  putfield        #3; //Field bar:LBar;
   14:  aload_0
   15:  iconst_0
   16:  putfield        #4; //Field baz:Z

If you leave out the initializations, none of the above bytecode is generated. In a simple microbenchmark there was a measurable time difference (about 10%) for the case where a handful of fields were initialized versus leaving them uninitialized, so it's clear that Hotspot doesn't completely eradicate the differences here. (<speculation>Perhaps it just zeroes out the whole object memory block when allocating a new object, relying on the default values to all be represented as zeroes natively, and this is done even when all fields are later initialized?</speculation>)

Obviously, speed considerations is not what should be the deciding factor here. This was a microbenchmark doing nothing other than construct objects with and without initialization - it's unlikely that you'll find a measurable difference on a real program. What really matters is readability. I should also point out that Findbugs will treat these two scenarios differently. If you print out a field value in the constructor, it will warn about an uninitialized field if the field was not explicitly initialized in the field declaration.

I can see arguments both for and against explicit field initialization, but I think this is one of those cases where convention wins. I personally find code cleaner and more readable when you leave your fields with implicit rather than explicit initialization.

P.S. Remember that null fields are often a bad idea and should be initialized to null-objects!

Wednesday Sep 20, 2006

Working with classes that (unfortunately) call overridden methods from their constructors

I was struggling with a problem today, and I found a pretty decent workaround. Since initial googling hadn't turned anything useful up, I figured this writeup might help others who search for it in the future.

In essence, the problem boils down to "How can you initialize your own object before the superclass constructor has run?"

Let's express this in code - I've changed the example to a simple domain to make the intent obvious:

/\*\* Represents a building, such as a house, a garage, etc. \*/
public class Building {
    public Building() {
        // Do something with the result of getColor(), such as
        reservePaint(getColor());
    }

    protected abstract Color getColor();
}

The above class is the "framework" class I'm trying to extend. I do not have the option of changing it or extending something else; this is the integration point.

Note the error above: the constructor is calling an overridden method. This is a well documented problem. Previous uses of the above class had probably not run into it, because most usages would be something like the following:

public class Skyscraper extends Building {
    protected Color getColor() {
        return Color.LIGHT_GRAY;
    }
}

Since the overridden method typically returns a constant that is independent of the subclass object being properly constructed, calls from the superclass constructor did not cause problems. The problem was that in my case, I'm dynamically constructing many different types of Buildings, so I want to parameterize the getColor() method:

public class PaintedGarage extends Building {
    private Color color;

    public PaintedGarage(Color color) {
        this.color = color;
    }

    public Color getColor() {
        return color;
    }

Unsurprisingly, this fails at runtime. (When you construct the PaintedGarage object, the superclass constructor, Building(), will be called before the PaintedGarage constructor is called, and when it calls the getColor() method, it will read the color field which has not yet been initialized.)

With that long motivation: What can we do about this? Following the Effective Java practice of not calling overridden methods from the constructor is not an option. Remember, that is done in the superclass, and that is a class we have no control over - it's part of the framework we are plugging into.

In an earlier version of this, I was already computing classes on the fly (using a byte code generator library), so it was simple for me to simply inline the literal return value in an overridden version of getColor.

However, I no longer need to do that for other purposes, so is there a way for me to ensure that getColor gets initialized by the time the super class constructor calls it?

Yes!

The trick is to use an inner class. The process is as follows. First, file a bug against the framework you are using to get the constructor behavior changed. Second, try something like this:

public class PaintedGarageFactory {
    private Color color; // Moved out of PaintedGarage and into enclosing (not super!) class

    private PaintedGarageFactory(Color color) { // Use factory method below
        this.color = color;
    }

    class PaintedGarage extends Building {
        public PaintedGarage() {
        }

        public Color getColor() {
            return color;
        }
    }

    public static PaintedGarage create(Color color) {
        PaintedGarageFactory factory = new PaintedGarageFactory(color);
        // Notice how the outer class is fully constructed before the
        // inner class (and its superclass constructor) is built.
        // Each factory instance constructs PaintedGarages of a particular
        // color (and could be reused etc.)
        return factory.create();
    }

    private PaintedGarage create() {
        return PaintedGarage();
    }

}

What this has done is move the parameters that need to be initialized before the superclass constructor is run, out to the outer class. These are obviously initialized before the innerclass is constructed.

This is not a good way to write code, but it solves the immediate need: You've provided an implementation of the super class that is properly constructed as far as the superclass constructor is concerned.

Don't forget to get the framework fixed.

Thursday Sep 07, 2006

Welcome JRuby

Sun has hired the two primary JRuby developers, Charles Nutter and Thomas Enebo. This is great news for JRuby, because it will now be their full time job to work on it. (JRuby is an implementation of the Ruby language that runs on top of the Java platform.)

This obviously fits well with our strategy to support multiple languages on the JVM, and in particular, dynamic and scripting languages. In fact, Charles and Thomas is joining the group I'm in, so hopefully a lot of the tool support we have built to support BASIC will also benefit JRuby. Going forward there will probably be a lot of cross pollination.

Welcome to Sun, Charles and Thomas! Charles has already blogged about joining Sun.

Thursday Aug 24, 2006

More on Nullness

The other day I wrote about using annotations to state your intent regarding null for fields, parameters and return values. This can help static checkers like findbugs find flaws in your code, where you fail to check return values for null where that is known to be risky. (In fact, findbugs does not only base knowledge of nullness on your use of annotations; it also has a builtin database of system API null behaviors, since these classes have not been "instrumented" with nullness annotations yet.)

I did however mention one big problem: These annotations are not (yet) standard - so you'll need to choose to use them from somewhere - and add third party imports to all your source files, add a foreign jar to your build, etc. Findbugs' set of annotations is probably as good a candidate as any. But while the annotations are licensed under LGPL, which should be fine for most of us, some companies (or at least their lawyers) are weary of mixing GPL anything with proprietary code.

Our lawyers are generally fine with LGPL usage, but I was still wondering if there was a way I could "isolate" myself by using an annotation from my own project, and then somehow tie that to a standard annotation in a single place. If annotations were like classes this would be easy - I could simply extend the findbugs one and I'd be done. But annotations don't work that way. So I thought I'd take a look at findbugs internals and see what's going on.

And I was surprised, and happily surprised at that, to discover that findbugs is very tolerant in how it deals with nullability annotations. It is not favoring its own annotations. In fact, all it checks for is whether the annotation base name matches the name findbugs has chosen. This was probably done deliberately, to ensure that findbugs works with code annotated for other tools, such as IntelliJ. This is really a great idea - and it doesn't seem risky to me; I can't think of many other uses for an annotation named "CheckForNull" or "NonNull". Of course, if a problem ever arises they can always update findbugs to be more discriminating for users who want to use home grown annotations where those names mean something else.

So, you can simply define your own project wide nullability annotations (they are simple marker annotations with no members), and then findbugs will happily do its magic. Worked like a charm for me:

Null Handling: Much ado about nothing, literally.

P.S. For more about the built-in set of null knowledge of Java standard APIs, see the class NullnessAnnotationDatabase in the findbugs source distribution (which handily comes distributed as a NetBeans ant-based project - thanks!)

Tuesday Jul 25, 2006

Code Advice #11: Initialize fields using the property name

(See intro for a background and caveats on these coding advice blog entries.)

How do you initialize fields in a constructor? And how do you initialize fields in a JavaBean setter method?

Here's the, in my opinion, correct way to do it:

    void setTemperature(int temperature) {
        this.temperature = temperature;
    }
The same scheme is used in constructors.

There are various other ways to do this. Dave Yost argues that you should use the following pattern:

    void setTemperature(int newTemperature) {
        temperature = newTemperature;
    }
Another coding style guide recommends that you "consider using the prefix a or an with parameter names. This helps make the parameter distinguishable from local and instance variables":
    void setTemperature(int aTemperature) {
        temperature = aTemperature;
    }
A third scheme I've seen quite frequently is choosing a parameter name that is an abbreviation for the property name being set:
    void setTemperature(int temp) {
        temperature = temp;
    }
A fourth suggestion is to use underscores as suffixes on the field names themselves:
    void setTemperature(int temperature) {
        temperature_ = temperature;
    }

So, why do I think my way is the right way to do it, and not the other alternatives shown? As always, the first reason is that the this.x=x pattern is the most common way to do it - and as a consequence more developers will find your code clear and pleasant to read.

However, there are logical reasons to do it this way too. The primary reason is that the signature of your method is part of its API! If this is a public method, the parameter should be included in the javadoc with a @param tag. With the first three alternatives, the parameter names are newTemperature, aTemperature and temp. I think temperature is a better parameter description. Yes, it's true that for a setter method like setTemperature, it's not that important what the parameter name is since it's obvious what it's for. But when you're dealing with a constructor, you do need to be descriptive. And newTemperature is not a proper name for an initial state. You could switch to using initial instead of new as a prefix, but in addition to being wordy you now have different schemes for constructors and setters. Having one approach to both is beneficial since it keeps things simple.

The last alternative does let you use a good parameter name - the same one that I'm proposing, temperature. However, it has another disadvantage in that it's using an underscore_ as a suffix for the field name. This to me smacks of Hungarian notation, although rather than describing the object type with a prefix it describes the symbol class with a suffix. I don't like it since it makes the code less readable, but I suppose that's a topic for a whole other discussion. In short though, most Java programmers avoid underscores in all symbols but constants (such as public static final int WATER_BOILING = 100;).

The proposed style does however have one problem. It relies on "polluting" the namespace in the constructor or setter with a parameter which hides the field name. In the following code, temperature refers to the parameter, not the field:

    void setTemperature(int temperature) {
        this.temperature = temperature;
    }
What happens if you accidentally write the wrong parameter name - either by a typo, or by writing a similar name and not noticing that the parameter name is a different variation of the field name?
    void setTemperature(int temp) {
        this.temperature = temperature;
    }
The above code will compile just fine. However, it is probably not what is intended. If you call this method, the temperature field will not be updated - you are simply assigning the current value of the field to itself.

Luckily, your programming tools will find these errors for you. Obviously, a tool could warn you that the temp parameter is unused, and that would clue you in. But most people do not want warnings on unused parameters, since they occur frequently in a lot of code - especially when you implement interfaces.

However, the code above has a "self assignment" - the left hand side of the assignment is identical to the right hand side. This code is obviously redundant and can be removed - but more importantly, it usually points to an error of the above form. Therefore, if you run findbugs (NetBeans plugin here), it will clue you right in to the problem:

Jackpot can find self assignments as well. The following code is courtesy of Tom Ball and will be part of the Examples shipped with Jackpot (and hopefully be built in as well).

import org.netbeans.jackpot.query.Query;
import javax.lang.model.element.Element;
import com.sun.source.tree.AssignmentTree;

public class SelfAssignmentQuery extends Query {

   public Void visitAssignment(AssignmentTree tree, Object p) {
      Element lhs = model.getElement(tree.getVariable());
      Element rhs = model.getElement(tree.getExpression());
      if (lhs == rhs) { // Elements are unique so == is okay
         addResult(getCurrentElement(), tree, "self-assignment found");
      }
      return super.visitAssignment(tree, p);
   }
} 

Armed with the right tools, the this.x=x pattern works because you get to use the "right" parameter name, and the tools will catch your mistakes.

Monday Jul 17, 2006

Code Advice #10: Place brackets with the declaration type, not the name

(See intro for a background and caveats on these coding advice blog entries.)

The most important code style rule is that when you modify somebody else's code, you preserve the coding style in that file. Even when it hurts. The only thing worse than an "incorrectly" styled source file is an inconsistently styled one!

One of my coworkers writes Java code in a very C-inspired way. One of the habits I have difficulty with is the bracket placement for arrays. I thought I would dedicate a blog entry to this, because it occurs in many other code bases. A quick scan revealed a couple thousand matches in the JDK source code (which, granted, is a tiny fraction of the overall codebase).

Briefly stated, the correct way (as far as I'm concerned) to place brackets in an array declaration is with the type, not with the variable name. Therefore, this is okay:

     int[] scores;
and this is not:
     int scores[];

Here's a real-world example, from String.java:

    @Deprecated
    public String(byte ascii[], int hibyte, int offset, int count) {
        checkBounds(ascii, offset, count);
        char value[] = new char[count];
    ...
As you can see, there are two violations here - both in the parameter usage, and in the declaration of local variable "value".

Most of you will probably just nod at this and move on, since it's what you're already doing. But if there's one thing I've noticed, it's that coding style blog posts always generate controversy - and that especially those of you doing it the Other Way are reluctant to change.

So, let me spend a couple of paragraphs arguing why you should place the brackets with the type name, not the variable name. The most obvious answer is that brackets-with-type is what by far most Java developers are using, so you should simply do it to make your code consistent with the accepted practice.

However, it does have some logical reasons too.

First, let's take a look at a method declaration that returns an array - this is java.util.Arrays.copyOf:

    public static int[] copyOf(int[] original, int newLength) {
Notice how the brackets appear with the type declaration. You do not have the luxury of moving them to the end of the line, or even just past the variable name:
    public static int copyOf[](int[] original, int newLength) { // WRONG!

Note that if you have code like this:

    int[] a, b;
Here both a and b will be integer arrays. This is different than in C where the similar
    int\* a, b;
would leave only a an array, and b a scalar!

Let me anticipate the objections. I can think of two advantages for placing brackets with variable names. First, it's your only option if you want to mix and match plain and array declarations of a type in the same statement:

    int a, b[], c, d[];
However, we can dispense with this quickly: you should not compress declarations this way. Use a separate statement for each.

The second objection, which I have some sympathy for, is that placing brackets with variable names is consistent with the way array instances are used:

    int foo[];
    ...
    foo[i] = -1;
However, between declaration and usage we have the initialization, and the initialization uses the typename with brackets:
    int[] foo = new int[50];
    foo[i] = -1;

An easy way to find violations of this coding style is to use Checkstyle. Note that while javac will happily compile both declaration styles, the error message from checkstyle is:

Array brackets at illegal position.
I guess that's stretching the definition of "illegal" a bit (since the Java Language Specification defines what is legal and what is not, and both forms are allowed). However, in my opinion it is good practice to place the brackets where most Java developers place them - with the typename in declarations.

About

Tor Norbye

Search

Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today