Sunday, January 14, 2007

Heisenbugs

I'm experimenting with writing a blog on ASiddur development. We'll see if this works as a method to communicate what's been going on.

I finally tracked down a bug in the latest development version that I put off looking at, since everything appeared to work correctly. All that I saw was a NullPointerException with a short stack trace in the output when I would run in the simulator, but everything appeared to still work correctly. So I figured that I could put off looking at it until I was getting closer to releasing version 0.3. Then I got reports that people were unable to actually run the latest version, so I decided to look at it more closely.

I fired up the debugger, setting it to catch any NullPointerException. After wading through the ones that were thrown and caught by design (e.g. the font images that are not supplied), I eventually got to the point where the main form came up, but no NullPointerException was being thrown. Given that the all the code did throw the exception was either platform code or code that's been around for a long time, I didn't think that was the problem. I'd found ASiddur's first Heisenbug. After adding some print statements and commenting large blocks of code, I finally traced the occurrence of the bug to the following line:

getDisplay().setCurrent(get_MainForm());


This line runs at the end of the initialization sequence, and switched the displayed item from being the splash screen/progress indication to the main form. Oddly enough, commenting out the line had no effect - I'd expected the main form to not show up at all, but it did anyway. Reading the documentation further, it appears that an newly-created Alert (the type of the splash screen) has a default timeout. So the code that I added to read in and begin to process a binary-format tefilla file made things take long enough that the default timeout was triggered. Even so, from reading the documentation of the setCurrent() method, a null argument (which it wasn't) is allowed (although it would have meant that ASiddur requested to be paused). And you're also allowed to call setCurrent with the currently displayed item (which would mean a request to come out of the background). So I have no idea why the call caused the NullPointerException, and much as I'm loath in general to blame the platform, I think that it was a bug in kvm. In any case, the workaround (to explicitly set the timeout of the splash screen Alert to FOREVER) is certainly correct code, so I'll leave it in and hope that it fixes the problem everywhere.

1 comment:

Sara said...

It is all good as long as it doesn't take you all night :-)