By kto on Dec 07, 2009
Everything was a little fuzzy in Livermore, California this morning:
A very rare snowfall for this area, we are at 500 feet above sea level, snow was falling as low as 200 feet early this morning. But that's just my "Fuzz" picture intro...
I read an article on Secure Software Testing and it talked about Fuzz Testing, and now I'm wondering if I've become a Fuzz Tester, and what exactly does that mean? :( Should I see a doctor? "Fuzz", what a funny word, back in the 70's that was one of the more polite slang words for the Police.
Years ago, when I worked on C compilers, one of the first things my fellow Fortran compiler developer did to test my C compiler was to feed it a Fortran source file as input. If my compiler core dumped instead of generating an error message, it would have failed the "garbage input" test. I consequently would cd into his home directory and run "f77 \*", but I could never get his damn Fortran compiler to crash, maybe it just liked eating garbage. ;\^) Anyway, it appears this kind of "garbage input" testing is a form of "Fuzz Testing", feed your app garbage input and see if you can get it to misbehave.
So back to my primary topic at hand, OpenJDK testing. Lately I've been trying to get to a point where running the jdk tests is predictable. I've done that by weeding out tests that are known failures or unpredictable, and running and re-running the same tests over and over and over, on different systems and with slightly different configurations (-server, -client, -d64).
After reading that Fuzz article, I've come to the conclusion that I've been doing a type of Fuzz Testing, by varying one of the inputs in random ways, the system itself. But that's a silly thought, that would make us all Fuzz Testers, because who runs tests with the exact same system state every time? Unless you saved a private virtual machine image and restarted the system every time, how could you ever be sure the system state was always the same? And even then, depending on how fast the various system services run, even a freshly started virtual machine could have a different state if you started a test 5 seconds later than the last time. And what about the network? There is no way to do that, well I'm sure someone will post a comment telling me how you can do that. And even if you could, what would be the point? If everything is exactly the same, of course it will do the same thing, right? So there is always Fuzz, and you always want some Fuzz, who really wants to be completely Fuzz-less? My brain is now getting Fuzzy. :\^(
When looking at the OpenJDK testing, I just want to be able to run a set of robust tests, tests that are immune to a little "system fuzz". In particular system fuzz created by the test itself or it's related tests, "self induced fuzz failures" seems to be a common problem. Sounds like some kind of disease, H1Fuzz1, keep that hand sanitizer ready. Is it reasonable to expect tests to be free of "self induced fuzz failures"? Or should testing guarantee a fuzz free environment and only run a test one at time?
I'm determined to avoid tests with H1Fuzz1, or find a cure. So I recently pushed some changes into the jdk7/tl forest jdk repository:
With more improvements on the way, and the same basic change is planned for OpenJDK6. This should make it easier to validate your jdk build by running only the jdk regression/unit tests in the repository that are free of H1Fuzz1. They also should run as quickly as possible. You will need a built jdk7 image and also the jtreg tool installed. To download and install the latest jtreg tool do the following:
Build the complete jdk forest, or just the jdk repository:
cd jdk/make && gmake ALT_JDK_IMPORT_PATH=previously_built_jdk7_home all images
Then run all the H1Fuzz1-free tests:
cd jdk/test && gmake -k jdk_all [PRODUCT_HOME=jdk7_home]
There are various batches of tests, jdk_all runs all of them and if your machine can handle it, use gnumake -j 4 jdk_all to run up to 4 of the batches in parallel. Some batches are run with jtreg -samevm for faster results. The tests in the ProblemList.txt file are not run, and hopefully efforts are underway to reduce the size of this list by curing H1Fuzz1.
As to the accuracy of the ProblemList.txt file, it could be wrong, and I may have slandered some perfectly good tests by accusing them of having H1Fuzz1. My apologies in advance, let me know if any perfectly passing tests made the list and I will correct it. It is also very possible that I missed some tests, so if you run into tests that you suspect might have H1Fuzz1, we can add them to the list. On the other hand, curing H1Fuzz1 is a better answer. ;\^)
That's enough Fuzz Buzz on H1Fuzz1.