How reliable are Mozilla's performance measurements? · 2011-04-07 15:38 by Wladimir Palant
The good news: they are not entirely wrong. However, there are obvious issues:
- The latest test run (on April 2nd) only has two results for Adblock Plus — for Windows 7 (20% slowdown) and Windows XP (23% slowdown). The other two tests (on Mac OS X and Fedora Linux) “crashed” and weren’t considered in the results at all. The “crash” turned out to be a general issue with the testing framework, anything that prevents the browser from being closed will cause the test to be ignored. Why isn’t incomplete test data mentioned on the “blame list”? Beats me.
- The test run before it (on March 26th) gave Adblock Plus perfect marks, the timing results were indistinguishable from the reference values (that’s why I didn’t see Adblock Plus on the list when I looked at it the first time). The important difference was apparently that Adblock Plus 1.3.3 was measured because Adblock Plus 1.3.5 wasn’t reviewed on addons.mozilla.org yet. So, did I introduce a huge performance regression in Adblock Plus 1.3.5? I had to re-run the test to understand it: the testing happened with the add-on being disabled! Any add-on that is only marked as compatible with Firefox 4 on addons.mozilla.org but not in the extension itself will get a perfect score (like Read It Later — found at the bottom end of the list with supposedly 4% slowdown despite being disabled during the test).
- There are many more “crashes” in the test results. Some add-ons like Cooliris haven’t been tested at all. Apparently, the testing framework fails to download them because these add-ons have different download packages for different operating systems.
Add to this the two issues I already mentioned last time (results being skewed on Windows for some extensions that aren’t being extracted correctly and extensions being tested uninitialized) and you get the idea. There is probably a large number of Top 100 extensions that either didn’t make the list or got a better score due to testing framework bugs. And some other extensions probably got a worse score because of being run unpacked which in reality never happens in Firefox 4.
Now to the actual data. Since I couldn’t compare Adblock Plus results I found two extensions on the list where the current version was released before March 26th, so both test runs tested the same extension version. I didn’t consider earlier test runs because they were performed with Firefox 3.6. FlashGot (50% slowdown) was the first extension at the top of the list to meet the criteria, and Download Statusbar (14% slowdown) is located near the middle of the list.
|Test run||Reference time (no extensions)||FlashGot 188.8.131.52||Download Statusbar 0.9.8|
|Windows 7 on March 26th||548.89||617.89||+12.6%||625.0||+13.9%|
|Windows 7 on April 2nd||541.89||617.63||+14.0%||625.63||+15.4%|
|Windows XP on March 26th||399.79||473.11||+18.3%||482.47||+20.7%|
|Windows XP on April 2nd||401.21||471.32||+17.5%||489.05||+21.9%|
|Mac OS X on March 26th||694.79||1677.47||+141.4%||706.21||+1.6%|
|Mac OS X on April 2nd||699.58||1706.16||+143.9%||722.05||+3.2%|
|Fedora Linux on March 26th||498.37||642.53||+28.9%||593.89||+19.2%|
|Fedora Linux on April 2nd||495.95||621.21||+25.3%||588.89||+18.7%|
The most interesting part here are the Mac OS X results of course. I’m not sure where the horrible FlashGot performance on OS X comes from. The test log contains a bunch of error messages that indicate a bug in the extension, probably triggered by the unusual configuration of the test machines (if I read it correctly, FlashGot is trying to write to the temp directory which is probably forbidden). It would have been useful if AMO provided extension developers with links to the test logs, finding and fixing such bugs is otherwise a very difficult task.
Download Statusbar results on OS X are also interesting — they are unrealistically low. Either Firefox on OS X does some things radically different than the Windows/Linux version or the extension is simply broken and doesn’t do anything. Or maybe that’s another case where an extension is being tested while disabled. No idea.
If you take OS X out of the equation however it is notable that the slowdowns caused by extensions don’t seem to be proportional to Firefox start-up times at all, they are actually almost the same on all platforms. Consequently, you get higher percentages on platforms where the reference startup time is lower. This even stays true on different hardware: when I tested Adblock Plus on my laptop the slowdown introduced by Adblock Plus was the same as on the Talos machines (even somewhat lower) while Firefox start-up time almost doubled. Under these conditions, does it even make sense to set these slowdowns in proportion to Firefox startup times? Wouldn’t it make more sense to give users an idea of the absolute scale?
All that said, the results of the two runs are remarkably similar. So once all the issues are fixed the numbers on the “blame list” shouldn’t be off by more than 2%. But it would have been nice if the obvious issues were fixed before going public with the results.
Commenting is closed for this article.