Sunday, 27 November 2011

Performance and Concurrency Testing - The Poor Cousins

Rob Fletcher has written an interesting post on the challenges associated with functional testing. I agree with everything he says, but recently my appetite for testing has lead me elsewhere and it's probably about time I wrote something about it. Here goes...

Unit, integration and functional testing have a good deal of momentum within the development community. There is a wealth of decent open source software and a tonne of support material readily available, but there are two areas of testing which aren't well supported and it's bothered me for some time. I'm talking about performance and concurrency testing. Performance testing encompasses load testing, soak testing, benchmarking etc(all of which are likely to require some degree of concurrency). By concurrency testing I'm talking about specifically trying to verify thread safety, or checking for deadlocks, etc.

Project teams frequently defer this type of testing to near the end of development and treat it as a short, one off activity. Doing so is a mistake, since if (as Rob points out) test data is cumbersome to set up for a functional test, it can be an order of magnitude more difficult for a load test. Instead of hard coding the test data you usually need to generate it, typically in accordance with some complex validation rules. Just generating the required quantity of test data can take hours or days. Even worse than this is the cycle time or "cadence" problem that Rob writes about. If a soak test takes 24 hours to run, but falls over with out of memory errors after 12, then takes 3 attempts to fix, you'll have wasted the best part of a week without actually completing a single test run. If you do manage to complete a successful test run, can you trust the results? Was the throughput limited by the test client, by the infrastructure or by the system under test? How would the results have changed if you'd tuned the infrastructure differently? Were your test data and test operations an accurate representation of live? More often than not performance and concurrency testing raise more questions than they answer.

The reasoning behind running these types of tests once and only once is because they are painful and expensive, but if we've learned anything from continuous integration it's that when something causes you pain, do it more frequently, not less. We've also learned from unit and functional testing that it's better to get feedback early. Both good reasons to automate your expensive performance / concurrency tests and run them as often as feasible. I like the idea of a nightly build which runs the load tests and trends the results. If the system suddenly starts performing badly, it should be easy to narrow the problem to a small number of commits. The downside of such a system would be the maintenance, although I suspect it wouldn't be as high as for functional tests.

Strangely the development community has not embraced performance and concurrency testing in the same way that it has done with unit, integration and functional testing. Commercial products such as LoadRunner exist, and there are open source alternatives such as JMeter. My primary objection (I have many more) to these tools is that they are not designed for developers, yet due to the technical difficulties in generating representative test data it is typically developers who end up writing these kind of tests. Grinder looks closer to what I'm after, but it seems so complicated. I want a tool that allows me to write tests in Java, has access to my project classpath, lets me leverage any other java library that I choose, runs from my preferred IDE, integrates with my build and that can record results over time. i.e. jUnit + [performance / concurrency toolkit] + database.

So I've started to write one - Julez. It's early days, but has been lots of fun so far. I should also thank Frank Carver for applying his Jedi mind tricks to keep the project small and lightweight.