Friday, 31 May 2013

Do You Like Your BDD Full Fat, Skimmed or Semi-Skimmed?

I've been having lots of fun writing micro services in nodejs and testing them with Yadda - my own BDD library. Before I started Yadda, I did an assessment of the other JavaScript BDD tools. It quickly became apparent there's a great deal of confusion about what BDD is, so I thought a blog post was in order to set things straight.

BDD stands for Behaviour Driven Development and is a term coined by Dan North. The story goes that Dan was running a TDD workshop, and noticed that the participants were getting distracted by the junit syntax, so he started expressing tests in terms of Given, When and Then steps.
Given 100 bottles of beer are sitting a the wall
When 1 falls
Then there should be 99 bottles of beer
This led him to write jbehave, a tool for mapping ordinary language steps to executable functions. Since then a host of other BDD tools such as CucumberRSpec, Concoordian, Specflow and Twist have sprung up.

There has been some argument that BDD is not a real thing, that it's just TDD done well. The counter argument is that the difference is the audience. If your tests are written in code, then the audience are developers. If your tests are written in an ordinary language, your audience can be anybody. I've been writing BDD tests for about five years now, but I've only once found a non developer with enough time, patience and discipline to co-author BDD tests. It was the most effective project I've ever worked on. So I think there is a difference, but it's rare that you'll be in a position to capitalise on it.

The BDD Sweet Spot

However, there is another compelling reason for using BDD tools. They soothe the pain of functional testing. Web testing especially. When you write web tests you are likely to encounter the following problems...
  1. The tests take a long time to write. This is due to what Rob Fletcher calls cadence. The cadence for functional tests will be higher than unit tests because the tests run much more slowly. Things quickly get much worse if you have to rebuild / restart your application whenever you change production or test code. It's usually hard to debug / step through functional tests too.
  2. The test suite takes a long time to run. Sometimes so long it's only run at night. This means the tests are usually broken in the morning and it's some poor souls job to fix all the problems that were introduced the day before. After a while the team gives up on writing functional tests.
  3. The tests are incomprehensible. They contain xpath expressions like '//table/tr[3]/td[4]' or css expressions like 'table#report span'. Because the test data takes so long to set up, developers keep appending tests to each other.
  4. The tests are brittle. Any time you change your page markup you're likely to break numerous tests. This can limit peoples enthusiasm for refactoring HTML and CSS.
  5. The tests all break at once. Because most functional test tools don't provide an abstraction layer, and because often developers think of writing one until too late, and because of a "it's only test code" mentality, and because when testing is done last it's seen as a chore, it's common for those brittle xpath expressions to be copy and pasted like no tomorrow.
  6. The tests aren't run in a real browser. Because test suites are so slow (and even slower in IE) it's common to only run them in one browser. When that becomes slow, people look to running them in fake browsers. Thanks to improvements in JavaScript libraries this is not as bad an idea as it used to be, but it still a concern.

Since true BDD tools are written in an ordinary language, not only are they (hopefully) comprehensible - solving problem #3, but they also provide an effective abstraction layer, solving problem #5 and thereby greatly reducing the impact of problem #4. Once you've built up a respectable library of test steps, your abstraction layer will enable a high degree of reuse and you'll also partially mitigate problem #1.

This still leaves the problems of a slow development time for new test steps, slow test suites, and authenticity. This is where JavaScript comes in. Since these are functional tests, you can write them in any language you want. JavaScript is interpreted not compiled, which chips away at problem #1. Headless webkit browser bots like ZombieJS or CasperJS run like lightning compared to WebDriver. Goodbye problem #2. For a while you could even feel more confident because WebKit is the base of both Safari and Chrome. However recently Google announced they were forking WebKit and developing a new rendering engine called blink. I still think 5 out of 6 is pretty good.

So what's all this about Full Fat, Skimmed, Or Semi Skimmed?

Well, the problem is that in BDD terms the JavaScript community has got rather confused. The most popular tools, Jasmine, MochaVows  Chai, should.js are proclaimed as BDD, but they're not. Jasmine, Mocha and Vows, are excellent test frameworks, but from a BDD perspective they merely describe('some test function') with text. Chai and should.js don't even do this, they are fluid assertion apis, which attempt to simulate an ordinary language; None of these tools pass the audience test. Worse still they don't provide the abstraction layer, meaning you've still got problems #1, #3, #4 and #5 to deal with.

The only true BDD tool I found during my search was CucumberJS. The problem with CucumberJS is that the Gerkin syntax is restrictive. I don't want to be limited to starting my sentences with Given, When, Then, And and But. I want to express myself naturally. I also want a test runner makes good decisions about which steps to run, instead of picking the first matching one it comes to. I want a tool that doesn't fail silently.  I want a tool that I can use synchronously or asynchronously. I want a tool that lets me plugin different step libraries so that I can test multiple interfaces (e.g. rest and HTML) with the same scenarios. I even want a tool that has nothing to do with testing and just maps ordinary sentences to functions so I could use it in a rules engine or build script. That's why I wrote Yadda.