Tuesday, August 10, 2010

Software Testing (part three, thanks to you)

This discussion about software testing has become exponentially more interesting because you guys are contributing such interesting information. This is from Matthew Montgomery:  I worked at Microsoft Games as a Software Test Engineer, a contractor. Now I work at a major software company with a similar job (just not games). I work with automated and manual testing quite regularly. And after seeing Matt's commentary, I thought I'd throw in my two cents. Warning: lots of inside baseball! I just thought you might be interested since you liked Matt's analysis.

As Matt points out, tens-of-thousands of hours is not great in itself. My experience with bug bashing at Microsoft Games was kind of like that. Everyone in the building tests the game. Presto: X hours on the game times Y people equals Z hours of testing! Except the bug reports are filed by people who are probably not familiar with your game. Thus, they tend to be dupes of existing bugs; feature requests that are impractical this close to ship; bugs that happened once and never again; or low priority bugs that'd be nice to fix but won't stop ship.

What Matt says about Excell or Volt testers is pretty much the case, as well. In my experience, you have some full-time testers who do more advanced testing, test planning, writing automation, etc. (That's what I do.) You might have some contractors on site who will do the less glamorous stuff, but whom you still trust to have some judgment. Or you might not. (That's what I did at MGS.) Matt is probably thinking of the VMS folks when he talks about hundreds of testers. These are the people who are hired to do the repetitive, boring testing that the full-timers have planned for them--- click this, expect X behavior. Click that, expect to see Y on the screen. That's more than likely where the majority of their thousands of hours comes from. It's little to no guarantee of quality; if the test planning is crappy, then the quality of testing is going to suffer as well.

Aside from that, there are some great reasons to automate a game. Let me say up front that Matt is correct and that you're unlikely to find any deep bugs with automated tests. The fancier you try to get with automation, the more fragile it tends to be. You're working with a program, obviously, which lacks human judgment and will only test what you specifically ask it to, and it can't think on the fly. Thus, in my opinion--- and I mean this as no slam against Matt--- the best way to think about automated tests is not as a panacea for the pain of manual testing. Rather, automated tests are good for catching regressions (i.e., a bug that was fixed but now isn't) and for ensuring a minimum level of quality for each build quickly, and without human intervention.

Take some trivial automation tests and couple them with a continuous integration system. Generally that's where a script compiles the latest code into a build whenever a developer checks in a change. Then your fancy script runs your automated tests against that build and reports the results. If your tests pass, you can be reasonably confident that this build is usable, to the extent that you trust your tests and your tests cover functionality you consider important. If the tests fail, you can say the build is not usable (again, with the previous qualifier). Coupled with unit tests, you can get feedback on the stability of the build within minutes of a developer making a change. And, as I said, this is all without a human being involved except to keep an eye on the status of the build.

This a big win. The sooner the developer finds out that they broke something, the sooner they can fix it or undo it. The less time your testers spend on trivial, repetitive "does this work" testing (what you'd call smoke tests or basic acceptance tests or sanity tests or ...), the more time they have to spend on everything else. Ultimately you can save a lot of time and, by extension, money.

Another useful scenario to explore is some basic happy path stuff. Matt's example is very close to what I mean: run a simulation of the game. The trick is that you don't care as much about the outcome as you do that you can load a game and play it to completion. Once again, if you run this test against every build, you can ensure that, at a minimum you can load, play, and finish a game.

 There are huge caveats, obviously. You won't catch graphical glitches, and you won't catch UI bugs unless you specifically try. This would, however, find bugs like "game never loads" or "the team with the higher score didn't win" or "game doesn't finish after N minutes." That's time a tester did not spend downloading and installing the build, loading it up, starting a game, narrowing down the bug, filing a report, and so on.

To reiterate, you shouldn't stop there and say the game works, although I'm sure that wouldn't stop a dysfunctional team. Nor is the number of tests correlated with the quality of the product. If you have tests that never find bugs before humans see them, what's the point? A more useful metric would be: how many bugs did your automation uncover?

Site Meter