Wednesday, August 04, 2010

Software Testing

I received an interesting e-mail from Matt Perrin about software testing in regards to the NCAA 11 post I made last week.
I wanted to make a comment on your post "Fail" in regards to the testing of NCAA. I work professionally as a QA Automation Engineer testing software and also develop indie games in my free time so I have some experience in both areas. I think the smoking gun is this statement from the blog:
Over the past year we logged tens-of-thousands of hours of QA on the game in addition to tens-of-thousands of hours more in scripted game testing through networks of automated game consoles here at the studio...

Out of the three possibilities you mention, I am willing to bet they didn't notice it. There are a lot of ugly little secrets about test automation and how software vendors actually use it versus how it should actually be used. A lot of teams never look to see what end result data is generated, instead trusting on a script's Pass/Fail reporting feature to verify a test condition was met at the end of a run. Also, with the constant drive to automate new software requirements, older automation scripts that are always passing are usually never reviewed. The scripts can actually give teams a false sense of security that everything is working as intended, but it's that end of run review process where the system really falls apart. The scripts that are always pass are the ones you should always be investigating for hidden bugs that aren't being caught.

Also, I would be curious to know how exactly they are getting any major benefits from automation for something as complex as a football game. It's one thing to automate the testing of a business application where every time I open it, it always behaves the same. For games, I could see automated testing working well for testing static UIs or specific fail states. In the scenario the blog mentions with "networks of automated game consoles" I am assuming they are either testing their leader boards and other connected features. Or they may be using a central server to push an automation script selecting a specific play to a group of machines in a pre-defined state and then watching the results and hoping the results match the probably of that play succeeding. In other words, a very static and repeatable scenario and not the chaos of a typical game.

Lastly, "tens-of-thousands of hours of QA" in the game industry is a bit of a joke when you have QA companies like Excell and Volt throwing hundreds of testers at a time at a game. And from some of the stories I've heard, these companies are basically running 24/7 testing centers, cranking out "testing hours" using whatever gamers have some aptitude for it but maybe not the precision skills typically found in a traditional QA professional. They can play the game and find the big bugs, but can they analyze the data properly? Probably not.

Thanks to Matt for a very interesting analysis. That may not all apply directly to NCAA11, but it's excellent information, regardless.

Site Meter