Thursday, September 04, 2008

A Curious Inflation, Part Two: All The Women Are Strong, All The Men Are Good Looking, And All The Games Are Above Average

After the post about 1UP and their "curious" inflation of review scores for sports games compared to their peers, Douglas Gould sent this:
You realize that your entire analysis is absolutely dependent on your scale?

That's absolutely true, of course. I believed I was doing a straight conversion based on the standard A-F grading scale, but Douglas's comment is true not only for my converted 1UP scores, but the review scores from the other websites as well.

This made me curious to do a comparison based on actual review criteria, instead of just looking at the review score, and since I mentioned Gamespot as being closest to the Metacritic average, let's use them as the test case.

First off, 1UP:
We rate games on a scale of A+ through F. Anything we score in the A+ through A- range is considered excellent, B+ through B- is good, C+ through C- is average, D+ through D- is bad, and F is terrible.

That all seems reasonable and somewhat balanced--three letters good, three letters bad. Now let's take a look at their last 100 reviews (360/PS3/Wii/PC), sorted by major letter grade:
A--18
B--45
C--25
D--12
F--0

So 63 out of 100 reviews were either "excellent" or "good." Only 12 were below "average."

Wow! Apparently, the gaming industry has had a massive quality improvement while I was completely asleep at the switch.

Now let's look at Gamespot's rating categories and criteria:
10.0: Prime
This exceedingly rare score refers to a game that is as perfect as a game can aspire to be at its time of release. Obviously, the constantly changing standards for technology and gameplay will probably make this game obsolete some day, but at its time of release, a game earning this score could not have been improved upon in any meaningful way.

9.0-9.5: Superb
We absolutely recommend any game in this range, especially to fans of that particular genre. However, games that score in the 9 range are also typically well suited to new players. Games that earn 9s are naturally uncommon, and earn GameSpot's Editors' Choice Award for their outstanding quality.

8.0-8.5: Great
This range refers to great games that are excellent in almost every way and whose few setbacks probably aren't too important. We highly recommend games in the upper half of this range, since they tend to be good enough to provide an enjoyable experience to fans of the particular genre and to new players alike.

7.0-7.5: Good
A game within this range is good overall, and likely worth playing by fans of the particular genre or by those otherwise interested. While its strengths outweigh its weaknesses, a game that falls in this range tends to have noticeable faults.

6.0-6.5: Fair
Games that earn 6-range ratings have certain good qualities but significant problems as well. These games may well be worth playing, but you should approach them with caution.

5.0-5.5: Mediocre
A 5-range score refers to a game that's "merely average" in the negative sense. These games tend to have enough major weaknesses to considerably outweigh their strengths. There's probably a substantially better, similar game out there for you.

4.0-4.5: Poor
Games that just don't work right and maybe didn't spend enough time in production tend to fall in to this category. They simply lack the cohesion and quality that make other games fun.

3.0-3.5: Bad
You probably shouldn't get too close to a game in this range. Any of its positive qualities most likely serve only to make the rest of it seem even more disappointing.

2.0-2.5: Terrible
Beware, for a game in this range is almost entirely devoid of any remotely decent or fully functional features.

1.0-1.5: Abysmal
Ouch. The rare game that falls in this lowest-of-the-low range has no redeeming qualities whatsoever. Don't play this game.


Interestingly, anything from 8 to 10 is considered "great" or better, while anything below a "6" is considered "mediocre" or worse. Now let's look at the distribution for the last 100 360/PS3/Wii/PC reviews:
10--0
9.0-9.5--4
8.0-8.5--22
7.0-7.5--30
6.0-6.5--15
5.0-5.5--9
4.0-4.5--10
3.0-3.5--8
2.0-2.5--1
1.0-1.5--0

On the face of it, Gamespot's distribution curve seems more balanced, if still top-heavy. However, by Gamespot's own criteria, everything from a 7.0-10.0 is either "good" or better, and 56 of 100 reviews were in that range. Not quite as happy puppy loving as 1UP's 63, but surprisingly close. And if you equate 1UP's "average" (C) with Gamespot's "fair" (5.0-5.5), Gamespot had 19 out of 100 games below that threshold, compared to IUP's 12.

In other words, there are very few below average games. The gaming industry, apparently, is a beacon of quality.

Or something.

Would these numbers indicate that the review system is broken? Well, that depends on what you expected in the first place. Game previews are straight out of the Entertainment Weekly tradition, and I think the vast majority of reviews (and reviewers) fall into that category as well.
Having said that, I think we can all find reviewers and sites that are a decent match for our particular tastes. I tend to like Eurogamer, particularly the reviews of Kieron Gillen and Kristan Reed, because they seem to contain more detail about the reviewer's experience playing the game. For sports games, Bill Abner is money--not because I always agree with him, but because I know how methodically he plays through a game before he begins writing.

Some reviewers are very solid and reliable in certain genres but not in others, and I'm a good example (even though I'm not really a "reviewer"). If the word "teabagging" can be used in reference to anything in the game, I'm the wrong guy to cover that (pun intended). And I'm really quite tired of WWII FPS games (or anything closer to the present). So my impressions of those genres is certainly going to be tinted by my discontent with those kinds of games in general.

Are some reviewers, at times, influenced by free trips/events given by publishers or pressured by their editors because of the potential loss of advertising revenue? Sure, and a bunch of you sent me a link to Dan Hsu discussing some of that here. I don't think there's any question that the Chinese Wall is either inadequate or nonexistant at some sites, but that's just another reason to focus on individuals who write very specific, thorough reviews.

I think the quality of reviews is much like the quality of the games themselves--absolutely brilliant at the top, with a deep swamp below.

Site Meter