Tuesday, February 11, 2014

The Danger of Following Bad Stats


“Some individuals use statistics as a drunk man uses lamp-posts — for support rather than for illumination.”

It’s an annual rite of passage around this time of year for NBA pundits to generate countless columns bemoaning the fact that certain players were undeservedly selected to the All-Star game while other more deserving candidates were robbed of the honor (1). I typically ignore this cacophony of kvetching (2) given that I’ve come to view All-Star selections as a reflection of popularity and Q score rather than production, plus I’ve been advised by my doctor/Father to cut down on the snarky tweets.

There was, however, one such post that piqued my interest. Said article aimed to determine the “25 Worst All-Stars” of the last ten years according to advanced metrics. Unfortunately, the advanced metric employed was none other than ESPN’s Player Efficiency Rating (PER), the ubiquitous algorithm developed by the renowned John Hollinger.

In scrutinizing this list of allegedly undeserving All-Stars, I was surprised by the number of egregious false positives: truly great and sometime elite players that PER deemed unimpressive. Rather than angrily tweeting at the author (doctor’s orders), I instead opted to use the flawed list to illustrate yet another truism that flies over the heads of most fans, even the more well-researched ones: not all advanced stats are good and many can even be (gasp!) misleading!

To illustrate this point, I’ve taken the players PER deemed the worst All-Stars of the last ten years and compared their PER that year to two other advanced metrics that are more comprehensive and more highly correlated with wins: Wins Produced per 48 minutes (WP48) and Win Shares per 48 minutes (WS/48) (3). What you’ll see is that while there are some players that all the metrics agree were merely average (or even below average) there are others on whom the metrics vehemently disagree:

*2012 WP48 not available 
What’s important to note about the players PER undervalues is that they each marvelously reflect the flaws in PER’s model: in overvaluing scoring totals and undervaluing the worth of a possession, PER rewards players who employ many possessions at the expense of more productive players who don’t shoot as frequently but boast high True Shooting Percentages (Allen), or absurd rebounding and shot-block rates (Wallace) or guards who do just about everything else at an elite level (Rondo, Kidd).

So what lesson should we take from this experiment, endearing reader? When it comes to advanced metrics, there is no democracy – certain measures are better than others and deserve greater input.
So the next time you read some pundit (looking at you, ESPN) point out this Player X has a surprisingly good PER or +/-, kindly explain that they are using statistics the way a drunk uses lamp posts – for support, not illumination.

--------------------------------------------------------------------------------------------------------------------------------------------------------

1. And yet I can help myself: How the coaches selected the wildly overrated Iso-Joe Johnson over the far more deserving Kyle Lowry or Lance Stephenson AND missed out on both DeAndre Jordan (the league leader in rebounding and fg% on a great team!) and Anthony Davis (until injuries forced the latter into action) is literally mind-boggling to me. Seriously, I had to take off work for a week, my mind was so boggled. 

2. Great name for a Jewish rock group.

3. For the math behind why PER is an inferior metric, check out this FAQ.

No comments:

Post a Comment