|
djsunyc
Posts: 44929
Alba Posts: 42
Joined: 1/16/2004
Member: #536
|
What Can Stats Do For You? (Part I) by KnickerBlogger | permalink | trackback
Life is made of lot of tough decisions. Let’s say you decide to rent a car for the weekend, and the person behind the counter offers you two different options at the same price. The first is for the car to come equipped with an Onstar GPS, while the second comes with a road map. Similarly if you were a juror, would you prefer a video of the incident or an eyewitness? If you hired an accountant to do your taxes, would you rather he uses a modern computer or just a calculator?
In each case it’s obvious that the former choice is better than the latter, because you would be getting more information. While a road map might be ample for travelling on major interstates and highways, it wouldn’t be able to tell you every little alleyway and back road. Everyone would rather have more information in every aspect of your life, from buying a house to playing a video game. So why is it that some people shun having more information when it comes to sports?
Just a few years ago, you would have been hard pressed to find anything other than per game stats for the NBA. Today, thanks to sites like 82games.com, basketball-reference.com, and my own stat page, you can find a host of new stats like +/-, rebounding rate, PER, and per 40 minute stats. You would think that these extra bits of information would be gold for those that like to talk hoops on basketball message boards. However, usulally the opposite occurs. Take for example these quotes:
“Hollinger’s biggest problem as an analyst is that he’s too stat-oriented. as we all know, stats don’t always tell the complete story.”
“He can’t keep using statistics like this.”
“I think the 40 per isn’t always the best stat. Isn’t there a reason that Jackie Butler only plays 2 minutes a game and Iverson plays 45?”
I didn’t have to surf far for these, as they were all from the same thread. Reading these and many other comments like them, it seems that the problems with statistics comes from misunderstanding their meaning & usage. So to fulfill my public service requirements for the New York State Criminal Court, I’d like to talk about the few hangups that are common among modern NBA stats.
ARGUMENT #1: Per minute stats are misleading. Or “would Jackie Butler average 80 pts a game?”
“Hey I just flipped a penny 10 times & got 8 heads, so you should always take heads since there’s an 80% chance that you’ll win.” Sounds pretty absurd doesn’t it? Everyone knows that a coin will land on heads half the time, and that if I flipped it another 10 times it’s more likely to be closer to 5 heads than 8. The problem is that I didn’t use enough attempts to make a proper judgement on the probability of coin flips. There are a many names for this kind of faulty thinking, incududing the clustering illusion and the fallacy of divison. Simply put this is a case of using too small a sample size.
Jackie Butler, in case you never heard of him, played a grand total of 5 minutes for the New York Knicks in 2005. In those 5 minutes, Butler hit all 4 of his shots and both his free throws to score 10 points. Judging by his per minute stats, Butler would average 80 points for every 40 minutes played! Just because Jackie’s per minute stats are supernatural, doesn’t discredit all per minute stats. Butler also had a 100% FG%, but no one is saying that FG% is worthless.
There are plenty of examples of small sample sizes. Last year the Clippers started out 1-0, but their 1.000 win percentage didn’t mean they were going 82-0. In 2005, Zydrunas Ilgauskas had 18 rebounds on opening day, but he only averaged 8.6 on the year. One game isn’t much to base a season on, and neither is 5 minutes. When using a statistic it’s important to make sure that the sample size is large enough that it’s not just an abberation. That Butler perfomed ludicrously well in the short time alloted to him doesn’t disprove per minute stats just like 8 heads in 10 flips doesn’t disprove probability theory.
ARGUMENT #2: Per game stats are better than per minute stats. Or “is Dirk Nowitzki a better rebounder than Reggie Evans?”
As I mentioned earlier, the previous defacto standard of the NBA was per game stats. The inherent problem with per game stats is that not everyone plays the same amount of minutes per game. Last year Dirk Nowitzki averaged 9.7 rebounds a game while Reggie Evans pulled down 9.3. By those numbers alone you might think that Dirk was the better rebounder, but consider this: Nowitzki averaged nearly 39 minutes a night, where Evans was just under 24. It’s reasonable to assume that if Evans was given 15 more minutes a game he would have pulled down a couple of more boards, and it’d be nice to be able to say this statistically.
Everyone is familiar with the baseball stat earned run average (ERA), but imagine if we measured pitchers by runs allowed per game instead or runs per inning. Randy Johnson’s 2.79 runs per game would look awful compared to Tanyon Sturtze’s 0.64. Luckily someone had the brilliant idea of measuring pitchers across innings pitched. By using ERA, it’s clear that Sturtze’s 4.73 ERA is almost a point higher than Johnson’s 3.79.
Per minute stats in basketball does the same thing, it can account for a disparity in playing time. Instead of dividing rebounds by games played, we can divide them by minutes played. On average Reggie Evans had 0.39 rebounds per minute, which was much higher than Dirk’s 0.25. By those numbers two things are obvious. First, Evans is a much better rebounder than Dirk (he’s nearly 50% more efficient on the glass). The second is that per minute stats are a much better way to compare players than per game stats.
What Can Stats Do For You? (Part II) by KnickerBlogger | permalink | trackback
If you think this article begins abruptly, it is because you haven’t read Part I first.
--------------------------------------------------------------------------------
ARGUMENT #3: Per 40 minute stats are useless because hardly anyone plays 40 minutes. Or “there’s a reason Reggie Evans doesn’t play 40 minutes”.
In 2005 only 5 players averaged more than 40 minutes a game, so why would we choose such an unlikely high minute total for our per minute stats? Let’s take a look back at Part I’s example where I showed per minute stats to be more useful than per game stats:
On average Reggie Evans had 0.39 rebounds per minute, which was much higher than Dirk’s 0.25.
The problem with per-minute stats is that they are decimals that are hard to envision. What the heck is 0.39 of a rebound anyway! It’s just easier to comprehend that Evans averaged 15.7 rebounds for every 40 minutes played. Here’s another example, guess what Ben Wallace’s average was in 2005:
A. 0.038 Blocks/Min B. 0.066 Blocks/Min C. 0.132 Blocks/Min
Dirk Nowitzki is player A. Ben Wallace is player B. Player C is double Ben Wallace’s average, which might be Andrei Kirilenko standing on Shaq’s shoulders or Theo Ratliff on a pogo stick. Unless you’re used to dealing in per minute stats 0.066 blocks would be meaningless to you. Back in Part I (you did read Part I first, right?) we discussed the baseball stat ERA, which is just runs per inning multiplied by 9. Just as it’s easier to visualise Randy Johnson giving up 3.79 runs per 9 innings than 0.42 runs per inning, it makes more sense to say that Big Ben averages 2.4 blocks per 40 minutes than to talk in fractions.
The truth is we could use any multiplier for per minute stats: 10, 12, 40, 48, pi, the square root of 2, or the national debt. The number 40 was taken because in today’s NBA the best players average about that many minutes. What’s important to remember is that per 40 minute stats is not advocating that the coach should give that player 40 minutes a night. When someone mentions that Mariano Rivera had a 1.38 ERA, they’re not saying that Mo should go out and throw all 9 innings. Nor are they saying that if Rivera pitched 9 innings, the opposition would score less than a run and a half per game. There are many reasons that someone might have good per minute stats in one area and not get a ton of minutes, from poor defense to bad conditioning. In Reggie Evans’ case, he doesn’t give you much else other than rebounds and put backs. Last year sometimes the Sonics wanted a shot blocker (like Jerome James), sometimes they wanted a scorer (Radmanovic), sometimes they wanted a little of both (Collison), and sometimes they wanted the same exact thing as Evans except with a different name on the uniform (Fortson).
By using 40 minutes, no one is advocating more playing time. It’s just a fair way of comparing two players that play different minutes.
ARGUMENT #4: Statistics don’t accurately reflect what happens on the court. Or “don’t let Kyle Korver’s FG% fool you.”
Ever been forced to try to unscrew a phillips screw with a flat head screwdriver? Have you tried to follow a recipe without having all the ingredients? Ever use duct tape as a quick fix it? No this isn’t one of those “Real Men of Genius” commercials, I’m attempting to evoke that feeling of trying to do a task without having the proper tools. If you’re unable to remove that screw with the wrong type screwdriver, it’s doesn’t mean the screwdriver is broken. Nor does it mean that screwdrivers in general are useless.
So why is it when someone uses the wrong stat in basketball, there are a few people that are quick denounce all stats? Take for example the only stat presented to the public for measuring shooting accuracy, field goal percentage (FG%). When FG% places sharp shooter Kyle Korver in the bottom half of the league, it’s understandable for people to think that statistics don’t adequately describe the game. If you can’t express something as simple as shooting ability with statistics, then what hope do you have for the other complexities of basketball?
The problem lies with the mainstream media ignoring any statistical advancement since the days of George Mikan. There are a host of stats that can be easily presented to the public which are improvements over the currently used ones. Field goal percentage was useful for the early days of the NBA, but since the three pointer was adopted by the NBA in 1979 it has become obsolete. The three point shot is a high risk/high reward proposition. A player who hits 33% from behind the arc is just as efficient as someone who hits 50% from the field, but FG% doesn’t show this to be true. Effective field goal percenatage (eFG%) compensates for this difference by giving players a proportional bonus for three pointers, and puts long range bombers back on the same plane as inside bangers.
Despite having 26 years to introduce this improved stat to their readers, collectively the media has failed in their duty to inform the public. They still use stats like FG% which show Melvin Ely to be more accurate than Kyle Korver. Ely had a 43% FG% last year, while Korver shot 42%. However when compensating for the extra points awarded on a three point shot, Korver’s 57% eFG% shows him to be far superior. The media still uses rebounds per game, which shows Dirk Nowitzki to be a better rebounder than Reggie Evans (see above). They still talk about points per game which shows the 13 win Atlanta Hawks as a better defensive team than the 62 win Phoenix Suns (102.5 PPG vs. 103.3 PPG), despite points per 100 possessions (pPTS) showing the Suns to have been the superior defensive team (111.3 pPTS vs. 107.0 pPts)
Instead of putting people on television that use these accurate tools to talk about players, they put on their circus show of performers. At halftime of a nationally televised game they’d rather spend time showing amateurishly doctored images of their commentators that would get booed off of a Fark photoshop contest. Sports shows feature those that can scream their point the loudest, not the ones that make the most sense (right, Mr. Cuban?) The print media is slowly catching on, but is still behind the times. Only a handful of stat savvy writers have been hired, such as John Hollinger of ESPN.com, Kevin Pelton of SuperSonics.com, and Martin Johnson of the New York Sun.
The media has perpetuated the myth of stats being occasionally useful but largely lacking. I want to throw my tv out the window when I hear something like “Kyle Korver only shot 42% last year, but don’t let that stat fool you.” Ironically it’s not the stat fooling the public, but the person using it.
|