BasketballStatistics.com  Innovative Stats and Analysis  




by Scott Sereday The 20082009 NBA season preview for ESPN the Magazine compared analysis of players, coaches, general managers, statheads and fans as each defended why they “know best.” This gave me the idea of attempting to use some of these sources to “fill in the gaps” that statistics don’t account for. As statistical measurements and analysis progress, these gaps can be explained in a more unbiased form. (Of course, a huge benefit of using statistical analysis is also discovering and confirming what visual observation does not easily attest, but it is also important to realize what is being measured and what isn’t in any evaluation.) In this article, I will first develop statistical offensive and defensive plusminus models using advanced statistics, almost all of which can be found on 82games.com. Then I will compare them to estimated ratings for coaches, scouts and the media. I will discuss the results of my studies in the following 4 part series: Part 1: Development of Offensive Adjusted Plus/Minus estimates using advanced statistics Part 1: Development of Offensive Adjusted Plus/Minus estimates using advanced statistics. A few years ago, Dan Rosenbaum has used statistical models to estimate adjusted plus/minus. These results and relevant discussion can be viewed at the following sites: Dan’s second model will be my primary source of comparison. He used data from the 20022003 through the 20042005 seasons in this model. I used the data from the 20052006 season through the 20072008 seasons in most of my models. I obtained this data from basketballvalue.com. I made slight adjustments when data was inconsistent with outside sources. I used some data from databasebasketball.com to check for reasonableness. Since the data on this site for the 20082009 season is not currently available, I have excluded the data from most of my estimates. However, I have included some results at the end for reference purposes.
Analysis: In my first model I used the independent variables that Dan used with a couple modifications. I eliminated several insignificant variables and used statistics per 100 possessions rather than per 40 minutes. I also didn’t use minutes per game as a variable since it can be viewed as a product of how coaches see the game. Finally, I didn’t weigh by player minutes, player possessions or any clutch/garbage aspects. Model 1 – Offensive Plus/Minus estimate using baseline statistics
*True Shot Attempts is defined as 0.44 X Free Throw Attempts + Field Goal Attempts. Height is measured in inches. To clarify the meaning of this table, a team is expected to score more points per 100 offensive possessions at a rate defined by the applicable (estimated) coefficient. For example, suppose that Player A and his Team A are identical to Player B and his Team B in every way except that Player A accumulates one more offensive rebound per 100 offensive possessions than Player B. This model assumes that when Player A is on the floor, Team A is expected to score 0.778 points more per 100 possessions then Team B is expected to score when Player B is on the floor. The Pvalue is approximately equal to the probability of obtaining a coefficient greater than the value observed if the true coefficient being estimated in the model is equal to zero. Thus, the above model indicates that there is a 25.6% chance that the true coefficient for steals is actually zero. Notice that these coefficients are typically close to Dan’s coefficients, except for free throw attempts and steals. Could these differences stem from the impact of these two statistics on team pace due to my usage of points per possession and Dan’s usage of points per minute? The large standard error of the coefficient for steals could also play a role in this difference. There could also be a difference in average value from year to year in some of these statistics. Some additional analysis of the meaning of these coefficients can be found in the apbrmetrics thread provided earlier in the article. For my next step I grouped shots and assists by close shots, 2 point jumpers and 3 pointers assisted and unassisted. Although they may not exactly reflect the values I used, most advanced statistics can be found on 82games.com. I defined close shots as anything pertaining to lay ups, dunks, finger rolls, reverses, running shots, hook shots and bank shots. I combined all missed field goals and all assists on jump shots since there was not a substantial difference in coefficients between these variables. Model 2 – Offensive Plus/Minus estimate using advanced statistics
The first thing that jumped out at me is that getting to the free throw line is so valuable that a missed free throw may actually increase expected scoring about as much as it limits scoring. Obviously, missing an “and 1” only adds value above that of not attempting the free throw at all. Perhaps more importantly, the tendency to get to the line may limit defensive aggressiveness. It is important to realize, however, that much of the value of drawing a foul is not captured for several reasons. Getting to the free throw line can increase the amount of bonus attempts, many of which might be attempted while a player is not on the floor, thus not contributing to a player’s plus/minus and creating a negative bias in the measured coefficient. Perhaps more importantly, free throw attempts often do not account for situational value. Many times, missing free throws can increase the expected scoring output, but lower the expected win probability as in end of game situations. Another observation is that an assist is much more valuable on close shots and an assisted shot is much more valuable if it is from long range. This is likely explained by the increased scoring expectation obtained from making an assist type pass to a 3 point shooter versus a lay up shooter. A pass to a direct lay up attempt greatly improves scoring chances over the expectation before that pass was completed. However, a pass directly leading to a 3 point shot only moderately increases scoring expectations, so much of the weight should be attributed to the 3 point scorer. Interestingly, the coefficient for an unassisted 3 pointer actually exceeds 3 points. This could be reflective of players who create shots in difficult “clutch” situation. In many instances, an unassisted 3 pointer is not taken unless it needs to be (end of the shot clock, etc.) and a player who takes such shots will often miss shots with a lower negative impact on the team. Unassisted 2 point jumpers seem to be the least valuable shots after including the value of the assist. Perhaps they allow the defense to collapse and defend more areas of the half court. Although I mention that the above statistics have “value,” it is important to realize that the coefficients indicate the value of a player who accumulates the relevant statistic rather than the value of the statistic itself. Additionally, a player’s value changes from situation to situation. Most plus/minus figures estimate the average of these values. Another key factor could be the relationship between the statistical variables themselves. For example, if a player can collect more offensive rebounds, he might expect to also have a higher number of unassisted close shots or putbacks. Thus the value for offensive rebounds in Model 1 was at least partially absorbed by the value of unassisted close shots.
For the next step, I attempted to add 2 new variables to my regression model. First, I added Dan’s versatility index of (assists X points X offensive rebounds)^(1/3). I then used a variable that showed the interaction between height (in inches above 60) and true shot attempts. I reasoned that such a player would distract shot blockers from disrupting the offense. Model 3 – Offensive Plus/Minus estimate using advanced statistics and interactions
This model indicates that, given a similar result, the taller a player is, the more his offensive involvement can help his team. Notice that this variable decreases the “value” attributed to offensive rebounds, while it increases the “impact” of steals and turnovers, among others. This is the result of the correlation between these statistics and height (and possibly shot frequency). When I compared the impact of the versatility index, I noticed the addition of this statistic did not have any real impact on unassisted 2 and 3 point jump shots made and turnovers. It also had little impact on free throws, missed shots and steals. However, the impact on the values of assisted field goals made, unassisted close shots made, assists and offensive rebounds was reasonably large. This means that the value of someone creating an outside shot is not enhanced by the ability to distribute and/or grab offensive rebounds. However, the value of someone who creates unassisted close shots is enhanced if they can also distribute the ball and/or collect offensive rebounds. This could be an indicator of the importance of making (or creating) good decisions specific to the situation. In the end, I decided the interaction of assists, offensive rebounds and points was not significant enough to keep and that the results seem to add too much variability. After eliminating some insignificant variables, I came up with my final Model. Model 4 – Final Offensive Plus/Minus estimate using advanced statistics
The following table summarizes a sample of the 2008 offensive plus/minus’ for Model 1 (statistical), Model 4 (advanced statistical) and adjusted plus/minus based on the article by Barzilia/Ilardi at http://www.82games.com/ilardi2.htm.
Both the statistical estimates and the adjusted plus/minus have statistical error due to the samplings and the models. Since the Barzilia/Ilardi model has a smaller error than the actual adjusted plus/minus model, I used that in my comparison of statistical and advanced statistical plus/minus. (It must be noted that the Barzilia/Ilardi model is less valid since it contains some weight from performance of prior years.) I grouped the players above into 4 groups based on the proximity of the Barzilia/Ilardi model to each statistical model. Advanced statistical model is significantly further: JR Smith Advanced statistical model is further: Tim Duncan, Dirk Nowitzki, Ron Artest, Caron Butler, and Manu Ginobili Advanced statistical model is closer: Kobe Bryant, Rashard Lewis, Paul Pierce, Dwight Howard, and Chris Bosh Advanced statistical model is significantly closer: Steve Nash, Hedo Turkoglu, Baron Davis, Amare Stoudemire and Rasheed Wallace The first thing to notice is that teammates appear often on these lists. The estimates of the Spurs stars are closer in the regular statistical model than in the advanced statistical model. Ginobili’s value is greater than Barzilia/Ilardi’s model estimate while Duncan’s value is less. (Tony Parker’s value is also greater.) The Magic’s advanced statistical estimates are closer to the Barzilia/Ilardi estimates than the regular statistical estimates. Lewis’ and Turkoglu’s values increased while Howard’s offensive value decreased. Similarly, Nash’s estimated value increased and Stoudemire’s value decreased to more closely match the Barzilia/Ilardi estimates. That means that half of the players on the above list are teammates. (8 of 16). Paul Pierce, Kobe Bryant and Rasheed Wallace also have teammates on this list. (Pierce was not on Garnett’s team in 2006 or 2007 and Bryant played much of 2008 without Bynum.) There are no teammates exclusively without significant differences between the models. This could strongly suggest that there is a problem with the lineup interactions of some of these teams. Alternatively, it could suggest that there is an unusual team statistical bias on these teams. For example, the regular statistical model suggests that Amare Stoudemire’s offensive value is 7.5 points per 100 offensive possessions compared to Steve Nash’s offensive value of 5.0. Amare has always had the better raw statistics, but somehow Nash has those 2 (almost 3) MVPs, almost purely based on offensive performance. Many suggested that Nash “made things go” and credit him with the turnaround the Suns experienced coinciding with his acquisition. Some felt his type of player was underappreciated (at the time). However, when confronted with the statistics and the “production,” most would say that Nash was not likely even the best player on his team, granting this honor to Amare. Many claimed that Marion was also better than Nash. However, upon looking at his plus/minus figures, he is always much better than his raw stats indicate. Furthermore, his advanced statistics show that he has a ton of close assists; his home team is among the stingiest at rewarding assists (these could be correlated); he creates an extremely high percentage of his shots and shoots a very good percentage on these shots. Sure enough, after accounting for advanced statistics, the new estimated values attribute Nash contributing 8.4 additional points above average and Amare contributing 5.6 points above average. The adjusted plusminus models have Nash even higher than this estimate and Amare even lower. Finally, I have provided the Offensive Plus/Minus estimates for the 20082009 season.
If you have any questions or comments please feel free to contact me at ssereday@gmail.com.
Copyright © 2009 BasketballStatistics.com 
