Recently I started contributing to the New York Times’ NBA blog “Off the Dribble.” My first article, which compares Kobe Bryant and LeBron James’s abilities to “make their teammates better,” can be found here:
My apologies for the relative silence at Basketball-Statistics.com during the last few days. I’d like to blame it on the holidays, but that’s actually not the case. I’ve been working on a couple of big things recently, including populating a database of the play-by-play data for every NCAA Division I game played this year. Because of the massive amounts of errors present in the official data, this process has been painfully slow and required a lot of tweaks. I’m just about done, though. The first thing I’d like to do with the data is calculate net plus-minus for every player.
There has been some wonderful work on statistical plus-minus, which estimates plus-minus using box score statistics. However, with the play-by-play data, we can calculate the real thing. Those stats should be released in the next few days (all that is left is to run the numbers). Once that is complete, I can work on creating adjusted plus-minus. So be on the lookout for NCAA plus-minus and other advanced research on college basketball!
The following is part of a weekly series at the Orlando Magic blog, Third Quarter Collapse.
Last week, I tracked the defense of the Magic. Thanks to positive response from the readers, I have decided to do it again. For a full a description of each statistic I track and what they mean, see last week’s article. Basically, what I did was watch the game last night and keep my own statistics (things that are not in the box score). Defensive statistics are often quite limited, and techniques such as manually charting and looking for certain things are often necessary to get a clearer picture.
Without further ado, here are the numbers from last night:
Again, if you don’t know what any of those things mean, please read last week’s article.
Recently I took a look at the diminishing returns of rebounds, assists, steals, and blocks. As you may or may not have noticed, one common type of statistic was missing: shooting. Today I’m going to fill in the blanks using the same approach as last time.
If you haven’t read the previous article, the premise is simple. For each lineup in the NBA last year that appeared in at least 400 plays, I project how they will do in each stat using the sum of their individual stats. For example, to predict a lineup’s offensive rebound rate, I simply add the offensive rebound rates of each of the five players in the lineup. I then compare this projection to the actual offensive rebounding rate of the lineup. These steps are followed for each lineup and for each statistic.
If there are diminishing returns (i.e. in a lineup of five good rebounders, each player ends up stealing a little bit from his teammates), the correlation between the projected rates and the actual rates will be significantly lower than one. In other words, for each percentage of rebounding rate a player has individually, he will only add a fraction of that to the lineup’s total because some of his rebounds will be taken away from teammates.
If this still isn’t clear to you, be sure to check out the old article. Once you’ve done that, this article will make more sense.
Back to shooting. I’ve decided to take a look at the diminishing returns of eight aspects of shot selection/efficiency: three-point shooting percentage, three-point attempt percentage (the percentage of a player’s total attempts that are threes), close (dunks/layups) shooting percentage, close attempt percentage, midrange shooting percentage, midrange attempt percentage, free throw shooting percentage, and free throw attempt percentage.
To project a lineup’s percentage in one of those categories, I can’t simply add up the five individual percentages. For example, a lineup of five 30% three-point shooters is not going to shoot 150% from beyond the arc. Instead, I have to calculate a weighted average for the lineup. Therefore, each player’s three-point shooting percentage is weighted by the amount of threes he took. The same approach can be taken with attempt percentages.
For some statistics, such as free throw percentage, we shouldn’t expect to see any diminishing returns. After all, adding a great free throw shooter to a lineup shouldn’t make the other players in the lineup shoot worse from the foul line. However, with other stats (especially attempt percentages), diminishing returns seem more possible.
To start, let’s take a look at the diminishing returns of three-point shooting percentage:
Here we see the slope is just about 1. However, the standard error for this slope is 0.21, so the results are pretty inconclusive.
How about three-point attempt percentage?
Again the slope is just about 1. This time, though, the standard error is just .04. Therefore, we can say with pretty good certainty that there are no diminishing returns for three-point attempt percentage. In other words, adding a player to your lineup that likes to shoot threes is going to add a proportional amount of three-point attempts to your lineup total.
Up next we have close shooting percentage:
The slope is actually above 1 this time, although it’s less than one standard error away from 1. There definitely is no clear evidence of diminishing returns for close shooting percentage. Adding an efficient player around the basket to your lineup will probably not make your other players less efficient around the basket.
Close attempt percentage:
The standard error for this slope is just .05, so we may be seeing slight diminishing returns. But not much.
Midrange shooting percentage:
The standard error for this one is pretty large (0.15), but again there are no real signs of diminishing returns.
Midrange attempt percentage:
These results are pretty similar to those of close attempt percentage. The slope is less than 1 and the standard error is pretty small. Again, though, the diminishing returns effect appears to be quite small.
Free throw percentage:
As I mentioned in the beginning of the article, we shouldn’t expect to see diminishing returns on free throw percentage, and we don’t.
Free throw attempt percentage:
Just like the rest of the stats we looked at, we don’t really see a hint of diminishing returns for free throw attempt percentage.
Unlike statistics such as rebounds, assists, steals, and blocks, shooting (in all of its forms) doesn’t seem to have the problem of diminishing returns. A player’s shooting efficiency will have a proportional impact on a lineup’s shooting efficiency, and his shooting tendencies will have a proportional impact on a lineup’s shooting tendencies. There are other ways to attack this question, though, and in the future I plan on doing just that.
For my previous piece on the Magic, I charted hustle stats such as deflections, loose balls, missed blockouts, etc. While these things are all important, perhaps the area in which hustle is most important is defense. Although it takes more than just good hustle to be a good defender (as a certain Mr. Howard will show us later), effort is one of the keys to being a good defensive team. Therefore, I decided to track defensive plays in last night’s contest between the Magic and the Pacers (this time, I only kept track of the Magic’s stats). I imagine most (if not all) NBA teams track these on their own, as well as companies such as Synergy Sports.
To see the numbers, click the link below:
The rest of this article will explain what those numbers mean. I will also provide a few observations and notes about the contest.
The first column is “Forced Misses.” This is pretty self-explanatory, although I should explain a few things. First, forced misses don’t only occur on an individual’s man-to-man assignment. A help defender that forces a missed shot would receive the credit. Second, I conservatively rewarded a few forced turnovers as forced misses. These were situations in which a player caused his opponent to turn the ball over (through traveling, bobbling the ball, etc.) by applying good pressure and staying in good position.
The second column is “Baskets Allowed.” This is also self-explanatory, with one caveat. If a player made a bad defensive play that eventually led to someone from the opposing team scoring, he was the one credited with an allowed basket. For example, on one play, Jason Williams allowed his man to penetrate into the lane with ease, forcing the Magic to help and rotate. The Pacers swung the ball and ended up with an easy three-pointer. Although Williams’ man did not receive any points or assists for the play, Williams was penalized for allowing the basket.
The third column is “Good Help D.” This occurred when a player came off his man to either help a beaten teammate or to make a good play such as causing a turnover. When Dwight Howard met the opponent in the lane and forced a miss, he was credited with a “Good Help D.” When Ryan Anderson strayed from his man for a second and reached in and knocked the ball loose from another player, he also received credit for good help defense. As you can see by now, many of these statistics are subjective (which is both a great thing and a bad thing).
The fourth column is titled “BB/MD.” This stands for blow-bys/middle drives. This occurred any time a Magic player allowed his man to drive right past him without the use of a screen (in certain cases, when there was a switch on a screen and the new defender allowed the opponent receiving the screen to drive by, a BB/MD was assessed). A BB/MD did not have to result in a made basket to be counted.
The fifth column is titled “Lost Man.” This was recorded every time a player failed to stay on his man, resulting in score. This occurred most frequently in one of two ways: either a player simply wasn’t paying attention and allowed off-the-ball movement (such as a backdoor cut) for a score, or a player failed to chase his man quickly enough through screens.
The sixth and final column containing raw data is “Silly Fouls.” While obviously the most subjective of the six categories, it was generally pretty easy to determine. Fouls that occurred off the ball and away from the play were the biggest culprits.
The rest of the data is computed based on those six categories. I included each player’s minutes played to serve as a reference point. “FG% Allowed” was calculated as follows: Baskets Allowed / (Baskets Allowed + Forced Misses). This statistic does not mean the field goal percentage of the man-to-man assignment. Because Forced Misses and Baskets Allowed are not always credited on a man-to-man basis, FG% Allowed is a bit more complicated. Obviously, a lower percentage is a better percentage.
The final six columns are the six raw categories calculated on a per-minute basis. Like last time, I calculated them in the form of “minutes per stat” as opposed to “stat per minute.” This is to avoid presenting very small numbers. For positive stats such as Forced Misses and Good Help D, a lower number is better (in other words, a player achieves these stats more frequently and therefore in less minutes on average). For negative stats such as Baskets Allowed and BB/MD, a higher number (or no number at all) is better.
Finally, some observations:
I enjoyed tracking the individual defensive performances of the Magic, and I think this type of information sheds a great deal of light on what’s going on beyond the box score. Although I don’t know what Synergy Sports or the Magic track defensively about their players, I imagine data such as this is of interest to them simply because it’s so useful. I plan on doing this again soon.
Just some personal information that I thought should be out there.
A week ago I tracked the hustle plays in a game between the Los Angeles Clippers and the Memphis Grizzlies. Tracking hustle plays is presumably something most, if not every, NBA team does. After all, box scores are pretty limited. Even if we use the play-by-play data to do thorough analysis, it still doesn’t include things such as diving for loose balls, deflections, missed blockouts, etc. But teams would like to know these things, so they must track it themselves.
I decided to track the hustle plays during last Saturday’s game between the Magic and the Golden State Warriors. During the game, I kept track of five things. First, I tracked players going for loose balls. In my experience with a college team, we only record plays where a player dove for a loose ball. But since this is the NBA, and effort is often lacking, I include all plays in which a player ends up with the ball, regardless of whether or not he dove. A second thing I track is drawn charges. You can somewhat glean this from the play-by-play data, but it is much easier to just record it yourself.
Thirdly, I kept track of good sprints. I define these as plays in which a player creates a play for himself or others by sprinting the floor and forcing the defense to adjust. For this game featuring the fast-paced Warriors, I had to be more selective in my criteria or else we’d have a lot of good sprints. A fourth thing I tracked for this game was deflections. This is relatively easy to define and track. Basically it includes any deflection that is not recorded as a steal, rebound, etc. Finally, I kept track of missed blockouts. These were most noticeable when they led to an easy offensive rebound, and they were much more rare in this game than in my first one.
Of course, these aren’t all the hustle plays that players can make. Traditional box score stats such as offensive rebounds and steals often reflect hustle plays. Defense is also largely a product of effort, but that is something I will track another time.
Below is a link to a spreadsheet that contains the hustle stats for the Magic-Warriors game. On the left side of each tab is the raw numbers. On the right side is the per-minute numbers. Instead of presenting them as “statistic per minute,” they are presented as “minute per statistic.” I did this because the numbers are so low. As it turns out, this method is not too difficult to grasp conceptually. For positive statistics such as deflections, a lower number is better (a blank number means the player did not record any deflection at all, which obviously is bad). For negative statistics such as missed blockouts, blank numbers are the best and low numbers are the worst.
I have a few observations about the data:
The Magic did not win this game because they outhustled the Warriors. In terms of effort, both teams were solid and about even. The Magic won, obviously, because of a huge run late in the fourth quarter in which they hit their shots and the Warriors made silly plays.
Up next I’d like to track the defense of the Magic. With a few games in the data set, we may be able to rate the defense of Magic players in other ways besides Defensive Rating, plus-minus, etc.
One thing many people have wondered is whether or not there are diminishing returns for rebounds. Basically, what that would mean is that not all of a player’s rebounds would otherwise have been taken by the opponent; some would have been collected by teammates. Therefore, starting five league leaders in rebounds would probably be overkill because eventually they’d just steal them from each other. At some point, there are only so many rebounds a team can grab, and some are just bound to end up in the hands of the opponent.
This principle is very important to statisticians who wish to develop player ratings systems. These ratings often assign weights to different statistics (including offensive and defensive rebounds), so knowing that a defensive rebound collected by one player would most likely otherwise have been collected by a teammate makes that stat less “valuable” in terms of producing wins.
To test the effect of diminishing returns of rebounds, I decided to go through the play-by-play data (available at Basketball Geek) and compare each lineup’s projected rebounding rates (the sum of each player’s individual rebound rates for the season) to their actual rebounding rates (what percentage of rebounds that lineup grabbed while it was on the floor). After doing some research, I found out a very similar study was done by Eli Witus (formerly of CountTheBasket.com, currently of the Houston Rockets). Before you proceed with the rest of my article, you should read his. Although my method is slightly different, he provides a great explanation of why it’s useful to do the research this way and he also lists some advantages and disadvantages of this method.
Before I show you the results, I should explain the intricacies of my research and also some of the differences between Eli’s study and mine. The individual rebound rates I used were taken from the rebound rates I calculated myself using the play-by-play data. Because both the individual rates and the lineup rates were calculated from the same data, there’s less risk of error due to silly things such as differences in calculations or incomplete data. Also, to reduce the effects of small sample sizes due to lineups that didn’t receive a lot of minutes together, Eli chose to group lineups into bins based on their projected rebound rates. He then regressed each bin ‘s (a collection of different lineups with similar projected rebound rates) projected rebound rate to its actual rebound rate.
When I was coming up with my idea, I chose to do things a little differently, although the purpose is the same. Instead of grouping the lineups into bins, I simply only selected the lineups that met a minimum qualification for plays. Only lineups that appeared in at least 400 plays were included in my study. This left me with a sample size of 475 lineups. Like Eli, I then regressed the projected rebounding rates versus the actual rebounding rates. One final difference between us two is that his article was written in February of 2008, so I’m presuming he used data from the 2007-08 season. I’m using data from the 2008-09 season.
Offensive Rebound Rate
The graph for Offensive Rebound Rate is below:
The key to understanding this graph is looking at the slope of the line. Here, it is 0.7462 (close to the 0.77 number he got). If there were no diminishing returns for offensive rebounds, the slope would be 1. This would mean that for each additional rebound a player could offer to his lineup, he would actually add one rebound to the lineup’s total. If the slope is less than one (such as in this case), it means that each additional offensive rebound by the player adds about 0.75 to the lineup’s total, because some of those would have been taken by his teammates anyways. The slope I have here is pretty high, though, indicating that the diminishing returns effect for offensive rebounds isn’t too strong.
Defensive Rebound Rate
In his study, Eli found that the diminishing returns effect was much stronger for defensive rebounds. Can I replicate his results? Below is the graph for defensive rebounds:
Eli found a slope of 0.29. Mine was close, but slightly higher at 0.3331. Regardless of the minor difference, we both can come to the same conclusion: there is a much stronger diminishing returns effect at play with defensive rebounds than there is with offensive rebounds. While each offensive rebound adds 0.75 to the lineup’s total, each defensive rebound only adds 0.33, indicating that many defensive rebounds are taken away from teammates. Of course, individual cases can vary.
These results help explain why a lot of player rating systems make defensive rebounds “worth” less than offensive rebounds. Eli has a good explanation of it at the end of the article here. For example, in his PER system, John Hollinger assigns offensive rebounds a value more than double the value of defensive rebounds. This is partly due to the diminishing returns effect we found here today and originally in Eli’s work. As it turns out, my numbers indicate that offensive rebounds are in fact worth a little more than double the value of defensive boards. So hats off to Hollinger and his many contemporaries who have managed to weight rebounds appropriately.
I could stop here, but I’d like to take this research a little further and see what other insights we can come up with. First, I’d like to break down the data by location (home and away).
One thing to note is that the projected rebounding rates for the lineups are based on overall individual ratings, not just for home games. If rebounding was usually in favor of the home teams, this would lead the projected lineup rebounding rates to usually underestimate the actual rates in this case. However, since it would presumably do this for all lineups, we can still take a look at the effect of diminishing returns.
With that being said, how does the home data compare to the overall data? For offensive rebounds, the slope is flatter, indicating a stronger effect of diminishing returns. However, for defensive rebounds, the slope is slightly higher, indicating a lesser effect. The differences are minor, though.
We can also take a look at the away data:
As you would expect given what we now know about the home data, the effect of diminishing returns appears to be much weaker on the road for offensive rebounds. In fact, as we can see, the slope is getting close to 1. This indicates that there isn’t much in terms of diminishing returns for this type of rebound. Intuitively, this makes sense. If teams rebound the ball better at home, there are less offensive rebound opportunities for the visiting team. Therefore, it is more likely that an offensive rebound by a visiting player would otherwise have been grabbed by the opponent as opposed to one of his teammates, which in turn makes good offensive rebounders more valuable on the road. The same pattern doesn’t follow for defensive rebounds, though. In both cases, the difference isn’t gigantic, so we should be hesitant to draw any serious conclusions.
The one difference that is large and consistent is the difference in slopes between offensive and defensive rebounds, no matter the location. Confirming what Eli found in his original studies, this data says that the effect of diminishing returns is much stronger on defensive rebounds than it is on offensive ones. Therefore, offensive rebounding is a more “valuable” skill in terms of how you rate players, and some of the best player rating systems do take this into consideration.
So far, this whole article has been about the diminishing returns of rebounds. However, we can also use the same lineup-based approach to look at other statistics. Today I’ll also explore the diminishing returns of blocks, steals, and assists. Eli already used his method to take a crack at the usage vs. efficiency debate, and I recommend you read that article for some fascinating insight.
Block Rate, for a lineup, is defined as the percentage of shots by the opposing team that is blocked by one of the players in the lineup.
Blocks are an interesting statistic to examine. After all, there are only so many block opportunities around the basket and occasionally on the perimeter. When you also take into consideration the fact that teams often funnel players into the waiting arms of a dominant shot-blocker, it seems as though the diminishing return for blocks should be relatively strong. That is, if you add a shot blocker that normally blocks 4% of the opposing team’s shots to your lineup, you shouldn’t expect to block nearly that many more as a team because of diminishing returns. To see if this is true, I used the same methodology that I did for rebounding and came up with this graph:
As it turns out, the slope is at 0.6015. This puts Block Rate somewhere in the middle between Offensive Rebounds and Defensive Rebounds. A lineup full of good shot blockers will almost certainly block more shots than a weaker lineup, but the difference may not be as much as you might think due to effects of diminishing returns.
Up next we have Steal Rate. For an individual, it is defined as the number of opponent possessions that end with the given player stealing the ball. Therefore, for a lineup, it would be defined as the number of opponent possessions that end with a steal by anyone from that lineup. The graph for Steal Rate is below:
Here, we see the slope is nearly 1. This indicates that there is practically no diminishing returns effect on steals. If you add a player 2% better than average in terms of steals to your average lineup, you should expect to steal the ball almost 2% more than you currently do. Another way to put it is that usually, if a given player steals the ball, it’s not likely that someone else would have stolen the ball if he failed. Of course, like with every graph so far, the R^2 is still very low. This means that we can’t really predict how many steals a lineup will get simply by adding the Steal Rates of all of its players.
Finally, we have Assist Rate. For an individual, it would mean the number of field goals made by a player’s teammates that he assisted on. For a lineup, it means the percentage of made field goals that were set up by an assist. The graph is below:
Of any graph presented on this page so far, this one has by far the lowest slope. Normally this would indicate that there is a huge diminishing returns effect for assists. However, I’m not sold on this explanation just yet for various reasons, so for now I will just present the data as is.
I discussed a number of different issues today, so I think it’s good to recap what I’ve presented. First, using a method similar to the one Eli Witus used at CountTheBasket.com, I found that there is a large diminishing returns effect for defensive rebounds that is significantly larger than the effect for offensive rebounds. This confirms the common belief that offensive rebounds are “worth” more than defensive ones. When we split the data into home and away, it appears that individual offensive rebounding skill is particularly important on the road, indicated by a very high slope on the graph. Finally, I took a look at the diminishing returns of a few other advanced statistics and found the strongest effect on assists and a weaker but still significant effect on blocks.
If you have suggestions or comments about my work, please e-mail me at email@example.com. And again, much credit must go to Eli Witus, who originally thought of these ideas well before I did.
(Note: These stats are updated through November 27. Games from this past weekend aren’t included. )
For those who are unaware, every year I calculate a statistic called Composite Score (numbers are here and here) for each player. Composite Score is a rating system that combines six different advanced statistics, with three measuring offense and three measuring defense. The offensive statistics are Offensive Rating, Offensive Plus-Minus, and PER. The defensive statistics are Defensive Rating, Defensive Plus-Minus, and Counterpart PER (the estimated PER allowed on defense by a player). These numbers can be obtained from Basketball-Reference.com and 82games.com.
Although I can’t compute Composite Score for Magic players just yet (because of the way its calculated, I need the stats for every player in the league before I can calculate Composite Score), I can still present how every Magic player has fared in the individual components. I will break things down into offense and defense. Below is a table presenting every Magic player’s offensive performance so far, as measured by the three offensive statistics I mentioned earlier:
Dwight Howard has been excellent as usual. Jason Williams has been a pleasant surprise and has been particularly efficient, posting an Offensive Rating of 123. His Offensive Rating is second on the team to J.J. Redick. Believe it or not, the best newcomer offensively this year for the Magic has been Ryan Anderson. Of course, don’t go crazy about his offensive plus-minus just yet. That number can be flammable, and it is particularly unreliable this early in the year. I included it for the sake of completeness, but I rarely use it for reference this early in the season.
By his standards, Vince Carter has struggled offensively. His PER is still relatively good, but his Offensive Rating is below the league average. Before his injury, Jameer Nelson was also failing to meet expectations, but again, he wasn’t bad either. With the exception of Brandon Bass, many of the reserves have struggled somewhat on the offensive end, posting PER’s and Offensive Ratings below league average. Following his return from suspension, Rashard Lewis has struggled perhaps as much as anyone else on the team. His efficiency has been well below his usual rates.
Next, let’s take a look at defense:
Again, I wouldn’t draw too many conclusions from the plus-minus numbers. As you can see from the table, if we were to take them at full value, we’d think the Magic is a team that is half defensive superstars and half defensive liabilities.
To start, Dwight Howard has been just as good on the defensive end as he’s been offensively. A Counterpart PER of 13.2 for a center is particularly impressive. Perhaps riding the coattails of players like Howard, Williams has posted good defensive numbers as well. He’s never had the reputation of being a lockdown defender, but his effort has been solid.
Looking down the list of defensive stats, we see nothing out of the ordinary except for a few things. Nelson’s CPER is very high, especially for a point guard. Carter’s, on the other hand, is very low (more on this later). Anthony Johnson’s plus-minus is comically bad, although that’s almost certainly the result of a small sample size. Finally, Matt Barnes looks great defensively according to Defensive Rating, but below average according to CPER. We’ll see how those numbers progress as the season goes on.
Back to Vince Carter’s defense. A couple of weeks ago, I wrote an article comparing his defense to Hedo Turkoglu’s. In the article, I said:
Here, Turkoglu strikes back. Carter looks below average in just about every category, and this supports his reputation. Turk, on the other hand, recorded numbers well above average in every category. The trickiest part about these comparisons is team context. It is something I’ve mentioned constantly when talking about my Composite Score numbers. Because of the way stats are tracked (at least publicly), it’s very difficult to separate a player’s individual contribution to his defense. How much of this is Hedo’s own doing, and how much of it is due to the fact that Orlando featured a very strong all-around defense? It’s hard to say, but I do think Turkoglu was probably a better defender than Carter.
Looking at Carter’s numbers in the early going, we can see that the team you’re on sure has a huge impact on your defensive numbers. He is better in every defensive category. How does he compare to Turkoglu now?
Turkoglu’s Defensive Rating has skyrocketed to 116, but his other defensive stats are still very impressive. This year, it’s hard to tell which player is better on defense. Carter has a low Defensive Rating and his plus-minus is very, very negative, but we don’t know how much that means yet. I think the lesson to take from this is that defensive statistics are pretty unreliable, especially this early in the season.
We’ll have to return to these numbers in about a month or so. The longer we wait, the clearer the picture becomes.