Something I’ve wanted to do for a while, and something I imagine every team does, is watch a game from start to finish and track all of the hustle plays made by both teams. In a league in which every player is not always giving 110%, sometimes a little bit of hustle and effort can go a long way.
With that in mind, the game I chose to track was Sunday’s afternoon contest between the Los Angeles Clippers and the Memphis Grizzlies. The Grizzlies have been playing well lately, including an impressive victory over Portland. Altogether, they came into the game having won five of their last seven contests. The Clippers were also relatively hot, having won three of their last four. As one would expect from a Grizzlies-Clippers game, the matchup was pretty lackluster through three quarters in terms of excitement and competitiveness. Then the fourth quarter arrived and the Clippers exploded, including a 22-0 run that won them the game. Los Angeles outscored Memphis 33-7 in the fourth quarter. After it was all said and done, the game ended up being a very memorable one for the Clippers and one the Grizzlies would soon like to forget.
Hustle, of course, played a large role. Below is a link to a spreadsheet which has the results of the statistics I tracked for the game. Those statistics included loose ball attempts, charges drawn, good sprints down the court (on either offense or defense), deflections, and missed blockouts.
A couple of non-hustle related notes:
As players get older, the belief is that they learn the tricks of the trade and get better at defense. During their first few years, they’re ill-equipped and unable to have a positive impact on defense, despite their superior athleticism and energy.
Do the numbers support these beliefs? We must turn to the always-useful Basketball-Reference.com. Using its Player Season Finder, I put together a spreadsheet containing every season from every player (minimum 500 minutes played) for the past five years. Using this data, we can see how Defensive Ratings change as players get older. Defensive Rating was developed by Dean Oliver, and it estimates the number of points a player allows per 100 possessions. Obviously, a lower number is better. To read more about it, check out the Basketball-Reference glossary. Let’s take a look at the chart:
I limited the age range from 19 to 36 to avoid outliers. On the x-axis, we have the age, and on the y-axis, the average Defensive Rating for that age. The results seem to confirm the common belief. Younger players tend to post higher (worse) Defensive Ratings than older players. Real life doesn’t work perfectly, so there are some fluctuations. However, the correlation is strong, indicated by the relatively large R^2 (explanation here). Therefore, there does appear to be something to the notion that players get better defensively as they get older.
We can also produce a similar graph using Defensive Win Score, a similar measure to Defensive Rating (for more information, check the glossary again). Basically, DWS is the amount of wins a player adds to his team through his defense. The chart is below:
The R^2 is slightly smaller, but the general idea is the same. Players get better defensively as they get older. Not considerably so, but statistically significantly so.
However, we must approach these results with caution. Let’s say, hypothetically, that big men generally have lower Defensive Ratings. Let’s also say, hypothetically, that big men stay in the league longer than their shorter counterparts. These two scenarios would combine to make it look like players get better defensively with age. What’s a simple way to account for complications such as this? Take a look at the data position by position.
To start, let’s look at centers:
The results appear to be clear as day here. The line is a little wavy, but centers sure seem to get better defensively as they get older. The average for 35-year olds is over three points per 100 possessions lower than the averages for 19-, 20-, and 21-year olds. Do power forwards react the same way to age?
Simply put, yes. These results tend to go with common logic. Many raw and young big men commit silly fouls, ignore help defense, go for the spectacular block too often, etc. However, we should not treat these results as gospel, as I will explain later.
How about small forwards?
Just like the previous two positions, it appears small forwards age well, at least on the defensive end. The magic number for this position appears to be 29. Small forwards that were at least 29 years of age during the last five seasons performed much better on the defensive end than their younger counterparts did. Let’s take a look at shooting guards:
We keep seeing the same results. No matter what position you look at, the story is the same. Players get better on defense as they get older. Finally, let’s take a look at the inevitable and see how point guards get better defensively with age:
Woops. That trend line has an oh-so-slightly negative slope, but it’s not exactly a great fit for the data (the R^2 is practically 0). Clearly, then, point guards don’t follow the same path as other positions. Older is not better in this case. For a position that often relies so much on speed and quickness, this makes sense. However, even point guards in their prime (around the age of 27) don’t perform significantly better than the young ones.
To wrap this up, we can make the following statement based on the data: Except for point guards, players generally get better on the defensive end as they get older. However, there are a number of issues to address before we go too far and actually believe that bold statement I just made:
UPDATE: After doing some more research, we may have to re-think things. Thanks to suggestions by Ryan Parker and Mike G at the APBRmetrics board, I decided to plot the average change in Defensive Rating (the difference between the current year and the last) for each age. It is below:
Looking at the graph above, we notice a couple of things. First, over the last five years, players of all ages tend to get worse defensively on a year-by-year basis. Whether it’s because of improving offenses or declining defenses, scoring has increased during each of the last five years.
More importantly for this study, we see that older players are declining faster than younger players are. For example, during the last five years, a 26 year-old is likely to have a Defensive Rating 0.5 points higher than he did a year ago. On the other hand, a 35 year-old is likely to have a Defensive Rating 1.5 points higher than he did a year ago. The difference between old and young isn’t much, but we can probably say that old isn’t definitively better than young.
Like I said in my original post, selective bias may be a problem. After all, this most recent research doesn’t dispute the fact that as a whole, when you look at all the old players, they tend to be better defensively than the young players. But that’s not because they got better as they got older. The data shows this. What we may be able to say now is that aging doesn’t improve your defensive abilities, and if you want to stay in this league as a veteran, you better be good at defense, because teams will “selectively remove” you from the league if you’re not.
The following is part of a weekly series at the Orlando Magic blog, Third Quarter Collapse.
A couple of weeks ago, Eddy Rivera e-mailed me this:
“I was wondering if you could look at the progression of the Orlando Magic defense this year, in comparison to how the team progressed in its first month under Stan Van Gundy when he arrived in 2007. The reason why I ask is because that’s the first year the Magic were adjusting to SVG’s defensive scheme and eventually, they were ranked 6th in defensive efficiency. Given that this year is a new team of sorts, with so many new players, I wanted to see how the squad was adjusting on defense (SCHOENE projects them to finish 5th).”
It’s about time to take a look at this question. With 14 games (through Sunday) now under their belts, the Magic have developed at least a tiny bit of a sample to look at their defense.
Eddy’s question seems pretty straightforward at first. To find the answer, shouldn’t we just look at how many points the Magic are giving up each game this season? Well, we already know that’s not going to work because you have to factor in pace. Once you factor in pace, though, the study is still lacking. After all, if the Magic play a bunch of offensively inept teams in games 1-7 and a lot of great offensive teams in games 8-14, it’s going to look like their defense is getting worse no matter what. So we must also factor in the level of competition.
With that setup in mind, I took a look at the Magic’s defensive progression through 14 games for each of the last three seasons. For each year, I calculated the points scored per 100 possessions of each opponent and compared that to their season average. I called that difference (between the game total and the season average) “Defensive Score.” I then plotted, for each season, the game number versus the Defensive Score for the first 14 games. Let’s start by taking a look at 2007-08, Van Gundy’s first season with the Magic:
As you can see, the Magic were all over the place in their first 14 games, producing a wide range of Defensive Scores. They allowed some teams to score nearly 30 points per 100 possessions above their season average (game #13 against San Antonio) but also held some teams to more than 30 below their season average (game #10 against New Jersey). The fact that the two performances I just mentioned came in the same week shows how up and down the Magic were as they were adjusting to the defensive schemes of their new head coach. I included a trend line in the chart, but don’t pay too much attention to it because we can see from the line’s information (on the right side) that it is a terribly poor fit. In other words, there was no real progression (either good or bad) from the Magic in the first 14 games of 2007-08.
How about 2008-09? Let’s take a look at the chart:
From the get-go, the Magic were dominating opponents on defense. Most of the points on the chart are below 0, meaning the Magic were almost always holding their opponents to lower than their season average. In addition, there weren’t any real stinkers. Now in his second year at the helm, Van Gundy had his defense at midseason form early in 2008-09.
Finally, let’s look at this year’s Magic team, a squad that has certainly had its struggles on defense:
True to form, the Magic were quite poor in their first seven games this year (with decent performances in the middle). However, things started to change in games 8-9, when the Magic at least held their opponents to close to their season averages. Lately, though, they’ve been playing great defense. In four of their last five games, Orlando has held their opponent below their season average. The one slipup was November 16, when the normally putrid Bobcats were able to put up a few points in Orlando. Overall, though, there appears to be a clear progression and a sign that the Magic’s defense is getting better. Unlike the last two trend lines, which had very poor line statistics, this line appears to be a pretty good fit. If you want the details as to why and are unfamiliar with R^2, click on this link, read about it, and check back here. Basically, there does appear to be something positive going on with the Magic defense.
I think these graphs are pretty enlightening. They show that this year’s Magic defense just needs time to get to its 2008-09 levels. I will check back in with these numbers in the future.
Last week, I calculated my own version of various advanced statistics, such as Rebound Rate, Assist Rate, and Usage Rate. The difference between my versions and the ones you normally see are that mine were based on actual play-by-play data, rather than estimates. Although my method isn’t perfect (partly because the play-by-play isn’t always reliable), I figured it was more accurate to base our stats on stuff that has actually happened as opposed to estimates of what happened.
Under that assumption, the question is how accurate are the numbers we’ve grown to know and love? Although they’re not too difficult to calculate, the play-by-play figures aren’t always available, so we need to know if we can count on the data that is most common. How far off are these estimations? Are there certain types of players for which these stats are usually inaccurate?
To recap, these are the stats in question:
Let’s start with a simple test. How well do the estimated numbers correlate with the play-by-play numbers? Below is a table that includes the R^2 (explanation) and standard error of each linear regression, as well as the average difference between the two types:
Thankfully, we see that all of the estimations appear to be pretty darn accurate. The R^2’s are all extremely high, and the standard errors are low. Of the seven stats I’m examining, Steal Rate appears to be the most inaccurate. It fares the worst in each of the three table columns. Overall Rebound Rate appears to be the most accurate. From this table, we are given no reason to doubt the validity of the box score estimations.
Although they may be accurate as a whole, perhaps these numbers are inaccurate just for certain players. Specifically, I was wondering if players that rate either really high or really low in a certain statistic are generally rated accurately by the box score estimation. To try to answer that question, I ran another regression. This time, the box score estimation was the independent variable, and the difference between the box score and play-by-play was the dependent variable. The results are in the table below:
There are some things to look out for. Although the adjusted R^2’s are all quite low, even negative sometimes, the slopes are all positive. This would indicate that as a given player gets better in a certain statistic, the box score data is more likely to overrate him in that category. The biggest problems occur with Assist Rate, which has a moderately sized R^2 value.
If that table doesn’t seem intuitive, I’ve also decided to present the results graphically. In each chart below, the x-axis is the box score estimate’s value, and the y-axis is the difference between the estimate and the play-by-play calculation.
All three Rebound Rates look pretty accurate, although they become more unpredictable as the numbers get high, especially with respect to Defensive Rebound Rate. When the Rate is around 10, the errors are pretty closely scattered around 0. However, when you get to 17.5 or 20, the errors become larger.
As I mentioned before, Assist Rate seems to have some major issues. For low Assist Rates, the differences are pretty small. However, when you get to the top assist men, the differences can be quite large. For example, Chris Paul’s Assist Rate for last season, according to the box score data, was 54.5. However, the play-by-play data has it at 51.2. For someone like him, where the number is astronomically high no matter which method you choose, the difference might seem trivial. But it does appear that top assist men are overrated the most by Assist Rate.
There’s not much to gather from the Steal Rate chart, although it becomes clear that my play-by-play computations are generally lower than the box score estimates.
Like Rebound Rate, Block Rate becomes particularly difficult to estimate when the numbers get high. As a percentage of the Block Rate, though, the difference is actually pretty consistent.
Finally, we have Usage Rate. There aren’t any major issues except for one outlier at the bottom, which is the result of complications due to the weirdness of Luc Richard Mbah a Moute’s name (seriously).
In conclusion, my research has shown me that, despite some minor issues, the box score estimations of things such as available rebounds are actually pretty close. They aren’t always perfect, and they can be particularly unreliable when the numbers get large, but overall they do a good job. Hopefully this work will provoke discussion on how we can continue to perfect those stats.
Another quick update. I removed the 0.44 estimator I was using for Steal Rate and Usage Rate to calculate possessions. Instead, I totaled the possessions from the play-by-play data itself. The updated numbers are below.
Some of the best stats out there, ones that most fans familiar with advanced stats know about, are actually based on estimates using box score data. For example, when we calculate Marcin Gortat’s Offensive Rebound Rate, we’re trying to determine what percentage of available offensive rebounds he collected while he was on the court. However, we don’t really know how many rebounds were available. We have to estimate based on how things usually go for the Magic and their opponents, and assign a portion of that to Gortat.
Using box score data, that’s the best we can do. But we also have play-by-play data, and we don’t have to estimate. We (actually, a programming script) can go through the hundreds of thousands of recorded plays from the NBA 08-09 season, and find how many of those resulted in offensive rebound opportunities for Gortat. From there we just total how many offensive boards he had, and divide that by the number of available ones.
This method removes some of the guessing game, and the results of this method on various stats for the Magic will be discussed today. For a full explanation of how everything works, I will refer you to the article I wrote over at Basketball-Statistics.com last Thursday, which is here. Let’s start by comparing the estimated rebound rates to the actual ones, as calculated from the play-by-play data:
We can see that the estimates are pretty darn close. Amazingly, though, Dwight Howard is an even better rebounder than we thought (by 0.3%). Gortat’s offensive rebounding may have been slightly overestimated, but his defensive rebounding was underestimated. The biggest differences were for Keith Bogans and Rafer Alston, who were actually not rebounding as well as we thought.
Now let’s move on to some stuff for the little guys. Here are the comparisons for assists and steals:
Jameer Nelson’s Assist Rate may have been inflated, while Anthony Johnson didn’t receive enough credit. When we use the play-by-play data instead of the estimates, the difference between the two shrinks from 10.9% to 7%. My play-by-play steal rates are slightly lower for every player, and that may have something to do with differences in the way I calculated possessions.
Finally, let’s look at blocks and usage rate:
Again, we see that each player’s PBP data is less than his estimated data. This is not a Magic-only thing. The reason for this difference is again due to different calculations. Block percentage is normally calculated as the percentage of opponents’ two-point attempts that were blocked by the player in question. My calculations counted three-point attempts as well. I feel that this way is more appropriate because, even though it’s rare, three-pointers do get blocked. With usage rates, we again see that the estimates were actually pretty close to the real thing.
Because the differences between the estimates and the play-by-play data are usually small, this information may seem trivial. In many ways, it is. However, it’s nice to get that warm fuzzy feeling when you know the numbers you’re looking at are thoroughly calculated instead of just estimations.
What, does nobody else get that feeling?
When I posted my recalculated stats using play-by-play data over at the APBRmetrics board, I learned that the Block Rates at Basketball-Reference are actually calculated using only opposing two-point attempts. In other words, a player’s Block Rate is the percentage of opposing two-point field goals that the player blocked.
With that new piece of information, I recalculated the Block Rate for every player. The new figures, along with the rest of the recalculated stats, are posted below:
Recently at his web site, Basketball Geek, Ryan Parker used play-by-play data to calculate Dean Oliver’s offensive and defensive ratings. I’ve decided to use Ryan’s approach (and data!) to calculate some of the other advanced statistics out there, many of which were developed by John Hollinger.
Many of these statistics are usually calculated using estimates based on the data available in box scores. However, with the play-by-play data in hand, we can turn these estimates into actual numbers. To calculate the stats, I used the formulas available in the Basketball-Reference glossary. For today, the following numbers will be presented:
There are a number of different ways to calculate Assist Rate. I calculated my version based on the method used by people such as Ken Pomeroy and Ed Kupfer. Ryan defines his Assist Rate as the “percentage of possessions used that were assists.” There are subtle differences, I believe.
So what’s the difference between my calculations and the usual ones? The following changes:
The numbers for every player are available in the Google Docs spreadsheet below:
My next step is to calculate PER using these numbers, and I plan to get to that shortly. Much credit again must go to Ryan Parker for inspiring me to do this.
Making major changes to your team when you are already very, very good appears to be the thing to do in today’s NBA. The Lakers essentially swapped Trevor Ariza for Ron Artest, the Magic swapped Hedo Turkoglu for Vince Carter, and the Cavaliers added Shaquille O’Neal. Each of these teams was among the best in the league last year, and we’ll see how messing with a good thing turns out.
Of course, since this is a Magic blog, I will look at Orlando’s decision to let Hedo Turkoglu walk and trade for Vince Carter. I will be using a number of advanced statistics that, thankfully, I didn’t have to calculate myself. There is a wealth of basketball statistics available on the Internet these days, and everything I will discuss today is publicly available. The numbers I will be using were gathered from BasketballValue.com, my own Composite Score statistics, Basketball-Reference.com, 82games.com, and a new site called Hoopdata.com.
Overall Player Rating Statistics
Let’s start with a cursory glance at overall player ratings for Carter and Turkoglu. With these numbers, Turk fares better in adjusted plus-minus and Composite Score, while Carter has the upper hand in PER and Win Shares. The difference in Composite Score is the most dramatic, and that is mainly due to differences in their Defensive Composite Scores (which I will get into later). There is also a pretty substantial difference in PER, and I think that is a reflection of Carter’s overall production per minute being more high volume than Turkoglu’s production. The fact that Carter had more Win Shares than Turkoglu despite playing on a pretty bad team is quite impressive.
According to the numbers, this is Carter’s biggest advantage. In just amount any offensive metric you use, Carter looks better. He is more efficient and produces more total offense than Turk. Similarly, he had a greater impact on his team’s offense in terms of plus-minus. Offensive Composite Score reflects all of these things.
Here, Turkoglu strikes back. Carter looks below average in just about every category, and this supports his reputation. Turk, on the other hand, recorded numbers well above average in every category. The trickiest part about these comparisons is team context. It is something I’ve mentioned constantly when talking about my Composite Score numbers. Because of the way stats are tracked (at least publicly), it’s very difficult to separate a player’s individual contribution to his defense. How much of this is Hedo’s own doing, and how much of it is due to the fact that Orlando featured a very strong all-around defense? It’s hard to say, but I do think Turkoglu was probably a better defender than Carter.
One of Turkoglu’s biggest benefits to the Magic, and something I thought they may miss, was his ability to create looks for others. This was magnified in the playoffs when the Magic dominated the Cavaliers behind the creativity of Turkoglu. Of course, Carter is no slouch in this area either, and the numbers above reflect this. His Assist Rate was actually higher than Turkoglu’s, and he was able to take better care of the ball in the process. Despite this, 82games.com gave Turk a better “Passing Rating,” although a worse “Hands Rating.” Regardless of the tiny differences on each side, I think it’s safe to say that Turkoglu’s playmaking abilities are no better than Carter’s.
These numbers, which are available at Hoopdata, show what types of shots that the two players assisted on. They are pretty similar across the board. I think it’s interesting that Carter assisted on slightly more shots that were converted at the rim than Turkoglu did, despite the latter playing with one of the best (if not the best) finishers in the game in Dwight Howard. It’ll be interesting to see how these numbers look after this season.
Partly because of his success in the playoffs, Hedo Turkoglu developed the reputation of being a clutch scorer and player. Carter has been a go-to guy late in the game for much of his career, so how do the two compare? Last year, Carter was actually more productive and more efficient shooting-wise than Turkoglu. Both were great from the free throw line and reasonably good playmakers, but the difference in effective field goal percentage was pretty dramatic. Carter’s was above average, while Turkoglu’s was well below average. Most players find it more difficult to hit their shots in crunch time when defenses tighten up, so the fact that Carter actually became more efficient with the game on the line is quite impressive.
Ignoring all of the other players involved (although we definitely should not understate them), did the Magic make the right move by switching from Turkoglu to Carter? VC is better offensively, and two of Turk’s most famous skills, playmaking and clutch play, are performed as well or better by Carter. The only concern is defense, especially since the Magic lost Courtney Lee. However, we don’t know for sure how great of a defender Turkoglu is when he isn’t playing in front of Dwight Howard, so that aspect remains to be seen. All in all, considering Carter’s potential to put them over the top, the other players they acquired, and the amount of money Hedo was demanding, it appears to have been the right move for Orlando.
To wrap up my series of articles on the impacts of players on their teammates’ three-point shooting, I thought I’d take a look at perhaps the most important aspect: can we use the available data to evaluate players and predict the future? Predicting the interactions of players is nearly impossible, but how close can we get to modeling certain aspects of these interactions?
I’m going to take a few different approaches today. The first approach is to see if we can predict a player’s impacts on his teammates’ three-point shooting based on other advanced statistics. For example, if we know an individual’s Player Efficiency Rating, can we estimate what kind of impact he’s having on his teammates? I ran a series of simple linear regressions between 12 different advanced stats and the impacts on three-point shooting. The results are displayed below:
The correlations can be interpreted as follows: if Player A scores one more point per 40 minutes than Player B, then Player A increases his teammates’ three-point attempt percentage (how often they shoot threes) by 0.16% more than Player B. Because increases in three-point attempts and three-point percentage generally are good things, it makes sense that most of the correlations in the table are positive. Players that perform better in these advanced statistics have a more positive impact on their teammates, with the exception of Rebounding Rate. For the statistically inclined, all of these are significant at the .01 level, with the exception of Rebounding Rate.
So far, so good. The numbers seem to agree with common sense: better players help their teammates more. But how much can these statistics really tell us? In other words, if we only know Chris Paul’s Assist Rate, can we predict how much influence he has on his teammates’ three-point shooting? To answer these questions, I turned to the R-squared values (http://en.wikipedia.org/wiki/R-squared) of each correlation. R-squared values range from 0-1, and they essentially tell us how much of an outcome (in this case, impact on three-point shooting) is explained by an independent variable (in this case, PER or Assist Rate or any of the other stats). The results are in the table below:
With the exception of Minutes Per Game (which may just be a reflection of overall ability), the R-squared values are all very low. In a hypothetical and easier world, they’d all be higher. Unfortunately this is the real world, and basketball is much too complicated for us to be able to predict complex player interactions based on a simple stat or two.
Before I go any further, let’s recap what we know so far. There is a significant correlation between most advanced statistics and interaction effects on three-point shooting, so we know that these things aren’t random. But these stats explain only a very tiny part of the story, so we know that interactions are very complex.
The next step I took was to attempt to develop a model that would predict a player’s impacts on three-point shooting using a combination of the different statistics. After playing with the numbers, I was able to achieve an R-squared value of 0.26 for Impact 3PA and 0.36 for Impact 3PCT. These numbers were boosted to 0.45 and 0.51, respectively, if we took the other impact number into account (using Impact 3PCT in the Impact 3PA regression and vice versa). But this defeats the purpose of the study, so we’ll ignore those most recent numbers.
What does this next step tell us? Within the limits of linear regression, we can only explain about 26% or 36% of a player’s impact on his teammates’ three-point shooting using various available advanced stats. In the real world, those numbers aren’t horrific, but they’re far from being the keys we need to truly figure out the game of basketball.
Finally, let’s switch gears and examine how consistent these interaction effects are from one year to the next. After all, if they are totally random, there should be no correlation. Using the numbers from 07-08 and 08-09, I ran regressions for Impact 3PA and Impact 3PCT. Both regressions resulted in statistically significant, positive correlations. That’s good. But the R-squared values for the regressions were .05 and .1, respectively, which is not so good. In other words, knowing how a player affected his teammates’ three-point shooting last year will only tell you a little bit about how he’s going to do this year.
Another interesting thing to look at is the impacts for players that switched teams. If players have similar impacts no matter what team they’re on, we may be on to something. When we limit the sample to these players, we again get statistically significant but not particularly informative results. Both regressions are significant at the .02 level but produce R-squared values under .1.
No matter which way you slice it, you get the same idea: good players make the players around them shoot threes more often and more efficiently, but that’s all we can say for sure. We can quantify what’s already happened, but we can’t predict the future.
If you’re still reading now, you’re undoubtedly more interested in this stuff than the average fan, and you may have some suggestions for me. If so, send me an e-mail at firstname.lastname@example.org.