Over the previous years, linear models have been introduced across the NBA as the “new”and “innovative” way to determine “value” of a player on the court during the course of a game. Every methodology, an increment on the past: Plus-Minus to Adjusted Plus-Minus to Regularized Adjusted Plus-Minus to Box Plus-Minus to Real Plus-Minus to Player Impact Plus-Minus. Each level attempts to leverage play-by-play data as either a linear model (APM), a Bayesian hierarchical model (RAPM), a Conditional Model (BPM and RPM), and a Bayesian hierarchical-conditional model (PIPM). The latter models: BPM, RPM, and PIPM attempt to incorporate player actions in an effort to normalize the wonky results that pop out in models such as APM and RAPM. For instance, good luck trying to push Danny Green as one of the best players in the entire league, worthy of a max contract. These models typically suffer from swamping and sparsity, a high-correlation high-noise environment that does not allow most linear models to identify proper signal above the noise floor for determine “points per 100 possessions” for a particular set of lineups.
To combat swamping, RAPM employs a regularization parameter that mechanically serves as a Bayesian filter on the parameter, mimicking Principal Component Analysis. While this drives down noise and dramatically improves analysis over APM, the model introduces slight biases; and due to the sparsity of the sampling frame, a rotational invariance problem emerges. A simple way to see this is swap columns in the lineup matrix and randomize the seed in the LBFGS optimizer. [Side note: If you use Newton-Raphson optimization for RAPM, you may need to take a moment to study local vs. global optimizers.] From mid-season for single season RAPM, we will actually see Danny Green and Fred Van Vleet swap places with near identical values.
To combat swamping, BPM, RPM, and PIPM employ “additional” variables to help drive up the signal on high performing players. This better captures the true nature of the lineups as certain players have higher usages than others and other players are more efficient than their counterparts. While this is great, the linear models still fail to capture the lineup strategy within the game.
To this end, we propose action networks.
The Game is a Novel. The Possession its Chapter.
A piece of advice I obtained from an Eastern Conference coach back in 2014 was that we think of every game as a story that unfolds with multiple subplots. When that coach broke down the philosophy, he was aiming for the thought of hopefully the other team is a shit writer and we can figure out the ending by chapter three. What the story really made me think of was that we can employ natural language processing to break down possessions. At the time, we had SportVU and Synergy data available. But before we delve deep into the data, let’s take a look at a classic story.
Beautiful Ball Movement: San Antonio Spurs
Since most of this methodology was completed back in 2014, the then presentation “Breaking Down NBA Plays Using Spatio-Temporal Constructions” contained several clips of the 2014 NBA Finals between the San Antonio Spurs and the Miami Heat. One of the clips focused on writing a story using actions.
Let’s break out the story. The first action is to set a dual pin-down on Tony Parker. This action occurs as two passes are made by the ensuing screeners: Manu Ginobili and Tiago Splitter. The second action is to perform rub by Matt Bonner to bring Ginobili to the strong side of the court. The third action is a hand-off from Kawhi Leonard to Ginobili leading immediately into a fourth action of a pick-and-roll from Bonner onto Ginobili.
As the Heat go into BLUE, Bonner pops instead of roll as Ginobili is double-teamed, leading into two swing passes back to Parker, followed by the fifth action: a pick. The ensuing drive (sixth action) creates a double team, forcing Parker to pass back out to Bonner. This allows for a seventh action / second drive by Bonner as the defender, Rashard Lewis, is slow in the closeout. As Chris Andersen steps out, Bonner sumps the pass off to Splitter, who makes the reverse layup.
In the linear model space, this is registered as “1” where we see the offense and “1” where we see the defense (or some multiple; it doesn’t really matter). In the box score models we identify that Bonner gets an assist and Splitter gets a made field goal. But that’s about where it ends. Tony Parker gets no love for the drive the opened the lane. The staggered pin-down gets ignored even though it forces the Heat to guard the actual play with Andersen, Lewis, and a confused Norris Cole instead of LeBron James.
Turning the Action into Features: Possession to Vec
Armed with SportVU and Synergy, we could develop a methods for building tracking features. We’ve already done some this for identifying passes. From here, we can start to tackle different types of actions: handoffs, screens, cuts, passes, types of shots, dribbles, etc.
For instance, we can model a basic pick and roll play as follows:
Here, we have the classic elbow screen with roll. Of course, the defense is not from the School of Thibodeau and BLUE is not employed. So we can build the features Ball Screen, Pick and Roll, No BLUE, Drive Left, and append on other actions as necessary, such as number of dribbles, switches by defense.
Therefore, a possession story unfolds as counting the number of plot devices that define the story. (Side Note: We do this because I’m lazy and I don’t to worry about exchangeability issues)
In my 2014 analysis, I had broken out 152 different types of actions. These actions would encapsulate a possession based on actions taken during the course of the possession. Therefore we would have a 152-long count vector counting the different actions that occurred within the possession and this would effectively serve as our possession to vector quantity. A requirement for becoming an action was that it had to be performed a minimum 5 times per game (on average) and it had to contribute to the network.
The Action Network
Finally, we arrive to the action network. This is exactly as you should expect: A Multilayer Perceptron Neural Network (fancy hierarchical regression) that attempts to learn the value of each action relative to the number of points scored within a possession. This will give weight to each action and we can use these weights as priors for every player in the league. The weights would then be viewed as “contribution to the points scored within a possession.” This process gets a little weird as values can and will go beyond 2-3 points. There are then negative actions such as turnovers. For instance, in the 2013-2014 season, a turnover was worth -5.66 points. While the positive actions such as backcourt pass contribute to +0.83 points. The sum of the moving parts then identifies the value of the possession.
And to make this more menacing, when predicting the number of points scored, we enforced penalties on non-integer values to ensure we obtain whole points. The result? Using 10-fold cross validation, we were able to recover 72.83% of possessions correctly for the 2013-14 season. For the 2018-19 season, it’s at 68.27%.
So, if you’ve followed this site, you’ve seen a series of explanatory and critical articles about RAPM. This was my fix to RAPM several years ago. By using RAPM and SportVU, we could now place the action network coefficients as a Gaussian prior and start to better understand players based on their actions. The nice part of this is that many of the swamping errors went away. Unfortunately, there were still issues with lower level players due to low action counts; something that not even employing RAPM could help solve.
Using the Gaussian prior framework, we have to be careful with how we use “RAPM.” To this point, we’ve been using it interchangeably with ridge regression on lineup data. So to be clearer we must stick to the same units of interest. In this case it must be points per possession instead of rating. That’s a big difference as we introduce potentially much more error, despite predictive results improving over the ~60% in RAPM.
Also, due to the action network, players are nearly zero-sum. This means that players who gain points force opponents to absorb negative points. We use the term nearly because actions can occur with no defenders; as for instance, we can be victim to meaningless passes in unopposed transition.
Therefore we view weighted actions as a Bayesian filter into regularized linear regression.
But Tracking Data Only Goes So Far…
Unfortunately, SportVU only goes so far back. Similarly, SportVU (and its successor Second Spectrum) doesn’t cover all games in season. Due to this, we perform imputation.
Within a season where we have tracking data, we can impute actions based on play-by-play data. For instance, using 80+ games we can build a regression model to predict the number of different types of actions that occurred. It’s not pretty (one season was as low as ~40% predictive capability) but we can’t afford to throw out data. (Side Note: This is a great place for research to occur… estimating the actions given only play-by-play and Synergy.)
The good news is that trends within season stay true. Teams play similar styles almost all season. It’s rare for a team to switch from being a mid-range dominant team to a three-point-bombing bunch. Also, shooting trends remain stable over the course of the year, but can become volatile from year-to-year.
For years prior to 2011/2012 SportVU data, we have to impute everything. And this is where things may get a little sketchy. The great news is that the three point revolution and advanced offenses barely existed prior to 2011. While three’s have been ever increasing, the actions to unlock large quantities of three point attempts have been a thing of the recent past; say only 3-4 years.
Therefore, using the first two seasons of tracking, we are able to build a regression model to guess the number of actions in each case. And, unfortunately, this only goes back to the beginning of play-by-play: in 1997.
Good news, however, is that in the current NBA there’s no more players left from the pre-1997 years. So those players are have no bearing on us now.
With that, let’s get to some results!
2017-18 NBA Season
Without jumping in headfirst into this season, let’s take a look at previous seasons and see how well this analytic performs. Recall that our goal is to identify point contributions over the course of a game. Therefore, we normalize the values to points per 100 possessions contributed by that player.
Here, we see that we are missing two All-NBA second team players: Joel Embiid and DeMar DeRozan. DeRozan came in at 22nd on the list while Embiid landed at 37th. Similarly, Paul George (3rd Team All-NBA) floundered down to 54th on the list. All defensive players Robert Covington and Draymond Green landed at 21st and 24th, respectively. 23rd was Goran Dragic.
Another notable item from this analysis is that Ben Simmons and Donovan Mitchell both rate high and were the two rookies in consideration for Rookie of the Year, an award which was ultimately won by Simmons.
1997-98 NBA Season
Using the Out-Of-Season imputation method, we were able to identify scores for the first year of play-by-play. In this case we obtain similar results as above. Here, we see the noise of the out-of-season sampling creep in as the entire ALL-NBA 3RD TEAM is left off the top 20.
We do however come up with a hotly contested top spot in the league, which ultimately goes to Gary Payton. These were indeed the top three players for MVP voting that season. We once again nail the rookie of the year with Tim Duncan.
Now for this year…
2018-19 NBA Season
Finally, running the numbers today, we obtain this season’s “Top 20” points contributed players. To recall, we use quotes because these numbers are distributed, meaning that there is variation attached. While the order may possibly pass the “eye test” we have to recognize that errors could be as large as 5-6 points. So this is merely a guide to a potential ordering.
This suggests that Giannis Antetokounmpo “should” be the MVP of the season. However, with such a tight score as compared to the 1997-98 season, we could make the identical argument for James Harden, the current MVP of the league.
For Rookie of the year, Luka Doncic rests at 39th overall, with Trae Young actually tumbling down to 71st in the league. This would suggest Doncic “should” be the ROY this year.
There we have it, a brief introduction into tracking-based story-telling through hierarchical Bayesian models. This model has been in practice since 2014 and has several areas for improvement. What areas would you improve? Unfortunately, we have to wait to see the outcomes for the awards. But until then, sound off in the comments below!