With the 2015-16 NBA season getting underway tonight, we take a look at previous year data and attempt to extrapolate an expected number of wins for each team this season. The method we propose is conservative in that we rely on previous season possession data and opening day rosters for each NBA team.
First, we identify team rosters and then build a model to identify scoring and defensive contribution of players per possession. If we do not have previous year possession data for a particular player, we find a similar caliber player and match that player’s score to the player in question. For instance, Metta World Peace is recovered by looking at Metta World Peace’s numbers from a season further in the past. Similarly, players with injuries are replaced with previous year data, such as Kevin Durant. Finally if the player is a rookie, we use a clustering algorithm based on collegiate per game totals for the players. For instance, Jahlil Okafor (Philadelphia 76ers) matches relatively well to an Al Jefferson (Charlotte Hornets). This is purely an ad hoc approach, as we are interested in only a conservative model for predicting NBA teams.
Step One: Identifying Team Rosters
First we take note of the numerous roster changes for each team. We take note that there are eight key retirements heading into the 2015-16 NBA season: Elton Brand, Aaron Gray, Stephen Jackson, Andrei Kirilenko, Shawn Marion, Kenyon Martin, Jason Richardson, and Darius Songaila. Similarly, there were also 36 trades between June 11th and July 30th. There were also 226 moves in Free Agency between July 2nd and October 26th, not including free agency re-signings with teams. Forty-three players moved overseas for competition. Twenty-eight players were waived; some claimed in the Free Agency moves. And finally, sixty players were selected in the 2015 NBA Draft. Overall, approximately 400 bodies exchanged between, towards, and away from teams.
In the end, each NBA started today with 15-man rosters, a total of 450 NBA players. Typically an NBA team will utilize 16-17 players over the course of a season. For the 2014-15 NBA season, there were a total of 492 players who saw at least one possession. Once the rosters were set, a copy of the rosters were obtained this morning and we looked towards building a crude model.
Step Two: Use the Players’ Pasts to Identify Ability of Players
Once we had rosters for each team, we set out to obtain a score for each player. Let’s walk through an example of the New Orleans Pelicans. As of this morning, the Pelicans roster is given as:
- PG: Norris Cole, Jrue Holiday, Nate Robinson, Ish Smith
- SG: Tyreke Evans, Eric Gordon
- SF: Luke Babbitt, Alonzo Gee, Quincy Pondexter
- PF: Ryan Anderson, Dante Cunningham, Anthony Davis
- C: Alexis Ajinca, Omer Asik, Kendrick Perkins
We then obtain the weights obtained from the positive interactions model we identified in the previous year. By doing this, we identify that Tyreke Evans and Anthony Davis are the two top players on the team with scores of 1.726342 and 1.478787, respectively. This means that when these two players are on the court, the Pelicans expect a total of three extra positive actions per possession compared to all other players on the court. Keep in mind, if they are on the court with Kyle Korver (3.18), their effect is negated. So this does not mean that we expect those three actions to happen, but rather relative to the other players on the court.
That stated, we then consider each game to be a contest of possessions. These numbers include variability. For instance, Kyle Korver’s 3.18 puts him at the top of the NBA, but his associated variance is .07, which means that he is prone to fall between 2.65 and 3.71 positive interactions for any given possession. So let’s suppose the following hypothetical examination of comparing player interactions.
Suppose Tyreke Evans and Anthony Davis play two-on-two against Kyle Korver and Paul Millsap, who have scores 3.183463 and 2.049262, respectively. The associated variances for each player from the variance-covariance matrix obtained in the regression that built these numbers are .04 (Davis), .07 (Evans), .07 (Korver), and .05 (Millsap). Playing by NBA rules, we assume that possessions take turns (no make-it-take-it) and that a team wins a possession if they have a better team “on the court.”
Using other information such as team possession data and average time per possession, we build an expected model of possessions in this hypothetical two-on-two game. For each possession, we draw numbers from each distribution at random. For instance, we obtain the following draws:
- Evans (1.34858) and Davis (1.58690) give a score of 2.93548;
- Korver (2.94839) and Millsap (1.99843) give a score of 4.94682.
Thus Korver and Millsap win the possession. We repeat this process for the expected number of possessions and identify who won more possessions. This team is then set to win the game. In this instance, Korver-Millsap is expected to defeat Evans-Davis by a score of 117-61. If we introduce each entire team into the mix, and re-run the simulation, we obtain a Hawks 92 – 86 win.
Step Three: Identify the Game Matrix for the 2015-16 NBA Season
In a given season, there are 82 NBA games for each team. Since we are discussing the New Orleans Pelicans, we note that they are a member of the Southwest Division in the Western Conference. Due to their location, they play each member of their division a total of four games. This leads to 16 in division games. Similarly, the Pelicans play each team in the Eastern Conference a total of two games, leading to another 30 games out of conference.
The remaining 36 games are distributed somewhat evenly across the remaining 10 teams within their conference. This leads to 6 non-division-same-conference teams (Clippers, Timberwolves, Suns, Trailblazers, Kings, Jazz) being played four times each, and the remaining 4 teams (Nuggets, Warriors, Lakers, Thunder) to be played three times each. This breakdown gives the Pelicans their remaining 36 out-of-division / in-conference games.
Step Four: Apply the Simulation to the Game Matrix and Tally Wins
The final step is to run the simulation in step two for every game on the schedule. We run this simulation 100,000 times and write out the resulting records for each team. The reason we run this many simulations is due to the fact that we may have a small probability effect occur. That is, we might see the Los Angeles Lakers reel off 52 wins. While this is a possible event, it’s highly improbable. Therefore, we take several iterations and average the number of wins.
Step Five: Display Results
Applying our simulations, we don’t quite get to see a 65 win team this season. This is due to the conservative model that was built. In fact, we do not find any team under 20 wins using this model.
Step Six: Explain Randomness (Deflect Blame)
We then take a final note to identify how randomness is entered into our model. As with any type of modeling process, particularly one using distributional theory for prediction, we can only predict based on a set of assumptions and how we transform random variables. In plain English, we predict the scores based on what we have seen in the past and conservatively estimate for the future. For example, if Serge Ibaka of the Oklahoma City Thunder only takes 20-40 three point attempts in the first several years of his career, then why would we assign a high probability to Ibaka taking more than 200 attempts in the next season? (Oops. That happened last year.) Similarly, why would we expect a bench player with lowly stats to break out and have a career year? (Jeremy Lin’s breakout can’t be the norm, right?)
What we are trying to state is that we built a conservative model. Yes, we expect Ibaka to make a decent percentage of three point attempts. This will even bulk his positive action score up an extra point. However, we cannot predict this to happen unless we change the rules of the model, to say something much more sophisticated.
As a fan of the NBA, I would love to see the Oklahoma City Thunder and Miami Heat bounce back from injuries and missing the playoffs to make runs for the division crown. I would like to see Philadelphia make some noise in their conference; even if it’s back to 35 win territory. And it may as well be possible that Atlanta doesn’t recapture their shooting touch from last season and tumbles a couple spots down to a near-.500 season. These are all not too unreasonable hopes.
However it is tough to argue the Trailblazers will have another 50+ win season after losing one of their most dominant players to free agency (LaMarcus Aldridge). It’s also tough to say what will happen to Dallas after the departure of key players such as Tyson Chandler and Monta Ellis.
This will shape up to be an exciting season. And for the record, in the time of writing this, two of the dominant teams from last year (Atlanta Hawks, Cleveland Cavaliers) just dropped their openers to fall into a last place tie in the Eastern Conference. It’s only one game into the season. But it’s shaping up to be a fun one already.