Warning: Lots of Math Ahead…
With the introduction of Player Efficiency Rating (PER), John Hollinger constructed a methodology for comparing the relative accomplishments of players across leagues, as well as across years. While being commonly viewed as complex and unidentifiable, the idea is relatively simple: produce a value for each player such that it captures their personal influence on the game in terms of points per minute/possession. I emphasize minute/possession due to the fact that the original intention of the metric is to capture per minute effects of a player; however, the model uses a per possession framework.
Before we have any discussion on the components of the model, we should first introduce the actual model. For references, feel free to read Basketball Reference’s How-To-Calculate PER Guide or John Hollinger’s own high-level description of the model.
The PER model is computed in a multi-step fashion. First, a player’s unadjusted PER is calculated. This measurement calculates a player’s per possession, per minute, personal contribution to the game. The formula is given as
This model indeed looks torturous, however, it is not all that bad. We will break down this model in a bit. The second part of the PER formulation is to perform a pace adjustment to allow for comparison between up-tempo and down-tempo teams in the league. This adjustment is given by
This is a simple adjustment and serves as effectively a stratified sampling correction. This correction is intended to normalize players with respect to their team pace. A value of “1” indicates that the team plays at the leave average. To calculate league pace, we simply count the number of possessions for every team, divide the number of possessions by the number of minutes played and multiply by 48 minutes. To calculate a team’s pace, we have a slightly different formulation. That is, we average the number of possessions for a team and the number of possessions for their opponent and divide by the number of minutes played. Multiplying by 48 yields the Team Pace. We do it this way due to the fact that League Pace is a census of all possessions while the Team Pace is a sample from the population of all possessions.
The resulting PER is now called adjusted Player Efficiency Rating (aPER) and is given by
Finally, the third component is a league adjustment to allow to compare across years. The league adjustment enforces a value of “15” to be the mean for the league. This is given by
This forces us to calculate one more value, called League aPER. This value is the average aPER across the entire league. This is a weighted average, where the weights are given by the number of minutes played by the player. This is given by
Here, we need to make clear the the index, i, just indexes the players in the league and that MIN is the total number of minutes played by all players in the league; effectively ten times the number of minutes played. As there are ten players on the court at any given time.
What this means is that all the magic in Hollinger’s PER is effectively at the uPER step in the process. So let’s take a look at that.
Breaking Down uPER
Taking a look at the equation for uPER, we have twelve terms and three scaling factors. Our goal is then to attempt to deconstruct the different factors and understand their contribution to the model. First, let’s tackle the scaling factors!
Scaling Factors: Two Make Sense…
Average Number of Points per Possession, v
The factor, v, is the average number of points per possession. The denominator is the basic form of estimating possessions from box score data. Here, we have the number of field goals attempted minus the number of offensive rebounds. This indicates the total number of made baskets and missed attempts that resulted in defensive rebounds. We also add in the number of turnovers and the number of free throws that result in an end of possession. Here, the multiplier of 0.44, or 22/50, is an estimated value indicating the percentage of free throws that terminate a possession. It should be noted that this function is a poor estimator of possessions and that the 0.44 value has been too high for more than a decade (credit: Matt Femrite, Nylon Calculus).
Despite this primitive estimate, we still obtain a form of “points per possession.” It should be clear that this is a global average and therefore all teams are valued the same; despite the scheduling not stating so.
This can be explicitly seen from the 2017 NBA season where, according to stats.nba.com, the Golden State Warriors averaged 1.132 points per possession while the Philadelphia 76ers averaged 1.007 points per possession. What was the league average? 1.062. This means that the metric will over-estimate the average number of points for an action from (or against) the Philadelphia 76ers while under-estimating the Warriors.
Defensive Rebound Percentage, d
The factor, d, is the defensive rebounding percentage. This is straightforward as it merely calculates the number of defensive rebounds and divides by the number of total rebounds. Again, this suffers the same problems as the average number of points as this factor assumes all teams gather defensive rebounds at the same rate.
What the Heck Factor, f
This final factor is a bit puzzling. First the two-thirds constant actually makes the most sense of this factor, but we will touch on that in a moment. The second term causes fits as it is one-fourth the league average of assists-to-field-goals-made times the league average of free-throws-to-field-goals-made. If anyone understands this relationship, please feel free to send me comments.
Points Scored Is Hidden in Here…
Now that we have a basic understanding of the scaling factors, we can start to pick apart the twelve terms of PER. First, rearranging terms, we immediately find points scored. To see this, let’s take a look at uPER again.
The first term is 3PM. The third term, involving FGM, expands out to yield two terms. The first one is 2FGM. The fourth term, involving FT, expands out into two terms as well. The first term of these if FT. Adding these three terms we obtain 2FGM + 3PM + FT, which is POINTS SCORED. This indicates that Hollinger’s PER is merely a measurement of points per possession per minute.
Losing Points By Not Scoring
Similarly, there are negative terms that subtract from a player scoring points. These terms are the fifth term, involving turnovers, the sixth term, involving missed field goal attempts, and the seventh term, involving missed free throw attempts. Let’s walk through the meaning of each of these three terms.
Turnovers Lose Points
The fifth term of uPER involves turnovers. Taking the number of turnovers a player makes indicates the number of lost possessions due to turnover. By multiplying by the league points per possession factor, we obtain an expected number of points lost per possession that would have otherwise been expected to be scored.
Missing Field Goal Attempts to Opponents Loses Points
Similarly, every miss that results in a defensive rebound loses a possession for a team. In the sixth term, we first calculate the number of missed field goals for a player, FGA – FGM. Since some misses are rebounded, we obtain an expected number of lost possessions by multiplying against the league average rate of defensive rebounds. Again multiplying by the league points per possession yields the expected number of points lost by a player missing a field goal attempt.
Missing Free Throws Loses Points
In the seventh term, we see a complicated looking formula. Let’s start at the tail. Here, we calculate the number of missed free throws, FTA – FTM. Next we have a deceitful term of 0.44 + 0.56d. Recalling that d is the defensive rebound percentage, we can rewrite this as 0.44*(1- d) + d. The second term is the expected percentage of defensive rebounds on missed free throws that terminate possessions. We multiply by the extra 0.44 to ensure the expected terminated possession. The first term is the expected percentage of free throws that are offensively rebounded. There is an extra 0.44 term. The reason for this is due to the possession continuing for the same offensive team. In this case, if a field goal is attempted, the associated value is absorbed in another term. Hence, the free-throw only contributions are multiplied by a second 0.44 factor. Multiply this term by the league average points per possession and we obtain the expected number of points lost due to missed free throws.
The eighth and ninth terms of uPER focus on rebounding. In these situations, for every defensive rebound obtained, a possession is taken away from an opponent, thus taking away an expected number of points. Similarly, picking up an offensive rebound or giving up an offensive rebound prolongs a possession. In this situation, the eighth and ninth terms attempts to place expected number of points added or lost due to rebounding.
As we have already picked off seven terms, we may not have the formula handy. Here it is once again…
Defensive Rebounds… Down-weighted.
The eighth term can be split up into two components. The first component is exactly that… defensive rebounds! We obtain defensive rebounds multiplied by the expected number of points per possession. This quickly attributes the expected number of points a player takes away from an opponent. This leaves us with the second component, which is average number of points per possession times the number of defensive rebounds times the league defensive rebounding percentage. If we rearrange this term, we obtain an interesting quantity… the expectation of a Binomial distribution with population size DREB and probability d. Think of this as a down-weighting relative to the probability of the league obtaining a rebound.
Assume we have player j, who obtains a certain number of rebounds. Then we can, using the subscript j to keep track of our player, write
The right hand side is indeed the number of defensive rebounds by player j minus the expected number of defensive rebounds for player j.
To test this out, let’s consider the 2017 NBA season. There were a total of 107,046 rebounds, of which 82,109 were defensive rebounds, for d = 0.767. This means that since 0.767 of league rebounds were defensive rebounds, we extract the 0.767 from each player.
Offensive Rebounds… Down-weighted as well.
Offensive rebounds follow the exact same suit as defensive rebounds. In this case, we must use the definition of d directly to see that this is also a down-weighting computation. Again consider player j and their rebound totals with respect to the league.
Once again we indeed obtain the binomial down-weighting as we did in the defensive rebounding case. For the 2017 NBA season, this means that every player needs to eliminate 23.3% of their offensive rebounds.
Once we down-weight each set of rebounds, we multiply by the league points per possession to obtain the expected number of points due to rebounding.
Steals and Blocks
Steals and blocks are also incorporated into uPER in the tenth and eleventh terms. These two values are incredibly straightforward. First, steals are simply taking away possessions from an opponent, therefore we merely multiply steals with league average points per possession.
For blocks, there’s slightly more work to do. For a block to take away a possession, we must incorporate a defensive rebound. Hence, we multiply blocks with defensive rebound percentage, along with league average points per possession.
The personal foul term may look daunting, but it is rather simple as well. There are two components to personal fouls. The first component results in made free throws while the second results in possession ending free throw attempts.
The first component computes the league average of free throws made per personal foul. Multiply this by the number of personal fouls a player has, and this is the expected number of made free throws given up by that player; hence a negative value.
The second component are possession ending missed free throws. Since these free throws end a possession, we once again find that pesky 0.44 coefficient. Similarly, these are possessions ended without making a free throw; as they are subtracted off in the first term. Thus we multiply these misses by the league average points per possession.
Whew… where are we now? Assists.
So far, we have talked about all parts of Hollinger’s PER except for assists. To this point everything calculated has a reason and results in expected number of points. Assists are of a different beast. Assists creep in at three terms: the second, third and fourth.
The second term is 2/3 times assists. No explanation to the weight other than “assists are two-thirds of a point.” If we pick off this second term and a portion of the third term, we obtain the following
This value is the number of assists made by a player minus the expected number of assists from his teammates for field goals made by that player. Let’s do a quick check.
For the 2017 NBA season, Jabari Parker of the Milwaukee Bucks picked up 142 assists and 399 made field goals. The Bucks, however, finished with 3182 field goals and 1984 assists. Performing this computation above, Parker finished with -71.1863 points added to the Milwaukee offense. Hence Parker is punished for shooting more than his passing. Compare this to his teammate Giannis Antetokuonmpo, who had 434 assists and 656 made field goals, and we see that Giannis scores a mighty 16.6528. In fact, this section rewards passers who never score.
Maybe the remaining component would rectify this problem. The remaining portion of assists is given by
Remember the introduction to the What the Heck Factor? Well, here it rears its ugly head. This is a daunting task to decompose; as that this equation not readily identifiable. Here, we are introduced with two new constants, 1/4, and 1/6. The assumption is that we are looking at assists that lead to extra free throws; however this value is detrimental to improving an offense as indicated by its negative. The other term looks to have three rate corrections, which may be incorporating correlations between free throws and field goals made relative to the league. I can’t tell for sure, as this function actually doesn’t state that at all.
If you happen to understand the genesis of this quantity, please feel free to shoot me a message.
Now that we have completed the hard part of walking through uPER, we turn our attention to pace adjustment and rate adjustment. In uPER we effectively calculated, with exception of assists, the expected return of points for each type of personal action on the court. This was a per possession calculation that in the end is divided by the number of minutes played.
For pace adjustment and rate adjustment, we apply the final two multipliers to adjust for the number of possessions each team has per 48 minutes, as well as the adjustment to ensure that the league average is 15.
Note that simply taking the average PER’s will not give us 15. Instead, taking the average minutes played weighted PER will yield 15. As an example, for the top 355 NBA players in the 2017 NBA season, we obtain 15.2478. Let it be known, this is not an error. But rather the fact that ESPN reported on the top 355 players on their main PER page; not the 486 NBA players who logged at least one minute last season. Taking the extra couple moments to obtain all players, the average is 15.0490; primarily rounding accounting for the 0.0490 discrepancy.
Due to the adjustments, we are able to compare players from year to year. For instance, we take a look at the previous 7 years (2011 through 2017) of distributions of PER.
With the exception of some extra zeros in 2011 and 2013, the distributions for PER remain relatively stable.
PER is also well-known for passing the “eye test.” That is, if the list presents the “best players” in the league, and they are indeed arguably the best players in the league based on qualitative arguments, then the analytic is typically viewed as justified. One simply way to test this is to identify the location of the Most Valuable Player on the PER list. There is one caveat to this… players must meet a minimum number of minutes to be eligible. To qualify, a player must have played 500 minutes. What this insinuates is that PER is not a good measure for small samples. Adhering to this rule, we can identify the positions of MVP’s since 2011.
- 2011: Derrick Rose (9th)
- 2012: LeBron James (1st)
- 2013: LeBron James (1st)
- 2014: Kevin Durant (1st)
- 2015: Steph Curry (3rd)
- 2016: Steph Curry (1st)
- 2017: Russell Westbrook (1st)
PER is also well known for having drawbacks. The primary drawback is that top-tier defensive players are not well-represented. The golden child example is Bruce Bowen from the San Antonio Spurs.
The key to PER is that it is based on personal actions of a player. What PER does not include is defensive capability or ability to spread a court. To correct this, we can perform an experimental design to identify the non-personal contributions of a player.
For instance, we can look at groups on the court and treat these as random effects with a response being the grouping’s bivariate PER. This would partition out how two units interact with each other, as well as against each other. If two players play well together, we expect their groups’ PER to increase. If a defender plays well but does not steal the ball much or picks up blocks, it will appear in their opponents lower PER.
Similarly, PER is designed to be a box score calculation. In the era of play-by-play data we can easily count the number of possessions as opposed to estimating the number of possessions using a crude measuring stick.
Third, PER assumes that all teams are equal. This is indicated at every location a “league” statistic pops up. We gave a clear example that playing the Warriors is not the same as playing the Philadelphia 76ers. In this effort, using the sampling design for schedules can be used to weight interactions. That is, instead of just calculating points per possession; we can calculate points per possession relative to opponent. This will help offset the schedule.
End of the Road
So what do you think of PER? Even more confused than before? It’s been a reliable analytic for over a decade; despite its obvious pitfalls with hard coded weights, ignorance of defensive factors, and assumption that all teams are equal. The nice thing about PER is that it can be used as a tool for further analysis and does not have to be an end-all of player capability.