On September 28th, the NBA Board of Governors approved changes to the NBA draft lottery system. These changes were construed in an attempt to help avoid tanking in the league in an effort to maximize a respective team’s probability of obtaining a high draft pick. In part, this is not a bad effort as we have seen great players do not necessarily go first in the draft. However, we’d like to take a look and see how much these changes actually affect teams. To do that, we take a look at the probabilistic structure of the NBA Draft Lottery.
The Draft is a Plackett-Luce Model
Anyone familiar with ranking algorithms are fully aware of the Plackett-Luce model. In case you’re not, here’s a brief synopsis.
Given a set of N items, consider each item having a probability of being selected. Let’s call this probability p_i. Here, i is the index of the item (one through N). Then, at random, an item is selected. Say item j is selected. Then we select an item from the remaining N-1 items; where item j is removed. This methodology is called sampling without replacement. This is the common “rank your favorite [sodas/TV shows/Presidential candidates] in order from most favorite to least favorite.”
Obtaining Probabilities of Orders
In terms of probabilities, let’s quickly write them out so we understand the process and can apply it to the NBA draft lottery. To begin, we must select a first team. This is given by
Here we assumed that Team_i was selected first. This is simply their own probability of selection. Next, we must select the second team. Since Team_i was already selected, there is zero probability of them being selected a second time. However, the total probability of a team being selected must be one. This means we normalize the probabilities by removing the first team’s probability via
However, this is not our probability of seeing Team_j second! This is only seeing Team_j second only when Team_i is selected first! We will get to this in a moment.
For now, we have obtained the probability of seeing Team_i first and Team_j second. This is given by
We can continue this process to obtain the first four items of interest. To save space, let’s write ijkl to be Team_i selected first, Team_j selected second, Team_k selected third, and Team_l selected fourth. Then, the Plackett-Luce model is written as
This model yields the probability off seeing a certain ordering of items. However, this is not the probability we are usually interested in when it comes to the NBA Draft Lottery. Instead, we focus on the probability of obtaining a certain ranking.
Obtaining Probabilities of Ranks
To obtain a probability of ranks, we must compute the combinatorial problem of identifying all possible orderings. While this problem is simple for small samples such as NBA Draft Lotteries; this is considered the hardest problem in ranking as exhaustion is not feasible for more than 1,000 items! For 14 items, this is easy.
To obtain the probability of being first, we simply look at the sampling probabilities that are given to each item. For the NBA, this is
0.14, 0.14, 0.14, 0.125, 0.105, 0.09, 0.075, 0.06, 0.045, 0.030, 0.020, 0.015, 0.01, 0.005
with respect to teams one through 14. If we perform a quick check, these probabilities indeed sum to one. Therefore these are the probabilities of being selected for the first pick in the draft.
Now, what about being selected second in the draft? This problem is a little tougher. In this case, we must iterate through all possibilities of being selected second. For example, Team_j can be selected second only if Team_j is not selected first. In the case of the NBA Draft Lottery, there are only 13 possible scenarios of this happening. To calculate this probability for every team, however, we must perform 182 probabilistic computations. This is much larger than the 14 computations in finding the probability of being first for every team.
Now, to find that probability, we simply calculate
Let’s break this equation down. Here, p_j is the probability of selecting Team_j first! But wait… Team_j is selected second… This is true. We played a little manipulation game here. Let’s break this down completely.
The probability of selecting Team_j second given Team_i is selected first is
All we do is add these up for all teams that are not Team_j. This is where that sum with the i note equal to j term pops up. Since we are multiplying, we can do the a*b = b*a thing and swap p_i and p_j. Finally, since the sum only counts the teams selected first, and Team_j is not selected first, Team_j‘s probability can get pull out the sum. Voila!
If we want to compute this for being selected third or fourth, we can just continue the same process. For identifying the probability of being selected third for every team, we now must compute 2,184 probabilistic computations! The equation for Team_k being selected third is given by
You can see how this gets ugly pretty fast…
Coding Plackett-Luce in Python
Instead of trying to write out the probabilities by hand, we can write a simple algorithm that iterates through the combinations at a fast rate. There are many ways to go, such as building a function and iterating, iterating through every case, or building a recursive algorithm. Depending on the goal in mind, we select the method to do the job we wish to perform. For example, if we simply want to build a basic probability distribution, recursion works best. If we wish to incorporate trades; writing the entire structure out may be preferable.
First, we start with our initial probability vector. This is our sampling weights in terms of probabilities of being selected first in the draft.
We already know how to select the first team of the draft. Instead, we start with the second team in the draft.
Here we iterate through all possible teams as they are selected first. We remove their sampling probability, calling this firstRemoved and normalize the probabilities. This normalization is the 1-p_i that we need to compute. We simply generate the conditional probabilities of being selected second and add them up over all possible first round selections. This is performed using the zip command. Output of this function is given as
Yuck. If you can’t read that, it’s simply:
0.134, 0.134, 0.134, 0.122, 0.105, 0.092, 0.078, 0.063, 0.048, 0.033, 0.022, 0.017, 0.011, 0.006
These are indeed the probabilities of being selected second! Now we continue this iterative process to identify the third pick:
Notice that it is contained within the second pick’s block. This is because we must hold on to the second and first team when selecting the third team. Output for this is given by
Cleaning this up a bit, we have:
0.127, 0.127, 0.127, 0.119, 0.106, 0.094, 0.081, 0.067, 0.052, 0.036, 0.024, 0.019, 0.012, 0.006
This is indeed the probabilities for being selected third. Continuing this process, we obtain the probabilities for every team being able to obtain picks one through four:
Picks 5 through 14 are Deterministic
Once we select the first four teams, the remainder of the picks are deterministic. This means we know who, with certainty, selects fifth given knowledge of the first four teams. In the NBA Draft Lottery, this is the remaining team with the worst record. Recursively, the next team selected is the worst team of the remaining teams not selected.
What this means for our model is that we merely have to sum the instances of the first four picks and iterate through the teams remaining. There is no shuffling of probabilities as there are no more random draws. In Python, we can build a coding block that iterates this process.
Here, we enter in the probabilities for the remaining teams, called probs. The data structure remainingProbs is a 10-by-14 matrix where each row represents picks 5 through 14 in the draft and each column represents the team. The quantity restProb is the probability of selecting the first four teams in the draft.
What we are doing here taking the probability of seeing teams ijkl being selected for the first four picks and setting this restProb. We then iterate through who is left is order from worst team to best team. The nonzeros line identifies the teams remaining and the min() function finds the worst team remaining. At step i, we give that team the probability of having the ith pick, remove them for the next picks and repeat for the next worst team.
Within the Python code block, this function is called within the final code block after all four first teams are selected:
This gives us the overall structure of the NBA Draft Lottery.
And there we have the entire NBA Draft Lottery odds for every team. More so, we have code to produce lottery odds whenever the lottery odds change. For instance, a tied team? Simply adjust the initialProbs vector and you’re done. A trade? Well, that’s a little more complicated.
Trades Can Be Accounted For…
If we recall the Sacramento Kings – Philadelphia 76ers pick swap from before, we can construct a new function in Python that performs a swap check. This is simple. We merely check for anytime the Kings were selected. Recall that every team is marked by their column. In the previous year, the Kings were the 8th team while the 76ers were the fourth team. This means the Kings are indexed as Team_7 while the 76ers are indexed as Team_3. This is due to indexing starting at zero. Every time a team is selected, we merely check if index 7 is deleted when index 3 exists. This would enforce a pick swap. We then swap the probabilities in secondProbs, thirdProbs, fourthProbs, or remainingProbs, depending where the swap occurs.
In the case a team not in the lottery obtains a pick. Consider last year the Chicago Bulls obtaining the rights to the Sacramento Kings pick if they fell below tenth in the Draft. In these cases, we have to make a slight adjustment to the probability vectors and include a fifteenth team with zero probability. We condition on a function of drop below tenth and add that probability into the fifteenth slot, which is held for the Bulls in this case.
Expected Draft Pick
Now that we have the distribution, we can compute the expected draft pick for each team. This is a simple process as we have the entire probability structure for every team in the Draft. The expected draft pick is the mean of the distribution for a team. We are used to the old “add the up values and divide by N” routine as kids. However, every probability is not the same and instead of multiplying by 1/N, we multiply by the probability of selection. In math terms, this is
For the worst team in the draft, this is given by
1*0.140 + 2*0.13417 + 3*0.12749 + 4*0.11972 + 5*0.47862 = 3.66279
This means that the team with the worst record is expected to obtain the 3.66279th pick in the draft. Continuing for every team in the draft, we obtain all the expected picks in the draft.
If we compare this to the previous version of the NBA Draft Lottery, we see that the lottery becomes more competitive for all teams.
Differences Between Now and 2019
The current draft lottery (pre-2019) gives a disincentive for teams that finish with the worst records by making their spots in the draft “up for grabs” by all teams in the draft lottery. Taking a look at the display of the expected draft positions, the blue line represents where a team should fall within the draft. This is the case where no lottery is held and teams pick based on order of worst to first.
Using a lottery, we deviate from this blue line. Anything above the blue line indicates a team’s chance of losing their spot in the draft, while anything below the blue line indicates that team having a better chance at advancing in the draft. The distance away from the blue line, the worse or better the team’s odds of losing their spot or gaining a spot, respectively.
In the current draft, only the worst three teams can lose their spot in the draft. The blue line and the red line intersect at the fourth spot in the draft. After that, the draft process benefits the other 10 teams. As the red line identifies the current expected values in the draft, there is not much deviation about the blue line. Let’s inject pick variation on each of these expected picks.
Here we see that only the first team in the draft is beaten up by the lottery process. Every other team falls relatively close to the No Lottery process. This indicates that the lottery does not really alleviate tanking as it’s effectively statistically equivalent to No Lottery.
What happens under the 2019 version?
In this case, we do negate some form of tanking as we see two teams pull off the blue line. Here, the fifth team is the cross-over point. This is due to the fourth draft position becoming a lottery draw.
While this does alleviate tanking to some degree, this new proposed methodology is not statistically different than the No Lottery version once again. What this effectively enforces is that teams should not tank into the bottom four spots; but rather the bottom five spots.
So how do we alleviate tanking? Make the entire lottery Plackett-Luce. Using the proposed probabilities, we should see both ends of the tail move away from the blue line. In this case, we will obtain significantly different than No Lottery odds while keeping a higher probability for worse teams to obtain that much needed draft pick.
However, at the same time, in doing this, there’s a large possibility of hurting smaller market teams that may get caught within picks 6-10 that may never find that diamond in the rough to pull them out of low-playoff/no-playoff limbo. How to quantify that, though, is a different story.