Analyzing Steals in the 2016-17 NBA Season

Over the 2016-17 NBA regular season there were a total of 18950 steals. This amounted to roughly 15.4 steals per game; effectively 7.7 steals per team per game. Over the 236,547 possessions in the NBA last season, this amounted to 8.0111% of all possessions ending in a steal. Think about that for a second. Roughly every 12 and a half possessions, there would be a steal.

The request of a coach is simple: protect the ball. Conversely, the logic that you can’t lose the ball more often than the amount of times you have the ball holds fairly well. This would indicate that players with a high volume of touches have a much higher chance to losing the ball more often than a player with far fewer touches. Makes sense, right? So that question is… how do we compare two players ability to not turnover a live ball to another an opponent. That is, how do we compare how well a player protects the ball in non-stoppage situations?

Types of Steals

First off, let’s look at the different types of steals. There are prior primary methods for obtaining a steal: take the ball from the ball handler, called an On-Ball Steal, or shoot a passing lane to nab an errant pass, called an Off-Ball Steal. According to the NBA, there are a few other definitions. One such definition is a Possession Lost Ball. These situations are credited to a defender is a change of possession occurs that due to a non-jump ball situation. There are some other bizarre situations as an offensive kick ball (which happened once in the 2016-17 season) and a Turnover Turnover, which was listed six times.

In the 2016-17 season, there were 12,154 Off-Ball steals, 6,586 On-Ball steals, and 203 Possession Lost Ball steals. The easiest way to grep these scenarios are to query play-by-play data. A simple pandas command partitions out all steals:

steals = df.loc[~df[‘steal’].isin([np.nan])]

At that point, we can either further partition, or simply iterate over the partitioned rows. After an initialization for each player, we can simply update using the zip command:

if row[‘reason’] == ‘bad pass ‘:
stealIndex[row[‘steal’]] = [x+y for x,y in zip(stealIndex[row[‘steal’]], [0,1,0,0])]
elif row[‘reason’] == ‘lost ball ‘:
stealIndex[row[‘steal’]] = [x+y for x,y in zip(stealIndex[row[‘steal’]], [1,0,0,0])]
elif row[‘reason’] == ‘poss lost ball’:
stealIndex[row[‘steal’]] = [x+y for x,y in zip(stealIndex[row[‘steal’]], [0,0,1,0])]
stealIndex[row[‘steal’]] = [x+y for x,y in zip(stealIndex[row[‘steal’]], [0,0,0,1])]

Who Stole the Ball the Most?

Despite having nearly 600 NBA players this past season, there were only 462 players who gave up a steal. The top 35 players with the most amount of steals:

Screen Shot 2017-08-12 at 6.15.24 PM

Slight error: Thaddeus Young played for Indiana Pacers in 2016-17; not Minnesota.

We also have a break down of how the steals were made. For instance, in a report I prepared for a Western Conference team multiple years ago, I wrote about the need to attack Stephen Curry on defense. While Curry is one of the premier steals leaders, not only does he give up a high amount of points in the paint (see Russell Westbrook vs. Stephen Curry), but Curry’s steals typically come from shooting the passing lane off the basketball. We see this with 97 of this 143 steals coming off the passing lane. This breaks down to a 67% clip. Curry’s steals also came against 95 different ball handlers.

Screen Shot 2017-08-12 at 6.22.06 PM

Note that there are a significant amount of Bigs on this list: Blake Griffin x5, Andre Drummond x4, Amir Johnson x2, DeMarcus Cousins x2. This is due to Curry’s active hands in the post, helping off a weak side defender or in transition. I encourage you to take a gander at many of his steals on YouTube with the sound off: Steph Curry steals.

In fact, we find that the top passing lane thieves are actually Trevor Ariza (HOU), T.J. McConnell (PHI), and Giannis Antetokounmpo (MIL); who each have over 100 off-ball steals!

The top On-Ball defenders, steal-wise, are John Wall (WAS), Draymond Green (GSW), Ricky Rubio (MIN), and Robert Covington (PHI). These players all ripped 58 or more steals directly from the ball handler this past season.

Steals Does Not Equal Best Defender…

Remember that steals do not indicate the overall quality of a defender, but rather is a merely a cog in the equation. For example, Steph Curry does not get his off-ball steals without premier defenders such as Klay Thompson, Andre Iguodala, and Draymond Green guarding perimeter players and forcing them into bad passes. Similarly, Draymond Green does not get his on-ball steals without suffocating defensive pressure or up-pace tempo forcing ball-handlers to lapse in judgement or hesitate and find themselves in a turnover position.

That said, steals do help indicate either the court vision of a defender (off-ball steals) and the relative quickness of a defender (on-ball steals). However, these are all relative. To gain further insight, we should really look at the player-to-player interaction on a steal. For example, who is better at steals: Tony Allen (MEM) or Thaddeus Young (IND)?

One way is to measure how chances the player has at a steal attempt. One way to do this is through spatial-temporal data, in which we can measure who is guarding who every moment of the game. If a defender is guarding the ball; that’s a chance. If a player is guarding a player who receives a pass, that’s another chance. Here, we adopt the NBA standard that a player is “guarding” another player if they are within “six” feet of that player… regardless of direction.

Unfortunately, I do not have spatial-temporal data for the 2016-17 NBA season. So onto another method for comparing two players: Who they stole the ball from.

Tony Allen vs. Thaddeus Young

Tony Allen is considered one of the best on-ball defenders in the league. It can be measured through effective field goal percentage of opponents, number of steals, and even though the spatial-distribution of scoring by players being guarded by Allen vs. times when they are not guarded by Allen. However Allen has only two more steals than Thaddeus Young despite only playing in three less games. There’s not much discrepancy.

Allen has stolen the ball from 78 different players last season; the biggest gluttons for punishment being James Harden (HOU) x6, Russell Westbrook (OKC) x4, Damian Lillard (POR) x4, Mason Plumlee (POR/DEN) x4, Shaun Livingston (GSW) x4, Devin Booker (PHX) x3, and Draymond Green (GSW) x3. That’s quite the list. In fact, the remainder list is a who’s who of players in the NBA.

Thaddeus Young shares a similar resume with thefts from 77 different players; but that’s where similarities end. The players Young picked up steals from were Nikola Vucevic (ORL) x5, Kristaps Porzingis (NYK) x4, Carmelo Anthony (NYK) x3, Khris Middleton (MIL) x3, Rajon Rondo (CHI) x3, and Timofey Mozgov (LAL) x3. It becomes very apparent that Young picks up his steals on weaker ball handlers.

Weak Ball-Handlers

So how do we measure a weak ball-handler? First off, we can look at turnovers compared to touches. This is an exceptionally simple measure. Non-steal turnovers include errant passes out of bounds, traveling, double-dribbling, and charges. There are other ways to create a turnover, but we start to stray into the bizarre like a “offensive kick ball steal.”

We can focus on steals directly as this means the ball handler had bad court vision for an errant pass, or had weak handles and gave up the basketball to a defender. In this case, we find that the players who have committed the most turnovers as steals to be these top 35 players:

Screen Shot 2017-08-12 at 6.58.08 PM

This shows that James Harden, Russell Westbrook, John Wall, LeBron James, and Dennis Schroder are the “worst” ball handlers in the league. However, this couldn’t be further from the truth. As noted above, the totals do not indicate the worst but are rather cogs in the equation. Here, to measure ability, we must be able to count the number of touches.

Number of Touches

If we look at the number of touches, we immediately see all these “terrible” ball handlers have the highest amount of touches. In fact the top 5 players to have the ball stolen from them are in the top 10 for having the ball in the first place.

Screen Shot 2017-08-12 at 7.01.57 PM.png

What should be disconcerting from this list is that Devin Booker (PHX) is nowhere to be found on the ball handler list despite being in the top 10 for having the ball stolen from them. So what we can do is look at rates of turning the ball over.

Turnover Via Steals Rate

If we change to rates, we obtain a completely underwhelming list of players.

Screen Shot 2017-08-12 at 7.06.07 PM.png

Here, we suffer from small sample sizes. This is typically the case when looking at rates. Typically, NBA front offices like to bracket players. So, we can do that too…  Let’s look at all players who obtain at least 820 touches a game. This is at least 10 touches a game for a full season. Then we find this list of turnover by steal rates:

Screen Shot 2017-08-12 at 7.09.12 PM.png

This list has more major players; or players who have the ball quite often. However, the question remains… does having a higher turnover rate mean that the player is more likely to turnover the basketball? Again, this is a sampling problem. So let’s perform a naive model to identify rates of steals; and then perform a nonparametric analysis to compare players adequately.

Regression Modeling: Linear vs. Nonparametric

The most basic model to apply is the Linear Regression model. That is, we assume that the number of turnovers due to steals is a linear function of possessions, has constant variance of turnovers conditioned on the number of possessions, and the turnovers due to steals are Gaussian distributed at each and every number of possessions. By the way, these assumptions fail miserably. But we’ll plot anyways!

A nonparametric method is to fit a local polynomial. In fact, cubic splines don’t perform too poorly here. They capture the trend of turnovers per touch fairly well as players with fewer touches have a higher rate of turnovers, verified by the data above, and then seemingly calm back down for players with a moderately high amount of touches. We then see the rates increase again for players with high volume of touches. This, however, is primarily due to the “hero-ball” movement played by Russell Westbrook and James Harden. Recall, they had a triple double MVP fight the entire season. Their turnover numbers and number of touches are clearly indicative of this.


Distribution of Turnovers Due to Steals over the Number of Touches for the Ball Handler. The Linear Regression (green) and Nonparametric Local Polynomial Regression (orange) are plotted as well. Being above the lines are bad.

Here we not only see the non-constant variance rear its ugly head as the number of touches increases, but we also gain insight as to how to compare players. For instance, we can simply measure the distributional difference above and below the respective regression lines to identify how far from “normal” or “expected” a player is when handling a basketball.

Using the nonparametric polynomial fit, we see that while Harden and Westbrook (they are the far right points) are above the line, they really aren’t as detrimental as Devin Booker is; who has the highest difference compared to all other players.

We also find that Marvin Williams (CHA), is the best at not turning over the ball to another player. This is particularly helpful, as he is a power forward and tends to be in traffic quite a bit. Remember Thaddeus Young’s best customer for steals? That’s right, Nikola Vucevic is the second best ball handler. In fact, ten percent of his steals came at the hands of Young.

Rounding out the top five ball handlers: third is T.J. McConnell, who happens to be one of the top players in steals in the league; fourth is Kemba Walker (CHA); and fifth is Giannis Antetokounmpo, who [like McConnell] also is a premier off-ball defender.


Here, we saw how to compare different players in steals and ball-handling capability from play-by-play data. We were also able to discern the difference between an on-ball defender and an off-ball defender and their impact on steals. Most importantly, we show how regression assumptions are typically poor (by they way, Poisson regression fails massively here as well) and applying nonparametrics builds us a benefit of having a little more flexibility to fit to the data well.

So the real question is… did you realize before today that T.J. McConnell is one of the best two-way ball-handler/stealers in the league?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s