LikeLike

]]>I tried again using data from a different source (https://github.com/903124/NBA_oneway_stint), and I got the exact same RAPM values for the top 25. The problem with the variances being about twice as high persists though, and I think I’ve found the reason why. To calculate R^T W R, I was using R = ratings2- stints*beta, and WR = ratings-weights*beta. This should be correct because ratings is WY and weights is WX (from what I can tell, this is the transformation you mentioned). However, if I instead use R^T R (which indeed is about half the size), then I match your variances. Is there some other mistake I’m making here? It seems like my method of calculating R^T W R should work.

LikeLike

]]>Interesting. I know Positive Residual’s method of using MCMC to estimate variances came up with values within 5% from the theoretical variance estimates; which is common for MCMC methods.

I wouldn’t know why something would be twice as high over than maybe the correlation was accidentally doubled/squared? But that would be approximately a 1.5x increase.

LikeLike

]]>Yeah, my values are (unsurprisingly) about the same as well, but for some reason all the variances I can check against yours are about twice as high. I assume this is just the result of a coding error on my part, but I’m stumped about what it could be. That’s why I was wondering whether it could instead stem from different inputs. Thanks again!

LikeLike

]]>For the post, I used whatever Ryan’s latest dump was (at that time), since that is publicly available. His parser has similar errors as my own parser (some stints had like 70 possessions) due to wonkiness in the pbp. Also, we tend to count free throws differently. For instance, offensive possessions can score negative points if they create a technical foul and the defensive makes the free throw. I believe if the offense loses you points, they should be charged. Some (read that as MANY) folks will “throw out” technical fouls. The ones I throw out are ones due to coaches and players on the bench.

At that time, from the data Ryan sent me, we had almost identical values for RAPM across the board. After this post, PositiveResidual also obtained very close values (he didn’t know the tuning parameters we both used).

LikeLike

]]>LikeLike

]]>In both the theory and the code it’s R^T*W*R. I think it looks a little off to you because I make a transformation that allows the code to run fast. Processing the weight matrix by itself dramatically increases (thanks to inverses on other data frames) the computation time of the algorithm.

LikeLike

]]>LikeLike

]]>You’re missing the transpose. The matrix is not NxN, but rather (2P+1)x(2P+1).

LikeLike

]]>Thank you for doing this. Maybe I’ll try my hand at computing confidence intervals for related stats like PIPM or BPM-unless their creators already have?

LikeLike

]]>