Evaluating the Reliability of WNBA Player Stats
One of the challenges with evaluating player performance in any sport is determining an adequate sample size to represent a player’s true skill level. It’s pretty obvious (to most, at least) that taking a one-game sample does not indicate how good of a player someone is. But we also don’t want to have to wait an entire season to determine a player’s strengths and weaknesses from a statistical standpoint. So, at what point can we become more confident that someone has played enough to have her talent shine through in box scores?
Today, I’ll be reviewing several player-level stats related to both efficiency and usage in the WNBA. For each stat, I aim to evaluate how taking more shots* impacts the reliability of different stats for players. In the context of this article, reliability is an attempt at measuring the ratio of a player’s true talent to her observed performance in the form of a box score stat.
*For this post, referencing more shots taken (or shots as a sample type) may also refer to possessions played, depending on the stat being discussed.
Evaluating the Reliability of WNBA Player Stats
Methodology
This section will focus mostly on the nerdy stuff, so feel free to move on to the following section if you’re just looking for the takeaways. The methodology for measuring reliability is based on previous research done by Fangraphs for various MLB stats. As they lay out, there are several ways in which reliability could be measured. In that research and mine, Cronbach’s formula was used to measure a stat’s alpha (or a measure of reliability). More specifically, the alpha value is intended to calculate the ratio of the variance of a player’s true talent level to the variance of the observed stat. Below, I adapted an excerpt from Fangraph’s math supplement article explaining the alpha calculation to my specific problem:
I calculated alpha in 5-shot increments and 10-possession increments for K. To calculate it, I set up N x K matrices including each shot attempt in my sample, where K represents the number of shots and N represents the number of players. Each matrix looks like the following table:
There are many different ways to formulate Cronbach’s alpha. For the specific application of this project the following formula is used:
Where i represents each shot attempt 1 through K, t represents the Player Totals for each player 1 through N, and K is defined as above.
This formula uses the ratio of the variance of each shot attempt among all players (si2) and inter-player variance (st2). The variance within the shot can be thought of as the noise within the particular stat. The inter-player variance is the variance of the Player Totals column in the example matrix. (It should be noted this variance term is the variance of the sum of the binomial outcomes for each player and not of the sum divided by K.) Subtracting this ratio from 1 yields the relationship for alpha: the ratio of true variance to the observed stat variance.
In terms of sampling technique, I used a random sampling method of shots for each player-season that had at least as many attempts as the sample size I was evaluating. For example, when measuring the alpha of free throw percentage with a sample of 50 attempts, I took all the player-seasons with at least 50 free throw attempts in a season. I randomly chose 50 free throw attempts from within that player-season to include in the calculation. The data I used is from the regular season of the 2007 through 2023 seasons.
With that, let’s get into the analysis and takeaways!
Shot Efficiency
I’ll kick things off with the only stat that isn’t efficiency-based that’s on the chart above but is measured on a per-attempt basis, which is the three-point rate (i.e., the percent of a player’s field goal attempts that are three-pointers). As you can see, this stat stabilizes almost immediately, reaching an alpha value of over 0.9 after less than 50 field goal attempts. So, from the outset of a season, it only takes a few games for players’ observed three-point rate to be indicative of their true three-point rate.
However, luck plays a bigger role in determining three-point percentage than other shot types. This, combined with a smaller sample size, makes the reliability of three-point percent much lower within a given season. Just two players — Arike Ogunbowale and Jewell Lloyd — attempted over 300 three-point shots last year. Among the 42 players who attempted at least 100 three-point shots in 2023, their three-point field goal percentage ranged from as low as 29% to 45%. While it’s just a one-season example, the data portrays its randomness well.
Another stat often noted as being more luck-driven is free throw percentage. That notion, though, is typically within the bounds of a single game and our samples extend beyond that. That said, we once again struggle with a small number of players with many free throws. But, the ones that do attempt many show stabilization around the 100-shot mark. Overall, free throw percentage reaches the second-highest reliability behind three-point rate.
Field goal percentage is obviously a combination of two-point and three-point attempts, but it shows some solid stability around the 250-shot mark. In this day and age, field goal percent is a rather flawed stat in terms of evaluating a player’s true efficiency as it doesn’t account for the type of shot being taken. Still, we can be more confident about it measuring a player’s true skill for field goal percentage than other stats.
The rest of the stats-two-point percentage, true shooting percentage, and effective field goal percentage-all drift towards being more reliable as more shots are taken but never truly hit an apex of reliability.
Shot Efficiency Averages and Standard Deviations
One other way that we can evaluate these stats is to see how the sample’s average and standard deviation (dotted line) changes as the sample size increases:
If we were to regress player stats, this chart provides the average to which we would regress at each shot attempt count. The standard deviation would then give us a way to form confidence bands around that regressed stat. For example:
Player A has taken 300 field goal attempts and has a field goal percentage of 53.9%. At 300 field goals, the alpha value for field goal percentage is approximately 0.76, while the sample average is approximately 43%.
We then multiply the observed 53.9% by 0.76 and the sample 43% by 0.24 (1 – 0.76), add the two together, and get a regressed field goal percentage of 51.3%. This, in theory, should be a better representation of the player’s true skill in the context of field goal percentage.
As you can see, each of the stats (aside from three-point rate) shows a very similar trend in that the averages slowly increase and the standard deviation drops off almost immediately before stabilizing. Therefore, the confidence bands for these stats are going to be much smaller as we get more shot attempts taken by the player.
Player Usage
In this next section, I’ll focus on various usage stats that are all measured on a per-possession basis. Due to the nature of my analysis and the data I am working with, I’m defining a “possession” as any team field goal attempt or team turnover. As a reference, the average number of team possessions in a game for the sample I used was about 90 per game.
Turnover and Steal Percentages
Here, I’ll start from the bottom up with turnover and steal percentage. It shouldn’t be too surprising that these are among the least reliable stats with the low counts at which these are accumulated. Additionally, neither of them really hit a plateau until a massive number of possessions is reached, so more is better in the case of these two stats.
Usage Rates
What I find most valuable is how quickly the usage rate gets to a highly reliable level. Simply put, player usage is known almost immediately from the outset. With a WNBA game having 90 possessions per game as I mentioned above, it would take just about 10 games for the alpha value of usage rate to hit a level where we should be comfortable being confident that her season-to-date usage reflects how much usage she will have going forward.
Rebound and Assist Percentages
Finally, rebound percentage and assist percentage follow a nearly identical curve to usage percent. Assist percentage slowly closes the gap in alpha values between the other two variables as more possessions are added and still offers strong reliability early on. But, what shocked me the most was rebound percentage having the strongest reliability of all these usage stats. Typically, lower rebound counts for individual players in a single game can make them feel more random. That said, my hypothesis would be that converting it into a percentage offers added benefits that aren’t captured in raw rebound averages.
As I did above, I’ll now look at the averages and standard deviations for each of the different stats:
The only major takeaway here is related to usage percentage. Intuitively, it makes sense that the usage rate for players would gradually rise as more possessions are measured, as the players with the highest number of possessions are often the ones being used the most. Additionally, it’s worth noting that, while already small, the standard deviations stay at nearly the same level regardless of the sample size.
Conclusion
Hopefully, what I’ve provided offers some utility for those creating projections by knowing which stat categories might introduce more variance on a game-to-game basis. What I’ve shared is, by no means, conclusive evidence for when the measured stats are good to use, but hopefully, they at least provide some directional accuracy in terms of which stats may be more stable than others. Furthermore, I would also acknowledge that there are alternate methods that could be used that I haven’t explored that may offer different (or better). That said, in a future article, I’ll discuss how we can apply these reliability values to player stats in-season to get a better representation of her true talent.
From a betting perspective (this is BettingPros, anyway), when attacking the player prop market, I would lean into longer odds on stats with more variance (i.e., three-pointers made, turnovers). For example, both Kayla McBride and Bridget Carleton have been three-point mavens this year as they’re each shooting over 43% from beyond the arc, averaging 6.7 and 4.7 three-point attempts per game, respectively. We can be pretty confident they’ll both shoot a decent number of threes, but if my projection leans towards the over, I would prefer betting on an alternate over at longer odds.