clock menu more-arrow no yes mobile

Filed under:

Illuminating the Black Box: Using Statistics to Understand Women's College Basketball

"Then there is the man who drowned crossing a stream with an average depth of six inches." 
~W.I.E. Gates

"Do not put your faith in what statistics say until you have carefully considered what they do not say." 
~ William W. Watt

"People use statistics in the same way a drunk man uses a lamp-post. For support rather than illumination."
~ G. K. Chesterson

* * *

\With the year (and decade) drawing to a close, people are flooding the web with "best of…" and "year in review" posts for everything from top sporting moments to commercials targeting women.

Among the more interesting lists that emerged out of this annual ritual was Stephen Dodson's post about the top ten articles of the year. Two of Dodson's top articles were sports articles: Bill Simmons’ interview with Malcolm Gladwell and Michael Lewis' article about Shane Battier.

As I re-read the Lewis article, a line about the role of statistics in professional basketball caught my eye.

The No-Stats All-Star -
...the big challenge on any basketball court is to measure the right things. The five players on any basketball team are far more than the sum of their parts; the Rockets devote a lot of energy to untangling subtle interactions among the team’s elements. To get at this they need something that basketball hasn’t historically supplied: meaningful statistics.

For most of its history basketball has measured not so much what is important as what is easy to measure — points, rebounds, assists, steals, blocked shots — and these measurements have warped perceptions of the game. ("Someone created the box score," Morey says, "and he should be shot.")

Personally, I like statistics because they help me to pinpoint reasons a team won or lost, see patterns that might have seemed like disconnected moments, and organize my thinking about the game.

However, while the use of statistics has definitely been growing among basketball enthusiasts, there are people who lament their use for good reason – they neither capture the full story of the game, nor the nuance of the type of "good basketball" exemplified by the subject of Lewis' piece, Houston Rockets forward Shane Battier.

I'm sure we can all think of examples where statistics were used to support something outlandish.

For example, a student of mine tried to explain to the class that Denver Nuggets forward Carmelo Anthony was better than Cleveland Cavaliers forward LeBron James on the basis of his scoring average. Although it might be possible to present evidence to support the claim that Anthony is the best NBA player right now, my student's argument amounts to a logical fallacy almost as egregious as the man who drowned crossing a stream described by W.I.E. Gates -- scoring average is but one slice of a much larger story of what makes a great player, not the basis of an argument about what makes the best player.

However, the problem is not so much the statistics themselves as much as people not carefully considering what they wish to say with the statistics. As recently noted by Atlanta Hawks GM Rick Sund, statistics are but one tool to assess basketball performance, not the final authoritative answer by any means.

For Woodson, numbers just part of the equation |
Sund said he considers the various numbers when evaluating players for acquisition. For instance, he said guard Jamal Crawford's statistics in the final five minutes of games helped sway him to trade for him over the summer. But he calls the data "just one other tool that goes in the hopper" to judge talent and is not the final arbiter.

I really made the turn to statistics when I started following women's basketball simply as a guide to figure out who the major players were and what things I should be paying attention to. Statistics have simply been a way to acclimate myself to women's basketball. To claim that I wouldn't need statistics to learn the game because watching games is sufficient would implicitly assume a) that I'm not obsessed with basketball and b) that my own subjective observation of a game is inherently more accurate than "cold, hard" numbers, quite an arrogant claim for a novice.

The key however is to keep it all in perspective: statistics should be used to complement, not replace, the subjective beauty of games like basketball.

An Introduction to Advanced Basketball Statistics | Empty the Bench
As much as I love the aesthetic beauty of the game that first attracted me to basketball, I try to find and use meaningful statistics whenever I have a real conversation about what makes a team or a player good or bad. The basic stats do paint a fuzzy picture of what’s happening, but it’s some more advanced stats that add the details we should be looking for.

So the question is: what details should we be looking to illuminate with the numbers?

What statistics do well is provide us with insight into patterns among the fine details that we might have some fuzzy perception of during one game, but struggle to capture and identify well over the course of 30, 34, or 82 games. On a grander scale, statistics help us put things in perspective by allowing us to evaluate single game, season, and career performances in comparison, contrast, and the context of a much broader narrative. Therefore, over time, statistics provide us with the ability to illuminate patterns and trends about the games we love that would otherwise be impossible.

Using Statistics to Understand Women's College Basketball

I would argue that any serious women’s college basketball fan should actually be more interested in statistics simply because it is not humanly possible to see even every tournament team play every game, much less all 346 teams that compete in NCAA Division I.

Furthermore, while it’s hard enough to follow all of men’s basketball, it’s even harder to follow women’s basketball given that so few games are televised and so many less are even given in-depth coverage. If nothing else, the statistics can tell us some kernel of a story about what happened in a given game or over the course of a given season that we would otherwise not have.

Some might argue that statistics are useless in college basketball because the strength of competition, schedules, and even talent level within teams varies so greatly. And that’s a good point.

However, what is useful about statistics for a sport as broad as college basketball is that if we carefully consider what they say and do not say, they can be used to approximate elements of the game that add to our understanding of the action.

Most often, people have sought statistics that predict future outcomes, not only for gambling purposes, but for the sake of fans who want to play "what if" games (among the most exciting things about watching sports is imagining bounded possibilities and watching them play out). However, to truly understand the landscape of the game -- not just consume information -- one needs more than outcome data.

In other words, what’s needed in order to best follow women's college basketball is a set of statistics that accurately describe as much as we can about the character of the game in addition to explaining why the outcome came about.

For example, things such as ball movement, whether a team played together well, or whether or not a team actually played well and lost or played terribly and won are things that are difficult for the casual fan to see just looking a final scores and basic statistics.

Of course, this is a fine distinction: it’s generally hard to separate even qualitative description from analysis, especially when trying to add qualitative nuance to a description with quantitative data.

Moreover, there is no way to possibly explain in numbers the things that to my mind create the aesthetic beauty of basketball: a point guard who consistently drives through the lane and draws the defense to create plays for others, a defense in which all five players appear to make a unified response to the defense, or how a team identifies and exploits a mismatch.

Perhaps another way to think about this is as trying to illuminate a black box: simply knowing which players were on the court (inputs) and knowing the final score (output) is not what any sport is really about. What's more interesting is how those players responded to each other and their opponents in the series of moments that led to the outcome.

No -- the beauty of those moments cannot be recounted by narrative description or a well-developed statistical analysis. So I suppose that underlying this desire to find more nuanced means of capturing college games is a willingness to satisfice: to get a good result that is good enough although not necessarily the best, as described by Herbert A. Simon in 1957.

I just want to find better ways to understand this game that I cannot possibly follow in its entirety.

So here's the question that I've been struggling with and have come up with some tentative answers for after watching as many pre-season games as I could: how can we give the most accurate account of what’s happening in women’s college basketball – even in one conference – if it’s essentially impossible to watch every game due to fundamental barriers such as accessibility, time, and money? What might statistics say and not say in response to that question?

Thankfully, although I’m not a statistician, I believe that most of the statistical tools necessary to accomplish this goal already exist. And since I'm on vacation for the holidays and will need something to do while taking breaks from family, I thought I would share my latest quantitative thinking and then actually get ready to use a few things during conference play.

Even though none of the tools I favor were developed for women’s basketball (thus further adding to a laundry list of imperfections in an imperfect process) I would argue that they work well enough across all levels of basketball – men’s or women’s, college or professional – that they are useful in approximating an account of what happens in a women’s college basketball game despite the fact that they might not explain outcomes.

Click for Part 2: Putting Games in Perspective With Model-Estimated Value