clock menu more-arrow no yes mobile

Filed under:

Putting Games in Perspective with Model-Estimated Value: UConn Sets a Standard for Excellence

Mental Toughness: Beyond the L's and W's | Women Talk Sports
We, as athletes and coaches, really need to evaluate, to self-evaluate our performance not by the L’s and W’s column, but by how we compare against our past, by how we performed game to game and challenge to challenge.It’s unfortunate that so many coaching jobs are decided based on that L’s and W’s column and statistically what a “winning season” is. That final column tells an inaccurate tale of a team.

I will never forget my dad’s words after my final high school athletics awards ceremony.

My coach had announced that I had been selected second team all-league, a sign of steady improvement in my game over four years and an honor I was quite proud of.

“Well that was a surprise,” he said matter-of-factly as we walked out of a brand new facility that I would never have a chance to play in due to graduation. “I guess it was because of that one game in the playoffs. But you never really shot enough.”


I mean, seriously – my basketball career was essentially over, I had just gotten a nice award, and we had been recognized as the most successful boys team in school history having won the city championship and made the state tournament (and lost first round to a team with a kid who was later found to be ineligible). The awards aren’t even determined based upon playoff performance. A “good job, son” would have sufficed.

However, his comment on my modest achievement is sort of representative of the type of constant critique I got throughout my development as a basketball player – his concern as my first coach was never winning and losing, but constantly improving my performance. After a game he would question me on my decision making, help me think through mistakes, and make suggestions on what I could do better. It never felt mean spirited or judgmental, just honest. Even when my career was coming to an end -- we talked about walking on in college but he reminded me that 5'10" off guards don't do so well in the pros -- the emphasis was still on getting better, independent of accolades.

When I listen to college coaches talk about their teams after games, I am often reminded of my dad’s approach to my development as a basketball player – the emphasis was always the process that led to desired outcomes, rather than the outcomes themselves. And although my dad never won a championship beyond my brother's 6th grade rec league, his thinking is actually quite similar to the pursuit of perfection described by University of Connecticut coach Geno Auriemma.

UConn and the Best of the America East Conference: Striving for Perfection - Swish Appeal
Hartford's Rizzotti was shaped by four years as a player at UConn where she was a national player of the year and won a national championship in 1995 as the point guard of Geno Auriemma's first undefeated team. In the post game press conference, Rizzotti shared that she told her Hartford team, "You wonder why I'm a crazy person. This is where I come from. These guys are up 35 and I'm sure (Auriemma) went in the locker room and had something to complain about. And that's where I come from. Whatever you do, we're going to want more. Yes, I want you to be perfect. Do I expect you to be perfect? No, but we're going to strive to be. What's the most perfect version of Hartford that we can be?"

As fans, I think it’s easy to forget that good coaches and athletes know that the “final column tells an inaccurate tale of a team.”

So what I want is tools to tell a more accurate tale of how close "excellence" a team is. And UConn is a good start. I think statistics are one way. First, I'll lay out my thinking, explain the metric that I find useful to accomplish the task, and finally use it as a lens to evaluate UConn's impressive win over Stanford yesterday and compare that performance to other games to build a framework for analysis.

Sure, a team’s record tells you that one team was better than another team for a given period of time on one day. The final score might give you a vague idea of how much better the winner was in that time interval. However, those things tell us very little about how a team has actually performed “game to game and challenge to challenge” as players and coaches think about it – coaches tend to see things within a longer trajectory of development based on practices, games, and where they would like to go.

Just as Auriemma pushed Rizzotti (or, to a lesser extent, as my dad pushed me), the question that is often difficult to answer just by looking at wins and final scores is how well a team played on a given night within the broader context of previous performances.

How close was a team to that standard of perfection they pursue, independent of the final score? How well did a team play within the broader trajectory of their development within a given season?

Evaluating the path to perfection

Anybody who has watched basketball has probably witnessed a game in which they leave thinking, well, they won that game, but they didn’t deserve it. Or witnessed a blowout where we’ve literally left the arena and said, "Well that didn’t mean anything -- the opponent just played terribly."

Conversely, we’ve also probably seen a team lose a game that they "should have" won if not for some bad luck or a late run…or a game that they "should have" lost if not for good luck or a late run.

Similarly, an extremely talented team can win an extremely sloppy game simply as a function of being bigger, faster, or more skilled whereas a scrappy team with less heralded talent can play really well as a team and still lose a well-executed game.

Often times, it’s these subjective assessments of how well a team played despite the outcome that makes following a team exciting…or agonizing (e.g. being a Warriors fan, pre-, post-, or during the “We Believe” run).

As someone obsessively interested in learning the women’s college game, these qualitative assessments add to the story of the game and tells me a little more about the state of a team beyond winning and losing, especially when comparing the performances in and out of conference play in which the quality of competition (and in some cases, how well a team has come together) varies widely. That subjective quality of the game is part of what’s lost in the numbers and part of what we want to know from the accounts of others.

What numbers can illuminate

In yesterday’s post, I tried to argue that statistics were necessary to understanding women’s college basketball with any nuance because it’s absolutely impossible to watch every game – the statistics help us capture patterns in a team’s performance beyond wins and losses that we would otherwise have no way of evaluating. So the question was as follows: what details should we be looking to illuminate with the numbers?

The question is especially pertinent to arguments in March about who deserves what seed and who might make a deep run given that even the most well-informed fan cannot realistically watch every game. Sometimes it's helpful to know how well a team is playing as a unit, which is distinct from, but connected to, whether a team is winning or will win the next game. Yet it's something that simply does not show up in the score or a team’s record.

Over the course of this past WNBA season and the college pre-season, I’ve been tracking performance quality using David Sparks’ Model-Estimated Value (MEV) metric for games I’ve watched in person. While even Sparks admits that it is not well correlated to winning, I’ve found that it’s a very reliable descriptor of how well a team played.

So using examples of college games, I’ll lay out a framework (not necessarily statistical significance) for evaluating the quality of a team’s performance. Regardless of the fact that watching a game will always be better than number gazing, I think MEV is a helpful starting point for insight into college games.

Why Model-Estimated Value?

There are plenty of statistics out there with which to analyze basketball games statistically, so why use Sparks’ metrics as opposed to anyone else?

First, I appreciate that the aim of his work was not necessarily to predict wins, but to assess the value of individual and team performance in order to avoid the cheap talk that has become prevalent in sports talk.

Hardwood Paroxysm: The Arbitrarian Manifesto
The end of æsthetics?

Perhaps this diatribe comes across as unappreciative of the beauty and elegance of sport--the aesthetic appeal. Personally, nothing could be further from the truth. The fundamental draw of athletic competition is the spectacle, and the uncertainty over what might happen next. Few human endeavors are more impressive than Allen Iverson, at a relatively advanced age, sprinting down the court, darting through a forest of men much taller than he, and somehow finding a way to put the ball in the basket while taking physical abuse that would cripple any ordinary man.

Rarely do we get to witness, and even participate in as fans, such emotional catharsis as comes with capturing a championship--the resolution of the of hopes, dreams and efforts of not only a small collection of players, but a contagion of support staff, a supportive city, and even many sympathetic casual observers around the world. Analysis is not the opposite beauty, methodological rigor is not the opposite of casual observation--rather each is a necessary part of a whole. For a fuller sense of enjoyment and understanding of any game, we look to statistics to confirm the impressions we have from just watching; just as sometimes, we look to the court to confirm what the numbers seem to be telling us. There is no right or wrong way to approach the appreciation or assessment of sports, and arguably, a perspective that ignored some aspect--be it gut reaction or regression analysis--would be substantially incomplete. All I know is that there is no dearth of subjective opinion available for your consumption, and all I can do is offer something a little different, and a little less arbitrary.

Second, and perhaps most important, he provides us with a coherent set of statistical tools with which to assess individual, team, and historical performance that work well together and build upon one another to give us a pretty strong grounding for putting a given outcome in perspective even if we haven’t watched the game.

None of this is to say that Sparks’ metrics are inherently "better" than those of others. Instead, I suggest that the coherence of all his work gives us a better snapshot of a game than most other work out there and generally complements observation well (i.e. some statistics fail to pass the “laugh test” when actually going back and looking at what they claim to describe).

Sparks might have squeezed more out of box score statistics than anyone out there and the result is a set of metrics that often do a very, very good job of constructing a detailed and nuanced account of what occurred.

At the same time, his metrics often give us insight into the value of things that occur on the court beyond scoring. That is not at all to diminish the value of scoring, but to say there are other things that a player does throughout the course of a game that are sometimes overlooked in favor of what happened in the "total points" column.

Hardwood Paroxysm " Blog Archive " The Arbitrarian: Marginal productivity of box score statistics
I will suggest that value is a function of productivity. In "counting stat" terms, basketball productivity can be seen as the accumulation of points, rebounds, steals, personal fouls, and so on, by a player or group of players. However, each of these possible production items is worth something different: a player who contributes 5 fouls in a game is certainly affecting the final score in a different way than a player who contributes 5 points in a game, ceteris paribus. Offensive and defensive rebounds might be differentially productive, as might be missed free throws and missed field goals. It should be fairly obvious to most observers of the game that merely "adding the good and subtracting the bad" is not an appropriate way to estimate productivity (See "Efficiency"), though it may be better than focusing heavily on scoring numbers alone.


At any rate, I plan to construct a productivity metric based on a linear-weighting system not too dissimilar from that of Berri and Hollinger, although it differs in the exact weights, and makes fewer "adjustments." Such linear systems are often criticized, but as I have outlined above, they are one of only a few options open to those with an interest in assessing the players of the past. Further, my value metric (as opposed to my productivity metric, if you’re still with me… there is a difference) incorporates more than just the linear-weighting system, as you will see next week. The key contribution I’m making today is to put forward what I believe to be highly significant, verisimilar linear regression results that help us find "true" weightings.

Sparks’ metrics sort of give us a sense of the broad landscape of a game that serve as a guide with which to describe the quality of individual and team performances. Yes, there’s nothing better than being there, but in the absence of being there – or even while being there – I’ve found that Sparks’ statistics give as accurate an account of what I saw as any.

For the current question, I have found his Model-Estimated Value metric to be surprisingly useful at both the individual and team level for capturing the quality of a given game.

What is MEV?

To summarize, the metric captures the value of a player’s performance based upon the total weighted value of each of their measurable actions during a game (points, missed shots, assists, rebounds, etc.). Similarly, the value of a team’s performance can be assessed by weighting the measurable actions of the team (total shots, total assists, total rebounds, etc.). (Please see the "The Marginal productivity of box score statistics" link above for a detailed description of his "metric-determining methodology".)

There are two caveats to the way in which I'm applying this here. First, team MEV does not correlate well to winning (hence the use of a number of other individual-as-team-member metrics that do correlate with winning). However, for my purposes what it does do better than merely looking at the final score is describe the quality of both individual and team performances.

Second, MEV was not weighted for women's college basketball so the numbers are almost certainly imperfect.

Nevertheless, over the course of this college basketball season, I have been keeping track of individual and team MEVs and often times it has said something meaningful about the quality of games. To demonstrate that, I will take a look at five games through the lens of MEV.

This is a small set of cases that I illustrate the utility of team MEV as a determinant of the quality of team performance. Each game was decided by 12 points or less, but the MEV tells us a little bit more about what happened.

Just to reiterate Sparks' point, MEV is not measuring productivity or efficiency but the value of player contributions. As such, when I say that Team MEV is measuring "quality", it's quality as defined by the collective value of a team's performance as a unit.

Obviously, this is not yet perfect and I'll have to continue looking at this, but the following examples will demonstrate two things: both the range of performance quality as defined by MEV and the differential between teams.

University of Connecticut vs. Stanford University 80-68 (MEV: 80.63-41.39)

After UConn’s win over Stanford yesterday, Auriemma commented in a post-game interview with ESPN reporter Rebecca Lobo that their performance represented what it means to strive for perfection. Given that the game involved the consensus #1 team in the nation playing the consensus #2 team, I think it’s fair to use this game as a starting point to establish a standard for good performance.

By now, you’re probably familiar with the general storyline of the game – the first half was an amazingly competitive game and then UConn just turned up the intensity and Stanford just fell apart.

No. 2 Stanford falls to No. 1 Connecticut - San Jose Mercury News
"The last time we played them, we got blasted right at the beginning," said Stanford coach Tara VanDerveer, referring to last season's 83-64 loss in a Final Four semifinal. "I think for people's confidence, the first half was good. Then reality set in." UConn took over in the second half, going on a 34-10 run in the first 14:17. "They didn't change anything in their defense except the intensity," said Nneka Ogwumike, who led Stanford with 20 points, 16 in the first half.

However, that’s not an entirely accurate description.

As described by color commentator Doris Burke and Lobo during the game and Auriemma moments afterwards, what actually occurred is that UConn shifted their defensive strategy, actively seeking to take Stanford out of their game with 3/4 court pressure. Stanford’s guards – who Michelle Smith writes must improve – struggled with the pressure, sometimes letting 15-20 seconds tick off the shot clock before getting the team into their offense.

“Soft pressure, that's forcing them to use clock,” said Burke, just before the 13:40 mark in the second half when Stanford guard Jeannette Pohlen took a three pointer from the left wing that missed the rim entirely and bounced off the right side of the backboard. “So why are they doing this? Because Stanford is a disciplined patterned offensive team. They like their rhythm. So they're just trying to throw different things and make them think and disrupt what they're trying to do.”

So Stanford’s decline and UConn’s increase in performance are in fact connected: 6 of Stanford’s 7 second half turnovers were on UConn steals. Stanford did get the ball into WNBA prospect and center Jayne Appel more often in the second half – she went 3-3 in the first half, 2-9 in the second half – but the UConn defense smothered her any time she got the ball forcing her to take very difficult contested shots over double or triple teams (Appel’s two field goals in the second half came once the game was essentially over).

It wasn’t just Stanford suddenly playing worse – UConn was responsible for a large part of it.

But what does this have to do with MEV?

As a result of UConn’s defensive pressure, Stanford’s overall performance in the second half dropped dramatically:

1st half MEV: Stanford 31.61, UConn 26.46
2nd half MEV: Stanford 9.78, UConn 54.17

Does the 50 point swing in MEV tell us more than the 14 point swing in points or the knowledge that UConn went on a 32-8 run? That’s debatable. However, it does demonstrate that UConn had a truly dominant all-around second half, not just a hot shooting second half.

MEV gives us is a general idea of just how well UConn played against a very good Stanford team. What it doesn’t tell us is exactly why Stanford struggled, what UConn did well, that they went on a 32-8 run, or the game ended up a 12 point game only because their intensity decreased once the game was in hand.

MEV enables us to make evaluate the overall performance of a team and then compare the performance -- not just the point total -- to performances in other games, if we were to look across their schedule. For now, I’m going to use this game to establish a standard by which to put other games in perspective.

Seattle University vs Concordia College 47-50 (MEV: 32.56-33.85)

Although the following examples are in chronological order, the SeattleU-Concordia game near the beginning of the pre-season serves as a good way to demonstrate the range of quality across Division I.

Consider this: Stanford’s MEV for the first half against a dominant UConn team was 31.94, about the same MEV SeattleU and Concordia put up for their entire game.

Aside from the fact that MEV shows that this game was not near the quality of UConn-Stanford, it also shows a competitive game between two teams at the bottom of the pecking order. While the result wasn’t at all what SeattleU wanted against a NAIA team, the MEV ratings might be what one might expect from a tight game that came down to a three point shot as they clock ran out that would have tied it. The Redhawks had a chance to win the game down to the buzzer and that shows in the MEV ratings.

The difference in the game was 6’4” Concordia center Ann Snodderly who really hurt the undersized Redhawks on both ends of the floor. However, neither team played well -- while SeattleU shot 22.4% for the game, Concordia had a turnover percentage of 36.82 and only shot 37%.

What MEV gives us in this case is a snapshot statistic that allows us to quickly compare the game to the quality of the UConn game without having to sift through a lot of numbers. Really it just reinforces common sense – UConn and Stanford are much better than SeattleU and Concordia. Great.

Seattle University vs. College of William & Mary – 58-69 (35.3 to 59.71)

This outcome is a little more interesting – despite an 11 point margin, the College of William & Mary was seemingly dominant by MEV standards.

The strong and aggressive play of guard Taysha Pye was simply too much for SeattleU to handle. The game actually didn’t feel quite as close as it even was because despite an almost 8 minute 13-2 run in the middle of the second half, SeattleU never had an answer for the physical play of Pye. In this case, the game was not as close as the score made it seem.

However, W&M coach Debbie might help us put MEV production in perspective:

“It wasn’t pretty but I thought for playing back to back games against a team who hadn’t played back to back games our effort was good,” said Taylor after the game.

In other words, I think it's safe to say that even a W&M team that is currently ranked 214 in RPI is playing ugly basketball with a MEV of 59.71. They essentially won on the strength of a very well played first half in which they had a MEV of 41.49.

We also see that MEV is not quite tied directly to scoring output – W&M scored 69 points against on a MEV of 59.71 against SeattleU while Stanford scored 68 points on a MEV of 41.39. What it demonstrates is not that W&M is objectively better than Stanford, which would be an absurd claim. It does demonstrate that W&M had a better performance against objectively weaker competition on a particular day.

In addition, it starts to establish a standard for what an average performance is as demonstrated by the games below.

University of Washington vs Sacramento State 71-74 (64.14-62.56)

In one of the bigger upsets of the young college basketball season -- in which UW had a chance to tie the game with two three point attempts as time ran out -- they UW may actually have outplayed Sacramento State overall judging by MEV.

Despite holding a 55-54 second half lead with 12:00 minutes left, Washington stopped moving the ball and settled for 14 threes of which they made 3. Their field goal percentage fell from 48.5% to 30% and they found themselves down 10 with 3:27 left in the second half.

What these MEV ratings say is that UW played well enough to win the game but made mistakes that prevented that from occurring. In this case, it was a matter of getting caught up in SeattleU’s high paced attack and shot selection that did not play to their strengths. With a bit more patience on taking those threes or better response to SeattleU’s press might have been the difference in this game as much as their overall performance.

It was an especially disappointing outcome for the Huskies because it was a winnable game against a team that had never won a Pac-10 game.

SeattleU vs Washington 53-58 (43.36 – 57.12)

Here is yet another example of a game in which the winner had a much better MEV than the loser but only won by a narrow margin.

In this game, UW was ultimately the bigger more talented team, but SeattleU stayed in the game with a hard-fought and efficient effort. The problem was that they stopped moving the ball as much in the second half and UW guard Kristi Kingma got hot down the stretch to put the game away.

Once again though, UW had a game in which they played well enough to beat their opposition handily and didn’t. Again, the game came down to mental errors moreso than a bad performance – they struggled to get the ball into center Regina Rogers and were frustrated by SeattleU’s defensive schemes. Again, they weren’t playing poorly, but they didn’t play to their strengths and it resulted in what could have been another huge upset loss against a team they’re “supposed to beat.”

SeattleU vs St. Mary’s 56-63 (47.06 – 57.31)

The most interesting thing about this outcome as evidenced by the MEV is the fact that despite losses, SeattleU is playing increasingly better basketball overall. And as coach Joan Bonvicini said after the game, they’re starting to play more consistent basketball – better basketball against tougher competition. Even in scoring less points than they did against W&M, they are coming out with higher MEVs against arguably better teams in Washington and St. Mary’s.

"I’m seeing us improve, I’m seeing us get better," said Bonvicini after the game. "We’re playing good teams."

This also reinforces the fact that there is little correlation between MEV and points, despite the fact that points are included in the MEV formula. What the MEV does is tell us something about the quality of how a team played.

As such, the value of MEV is probably best described by SeattleU’s progression of games – it demonstrates comparative shifts in performance independent of competition or scoring output. From watching UW games, I would also argue that their inability to win games convincingly despite MEV differentials is indicative of mental lapses beyond the numbers, something that is not immediately evident from the numbers. And of course, MEV gave us a very clear sense of how much better UConn was than Stanford yesterday.

Just as a preliminary framework looking at games I’ve seen, MEVs under 35 indicate a bad game, MEVs in the 60-70 range indicate an average game, and MEVs in the 80 range – two extremely good halves in the 40 range – are outstanding.

When trying to figure out whether a team is getting better, worse, or plateauing, MEV can be useful as a metric with which to assess a team’s performance.

However, what we still don’t know is the team’s style of play that might have also influenced scoring output that might better help describe the character of the game.

Next post: Style of play – synergy, rhythm, and continuity.

Previous: Illuminating the Black Box: Using Statistics to Understand Women's College Basketball