Connecticut vs. Stanford - Again and Again and Again....

Last year's Connecticut Huskies team might have been one of the most dominant squads ever, going undefeated and sweeping the 2010 NCAA Women's Basketball Tournament.  They concluded their title run with a 39-0 record, beating 53-47 in the National Championship.

However, the NCAA Tournament has a very different format than say, the march to the WNBA Finals.  The NCAA tournament is a single-elimination tournament - one bad day, and your shot at a national championship is lost permanently.  You have to go 6-0 in the tournament to call yourself champion.  Whereas in the WNBA Playoffs, the current format is 3-5-5.  One could win eight games and lose five and still be crowned champion at the end of the year.

The Huskies were routinely superior to their competition.  They blew out Southern by 56 points, Temple by 54, Iowa State by 38, Florida State by 40 and Baylor by 20.  However...was Connecticut really better than Stanford?  At Storrs, Connecticut they beat the Cardinal 80-68 in front of a home crowd...but home field advantage has to count for something.  In crunch time, they only beat Stanford by six.  Is there a possibility that Connecticut could have had a bad day on April 6, 2010?  That Stanford could have ended the Huskies perfect season?  And if so, how great was that possibility?


I've recently read a paper by Ryan Bach where he discusses the modeling of basketball games by estimating the probability of points per possession.  In short, given a possession, what is the likelihood that a team will score two points during a possession?  What is the likelihood that it will just score one point?  Two points?  Three points?  From there, one could calculate a complex probability formula based on home court advantage and offensive and defense rating of the teams involved.

Unfortunately, I don't have the skill to do multivariate statistical analysis (not yet, anyway).  I also don't have the software package that can take 2000+ men's basketball box scores and grind out the coefficients using that analysis, nor really understand how to use the results.  What I can do is take Bach's results and, ignoring the gender difference, apply the value to women's basketball.  In short, given a possession by either Connecticut or Stanford, I could calculate (by random simulation) how any points would be scored in that possession.  (If you're really interested, I'll reprint the formulas in the reply thread.)

The next step would be to calculate the number of possessions by both teams in a game.  Dean Oliver and company have a formula that can estimate possessions per game, because it would be monstrously hard to do that based on 77 games worth of play-by-play data for both Connecticut and Stanford.  Once again, there's a data issue - Oliver's formula is based on the NBA, and this is women's college basketball.  We'll blink our eyes, accept the problem and move forward.  From the cumulative seasons stats, we can estimate how many possessions either the Huskies or the Cardinal have in a game.

Home field advantage?  Well, this is the NCAA tournament so theoretically, there isn't one.  Hooray!

Offensive and defensive rating?  Bach's paper requires a ton of finagling to get those, so I'm going to use the APBA method.  APBA is a card-based baseball game where the batter/pitcher dichotomy was handed by random chance - 50 percent of the time the results are read off the batter's card and 50 percent of the time off the pitcher's card.  We'll do the same.  If we can calculate how likely it is for a team to score 0, 1, 2 or 3 points, we can look at the cumulative opponent data and determine the same probability.

We'll start with Connecticut.  Using the formulas provided by Bach, here's how a typical Huskies possession ends:

45.8 percent - no points scored
4.4 percent - one point scored
40.2 percent - two points scored
9.6 percent - three points scored

However, we're going to assume that half the time, the Stanford defense will dictate how the possession turns out, and not the Connecticut offense.  Here's how a typical Stanford opponent did with one of their possessions facing the Cardinal defense.

62.3 percent - no points scored
3.7 percent - one point scored
27.3 percent - two points scored
6.7 percent - three points scored

For every Connecticut possession, fifty percent of the time we'll use the first set of probabilities:  the Connecticut offense determined the results of the possession.  For the other fifty percent of random chance, we'll use the "Typical Stanford Opponent" offense - this is where the Stanford defense determined the final result.

For Stanford, we'll do the reverse:  half the time the Stanford offense determines the outcome, the other half of the time we use a table corresponding to the typical Connecticut opponent.

The next step is to give each team a roughly equal number of possessions and let them have at it.  Connecticut averages 69.06 possessions per game; Stanford averages 67.37.  Taking the midpoint, we'll give each team 68 possessions.  Each possession will be assigned a final result of 0, 1, 2 or 3 points based on the offense/defense choice and the associated probability tables.  We'll sum up the points for each team, compare the sums and that will be our final score.

I performed this simulation 25 times.  Here were the results:

Connecticut    61    Stanford    49
Connecticut    74    Stanford    58
Connecticut    63    Stanford    61
Connecticut    74    Stanford    73
Connecticut    59    Stanford    58
Connecticut    71    Stanford    65
Connecticut    67    Stanford    68
Connecticut    71    Stanford    72
Connecticut    71    Stanford    77
Connecticut    58    Stanford    66
Connecticut    61    Stanford    55
Connecticut    76    Stanford    66
Connecticut    71    Stanford    67
Connecticut    54    Stanford    56
Connecticut    62    Stanford    52
Connecticut    67    Stanford    67
Connecticut    68    Stanford    59
Connecticut    55    Stanford    51
Connecticut    68    Stanford    60
Connecticut    56    Stanford    52
Connecticut    62    Stanford    50
Connecticut    64    Stanford    58
Connecticut    55    Stanford    51
Connecticut    68    Stanford    56
Connecticut    70    Stanford    62

Record of Connecticut Huskies:  19 1/2 - 5 1/2 (78 percent winning percentage)
Biggest Huskies win:  16 points (Connecticut 74, Stanford 58)
Biggest Cardinal win:  8 points (Stanford 66, Connecticut 58)
Average game result:  Connecticut by 4 1/2

In short, replaying this game over and over gives the Huskies the win nearly four out of five times.  Granted, the model has a lot of flaws.  Some of the assumptions are based on men's college basketball.  Some are based on men's pro basketball.  How quality of schedule affects the results isn't well understood.  But at least, the results look realistic - no one is scoring a 100 point game or winning by forty points.

If Connecticut and Stanford are theoretically the two best teams in women's basketball, winning 78 percent of your predicted match-ups on a neutral court is the very definition of dominance.  Whatever happened on April 6, 2010 was definitely not a fluke.