Quantifying Comebacks and Excitement With Win Probability

Italian translation at settesei.it

As promised the other day, there’s a lot we can do with point-by-point and win probability stats for over 600 grand slam matches.

I’ve beefed up those pages a bit by borrowing some ideas from Brian Burke at Advanced NFL Stats.  He invented a couple of simple metrics using win probability stats to compare degrees of comebacks and the excitement level of (American) football games.

The concepts transfer to tennis quite nicely.  Comeback Factor identifies the odds against the winner at his lowest point.  I’ve defined it the same way Burke does for football: CF is the inverse of the winning player’s lowest win probability.  In the US Open Federer/Djokovic semifinal, Djokovic’s win probability was as low as 1.3%, or 0.013.  Thus, his comeback factor is 1/.013, or about 79.  That’s about as high a comeback factor as you’ll ever see.

On the other end, comeback factor cannot go below 2.0 — that’s the factor if the winning player’s WP never fell below 50%.  Matches in which the winner dominated are often very close to 2.0, as in the Murray/Nadal semifinal.  In that match, Nadal’s low point was facing a single break point at 2-3 in the first set; the comeback factor is 2.3.

A good way to think about comeback factor is this: “At his lowest point, the winning player faced odds of 1 in [CF].”

Excitement Index is a measure of volatility, or the average importance of each point in a match.  “Volatility” measures the importance of each individual point; EI is the average volatility over the course of a match.

(Burke sums the volatilities, reasoning that in football, a fast-paced game with many plays is itself exciting.  Since there is no clock in tennis [not exactly, anyway], it seems appropriate to average the volatilities.  Win probability already considers the excitement and importance of a deciding final set.)

At the moment, I’m calculating EI by multiplying the average volatility by 1000.  The Murray/Nadal match is 35 (not very exciting, though Murray fought back), the Djokovic/Federer match is 47 (more on that in a minute), while the 2nd rounder between Donald Young and Stanislas Wawrinka is 64.  I haven’t looked at all the matches yet, but EI should generally fall between 10 and 100, possibly exceeding 100 in rare instances like the Isner/Mahut marathon.

It seems like Djok/Fed should be higher, perhaps because we remember the excitement of the final set.  (And it may be that the final set should be weighted accordingly.)  But looking at the match log, there were an awful lot of quick games, which translate to relatively low volatility.  By contrast, Donald/Stan was more topsy-turvy throughout, as the players traded sets, then send volatility through the roof with a pair of breaks midway through the final set.

Both EI’s scaling and its exact definition are works in progress.  When I get a chance, I’ll do a survey of matches for which I have point-by-point data to further investigate both of these new (to tennis) metrics.

Win Probability Graphs and Stats

Win probability graphs and stats are now available for over 600 grand slam matches from 2011.  Thanks to IBM Pointstream from this year’s slams, there is a wealth of data available like never before.

Here’s the main menu.

Here’s a sample match: The US Open semifinal between Federer and Djokovic.

When I first started publishing tennis research, win probability was one of my focuses.  You can find earlier work here, which links to specific tables for games, sets, and tiebreaks.  I’ve also published much of the relevant code, which is written in Python.

Win probability represents the odds of each player winning after every point of the match, based on the score up to that point and which player is serving. It makes no assumptions about the specific skill levels of each players, but does assume that the server has an advantage, which varies based on surface and gender.  With every point, each player’s win probability goes up or down, and the degree to which it rises or falls is dependent on the importance of the point–at 4-1, 40-0, winning the point is nice, but losing the point just delays the inevitable; at 5-6 in a tiebreak, the potential change in win probability is huge.

To quantify that in the graphs, I show another metric: Volatility, which measures the importance of each point. It is equal to the difference in win probabilities between the server winning and losing the following point. 10 percent is exciting, 20 percent is crucial, and 30 percent is edge-of-your-seat stuff.

Assumptions

To produce these numbers, I needed to make several simplifying assumptions.  Some are more important than others; here are the big two:

  • The players are equal.
  • Each player’s ability does not vary from point to point.

The first of these is almost always false, and the second is probably false as well.  The first, however, makes things more interesting.  In most matches Novak Djokovic plays these days, he goes in with an 80-percent-or-better chance of winning.  If we graphed one of his matches starting at 85 percent, we’d usually get a very slowly ascending line.  Instead, by starting at 50 percent, we can see where he and his opponent had their biggest openings, and who took advantage.

(In this long-ago post, I showed a sample graph with an assumption similar to the 85 percent for Djokovic, and you can see some of what I mean.)

Assuming that the players are equal also sidesteps of messy question of how to quantify each player’s skill level on that day, on that surface, against that opponent.

The second big assumption ignores possibility real-world attributes like clutch performance and streakiness, along with more pedestrian considerations like some players’ stronger serving in the deuce or ad court.

Another long-ago article of mine suggests that servers are not absolutely consistent, possibly because of natural rises and falls in performance, also possibly because of risk-taking (or lack of concentration) in low-pressure situations.  One of the most interesting directions for research with these stats is into this inconsistency: We need to figure out whether some players are more consistent than others, whether “clutch” exists in tennis, and much more.

One more set of assumptions regards the server’s advantage.  Since these graphs only encompass the four grand slams, I set the server’s win percentage for each tournament.  The numbers I used for men are: 63% in Australia, 61% at the French, 66% at Wimbledon, and 64% at the U.S. Open.  I used percentages two points lower for women at each event.

More on Win Probability

There’s very little out there on win probability and volatility in tennis.  I wasn’t the first person to work out the probability of winning a game, a set, or a match from a given score, but as far as I know, I’m the only person publishing graphs like this.  Much of the problem is the limited availability of play-by-play descriptions for professional tennis.

That problem doesn’t apply to baseball, where win probability has thrived for years.  Here’s a good intro to win probability stats in baseball, and fangraphs.com is known for its single-game graphs–for instance, here’s tonight’s’s Brewers game.  In many ways, win probability is more interesting in baseball than in tennis.  In tennis, there are only two possible outcomes of each point, while in baseball, there are several possible outcomes of each at-bat.

Enjoy the graphs and stats!

The Speed of Every Surface

Italian translation at settesei.it

Last week, I wrote an article for the Wall Street Journal noting the relatively slow speed of this year’s U.S. Open.  It’s not clear whether the surface itself is the cause, or whether the main factor is the humidity from Hurricane Irene and Tropical Storm Lee.  For whatever reason, aces were lower than usual, creating an environment more favorable to, say, Novak Djokovic than someone like Andy Roddick.

The limited space in the Journal prevented me from going into much detail about the methodology or showing results from tournaments other than the slams.  There’s no word limit here at Heavy Topspin, so here goes…

Aces and Server’s Winning Percentage

Surface speed is tricky to measure–as I’ve already mentioned, “surface speed” is really a jumble of many factors, including the court surface, but also heavily influenced by the atmosphere and altitude.  (And, possibly, different types of balls.)  If you were able to physically move the clay courts in Madrid to the venue of the Rome Masters, you would get different results.  But teasing out the different environmental influences is little more than semantics–we’re interested in how the ball bounces off the court, and how that affects the style of play.

So then, what stats best reflect surface speed?  Rally length would be useful, as would winner counts–shorter rallies and more winners would imply a faster court.  But we don’t have those for more than a few tournaments.  Instead, I stuck with the basics: aces, and the percentage of points won by the server.

Important in any analysis of this sort is to control for the players at each tournament.  The players who show up for a lower-rung clay tournament are more likely to be clay specialists, and the men who get through qualifying are more likely to be comfortable on clay.  Also, the players who reach the later rounds are more likely to be better on the tournament’s surface.  Thus, the number of aces at, say, the French Open is partially influenced by surface, and partially influenced by who plays, and how much each player plays.

Thus, instead of looking at raw numbers (e.g. 5% of points at Monte Carlo were aces), I took each server in each match, and compared his ace rate to his season-long ace rate.  Then I aggregated those comparisons for all matches in the tournament.  This allows us to measure each tournament’s ace rate against a neutral, average-speed surface.

The Path to Blandness

The ace rate numbers varied widely.  While the Australian Open and this year’s US Open were close to a hypothetical neutral surface speed, other tourneys feature barely half the average number of aces, and still others have nearly half-again the number of aces of a neutral surface.   I’ve included a long list of tournaments and their ace rates below; you won’t be surprised to see the indoor and grass tournaments on the high end and clay events at the other extreme.

But there’s a surprise waiting.  I also calculated the percentage of points won by the server, and like ace rate, I controlled for the mix of players in every event.  While ace rate varies from 53% of average to 145% of average, the percentage of points won by the server never falls below 90% of average, rarely drops below 95%, and never exceeds 105%.  53 of the 67 tournaments listed below fall between 97% and 103%–suggesting that surface influences the outcome of only handful of points per match.

That may defy intuition, but think back to the mix of players at each tournament.  Big-serving Americans don’t show up at Monte Carlo, while South Americans generally skip every non-mandatory event in North America.  The nominal rate at which servers win points varies quite a bit, but that’s because of the players in the mix.

Also, this finding suggests that, as a stat, aces are overrated.  They may be a useful proxy for server dominance–if a players hits 15 aces in a match, he’s probably a pretty good server–but they come nowhere near telling the whole story.  Aces on grass turn into service winners on hard courts, and then become weak returns and third-shot winners on clay.  The end result is usually the same, but Milos Raonic is a lot scarier when the serves bounce over your head.

Finally, it would be a mistake to say that a variance of 3-5% in serve points won is meaningless.  It may be less than expected, but especially between good servers, 3-5% can be the difference.  Move Saturday’s Federer/Djokovic semifinal to a surface like Wimbledon’s, and we’d be looking at a different champion.

All the Numbers

Here is the breakdown of ace rate and serve points won, compared to season average, for nearly every current ATP event.

Since I am using each season’s average, you may wonder whether the averages themselves have changed from year to year.  I’ve read that courts are getting slower, but in the five-year span I’ve studied here, the ace rate has actually crept up a tiny bit.  Each tournament varies quite a bit–probably due to weather–but generally ends up at the same numbers.

Below, find the 2011 ace rate and percentage of serve points won, as well as the average back to 2007.   Again, these are controlled for the mix of players (including how much each guy played), and the numbers are all relative to season average.

The little letter next to the tournament name is surface: c = clay, h = hard, g = grass, and i = indoor.

Tournament          2011Ace  2011Sv%    AvgAce  AvgSv%  
Estoril          c    57.5%    96.6%     53.3%   94.3%  
Monte Carlo      c    52.0%    92.1%     53.9%   91.2%  
Umag             c    58.6%    95.2%     58.7%   94.3%  
Serbia           c    54.2%    93.5%     61.0%   94.8%  
Rome             c    62.5%    95.9%     62.9%   94.4%  
Buenos Aires     c    61.9%    99.0%     62.9%   98.6%  
Houston          c    64.9%    97.2%     66.6%   96.8%  
Valencia         i                       68.0%   96.4%  
Barcelona        c    55.7%    94.3%     68.0%   96.2%  
Dusseldorf       c    45.7%    96.5%     72.8%   97.2%  

Hamburg          c    78.0%    96.6%     74.3%   96.4%  
Bastad           c    63.8%    94.5%     76.8%   97.7%  
Roland Garros    c    78.0%    98.4%     77.1%   97.5%  
Santiago         c    84.5%    98.5%     81.5%   99.4%  
Costa do Sauipe  c    83.4%   101.7%     84.2%   98.9%  
Nice             c    88.5%    97.4%     84.3%   98.1%  
Casablanca       c    79.1%    99.0%     84.9%   98.2%  
Acupulco         c    70.9%    95.6%     86.0%   98.7%  
Madrid           c    77.0%    98.5%     86.1%   98.0%  
Munich           c    87.9%   100.1%     86.5%  100.0%  

Beijing          h                       86.7%   97.3%  
Los Angeles      h    84.7%    97.2%     87.7%   97.3%  
Kitzbuhel        c    95.8%    97.9%     89.0%   98.6%  
Toronto          h                       89.6%   98.3%  
Chennai          h    82.3%    98.0%     89.6%   98.7%  
Stuttgart        c    77.0%    95.8%     89.7%   98.1%  
Indian Wells     h    88.9%    99.0%     90.9%   98.0%  
Doha             h   125.5%   101.9%     91.2%   97.6%  
Auckland         h   103.1%   102.0%     93.9%   98.7%  
Miami            h    94.5%    97.9%     94.4%   98.0%  

Shanghai         h                       94.6%   98.1%  
Australian Open  h    97.6%    97.3%     96.5%   96.9%  
Kuala Lumpur     h                       97.1%   97.3%  
Sydney           h   105.8%   100.0%     97.4%   99.1%  
St. Petersburg   i                       97.8%  101.7%  
Montreal         h    91.3%    98.4%     98.1%   98.2%  
Delray Beach     h   106.2%    99.9%     99.1%   98.6%  
Gstaad           c   104.5%   100.1%    101.2%  101.4%  
Dubai            h   102.7%    96.5%    103.2%   98.2%  
US Open          h   101.3%    97.4%    104.0%   98.7%  

Vienna           i                      105.8%  101.4%  
Johannesburg     h   110.0%   102.7%    106.0%  101.0%  
Washington DC    h    97.5%   100.1%    106.8%   99.8%  
Newport          g    93.3%    99.0%    107.5%  101.7%  
Winston-Salem    h   108.1%    99.6%    108.1%   99.6%  
Atlanta          h   110.0%   100.9%    108.4%   99.0%  
Bangkok          h                      110.5%  101.6%  
Cincinnati       h    96.2%    98.9%    111.7%  100.5%  
Zagreb           i   107.0%    99.2%    112.3%  102.3%  
Moscow           i                      113.0%  101.3%  

Brisbane         h   130.6%   100.3%    113.4%  100.0%  
Eastbourne       g   111.2%   101.8%    114.1%  102.9%  
Paris Indoors    i                      115.4%   99.6%  
Rotterdam        i   123.8%   103.7%    115.9%  101.0%  
Basel            i                      117.7%  101.3%  
San Jose         i   108.6%   103.0%    120.0%  102.7%  
Wimbledon        g   119.4%   102.8%    120.7%  103.0%  
Queen's Club     g   113.3%   101.8%    121.5%  103.2%  
Halle            g   122.9%   104.7%    123.2%  102.5%  
Marseille        i   127.4%   102.8%    124.2%  102.2%  

Stockholm        i                      124.4%   99.8%  
Metz             i                      124.6%  101.7%  
Tokyo            h                      124.7%  100.5%  
s-Hertogenbosch  g   110.9%   102.1%    126.3%  104.0%  
Memphis          i   117.1%   101.2%    129.1%  102.0%  
Montpellier      i                      145.4%  104.5%