How To Keep Round Robin Matches Interesting

Italian translation at settesei.it

Round robins–such as the formats used by the ATP and WTA Tour Finals–have a lot going for them. Fans are guaranteed at least three matches for every player, and competitors can recover from one (or even two) bad outings. Best of all, when compared to a knockout-style draw, it’s twice as much tennis.

On the other hand, round robins have one major drawback: They can result in meaningless matches. It’s fairly common that, after two matches, a player is guaranteed a spot in the semifinals (sometimes even a specific seed) or eliminated from contention altogether. At a high-profile event such as the Tour Finals, with sky-high ticket prices, do we really want to run the risk of dead rubbers?

I don’t claim to have the answer to that question. However, we can take a closer look at the round robin format to answer several relevant questions. What is the probability that the final day of a four-player group will include at least one dead rubber? What about the final match? And most importantly, before the event begins, can we set the schedule in such a way to minimize the likelihood of dead rubbers?

The range of possibilities

As a first step, let’s determine all of the possible outcomes of the first four matches in a four-player round robin group. For convenience, I’ll refer to the players as A, B, C, and D. The first day features two matches, A vs B and C vs D. The second day is A vs C and B vs D, leaving us with a final day of A vs D and B vs C.

Each match has four possible outcomes: the first player wins in two sets, the first player wins in three, the second wins in two, or the second wins in three. (Sets won are important because they are used as a tiebreaker when, for instance, three players win two matches apiece.) Thus, there are 4 x 4 x 4 x 4 = 256 possible arrangements of the group standings entering the final day of round robin play.

Of those 256 permutations, 32 of them (12.5%) include one dead rubber on the final day. In those cases, the other match is played only to decide semifinal seeding between the players who will advance. Another 32 of the 256 permutations involve one “almost-dead” match, between a player who has been eliminated and a player who is competing only to determine semifinal seeding.

In other words, one out of every four possible outcomes of the first two days results in a day three match that is either entirely or mostly meaningless. Later on, we’ll dig into the probability that these outcomes occur, which depends on the relative skill levels of the four players in the group.

Before we do that, let’s take a little detour to define our terms. Because of the importance of semifinal seeding, some dead rubbers are less dead than others. Further, it is frequently the case that one player in a match still has a shot at the semifinals and the other doesn’t. Altogether, from “live” to “dead,” there are six gradations:

  1. live/live — both players are competing to determine whether they survive
  2. live/seed — one player could advance or not; the other will advance, and is playing to try to earn the #1 group seed
  3. live/dead — one player is trying to survive; the other is eliminated
  4. seed/seed — both players will advance; the winner gets the #1 group seed
  5. seed/dead — one player is in the running for the #1 seed; the other is eliminated
  6. dead/dead — both players are eliminated

All else equal, the higher a match lies on that scale, the more engaging its implications for the tournament. For the remainder of this article, I’ll refer only to the “dead/dead” category as “dead rubbers,” though I will occasionally discuss the likelihood of “dead/seed” matches as well. I’ll assume that the #1 seed is always more desirable than #2 and ignore the fascinating but far-too-complex ramifications of situations in which a player might prefer the #2 spot.

The sixth match

As we’ve seen, there are many sequences of wins and losses that result in a dead rubber on day three. Once the fifth match is played, it is even more likely that the seedings have been determined, making the sixth match meaningless.

After five matches, there are 1,024 possible group standings. (256 permutations after the first four matches, multiplied by the four possible outcomes of the fifth match.) Of those, 145 (14.1%) result in a dead sixth rubber, and another 120 (11.7%) give us a “dead/seed” sixth match.

We haven’t yet determined how likely it is that we’ll arrive at the specific standings that result in dead sixth rubbers. So far, the important point is that dead rubbers on day three aren’t just flukes. In a four-player round robin, they are always a real possibility, and if there is way to minimize their likelihood, we should jump at the chance.

Real scenarios, really dead rubbers

To figure out the likelihood of dead rubbers in practical situations, like the ATP and WTA Tour Finals, I used a hypothetical group of four players with Elo ratings spread over a 200-point range.

Why 200? This year’s Singapore field was very tightly packed, within a little bit more than 100 points, implying that the best player, Angelique Kerber, had about a 65% chance of beating the weakest, Svetlana Kuznetsova. By contrast, the ATP finalists in London are likely to be spread out over a 400-point range, giving the strongest competitor, Novak Djokovic, at least a 90% edge over the weakest.

I’ve given our hypothetical best player a rating of 2200, followed by a field of one player at 2130, one at 2060, and one at 2000. Thus, our favorite has a 60% chance of beating the #2 seed, a 69% chance of defeating the #3 seed, and a 76% chance of besting the #4 seed.

For any random arrangement of the schedule, after the first two days of play, this group has a 17% chance of giving us a dead rubber on day three, plus a 23% chance of a “dead/seed” match on day three.

After the fifth match is contested, there is a 16% chance of that the sixth match is meaningless, with an additional 12% chance that the sixth match falls into the “dead/seed” category.

The wider the range of skill levels, the higher the probability of dead rubbers. This is intuitive: The bigger the range between the top and bottom, the more likely that the best player will win their first two matches–and the more likely they will be straight-setters. Similarly, the chances are higher that the weakest player will lose theirs. The higher the probability that players go into day three with 2-0 or 0-2 records, the less likely that day three matches have an impact on the outcome of the group.

How to schedule a round robin group

A 17% chance of a dead rubber on day three is rather sad. But there is a bright spot in my analysis: By rearranging the schedule, you can raise that probability as high as 24.7% … or drop it as low as 10.7%.

Remember that our schedule looks like this:

Day one: A vs B, C vs D

Day two: A vs C, B vs D

Day three: A vs D, B vs C

We get the lowest possible chance of a day three dead rubber if we put the players on the schedule in order from weakest to strongest: A is #4, B is #3, and so on:

Day one: #4 vs #3, #2 vs #1

Day two: #4 vs #2, #3 vs #1

Day three: #4 vs #1, #3 vs #2

There is a small drawback to our optimal arrangement: It increases the odds of a “dead/seed” match. It turns out that you can only optimize so much: No matter what the arrangement of the competitors, the probability of a “dead/dead” or “dead/seed” match on day three stays about the same, between 39.7% and 41.7%. While neither type of match is desirable, we’re stuck with a certain likelihood of one or the other, and it seems safe to assume that a “dead/seed” rubber is better than a totally meaningless one.

Given how much is at stake, I hope that tournament organizers heed this advice and schedule round robin groups in order to minimize the chances of dead rubbers. The math gets a bit hairy, but the conclusions are straightforward and dramatic enough to make it clear that scheduling can make a difference. Over the course of the season, almost every tennis match matters–it would be nice if every match at the Tour Finals did, too.

(I wrote more about this, which you can read here.)

The Pointlessness of Playing the Lets

Italian translation at settesei.it

Some people always want tennis matches to be shorter. Among the many recurring proposals to accomplish that, one that has been implemented in some places is eliminating service lets. In other words, serves are treated the same way as any other shot: If the serve clips the net and lands in the box, it’s in play.

“No-let” rules have been adopted by World Team Tennis and American university tennis. In the latter case, eliminating lets has more to do with ensuring fair play in the absence of an umpire. In 2013, the ATP experimented with no lets on the Challenger tour for the first three months of the year.

With an umpire on every professional court and machines that detect service lets at tour-level events, fairness (or avoiding cheating) is not the issue here. The reason we’re talking about this is that service lets take time, and apparently time is the enemy.

How much time?

The Match Charting Project has tracked lets in most of the 2,500-plus matches it has logged. Thus, we have some real-life data on the frequency of service lets. For today, I’ve limited our view to matches since 2010, which still gives us more than 2,000 matches to work with.

The average men’s match in the database, which consists of 151 total points, had six first-serve lets and fewer than one (0.875) second-serve let. Women’s matches are similar: Of the typical 139 points, there were 4.5 first-serve lets and 0.8 second-serve lets.

Let’s estimate the extra time all those lets are taking. After a first-serve let, most players restart their preparations, so let’s say a first-serve let is an extra 20 seconds. When the second serve is a let, most players are quicker to try again, so call that 10 seconds.

For the average men’s match in the database, that’s an extra 128 seconds–just over two minutes. For women, that’s 99 extra seconds per match. In both cases, the time consumed by service lets is less than one second per point.  Just about any other rule change aimed at speeding up the game would be more effective than that.

Even at the extremes, it’s tough to argue that service lets are taking too much time. Of all the matches in the charting database, none had more than 24 service lets, and that was in the 2012 London Olympics marathon between Roger Federer and Juan Martin Del Potro. Using the estimates I gave above, those 20 first-serve and four second-serve lets accounted for just over seven minutes of the total match time of 4:26.

Only one of the 1,000 women’s matches in the database featured more than 17 service lets or more than five let-attributable minutes: Petra Cetkovska‘s three-set upset of Angelique Kerber at the 2014 Italian Open. That outlier included 22 lets, which we would estimate at a cost of just under seven minutes.

Playing service lets wouldn’t destroy the very fabric of tennis as we know it, but it also wouldn’t substantially shorten matches. By changing the let rule, tennis executives would needlessly annoy players and fans for no noticeable benefit.

What Would Happen If the WTA Switched to Super-Tiebreaks?

Italian translation at settesei.it

It’s in the news again: Some tennis execs think that matches are too long, fans’ attention spans are too short, and the traditional format of tennis matches needs to change. Since ATP and WTA doubles have already swapped a full third set for a 10-point super-tiebreak, something similar would make for a logical proposal to cap singles match length.

Let’s dig into the numbers and see just how much time would be saved if the WTA switched from a third set to a super-tiebreak. It is tempting to use match times from doubles, but there are two problems. First, match data on doubles is woefully sparse. Second, the factors that influence match length, such as average point length and time between points, are different in doubles and singles.

Using only WTA singles data, here’s what we need to do:

  1. Determine how many matches would be affected by the switch
  2. Figure out how much time is consumed by existed third sets
  3. Estimate the length of singles super-tiebreaks
  4. Calculate the impact (measured in time saved) of the change

The issue: three-setters

Through last week’s tournaments on the WTA tour this year, I have length (in minutes) for 1,915 completed singles matches.  I’ve excluded Grand Slam events, since third sets at three of the four Slams can extend beyond 6-6, skewing the length of a “typical” third set.

The average length of a WTA singles match is about 97 minutes, with a range from 40 minutes up to 225 minutes. Here is a look at the distribution of match times this year:

histo1

The most common lengths are between 70 and 90 minutes. Some executives may wish to shorten all matches–switching to no-ad games (which I’ve considered here) or a more radically different format such as Fast4–but for now, I think it’s fair to assume that those 90-minute matches are safe from tinkering.

If there is a “problem” with long matches–both for fan engagement and scheduling–it arises mostly with three-setters. About one-third of WTA matches go to a third set, and these account for nearly all of the contests that last longer than two hours. 460 matches have passed the two-hour mark this season. Of those, all but 24 required a third set.

Here is the distribution of match lengths for WTA three-setters this season:

histo3

If we simply removed all third sets, nearly all matches would finish within two hours. Of course, if we did that, we’d be left with an awful lot of ties. Instead, we’re talking about replacing third sets with something shorter.

Goodbye, third set

Third sets are a tiny bit shorter than the first and second sets in three-setters. If we count sets that go to tiebreaks as 14 games, the average number of games in a third set is 9.5, while the typical number of games in the first and second sets of a three-setter is 9.7.

Those counts are close enough that we can estimate the length of each set very simply, as one-third the length of the match. There are other considerations, such as the frequency of toilet breaks before third sets and the number of medical timeouts in different sets, but even if we did want to explore those minor issues, there is very little available data to guide us in those areas.

The length of a super-tiebreak

The typical WTA three-setter involves about 189 individual points, so we can roughly estimate that foregoing the third set saves about 63 points. How many points are added back by playing a super-tiebreak?

The math gets rather involved here, so I’ll spare you most of the details. Using the typical rate of service and return points won by each player in three-setters (58% on serve and 46% on return for the better player that day), we can use my tiebreak probability model to determine the distribution of possible outcomes, such as a final score 10-7 or 12-10.

Long story short, the average super-tiebreak would require about 19 points, less than one-third the number needed by the average third-set.

That still doesn’t quite answer our question, though. We’re interested in time savings, not point reduction. The typical WTA third set takes about 44 minutes, or about 42 seconds per point. Would a super-tiebreak be played at the same pace?

Tiebreak speed

While 10-point breakers are largely uncharted territory in singles, 7-point tiebreaks are not, and we have plenty of data on the latter. It seems reasonable to extend conclusions about 7-pointers to their 10-point cousins, and they are played with similar rules–switch servers every two points, switch points every six–and under comparable levels of increased pressure.

Using IBM’s point-by-point data from this year’s Grand Slam women’s draws, we have timestamps on about 700 points from tiebreaks. Even though the 42-seconds-per-point estimate for full sets includes changeovers, tiebreaks are played even more slowly. Including mini-changeovers within tiebreaks, points take about 54 seconds each, almost 30% longer than the traditional-set average.

The bottom line impact of third-set super-tiebreaks

As we’ve seen, the average third-set takes about 44 minutes. A 19-point super-tiebreak, at 54 seconds per point, comes in at about 17 minutes, chopping off more than 60% off the length of the typical third set, or about 20% from the length of the entire match.

If we alter this year’s WTA singles match times accordingly, reducing the length of all three-setters by one-fifth, we get some results that certain tennis executives will love. The average match time falls from 97 minutes to 89 minutes, and more importantly, far fewer matches cross the two-hour threshold.

Of the 460 matches this season over two hours in length, we would expect third-set super-tiebreaks to eliminate more than two-thirds of them, knocking the total down to 147. Here is the revised match length distribution, based on the assumptions I’ve laid out in this post:

histo4

The biggest benefit to switching to a third-set super-tiebreak is probably related to scheduling. By massively cutting down the number of marathon matches, it’s less likely that players and fans will have to wait around for an 11:00 PM start.

Of the various proposals floating around to shorten matches–third-set super-tiebreaks, no-ad scoring, playing service lets, and Fast4–changing the third-set format strikes the best balance of shortening the longest matches without massively changing the nature of the sport.

Personally, I hope none of these changes are ever seen on a WTA or ATP singles court. After all, I like tennis and tend to rankle at proposals that result in less tennis. If something must be done, I’d prefer it involve finding new executives to replace the ones who can’t stop tinkering with the sport. But if some rule needs to be changed to shorten matches and make scheduling more TV-friendly, this is likely the easiest one to stomach.

Elina Svitolina and Multiple #1 Upsets

Last week in Beijing, Elina Svitolina beat new WTA #1 Angelique Kerber. It was the first time the Ukrainian defeated Kerber this season, but it wasn’t her first 2016 triumph over a player ranked #1. At the Rio Olympics in August, Svitolina upset then-top-ranked Serena Williams.

It’s unusual for a player to face two (or more) different #1-ranked opponents in the same season. Since 1985, it has happened 136 times on the WTA tour and 148 times on the ATP tour. That’s less than five times per season per tour.

Of course, it’s much less common to upset multiple #1-ranked opponents, as Svitolina did. This was only the 16th time a woman did so (again, since 1985), while it has happened on the men’s side 18 times.

Here is a full list of WTA player-seasons that featured defeats of more than one top-ranked player:

Year  Player               Upsets                      
2016  Elina Svitolina      Kerber; Serena              
2010  Samantha Stosur      Serena; Wozniacki           
2009  Venus Williams       Serena; Safina              
2008  Dinara Safina        Henin; Sharapova; Jankovic  
2006  Justine Henin        Davenport; Mauresmo         
2003  Justine Henin        Serena; Clijsters           
2002  Kim Clijsters        Serena; Venus               
2002  Serena Williams      Capriati; Venus             
2001  Lindsay Davenport    Capriati; Hingis            
1999  Amelie Mauresmo      Hingis; Davenport           
1999  Venus Williams       Davenport; Hingis           
1997  Amanda Coetzer       Hingis; Graf                
1996  Jana Novotna         Graf; Seles                 
1996  Kimiko Date Krumm    Graf; Seles                 
1991  Martina Navratilova  Graf; Seles                 
1991  Gabriela Sabatini    Graf; Seles

It’s quite an accomplished list. As we might expect, there’s a lot of overlap between the players who achieved these upsets and past and future #1-ranked players. The real standouts here are Justine Henin and Venus Williams, who managed the feat twice, and Dinara Safina, who faced three different #1s in 2008, going undefeated against them.

Here are the men who beat multiple #1s in the same season:

Year  Player                 Upsets             
2013  Juan Martin Del Potro  Nadal; Djokovic    
2012  Andy Murray            Federer; Djokovic  
2011  David Ferrer           Nadal; Djokovic    
2011  Jo Wilfried Tsonga     Nadal; Djokovic    
2010  Marcos Baghdatis       Nadal; Federer     
2009  Juan Martin Del Potro  Nadal; Federer     
2008  Andy Murray            Nadal; Federer     
2008  Gilles Simon           Nadal; Federer     
2003  Rainer Schuettler      Roddick; Agassi    
2003  Fernando Gonzalez      Hewitt; Agassi     
2001  Greg Rusedski          Safin; Kuerten     
2001  Max Mirnyi             Safin; Kuerten     
1995  Michael Chang          Agassi; Sampras    
1992  Richard Krajicek       Courier; Edberg    
1991  Guy Forget             Edberg; Becker     
1991  Andrei Cherkasov       Edberg; Becker     
1990  Boris Becker           Lendl; Edberg      
1988  Boris Becker           Wilander; Lendl

This list isn’t quite as impressive, though it does capture several very good players at their best.  It also highlights the world-beating potential of Max Mirnyi, who–despite never reaching the top 15 himself–finished the 2001 season with a 3-1 record against ATP #1s.

The rarity of facing multiple #1s in the same season–let alone beating them–stops us from drawing any meaningful conclusions about what Svitolina’s feat indicates for her future. At the very least, however, it reminds us of the Ukrainian’s potential as a future star, and puts her among some very good historical company.

Christina McHale’s Tokyo Marathon

At the Japan Open in Tokyo last week, Christina McHale won her first career title. It didn’t come easy. She played three sets in every one of her five matches, going all the way to third-set tiebreaks in her first two rounds. Altogether, she spent over 13 hours on court.

We need some context to appreciate just what an outlier that is. Of 50 tour-level WTA tournaments this year, no other titlist has spent more than about 11 hours and 35 minutes on court–and that includes Grand Slam winners, who play two more matches than McHale did! Before Christina’s marathon effort last week, the champion who spent the most time on court in a 32-draw event was Dominika Cibulkova, who needed “only” 9 hours and 20 minutes to win in Eastbourne.

There’s no complete source for historical WTA match-time data, so we can’t determine just how rare 13-hour efforts were in years past. We can, however, hunt for tournaments in which the winner needed to play so many sets.

Going back to 1991–encompassing almost 1,500 events–McHale’s effort marks only the second time a player has won a tournament while playing 15 sets in five matches. The only previous instance was Anastasia Pavlyuchenkova‘s Paris title run in 2014. Serena Williams played five three-setters en route to the Roland Garros title last year, but of course, she played two other matches as well. Three other players–none since 2003–received first-round byes and then won tournaments by playing three sets in each of their four matches.

In general, we might expect a player who goes the distance in every round to struggle in the final. First of all, we would expect her to be tired–especially if, as is almost always the case, her opponent hasn’t spent as much time on court. Second, we might deduce that, if a player needed three-sets to win early rounds, she’s in relatively weak form, compared to the typical tour-level finalist.

Sure enough, the last 25 years of WTA history give us 16 players who reached a final by playing three sets in every round. Of the 16, only four–McHale, Pavlyuchenkova, and two others who didn’t require three sets in the final–won the title. The other 12 couldn’t retain their three-set magic and lost in the final.

While 16 players don’t make up much of a sample, we get a similar result if we broaden our view to those who played three-setters in exactly three of their four matches before the final. Excluding those who faced opponents who also played so many three-setters, we’re left with 134 players, only 48 (35.8%) of whom won the title match. A simple ranking-based forecast indicates that 58 (43.3%) of those players should have won, suggesting that while these players are indeed weaker than their more-dominant opponents, their underperformance may be due partly to fatigue.

McHale spent over 10 hours on court simply reaching the Tokyo final, far more than the six-plus hours required by her opponent, Katerina Siniakova. Even when a player doesn’t spend the record-setting amount of time on court that the American did this week, competitors tend to underperform after playing so many three-setters. The fact that McHale didn’t, and that she triumphed in yet another marathon match, makes her achievement all the more impressive.

Andrey Kuznetsov and Career Highs of ATP Non-Semifinalists

When following this week’s ATP 250 tournament in Winston-Salem and seeing Andrey Kuznetsov in the quarterfinals the following question arose: Will he finally make it into the first ATP semifinal of his career? As shown here Andrey – with a ranking of 42 – is currently (by far) the best-ranked player who has not reached an ATP SF. And it looks as if he will stay on top of this list for some time longer after losing to Pablo Carreno Busta 4-6 3-6 on Wednesday.

With stats of 0-10 in ATP quarterfinals, he is still pretty far away from Teymuraz Gabashvili‘s streak of 0-16. Despite having lost six more quarterfinals before winning his first QF this January against a retiring Bernard Tomic, Teymuraz climbed only to a ranking of 50. Still, we could argue that the QF losing-streak of Teymuraz is not really over after having won against a possibly injured player.

Running the numbers can answer questions such as “Who could climb up highest in the rankings without having won an ATP quarterfinal?” Doing so will put Andrey’s number 42 into perspective and will possibly reveal some other statistical trivia.

Player                Rank            Date   On
Andrei Chesnokov        30      1986.11.03    1
Yen Hsun Lu             33      2010.11.01    1
Nick Kyrgios            34      2015.04.06    1
Adrian Voinea           36      1996.04.15    1
Paul Haarhuis           36      1990.07.09    1
Jaime Yzaga             40      1986.03.03    1
Antonio Zugarelli       41      1973.08.23    1
Bernard Tomic           41      2011.11.07    1
Omar Camporese          41      1989.10.09    1
Wayne Ferreira          41      1991.12.02    1
Andrey Kuznetsov        42      2016.08.22    0
David Goffin            42      2012.10.29    1
Mischa Zverev           45      2009.06.08    1
Alexandr Dolgopolov     46      2010.06.07    1
Andrew Sznajder         46      1989.09.25    1
Lukas Rosol             46      2013.04.08    1
Ulf Stenlund            46      1986.07.07    1
Dominic Thiem           47      2014.07.21    1
Janko Tipsarevic        47      2007.07.16    1
Paul Annacone           47      1985.04.08    1
Renzo Furlan            47      1991.06.17    1
Mike Fishbach           47      1978.01.16    0
Oscar Hernandez         48      2007.10.08    1
Ronald Agenor           48      1985.11.25    1
Gary Donnelly           48      1986.11.10    0
Francisco Gonzalez      49      1978.07.12    1
Paolo Lorenzi           49      2013.03.04    1
Boris Becker            50      1985.05.06    1
Brett Steven            50      1993.02.15    1
Dominik Hrbaty          50      1997.05.19    1
Mike Leach              50      1985.02.18    1
Patrik Kuhnen           50      1988.08.01    1
Teymuraz Gabashvili     50      2015.07.20    1
Blaine Willenborg       50      1984.09.10    0

The table shows career highs (up until #50) for players before they won their first ATP QF. A 0 in the last column indicates that the player can still climb up in this table, because he did not win a QF, yet. There may also be retired players being denoted with a 0, because they never managed to get past a QF during their career.

I wonder, who had Andrei Chesnokov on the radar for this? Before winning his first ATP QF he pushed his ranking as far as 30. He later went on to have a career high of 9. Nick Kyrgios could also improve his ranking quickly without the need to go as deep as a SF. His Wimbledon 2014 QF, Roland Garros 2015 R32, and Australian Open 2015 QF runs helped him to get up until #34 without a single win at an ATP QF. Also, I particularly would like to highlight Alexandr Dolgopolov who reached #46 before having even played a single QF.

Looking only at players who are still active and able to up their ranking without an ATP SF we get the following picture:

Player                 Rank            Date
Andrey Kuznetsov         42      2016.08.22
Rui Machado              59      2011.10.03
Tatsuma Ito              60      2012.10.22
Matthew Ebden            61      2012.10.01
Kenny De Schepper        62      2014.04.07
Pere Riba                65      2011.05.16
Tim Smyczek              68      2015.04.06
Blaz Kavcic              68      2012.08.06
Alejandro Gonzalez       70      2014.06.09

Andrey seems to be relatively alone with Rui Machado being second in the list having reached his highest ranking already about five years ago. Skimming through the remainder of the table, we would be surprised if anyone soon would be able to come close to Andrey’s 42, which doesn’t mean that a sudden unexpected streak of an upcoming player would render this scenario impossible.

So what practical implications does this give us for analyzing tennis? Hardly any, I am afraid. Still, we can infer that it is possible to get well within the top-50 without winning more than two matches at a single tournament over a duration that can even range over a player’s whole career. Of course it would be interesting to see how long such players can stay in these ranking areas, guaranteeing direct acceptance into ATP tournaments and, hence, a more or less regular income from R32, R16, and QF prize money. Moreover, as the case of 2015-ish Nick Kyrgios shows, the question arises how one’s ranking points are composed: Performing well at the big stage of Masters or Grand Slams can be enough for a decent ranking while showing poor performance at ATP 250s. On the other hand, are there players whose ATP points breakdown reveals that they are willing to go for easier points at ATP 250s while never having deep runs at Masters or Grand Slams? These are questions which I would like to answer in a future post.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria. I would like to thank Jeff for being open-minded and allowing me to post these surface-scratching lines here.

Searching For Meaning in Distance Run Stats

Italian translation at settesei.it

For the last couple of years, some tennis broadcasts have featured “distance run” stats, tracking how far each player travels over the course of a point or a match. It’s a natural byproduct of all the cameras pointed at tennis courts. Especially in long rallies, it’s something that fans have wondered about for years.

As is often the case with new metrics, no one seems to be asking whether these new stats mean anything. Thanks to IBM (you never thought I’d write that, did you?), we have more than merely anecdotal data to play with, and we can start to answer that question.

At Roland Garros and Wimbledon this year, distance run during each point was tracked for players on several main courts. From those two Slams, we have point-by-point distance numbers for 103 of the 254 men’s singles matches. A substantial group of women’s matches is available as well, and I’ll look at those in a future post.

Let’s start by getting a feel for the range of these numbers. Of the available non-retirement matches, the shortest distance run was in Rafael Nadal’s first-round match in Paris against Sam Groth. Nadal ran 960 meters against Groth’s 923–the only match in the dataset with a total distance run under two kilometers.

At the other extreme, Novak Djokovic ran 4.3 km in his fourth-round Roland Garros match against Roberto Bautista Agut, who himself tallied a whopping 4.6 km. Novak’s French Open final against Andy Murray is also near the top of the list. The two players totaled 6.7 km, with Djokovic’s 3.4 km edging out Murray’s 3.3 km. Murray is a familiar face in these marathon matches, figuring in four of the top ten. (Thanks to his recent success, he’s also wildly overepresented in our sample, appearing 14 times.)

Between these extremes, the average match features a combined 4.4 km of running, or just over 20 meters per point. If we limit our view to points of five shots or longer (a very approximate way of separating rallies from points in which the serve largely determines the outcome), the average distance per point is 42 meters.

Naturally, on the Paris clay, points are longer and players do more running. In the average Roland Garros match, the competitors combined for 4.8 km per match, compared to 4.1 km at Wimbledon. (The dataset consists of about twice as many Wimbledon matches, so the overall numbers are skewed in that direction.) Measured by the point, that’s 47 meters per point on clay and 37 meters per point on grass.

Not a key to the match

All that running may be necessary, but covering more distance than your opponent doesn’t seem to have anything to do with winning the match. Of the 104 matches, almost exactly half (53) were won by the player who ran farther.

It’s possible that running more or less is a benefit for certain players. Surprisingly, Murray ran less than his opponent in 10 of his 14 matches, including his French Open contests against Ivo Karlovic and John Isner. (Big servers, immobile as they tend to be, may induce even less running in their opponents, since so many of their shots are all-or-nothing. On the other hand, Murray outran another big server, Nick Kyrgios, at Wimbledon.)

We think of physical players like Murray and Djokovic as the ones covering the entire court, and by doing so, they simultaneously force their opponents to do the same–or more. In Novak’s ten Roland Garros and Wimbledon matches, he ran farther than his opponent only twice–in the Paris final against Murray, and in the second round of Wimbledon against Adrian Mannarino. In general, running fewer meters doesn’t appear to be a leading indicator of victory, but for certain players in the Murray-Djokovic mold, it may be.

In the same vein, combined distance run may turn out to be a worthwhile metric. For men who earn their money in long, physical rallies, total distance run could serve as a proxy for their success in forcing a certain kind of match.

It’s also possible that aggregate numbers will never be more than curiosities. In the average match, there was only a 125 meter difference between the distances covered by the two players. In percentage terms, that means one player outran the other by only 5.5%. And as we’ll see in a moment, a difference of that magnitude could happen simply because one player racked up more points on serve.

Point-level characteristics

In the majority of points, the returner does a lot more running than the server does. The server usually forces his opponent to start running first, and in today’s men’s game, the server rarely needs to scramble too much to hit his next shot.

On average, the returner must run just over 10% further. When the first serve is put in play, that difference jumps to 12%. On second-serve points, it drops to 7%.

By extension, we would expect that the player who runs further would, more often than not, lose the point. That’s not because running more is necessarily bad, but because of the inherent server’s advantage, which has the side effect of showing up in the distance run stats as well. That hypothesis turns out to be correct: The player who runs farther in a single point loses the point 56% of the time.

When we narrow our view to only those points with five shots or more, we see that running more is still associated with losing. In these longer rallies, the player who covered more distance loses 58% of the points.

Some of the “extra” running in shorter points can be attributed to returning serve–and thus, we can assume that players are losing points because of the disadvantage of returning, not necessarily because they ran so much. But even in very long rallies of 10 shots or more, the player who runs farther is more likely to lose the point. Even at the level of a single point, my suggestion above, that physical players succeed by forcing opponents to work even harder than they do, seems valid.

With barely 100 matches of data–and a somewhat biased sample, no less–there are only so many conclusions we can draw about distance run stats. Two Grand Slams worth of show court matches is just enough to give us a general context for understanding these numbers and to hint at some interesting findings about the best players. Let’s hope that IBM continues to collect these stats, and that the ATP and WTA follow suit.

Shot-by-Shot Stats for 261 Grand Slam Finals (and More?)

One of my favorite subsets of the Match Charting Project is the ongoing effort–in huge part thanks to Edo–to chart all Grand Slam finals, men’s and women’s, back to 1980. We’re getting really close, with a total of 261 Slam finals charted, including:

  • every men’s Wimbledon and US Open final all the way back to 1980;
  • every men’s Slam final since 1989 Wimbledon;
  • every women’s Slam final back to 2001, with a single exception.

The Match Charting Project gathers and standardizes data that, for many of these matches, simply didn’t exist before. These recaps give us shot-by-shot breakdowns of historically important matches, allowing us to quantify how the game has changed–at least at the very highest level–over the last 35 years. A couple of months ago, I did one small project using this data to approximate surface speed changes–that’s just the tip of the iceberg in terms of what you can do with this data. (The dataset is also publicly available, so have fun!)

We’ve got about 30 Slam finals left to chart, and you might be able to help. As always, we are actively looking for new contributors to the project to chart matches (here’s how to get started, and why you should, and you don’t have to chart Slam finals!), but right now, I have a different request.

We’ve scoured the internet, from YouTube to Youku to torrent trackers, to find video for all of these matches. While I don’t expect any of you to have the 1980 Teacher-Warwick Australian Open final sitting around on your hard drive, I’ve got higher hopes for some of the more recent matches we’re missing.

If you have full (or nearly full) video for any of these matches, or you know of a (preferably free) source where we can find them, please–please, please!–drop me a line. Once we have the video, Edo or I will do the rest, and the project will become even more valuable.

There are several more finals from the 1980s that we’re still looking for. Here’s the complete list.

Thanks for your help!

Measuring the Clutchness of Everything

Italian translation at settesei.it

Matches are often won or lost by a player’s performance on “big points.” With a few clutch aces or un-clutch errors, it’s easy to gain a reputation as a mental giant or a choker.

Aside from the traditional break point stats, which have plenty of limitations, we don’t have a good way to measure clutch performance in tennis. There’s a lot more to this issue than counting break points won and lost, and it turns out that a lot of the work necessary to quantify clutchness is already done.

I’ve written many times about win probability in tennis. At any given point score, we can calculate the likelihood that each player will go on to win the match. Back in 2010, I borrowed a page from baseball analysts and introduced the concept of volatility, as well. (Click the link to see a visual representation of both metrics for an entire match.) Volatility, or leverage, measures the importance of each point–the difference in win probability between a player winning it or losing it.

To put it simply, the higher the leverage of a point, the more valuable it is to win. “High leverage point” is just a more technical way of saying “big point.”  To be considered clutch, a player should be winning more high-leverage points than low-leverage points. You don’t have to win a disproportionate number of high-leverage points to be a very good player–Roger Federer’s break point record is proof of that–but high-leverage points are key to being a clutch player.

(I’m not the only person to think about these issues. Stephanie wrote about this topic in December and calculated a full-year clutch metric for the 2015 ATP season.)

To make this more concrete, I calculated win probability and leverage (LEV) for every point in the Wimbledon semifinal between Federer and Milos Raonic. For the first point of the match, LEV = 2.2%. Raonic could boost his match odds to 50.7% by winning it or drop to 48.5% by losing it. The highest leverage in the match was a whopping 32.8%, when Federer (twice) had game point at 1-2 in the fifth set. The lowest leverage of the match was a mere 0.03%, when Raonic served at 40-0, down a break in the third set. The average LEV in the match was 5.7%, a rather high figure befitting such a tight match.

On average, the 166 points that Raonic won were slightly more important, with LEV = 5.85%, than Federer’s 160, at LEV = 5.62%. Without doing a lot more work with match-level leverage figures, I don’t know whether that’s a terribly meaningful difference. What is clear, though, is that certain parts of Federer’s game fell apart when he needed them most.

By Wimbledon’s official count, Federer committed nine unforced errors, not counting his five double faults, which we’ll get to in a minute. (The Match Charting Project log says Fed had 15, but that’s a discussion for another day.) There were 180 points in the match where the return was put in play, with an average LEV = 6.0%. Federer’s unforced errors, by contrast, had an average LEV nearly twice as high, at 11.0%! The typical leverage of Raonic’s unforced errors was a much less noteworthy 6.8%.

Fed’s double fault timing was even worse. Those of us who watched the fourth set don’t need a fancy metric to tell us that, but I’ll do it anyway. His five double faults had an average LEV of 13.7%. Raonic double faulted more than twice as often, but the average LEV of those points, 4.0%, means that his 11 doubles had less of an impact on the outcome of the match than Roger’s five.

Even the famous Federer forehand looks like less of a weapon when we add leverage to the mix. Fed hit 26 forehand winners, in points with average LEV = 5.1%. Raonic’s 23 forehand winners occurred during points with average LEV = 7.0%.

Taking these three stats together, it seems like Federer saved his greatness for the points that didn’t matter as much.

The bigger picture

When we look at a handful of stats from a single match, we’re not improving much on a commentator who vaguely summarizes a performance by saying that a player didn’t win enough of the big points. While it’s nice to attach concrete numbers to these things, the numbers are only worth so much without more context.

In order to gain a more meaningful understanding of this (or any) performance with leverage stats, there are many, many more questions we should be able to answer. Were Federer’s high-leverage performances typical? Does Milos often double fault on less important points? Do higher-leverage points usually result in more returns in play? How much can leverage explain the outcome of very close matches?

These questions (and dozens, if not hundreds more) signal to me that this is a fruitful field for further study. The smaller-scale numbers, like the average leverage of points ending with unforced errors, seem to have particular potential. For instance, it may be that Federer is less likely to go for a big forehand on a high-leverage point.

Despite the dangers of small samples, these metrics allow us to pinpoint what, exactly, players did at more crucial moments. Unlike some of the more simplistic stats that tennis fans are forced to rely on, leverage numbers could help us understand the situational tendencies of every player on tour, leading to a better grasp of each match as it happens.

How Elo Solves the Olympics Ranking Points Conundrum

Italian translation at settesei.it

Last week’s Olympic tennis tournament had superstars, it had drama, and it had tears, but it didn’t have ranking points. Surprise medalists Monica Puig and Juan Martin del Potro scored huge triumphs for themselves and their countries, yet they still languish at 35th and 141st in their respective tour’s rankings.

The official ATP and WTA rankings have always represented a collection of compromises, as they try to accomplish dual goals of rewarding certain behaviors (like showing up for high-profile events) and identifying the best players for entry in upcoming tournaments. Stripping the Olympics of ranking points altogether was an even weirder compromise than usual. Four years ago in London, some points were awarded and almost all the top players on both tours showed up, even though many of them could’ve won more points playing elsewhere.

For most players, the chance at Olympic gold was enough. The level of competition was quite high, so while the ATP and WTA tours treat the tournament in Rio as a mere exhibition, those of us who want to measure player ability and make forecasts must factor Olympics results into our calculations.

Elo, a rating system originally designed for chess that I’ve been using for tennis for the past year, is an excellent tool to use to integrate Rio results with the rest of this season’s wins and losses. Broadly speaking, it awards points to match winners and subtracts points from losers. Beating a top player is worth many more points than beating a lower-rated one. There is no penalty for not playing–for example, Stan Wawrinka‘s and Simona Halep‘s ratings are unchanged from a week ago.

Unlike the ATP and WTA ranking systems, which award points based on the level of tournament and round, Elo is context-neutral. Del Potro’s Elo rating improved quite a bit thanks to his first-round upset of Novak Djokovic–the same amount it would have increased if he had beaten Djokovic in, say, the Toronto final.

Many fans object to this, on the reasonable assumption that context matters. It certainly seems like the Wimbledon final should count for more than, say, a Monte Carlo quarterfinal, even if the same player defeats the same opponent in both matches.

However, results matter for ranking systems, too. A good rating system will do two things: predict winners correctly more often than other systems, and give more accurate degrees of confidence for those predictions. (For example, in a sample of 100 matches in which the system gives one player a 70% chance of winning, the favorite should win 70 times.) Elo, with its ignorance of context, predicts more winners and gives more accurate forecast certainties than any other system I’m aware of.

For one thing, it wipes the floor with the official rankings. While it’s possible that tweaking Elo with context-aware details would better the results even more, the improvement would likely be minor compared to the massive difference between Elo’s accuracy and that of the ATP and WTA algorithms.

Relying on a context-neutral system is perfect for tennis. Instead of altering the ranking system with every change in tournament format, we can always rate players the same way, using only their wins, losses, and opponents. In the case of the Olympics, it doesn’t matter which players participate, or what anyone thinks about the overall level of play. If you defeat a trio of top players, as Puig did, your rating skyrockets. Simple as that.

Two weeks ago, Puig was ranked 49th among WTA players by Elo–several places lower than her WTA ranking of 37. After beating Garbine Muguruza, Petra Kvitova, and Angelique Kerber, her Elo ranking jumped to 22nd. While it’s tough, intuitively, to know just how much weight to assign to such an outlier of a result, her Elo rating just outside the top 20 seems much more plausible than Puig’s effectively unchanged WTA ranking in the mid-30s.

Del Potro is another interesting test case, as his injury-riddled career presents difficulties for any rating system. According to the ATP algorithm, he is still outside the top 100 in the world–a common predicament for once-elite players who don’t immediately return to winning ways.

Elo has the opposite problem with players who miss a lot of time due to injury. When a player doesn’t compete, Elo assumes his level doesn’t change. That’s clearly wrong, and it has cast a lot of doubt over del Potro’s place in the Elo rankings this season. The more matches he plays, the more his rating will reflect his current ability, but his #10 position in the pre-Olympics Elo rankings seemed overly influenced by his former greatness.

(A more sophisticated Elo-based system, Glicko, was created in part to improve ratings for competitors with few recent results. I’ve tinkered with Glicko quite a bit in hopes of more accurately measuring the current levels of players like Delpo, but so far, the system as a whole hasn’t come close to matching Elo’s accuracy while also addressing the problem of long layoffs. For what it’s worth, Glicko ranked del Potro around #16 before the Olympics.)

Del Potro’s success in Rio boosted him three places in the Elo rankings, up to #7. While that still owes something to the lingering influence of his pre-injury results, it’s the first time his post-injury Elo rating comes close to passing the smell test.

You can see the full current lists elsewhere on the site: here are ATP Elo ratings and WTA Elo ratings.

Any rating system is only as good as the assumptions and data that go into it. The official ATP and WTA ranking systems have long suffered from improvised assumptions and conflicting goals. When an important event like the Olympics is excluded altogether, the data is incomplete as well. Now as much as ever, Elo shines as an alternative method. In addition to a more predictive algorithm, Elo can give Rio results the weight they deserve.