Podcast Episode 29: A New Davis Cup and a Career Golden Masters

In Episode 29 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, we spend so much time talking about the Davis Cup reforms, if ITF head David Haggerty were to listen, he’d immediately try to change our format to something shorter. While we recorded the episode before yesterday’s finals in Cincinnati, we also touch on several other topics: the impressiveness of Novak Djokovic’s career record at all the Masters, the players who–like Sloane Stephens–have powerful groundstrokes that they save for special occasions, and Simona Halep’s aggressive scheduling.

Thanks for listening!

(Note: this week’s episode is about 66 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Update: Episode index with links, thanks to FBITennis:

Davis Cup Reforms Generally 1:49
Team and mixed gender tennis trend 6:15
Effect of DC reforms on smaller tennis federations 8:11
Will big stars play more often in the new DC format? 12:17
Davis Cup analogy to World Baseball Classic 18:39
Which countries benefit most from the new DC format? 22:09
Cincinnati F (men’s):  Djokovic v Federer 26:05
US Open favorites (men’s) 29:45
How close is Stan Wawrinka in getting back to form? 33:12
Variation in ground stroke speed (e.g., Wawrinka, Stephens) 37:10
Simona Halep’s crazy schedule 47:05
Discussion of New Haven event 50:26
US Open qualifying matches (how to choose?) 57:00

Measuring a Season’s Worth of Luck

In Toronto last week, Stefanos Tsitsipas was either very clutch, very lucky, or both. Against Alexander Zverev in Friday’s quarter-final, he won fewer than half of all points, claiming only 56.7% of his service points, compared to Zverev’s 61.2%. The next day, beating Kevin Anderson in the semi-final in a third-set tiebreak, he again failed to win half of total points, holding 69.9% of his service points against Anderson’s 75.5%.

Whether the Greek prospect played his best on the big points or benefited from a hefty dose of fortune, this isn’t sustainable. Running those serve- and return-points-won (SPW and RPW) numbers through my win probability model, we find that–if you take luck and clutch performance out of the mix–Tsitsipas had a 27.8% chance of beating Zverev and a 26.5% chance of beating Anderson. These two contests–perhaps the two days that have defined the youngster’s career up to this point–are the very definition of “lottery matches.” They could’ve gone either way, and over a long enough period of time, they’ll probably even out.

Or will they? Are some players more likely to come out on top in these tight matches? Are they consistently–dare I say it–clutch? Using this relatively simple approach of converting single-match SPW and RPW rates into win probabilities, we can determine which players are winning more or less often than they “should,” and whether it’s a skill that some players consistently display.

Odds in the lottery

Let’s start with some examples. When one player wins more than 55% of points, he is virtually guaranteed to win the match. Even at 53%, his chances are extremely good. Still, a lot of matches–particularly best-of-threes on fast surfaces–end up in the range between 50% and 53%, and that’s what most interesting from this perspective.

Here are Tsitsipas’s last 16 matches, along with his SPW and RPW rates and the implied win probability for each:

Tournament  Round  Result  Opponent     SPW    RPW  WinProb  
Toronto     F      L       Nadal      62.9%  21.1%       3%  
Toronto     SF     W       Anderson   69.9%  24.5%      27%  
Toronto     QF     W       A Zverev   56.7%  38.8%      28%  
Toronto     R16    W       Djokovic   77.2%  32.0%      85%  
Toronto     R32    W       Thiem      83.3%  30.2%      93%  
Toronto     R64    W       Dzumhur    82.8%  35.0%      98%  
Washington  SF     L       A Zverev   54.7%  25.5%       1%  
Washington  QF     W       Goffin     71.2%  32.7%      67%  
Washington  R16    W       Duckworth  80.0%  37.5%      98%  
Washington  R32    W       Donaldson  59.5%  45.5%      74%  
Wimbledon   R16    L       Isner      72.5%  18.0%      10%  
Wimbledon   R32    W       Fabbiano   64.0%  55.9%     100%  
Wimbledon   R64    W       Donaldson  70.1%  40.9%      95%  
Wimbledon   R128   W       Barrere    71.5%  39.0%      94%  
Halle       R16    L       Kudla      59.7%  28.8%       8%  
Halle       R32    W       Pouille    78.3%  42.9%      99%

More than half of the matches are at least 90% or no more than 10%. But that leaves plenty of room for luck in the remaining matches. Thanks in large part to his last two victories, the win probability numbers add up to only 9.8 wins, compared to his actual record of 12-4. All four losses were rather one-sided, but in addition to the Toronto matches against Zverev and Anderson, his wins against David Goffin in Washington and, to a lesser extent, Novak Djokovic in Toronto, were far from sure things.

In the last two months, Stefanos has indeed been quite clutch, or quite lucky.

Season-wide views

When we expand our perspective to the entire 2018 season, however, the story changes a bit. In 48 tour-level matches through last week’s play (excluding retirements), Tsitsipas has gone 29-19. The same win probability algorithm indicates that he “should” have won 27.4 matches–a difference of 1.6 matches, or about five percent, which is less than the gap we saw in his last 16. In other words, for the first two-thirds of the season, his results were either unlucky or un-clutch, if only slightly. At the very least, the aggregate season numbers are less dramatic than his recent four-event run.

For two-thirds of a season, a five percent gap between actual wins and win-probability “expected” wins isn’t that big. For players with at least 30 completed tour-level matches this season, the magnitude of the clutch/luck effect extends from a 20% bonus (for Pierre Hugues Herbert) to a 20% penalty (for Sam Querrey, which he reduced a bit by beating John Isner in Cincinnati on Monday despite winning less than 49% of total points). Here are the ten extremes at each end, of the 59 ATPers who have reached the threshold so far in 2018:

Player                 Matches  Wins  Exp Wins  Ratio  
Pierre Hugues Herbert       30    16      13.2   1.22  
Nikoloz Basilashvili        34    17      14.0   1.21  
Frances Tiafoe              39    24      20.0   1.20  
Evgeny Donskoy              30    13      10.9   1.19  
Grigor Dimitrov             34    20      17.1   1.17  
Lucas Pouille               31    16      13.7   1.17  
Gael Monfils                34    21      18.3   1.15  
Daniil Medvedev             34    18      15.8   1.14  
Marco Cecchinato            33    19      16.7   1.14  
Maximilian Marterer         32    17      15.2   1.12  
…                                                      
Leonardo Mayer              37    19      20.1   0.95  
Guido Pella                 37    20      21.2   0.95  
Marin Cilic                 38    27      28.8   0.94  
Novak Djokovic              37    27      29.3   0.92  
Marton Fucsovics            30    16      17.5   0.92  
Joao Sousa                  36    18      19.8   0.91  
Dusan Lajovic               34    17      18.7   0.91  
Fernando Verdasco           43    22      24.5   0.90  
Mischa Zverev               39    18      20.7   0.87  
Sam Querrey                 30    15      18.8   0.80

A difference of three or four wins, as many of these players display between their actual and expected win totals, is more than enough to affect their standing in the rankings. The degree to which it matters depends enormously on which matches they win or lose, as Tsitsipas’s semi-final defeat of Anderson has a much greater impact on his point total than, say, Querrey’s narrow victory over Isner does for his. But in general, the guys at the top of this list are ones who have seen unexpected ranking boosts this season, while some of the guys at the bottom have gone the other way.

The last full season

Let’s take a look at an entire season’s worth of results. Last year, a few players–minimum 40 completed tour-level matches–managed at least a 20% luck/clutch bonus, but with the surprising exception of Daniil Medvedev, none of them have repeated the feat so far in 2018:

Player                 Matches  Wins  Exp Wins  Ratio  
Donald Young                43    21      16.2   1.30  
Fabio Fognini               58    35      28.5   1.23  
Jack Sock                   55    36      29.8   1.21  
Jiri Vesely                 45    22      19.3   1.14  
Daniil Medvedev             43    22      19.7   1.11  
John Isner                  57    36      32.3   1.11  
Damir Dzumhur               56    33      29.7   1.11  
Gilles Muller               48    30      27.1   1.11  
Alexander Zverev            74    53      48.1   1.10  
Juan Martin del Potro       53    37      33.6   1.10

A few of these players have had solid seasons, but posting a good luck/clutch number in 2017 is hardly a guaranteed, as the likes of Donald Young, Jack Sock, and Jiri Vesely can attest. Here is the same list, with 2018 luck/clutch ratios shown alongside last year’s figures:

Player                 2017 Ratio  2018 Ratio     
Donald Young                 1.30        0.89  *  
Fabio Fognini                1.23         1.1     
Jack Sock                    1.21        0.68  *  
Jiri Vesely                  1.14        1.08  *  
Daniil Medvedev              1.11        1.14     
John Isner                   1.11        0.96     
Damir Dzumhur                1.11        1.01     
Gilles Muller                1.11        0.84  *  
Alexander Zverev             1.10        1.06     
Juan Martin del Potro        1.10        1.07

* fewer than 30 completed tour-level matches

The average luck/clutch ratio of these ten players has fallen to a bit below 1.0.

Unsustainable luck

You can probably see where this is going. I generated full-season numbers for each year from 2008 to 2017, and identified those players who appeared in the lists for adjacent pairs of seasons. If luck/clutch ratio is a skill–that is, if it’s more clutch than luck–guys who post good numbers will tend to do so the following year, and those who post lower numbers will be more likely to remain low.

Across 325 pairs of player-seasons, that’s not what happened. There is almost no relationship between one year of luck/clutch ratio and the next. The r^2 value–a measure of correlation–is 0.07, meaning that the year-to-year numbers are close to random.

Across sports, analysts have found plenty of similar results, and they are often quick to pronounce that “clutch doesn’t exist,” which leads to predictable rejoinders from the laity that “of course it does,” and so on. It’s boring, and I’m not particularly interested in that debate. What this specific finding shows is:

This type of luck, defined as winning more matches than implied by a player’s SPW and RPW in each match, is not sustainable.

What Tsitsipas accomplished last weekend in Toronto was “clutch” by almost any definition. What this finding demonstrates is that a few such performances–or even a season’s worth of them–doesn’t make it any more likely that he’ll do the same next year. Or, another possibility is that the players who stick at the top level of professional tennis are all clutch in this sense, so while Tsitsipas might be quite mentally strong in key moments, he’ll often run up against players who have similar mental skills, and he won’t be able to consistently win these close matches.

If Stefanos is able to maintain a ranking in the top 20, which seems plausible, he’ll probably need to win more serve and return points than he has so far. Fortunately for him, he’s still almost eight years younger than his typical peer, so he has plenty of time to improve. The occasional lottery matches that tilt his way will need to be mere bonuses, not the linchpin of his strategy to reach the top.

Podcast Episode 28: Tsitsipas, Sloane, and the Serve Clock

Episode 28 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, is our triumphant (re-) return to podcasting. We cover all the highlights from last week’s Rogers Cup, especially the breakout week from Stefanos Tsitsipas, the strong showing from Sloane Stephens, and the quiet dominance of Rafael Nadal and Simona Halep.

We also cover tennis’s new serve clock, which has entered the game surprisingly smoothly … but without much effect on the pace of play. Thanks for listening!

(Note: this week’s episode is about 63 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Update: Episode index with links, thanks to FBITennis:

Tsitsipas in Toronto 1:17
Performance Byes [pp 45-46 in linked pdf] 8:05
Next Gen One-Handed Backhands 13:29
Sloane Stephens’ Ranking Profile 19:38
Effectiveness of Kamau Murray 27:40
Do the Canadian Results Affect US Open Favorites? 33:00
Dimitrov defending in Cincinnati 37:30
Federer returns in Cincinnati 40:55
Muguruza defending in Cincinnati 45:06
Average Age of ATP Top 50 Declining 48:30
Serve Clock 52:34

Maybe, Finally, The Next Generation is Here

Italian translation at settesei.it

Alexander Zverev is winning Masters titles. Stefanos Tsitsipas is beating top ten players. Denis Shapovalov, Frances Tiafoe, and even Alex De Minaur are making life more difficult for ATP veterans.

For most of the last decade, the story of men’s tennis has been the degree to which the game is getting older. Even now, thirty-somethings hold half of the places in the top ten. Wave after wave of hyped prospects have failed to take over the sport, settling in for a long fight to the top.  On Monday, Juan Martin del Potro, once hailed as the man who would topple the Big Four, will reach a new career-best ranking of No. 3 … six weeks away from his 30th birthday.

At last, though, men’s tennis appears to be getting younger. Teenagers Shapovalov, Tiafoe, and De Minaur are rising just as some of the game’s crustiest vets are on their way out: 36-year-olds David Ferrer and Julien Benneteau are calling it quits this year, tumbling in the rankings alongside the likes of Feliciano Lopez and Ivo Karlovic.

The result is that the average age of the ATP top 50 is falling–something it hasn’t done for a really, really long time. The following graph shows the average age of the top 50 at the end of every season since 1983, plus–the rightmost data point–the mean age of the current top 50:

At the end of 2017, the average age was 29.0 years; it has since fallen to 27.75. That’s bigger than any single-year swing (up or down) in the last 35 years. As the graph shows, there were plenty of “down” years in the late 1990s and early 2000s, but none of them had even half the magnitude of the current drop.

There’s still an enormous gap between the current state of affairs and the days when men’s tennis was young. If we expand our view to the top 100, this year’s shift is less dramatic–with Ferrer, Benneteau, Lopez and others ranked between 51 and 100, that average still sits at 28.1 years, only about seven months younger than the corresponding number at the end of last season. But even that weaker evidence of a youth movement points in the same direction: 28.1 years is the youngest the top 100 has been since 2012.

Barring fundamental changes in rules or equipment, we’re unlikely to return to the teenage-driven game of the early 1990s. But after a decade of waiting, watching, and wondering, we can see some cracks in the greatest generation of men’s tennis. And finally, there’s a group of young players ready to take advantage.

The Cost of a Double Fault

We all know that double faults aren’t good, but it’s less clear just how bad they are. Over the course of an entire match, a single point here or there doesn’t seem to matter too much, especially when a double fault creeps in at a harmless moment, like 40-love. Yet many missed second serves are far more costly. Let’s try to quantify the impact of tennis’s most enervating outcome.

To do this, we need to think in terms of win probability. In each match, a player wins a certain percentage of service points and a certain percentage of return points. If those rates are sufficiently dominating–say, Mihaela Buzarnescu’s 65% of service points won and 59% of return points won in last week’s San Jose final–the player’s chance of winning the match is 100%. No matter how unlucky or unclutch she was, those percentages result in a win. But in a close contest, in which both players win about 50% of points (often referred to as “lottery matches”), the result is heavily influenced by clutch play and luck. In Buzarnescu’s tour de force, flipping the result of a single point would be meaningless. But in a tight match, like the Wimbledon semifinal between John Isner and Kevin Anderson, a single point could mean the difference between a spot in the championship match and an early flight home.

My aim, then, is to measure the average win probability impact of a double fault. To take another example, consider last week’s Washington quarter-final between Andrea Petkovic and Belinda Bencic. Bencic won nearly 51% of total points–59% of her service points and 42% on return–but lost in a third-set tiebreak. Those serve and return components were enough to give her a 56.3% chance of winning the match: claiming more than half of total points usually results in victory, but so close to 50%, there’s plenty of room for things to go the other way.

I refer to this match because double faults played a huge role. Bencic tallied 12 double faults in 105 service points, a rate of 11.4%, more than double the WTA tour average of 5.1%. Had she avoided those 12 double faults and won those points at the same rate as her other 93 service points, she would have ended up with a much more impressive service-points-won rate of 67%. Combined with her 42% rate of return points won, that implies an 87% chance of winning the match–more than 30 percentage points higher than her actual figure! Roughly speaking, each of her 12 double faults cost her a 2.5% chance (30% divided by 12) of winning the match.

A double fault rate above 10% is unusual, but a cost of 2.5% per offense is not. When we run this algorithm across the breadth of the ATP and WTA tours, we find that the cost of double faults adds up fast.

Tour averages

Using the method I’ve described above–replacing double faults with average non-double-fault service points–and taking the average of all tour-level matches in 2017 and 2018 through last week’s tournaments, we find that the average WTA double fault costs a player 1.83% of a win. Put another way, every 55 additional double faults subtracts one match from the win column and adds one to the loss column.

In the men’s game, the equivalent number is 1.99% of a win. The slightly bigger figure is due to the fact that men, on average, win more service points, so the difference between a double fault and a successful service offering is greater.

There is, however, an alternative way we could approach this. By comparing double faults to all other service points, we’re trading a lot of the double faults for first serve outcomes. We might be more interested in knowing how a player would fare if his or her second serve were bulletproof–still eliminating double faults, but replacing them specifically with second serves instead of a generic mix of service points.

In that case, the algorithm remains very similar. Instead of replacing double faults with non-double-fault serve points, we replace them with non-double-fault second serve points. Then the cost of a double fault is a little bit less, because second serve points result in fewer points won than service points overall. The second-serve numbers are 1.61% per double fault in the women’s game and 1.70% per double fault in the men’s game. For the remainder of this post, I’ll stick with the generic service points, but one approach is not necessarily better than the other; they simply measure different things.

Building a player-specific stat

Odious as double faults are, they are not completely avoidable. Very few players are able to sustain a double fault rate below 2%, and tour averages are around twice that. Since the beginning of 2017, the ATP average has been about 3.9%, and the WTA average roughly 5.1%, as we saw above.

We can measure players by considering their match-by-match double fault rates compared to tour average. In Bencic’s unfortunate case, her 12 double faults were 6.7 more than a typical player would’ve committed in the same number of service points. In contrast, in the same match, Petkovic recorded only 3 double faults in 102 service points, 2.2 double faults fewer than an average player would have.

We know that each WTA double fault affects a player’s chances of winning the match by 1.83%, so compared to an average service performance, Bencic’s excessive service errors cost her about a 17% chance of winning (6.7 times 1.83%), while Petkovic’s stinginess increased her own odds by about 6.6% (2.2 times 1.83%).

Repeat the process for every one of a player’s matches, and you can assemble a longer-term statistic. Let’s start with the WTA players who, since the start of last season, have cost themselves the most matches (“DF Cost”–negative numbers are bad), along with those who have most improved their lot by avoiding double faults:

Player                   DF%  DF Cost  
Kristina Mladenovic     7.7%    -3.84  
Daria Gavrilova         7.9%    -3.77  
Jelena Ostapenko        7.7%    -3.58  
Petra Kvitova           8.1%    -3.01  
Camila Giorgi           8.3%    -2.63  
Oceane Dodin           10.2%    -2.51  
Donna Vekic             7.0%    -1.91  
Venus Williams          6.7%    -1.71  
Coco Vandeweghe         6.4%    -1.60  
Aliaksandra Sasnovich   6.7%    -1.55  
…                                      
Agnieszka Radwanska     2.3%     1.27  
Sloane Stephens         2.1%     1.43  
Caroline Wozniacki      3.2%     1.43  
Barbora Strycova        3.5%     1.47  
Elina Svitolina         3.9%     1.48  
Simona Halep            3.5%     1.53  
Qiang Wang              2.6%     1.54  
Anastasija Sevastova    3.1%     1.57  
Carla Suarez Navarro    2.1%     1.67  
Caroline Garcia         3.6%     1.82

And the same for the men:

Player                  DF%  DF Cost  
Benoit Paire           6.2%    -4.51  
Ivo Karlovic           5.8%    -3.63  
Fabio Fognini          5.0%    -2.38  
Denis Shapovalov       6.3%    -2.26  
Grigor Dimitrov        5.1%    -2.25  
Gael Monfils           5.0%    -2.22  
David Ferrer           5.2%    -2.06  
Jeremy Chardy          5.3%    -2.00  
Fernando Verdasco      4.8%    -1.94  
Jack Sock              4.8%    -1.73  
…                                     
Roger Federer          2.1%     0.88  
Tomas Berdych          2.9%     0.89  
Juan Martin del Potro  2.8%     0.93  
Albert Ramos           3.1%     0.97  
Pablo Carreno Busta    2.2%     1.07  
Richard Gasquet        2.6%     1.12  
John Isner             2.6%     1.23  
Dusan Lajovic          1.9%     1.23  
Denis Istomin          1.9%     1.23  
Philipp Kohlschreiber  2.5%     1.24

Situational double faults

These aggregate numbers have the potential to hide a lot of information. They consider only two things about each match: how many double faults a player committed, and how close the match was. This statistic would treat Bencic the same whether she hit nine of her double faults at 40-love, or nine of her double faults in the third-set tiebreak. Yet the latter would have a colossally greater impact.

While this is an important limitation to keep in mind, it appears that double faults are distributed relatively randomly. That is, most players do not hit a majority of their double faults in particularly high- or low-leverage situations. The player lists displayed above show both the most basic stat–double fault percentage–along with my more complex approach. For players with at least 20 matches since the beginning of last season, double fault rate is very highly correlated with the match-denominated cost of double faults. (For men, r^2 = 0.752, and for women, r^2 = 0.789.) In other words, most of the variance in double fault cost can be explained by the number of double faults, leaving little room for other factors, such as the importance of the situation when double faults are committed.

That said, there’s plenty of room for additional analysis into those specific sitations. Instead of taking a match-level look at win probability, as I have here, one could identify the point score of every single one of a player’s double faults, and see how each event affected the win probability of that match. I suspect that, for most players, that would amount to a whole lot of extra complexity for not a lot of added insight, but perhaps there are some players who are uniquely able to land their second serve when it matters most, or particularly prone to double faults at key moments. This match-level look has made it clear how costly double faults can be, and it’s possible that for some players, missed serves are even more damaging than that.

How Servers Respond To Double Faults

Italian translation at settesei.it

In the professional game, double faults are quite rare. They sometimes reflect a momentary lapse in concentration, and can negatively impact a server’s confidence. Players are sometimes particularly careful after losing a point to a double fault, taking some speed off their next delivery, or aiming closer to the middle of the box.

Let’s dig into some data from last year’s grand slams to see what players do–and how it affects their results–immediately after double faults. IBM’s Slamtracker provided point-by-point data for most 2017 grand slam singles matches, including serve speed and direction, and the available matches give us about 5,000 double faults to work with. (I’ve organized the data and made it freely available here.)

For each server in each match, I’ve tallied their results on points immediately following double faults. (That means that we exclude after-double-fault points when the double fault ended the game.) Then, for each player, I compared those results with match-long averages. Because double faults are so unusual, and because we only have this data for the majors, the sample isn’t adequate to tell us much about individual players. But for tour-wide analyses, it’s more than enough.

Serve points won: As we’ll see in a moment, men and women have different overall tendencies on the point following a double fault. But by the most important measure of simply winning the next point, gender plays little part. Men, who in this sample win 65.1% of service points, fall just over one percentage point to 64.0% on the point following a double fault. Women, who average 57.8% of service points won, drop even more, to 56.1% after a double.

First serve percentage: I expected that servers become more conservative immediately after a double fault. For women, that hypothesis is correct: In these matches, they land 63.3% of their first serves, while after a double fault, that number jumps to 65.4%. On the other hand, men don’t seem to change their approach very much. On average, they make 62.3% of their first offerings, a number that barely changes, to 62.5%, after double faults.

First serve points won: Here is additional evidence that women become more conservative after double faults, while men do not. In general, women win 63.7% of their first serve points, but just after a double fault, that number drops to 62.9%. For men, there is a decrease in first serve points won, but it is almost as small as their difference in first serve percentage: 72.7% overall, 72.4% after a double fault.

First serve speed: With serve speed, we run into a limitation of the Slamtracker data, which gives us speed only for those serves that go in. So when we look at the average speed of first serves, we’re excluding attempts that miss the box. Even with that caveat, the data keeps pointing in the same direction. Contrary to my “conservative” hypothesis, men serve a bit faster than usual after a double fault–183.3 km/h following doubles, versus 182.8 km/h in general. Women do seem to change their tactics, dropping from an average speed of 155.5 km/h to a post-double-fault pace of 152.2 km/h.

First serve direction: Slamtracker divides serve direction into five categories: wide, body-wide, body, body-center, and center. After a double fault, men are less likely than usual to hit a wide serve (24.1% to 25.8%), and those serves get split roughly evenly between the body and center categories. The difference in body serves is most striking: They account for only 3.5% of first serves overall, but 4.4% of post-double first serves. This may be the one way in which men opt for the conservative path, by maintaining speed but giving themselves a wider margin of error.

Women move many of their after-double-fault serves toward the middle of the box. On average, over 44% of serves are classified as either “wide” or “center,” but immediately after a double fault, that number drops below 41%. It’s not a huge difference, but like all of the other tendencies we’ve seen in the women’s game, it suggests that for many players, caution creeps in immediately after missing a second serve.

Tactics

As usual, it’s difficult to move from these sorts of findings to any sort of tactical advice. Even the first data point, that both men and women win fewer service points than usual right after they’ve double faulted, can be interpreted in multiple ways. By one reading, players may be serving too conservatively, missing out of the benefits of big first serves. On the other hand, if confidence is an issue, perhaps serving more aggressively would just result in more misses.

When in doubt, we have to trust that the players and coaches know what they’re doing–they’ve honed these tradeoffs through decades of experience and thousands of hours of match play. For fans, these numbers add to our understanding of the conclusions that players have reached. For the pros, perhaps a more detailed look at what happens after a double fault would help tweak their own strategies, both bouncing back from their own double faults and taking advantage of the lapses in concentration of their opponents.

Men’s Doubles Season Starts and the Case of Oliver Marach and Mate Pavic

This is a guest post by Peter Wetz.

In recent years, the steady decline of the holders of 116 doubles titles–Bob and Mike Bryan–has resulted in more variety at the very top of the game. The 16-time Grand Slam champions won their last major at the US Open 2014. Since then, eight different teams have won their first title at the highest level of the sport.

Even though none of these debut winners emerged out of nowhere, the doubles team consisting of Oliver Marach and Mate Pavic, which formed in the middle of last season, has enjoyed an exceptional run at this year’s start of the season. This prompted me to take a closer look at the performance of doubles teams per season.

The following table shows each team’s won-loss record through the French Open for each season since 2000 . It’s sorted by number of  wins up to that point, and the last column displays the won-loss record for the complete season. Only teams that have won more than 30 matches until the French Open are listed.

Year	Team		W-L (%) Start	W-L (%) Full
2013	Bryan/Bryan	40-4  (91%)	71-11 (87%)
2002	Knowles/Nestor	38-7  (84%)	66-14 (82%)
2007	Bryan/Bryan	37-5  (88%)	73-10 (88%)
2008	Bryan/Bryan	37-9  (80%)	63-17 (79%)
2009	Bryan/Bryan	37-9  (80%)	68-18 (79%)
2014	Bryan/Bryan	36-6  (86%)	64-12 (84%)
2018	Marach/Pavic	36-7  (84%)	tbd
2010	Nestor/Zimonjic	35-7  (83%)	57-19 (75%)
2012	Mirnyi/Nestor	34-9  (79%)	43-18 (70%)
2003	Knowles/Nestor	34-9  (79%)	57-16 (78%)
2006	Bryan/Bryan	33-9  (79%)	65-15 (81%)
2004	Bryan/Bryan	32-8  (80%)	57-17 (77%)
2010	Bryan/Bryan	31-7  (82%)	67-13 (84%)
2011	Bryan/Bryan	31-7  (82%)	59-16 (79%)
2009	Nestor/Zimonjic	31-8  (79%)	57-17 (77%)
2014	Nestor/Zimonjic	31-8  (79%)	42-18 (70%)
2003	Bryan/Bryan	31-12 (72%)	54-20 (73%)

As we can see, Marach/Pavic come in seventh with a very healthy 36-7 won-loss record this year. Their first loss came in the Rotterdam final, their fourth tournament after collecting titles in Doha, Auckland, and at the Australian Open–a streak of 17 consecutive match wins. If we ignore the all-time greats, there hasn’t been a better start to a men’s doubles season in the past 16 years.

The fact that the Bryan twins show up ten out of seventeen times in the table underlines just how dominant they were. And even though they did not win a Grand Slam in the last three years, they still had the best season starts in 2015 and 2016 (just barely missing the table, because they did not reach 30 match wins).

The last column gives a clue of what to expect from Marach and Pavic for the rest of the year. Most of the time, the teams at the very top only slightly decline. Notably, in 2007 the Bryan brothers maintained a win percentage of 88%, which led to the best doubles season in the dataset, measured by won-loss record.

After losing their seventh match this season at the 2018 French Open final to Herbert/Mahut and therefore missing the chance to win the first two majors of the season–a feat achieved in the open era only by the Bryans in 2013–it will be interesting to see if they will be able to sustain their level over a full season.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.

Everything You Always Wanted to Know About Marco Cecchinato’s Run to the Roland Garros Semifinal

This is a guest post by Peter Wetz.

When a 25 year old Italian tennis player named Marco Cecchinato defeated Marius Copil in the first round of this year’s edition of Roland Garros, some people may have noticed that it was one of the longer first round matches. With a duration of 3 hours and 41 minutes the match was the fifth longest of the 64 opening round matches. However, I am confident that no one suspected the winner of this encounter would go much farther in the draw. Little did we know.

After his unexpected four set win in the quarterfinal against a hard-fighting Novak Djokovic–bookmakers were giving him about an 11 percent chance of winning–many tweets emphasized the uniqueness of this achievement. Since it is difficult to provide more context in a tweet, I was interested in just how often something like this happened in the past. So I looked into the data and came up with more complete lists of the tweeted facts which are presented in the remainder of this post.

The first and obvious question is, when was the last time that a player ranked as high as Cecchinato reached a Grand Slam semifinal?

The following table shows players ranked outside of the top-70 that reached a Grand Slam semifinal. Rows denoting achievements at Roland Garros are bold.

Tourney Player		       Rank	Round
RG 18	Marco Cecchinato	 72	SF
W  08	Rainer Schuettler	 94	SF
W  08	Marat Safin		 75	SF
AO 04	Marat Safin		 86	F
W  01	Goran Ivanisevic	125	W
W  00	Vladimir Voltchkov	237	SF
RG 99	Andrei Medvedev		100	F
AO 99	Nicolas Lapentti	 91	SF
AO 98	Nicolas Escude		 81	SF
W  97	Michael Stich		 88	SF
RG 97	Filip Dewulf		122	SF
RG 92	Henri Leconte		200	SF
UO 91	Jimmy Connors		174	SF
AO 91	Patrick Mcenroe		114	SF

As the tweet points out the most recent comparable runs by Rainer Schuettler and Marat Safin happened after the players have reached top-10 rankings. Hence, the most recent really comparable run where the player has not reached his career high ranking at the time of the tournament, is by Vladimir Voltchkov, who reached the semifinal at Wimbledon 2000.*

Another unique thing about Cecchinato’s run is that until last week he did not win a single match at a Grand Slam event.

The following table shows players that won their first match at a Grand Slam event and went on to win more matches. To prevent showing an extremely short table, I relaxed the condition on how far the player should have gone when winning his first Grand Slam match to reaching the quarterfinal. The last column Attempts denotes the number of main draw appearances until his first main draw win.

Tourney   Player	   Rank    Reached Attempts
RG 18	  Marco Cecchinato   72	   SF	   6
AO 18     Tennys Sandgren    97	   QF	   3
RG 03	  Martin Verkerk     46	   F	   3
W  00     Alexander Popp    114	   QF	   2
W  97	  Nicolas Kiefer     98	   QF	   3
RG 97	  Galo Blanco	    111	   QF	   4
W  96	  Alex Radulescu     91	   QF	   1
RG 95	  Albert Costa	     36	   QF	   4
RG 94     Hendrik Dreekmann  89	   QF	   2
AO 93	  Brett Steven	     71	   QF	   1

As the table shows, rarely has a player gotten past the quarterfinal after recording his debut win at a Grand Slam, with the notable exception of Martin Verkerk, who reached the final 15 years ago at his third attempt. Still–especially in the 1990s–there were a few players who won four consecutive matches. Not included in the table, but not less impressive, is the run by Mikael Pernfors. Interestingly, he had not won a single Grand Slam match, but he had built himself a ranking of 26, when he reached the final round of Roland Garros 1986, where he also won his first main draw match.

When looking at male Grand Slam competitors from Italy, not many names besides Fabio Fognini, Andreas Seppi, Simone Bolelli, and Paolo Lorenzi spring to mind. With 150 main draw appearances, the quartet shares a mere ten appearances in the round of 16 and one quarterfinal appearance (Fabio Fognini at Roland Garros 2011). Marco Cecchinato is the first Italian player in the semifinal of a Grand Slam in 40 years.

The following table shows all appearances of Italian players past the round of 16.

Tourney   Player	    	Reached
RG 18	  Marco Cecchinato  	SF
RG 11	  Fabio Fognini		QF
W  98	  Davide Sanguinetti 	QF
RG 95	  Renzo Furlan	     	QF
AO 91	  Cristiano Caratti  	QF
RG 80	  Corrado Barazzutti 	QF
W  79     Adriano Panatta	QF
RG 78	  Corrado Barazzutti	SF
UO 77	  Corrado Barazzutti	SF
RG 77	  Adriano Panatta	QF
RG 76	  Adriano Panatta	W
RG 75	  Adriano Panatta	SF
RG 73	  Paolo Bertolucci	QF
RG 73	  Adriano Panatta	SF
RG 72	  Adriano Panatta	QF

Despite the fact that male Italian players seem strongest on the dirt, since 1978 no one reached the semifinal of a Grand Slam. Even Fabio Fognini’s quarterfinal appearance at Roland Garros 2011 was the first in 13 years. Marco Cecchinato is one win away of being the first Italian Grand Slam finalist since 1976.

Marco Cecchinato was not seeded. If we look at Grand Slam semifinals comprised of unseeded players an interesting pattern appears.

Tourney Player  	    	Reached
RG 18	Marco Cecchinato  	SF
AO 18	Hyeon Chung		SF
AO 18	Kyle Edmund		SF
W  08	Rainer Schuettler	SF
W  08	Marat Safin		SF
RG 08	Gael Monfils		SF
AO 08	Jo Wilfried Tsonga	F
UO 06	Mikhail Youzhny		SF
W  06	Jonas Bjorkman		SF
AO 06	Marcos Baghdatis	F
UO 05	Robby Ginepri		SF
RG 05	Mariano Puerta		F
W  04	Mario Ancic		SF
RG 04	Gaston Gaudio		W
AO 04	Marat Safin		F
W  03	Mark Philippoussis	F
RG 03	Martin Verkerk		F
AO 03	Wayne Ferreira		SF
W  01	Goran Ivanisevic	W
UO 00	Todd Martin		SF
W  00	Vladimir Voltchkov	SF
RG 00	Franco Squillari	SF

Since 2008 this is only the third time that an unseeded player reached the semifinal. All three occurrences happended this year. It appears that we can again get used to see new faces deep into the second week of a Grand Slam tournament.

Finally, let’s take a look at Grand Slam semifinals between players using a one-handed backhand. The decreasing popularity of the one-hander has already been discussed here and with this in mind it seems even more unique that Dominic Thiem–the player who Marco Cecchinato will face tomorrow in the semifinal–inititally played a two-hander, but then changed to a one-hander.

Tourney Player 1	    	Player 2
RG 18	Marco Cecchinato  	Dominic Thiem
AO 17	Roger Federer		Stanislas Wawrinka
UO 15	Roger Federer		Stanislas Wawrinka
W  09	Roger Federer		Tommy Haas
W  07	Roger Federer		Richard Gasquet
AO 07	Fernando Gonzalez	Tommy Haas
UO 04	Roger Federer		Tim Henman
UO 02	Pete Sampras		Sjeng Schalken
RG 02	Albert Costa		Alex Corretja
W  99	Pete Sampras		Tim Henman
UO 98	Patrick Rafter		Pete Sampras
W  98	Pete Sampras		Tim Henman

If we ignore Roger Federer and Stanislas Wawrinka, two players who brought the one-handed backhand back into discussion, the last Grand Slam semifinal between two one-handers was played between Fernando Gonzalez and Tommy Haas at the Australian Open 2007. Before that, Pete Sampras was involved in four of six such encounters. Without Roger and Pete the world of one-handed Grand Slam semifinals would look really thin.

Whatever the result of the semifinal between Marco Cecchinato and Dominic Thiem will be, we know already that Marco achieved what only few players have done before him, especially in recent years. Whether he will be able to repeat this feat at Wimbledon, where he will be seeded despite having never won a match on a grass court, is arguable. Still, placing a bet on his own first round loss probably won’t be a good idea–at the very least, a lot more fans will be watching his opening match than ever before.

* A previous version of this article wrongly stated that the Wimbledon 2001 championship run by Goran Ivanisevic is more similar to Marco Cecchinato’s run. However, in 2001 Ivanisevic had already achieved his career high ranking, which is not the case for Cecchinato. Thanks for @rtwkr at Twitter for pointing this out.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.

Podcast Episode 27: Roland Garros Preview

Episode 27 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, will help get you ready for the 2018 French Open. We talk through the men’s and women’s draws, with a focus on the unpredictability of the women’s field, the towering presence of Rafael Nadal in the men’s, and the big-name floaters lurking in both brackets.

We also touch on Mike Bryan’s new doubles partner, Serena/Venus in the women’s doubles, and the long-delayed suspension of Nicolas Kicker for match fixing. Thanks for listening!

(Note: this week’s episode is about 48 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Update: Episode index with links, thanks to FBITennis:

Roland Garros Women’s Draw (and Serena v Sharapova) 1:08
How much do head-to-head records matter? 5:09
Top Quarter Analysis: Simona Halep as a weak favorite 8:39
Second Quarter Analysis: Muguruza, Sleeper? 12:21
Third Quarter Analysis: Potential Ostapenko v Svitolina Quarterfinal 13:51
Fourth Quarter Analysis: Kvitova favorite, but Jeff has a “Super Dark Horse” 15:36
Jeff’s and Carl’s Picks on the Women’s Side 17:40
Roland Garros Men’s Draw:  Nadal’s Tournament to Win 19:57
Do you take Kyle Edmund for your Fantasy Tennis Team? 24:56
Or Lucas Pouille? 27:44
Third Quarter Analysis: Djokovic’s quarter 29:22
Fourth Quarter Analysis:  Zverev over Thiem, and can Wawrinka or Nishikori surprise? 31:55
Draw Luck on the Men’s Side 34:58
Men’s Doubles:  Bryan/Bryan Streak Broken 37:19
Women’s Doubles:  Williams Sisters 39:17
Kicker Kicked to the Curb; Match Fixing 40:48
On Demand Video of Roland Garros Qualies 45:38

Unseeded Serena and the Roland Garros Draw

In a wide-open women’s field at this year’s French Open, it seems fitting that one of the most dangerous players in the draw isn’t even seeded. Serena Williams has played only four matches–none of them on clay–since returning to tour after giving birth. As such, her official WTA ranking is No. 453, and her current match-play level is anyone’s guess.

Because her ranking is low, she needed to use the ‘special ranking’ rule to enter the tournament, and the rule doesn’t apply to seedings. (I’m not going to dive further into the debate about how the rule should work–I’ve written a lot about the rule in the past.) As an unseeded player, she could have drawn anyone in the first round; in that sense, she was a bit lucky to end up opposite another unseeded player, Kristyna Pliskova, in the first round. Her wider draw section is manageable as well, with a likely second-round match against 17th seed Ashleigh Barty and a possible third-rounder with 11th seed Julia Goerges. If she makes it to the round of 16, we’ll probably be treated to a big-hitting contest between Serena and Karolina Pliskova or Maria Sharapova.

According to my Elo-based forecast, a best guess about the level of post-pregnancy Serena is that she’s the 7th best overall player in the field, and 9th best on clay. That gives her about a 40% chance of winning her first three matches and reaching the second week, a 6.2% chance of making it to the final, and a 3.1% chance of adding yet another major title to her haul.

What if she were seeded? Seeds are a clear advantage for players who receive them, as a seeding protects against facing other top contenders until later rounds. By simulating the tournament with Serena seeded, we can get a sense of how much the WTA’s rule (and the French Federation’s decision not to seed her) impacts her chances.

Seeded 7th: Let’s imagine a bizarre world in which my Elo ratings were used for tournament seedings. In that case, Serena would be seeded 7th, knocking Caroline Garcia down to 8th and sending current 32nd seed Alize Cornet into the unseeded pool. That would be a clear advantage: 50/50 odds of reaching the fourth round, a 9% chance of playing in the final, and a 4.4% shot at the title, compared to 3.1% in reality.

Seeded 1st: If seeds were assigned based on protected ranking, Serena would be the top seed. You can’t get much more of an advantage than that: The top seed is protected from playing either of the other top-four seeds until the semifinals, for instance. (It’s no insurance against a meeting with 28th seed Sharapova, but Serena, of all people, isn’t worried about that.) Moving from 7th to 1st would give her another boost, but it’s a modest one: As the top seed, her chances of sticking around for the second week would still be 50/50, with 10.1% and 4.7% odds of reaching the final and winning the title, respectively.

Here’s a summary of Serena’s chances in the various seeding scenarios. The final column is “expected points”–a weighted average of the number of WTA ranking points she is expected to collect, given her likelihood of reaching each round.

Scenario     R16  Final  Title  ExpPts  
Actual     39.8%   6.2%   3.1%     273  
Unseeded*  34.4%   6.2%   3.0%     259  
Seeded 7   50.3%   9.0%   4.4%     356  
Seeded 1   50.5%  10.1%   4.7%     371

* the ‘unseeded’ scenario represents Serena’s chances as an unseeded entrant, given a random draw. She got a little lucky, avoiding top players until the 4th round, though her chances of making the final end up the same.

Seeds matter, though there’s only so much they can do. If Serena really is at a barely-top-ten level, she’s a long shot for the title regardless of whether there’s a number next to her name. If my model grossly underestimates her and she’s back at previous form–let’s not forget, she made the final the last time she played here, and won the title the year before that–then the rest of the field will once again look like a bunch of flies for her to swat away, regardless of which numbers they have next to their names.