jsackmann – Page 66 – Heavy Topspin

Roger Federer, Lottery Winner

In today’s third-round match in Rome, Roger Federer posted a truly unusual stat line. He beat Borna Coric in three sets, 2-6 6-4 7-6(7), winning 95 points to Coric’s 107. That’s a total-points-won rate (TPW) 47.0%, not unheard of for a match winner, but near the lower limit of what’s possible. By Dominance Ratio (DR)–the ratio of return points won to serve points lost–Fed comes out at 0.78, where 1.0 represents an evenly-split match. He has won only 24 times in his career with a DR below 1.0, and today was the first time since 2015. These types of decisions are often referred to as “lottery matches,” because there is more luck than usual involved in the result.

Not only did Federer win the match with a TPW below 50% and a DR below 1.0, all three of his individual sets were below those numbers. He won 23 of 55 points in the first set, 31 of 64 in the second, and 41 of 83 in the third. The low total in the first set is to be expected–he lost that set badly. But often, low numbers for an entire match stem from a bad performance in a single set, like the swoon in a 7-6 1-6 7-6 contest. Coric outplayed him–narrowly, at least–in all three sets.

You might suspect that this is extremely rare, and you’d be right. Only 4.5% of ATP tour-level matches end in favor of the player who won fewer points, and 7.2% go the direction of a player with a DR below 1.0. Those numbers usually overlap, but not always. Roughly 4.0% of matches are won by a player with a TPW below 50% and a DR below 1.0. Individual sets are even more likely to be awarded to the player who won more points. Just 2.4% of sets are won by the man who lost more points. The frequency of DR < 1.0 is 7.4%, about the same as at the match level.

It turns out that there is a precedent–exactly one!–for Fed’s feat, of winning a match with TPW < 50% and DR < 1.0 in each of three sets. That’s one previous occurence in my dataset of point-by-point sequences for over 17,000 ATP tour-level matches since 2010. Inevitably, John Isner was involved. At Memphis in 2017, Isner lost his quarter-final match to Donald Young, 7-6 3-6 7-6. Young won only 46.9% of total points, and his DR was 0.66, both marks among the lowest you’ll ever see for a winner. Like Federer, Young came close in the sets he won, tallying 49.3% of all points in both the first and third set. By saving eight of nine break points and withstanding the Isner serve in the tiebreaks, Young managed to overcome a statistically superior opponent.

Federer’s victory today wasn’t particularly reliant on break point performance, though fans will be encouraged that he converted two of his four opportunities. Much has been written about Roger’s ineffectiveness in this sort of match–against his 24 wins with a sub-1.0 DR, he has 49 losses with a DR above 1.0–and break point futility is often to blame. While big servers tend to play a lot of close matches, Federer has managed to record plenty of wins without relying on the lucky ones.

With a guaranteed place in the prominent parts of the record book, Fed is making a move on the obscure pages in the back. Having repeatedly shown us that he can win matches by outplaying the guy on the other side of the net, he finally came up with a victory when the stats pointed in the other direction.

Podcast Episode 61: Reading Rafael Nadal’s Tea Leaves

Episode 61 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, struggles to draw conclusions from Rafael Nadal’s latest surprise loss in Madrid. The King of Clay has no titles in 2019–not even a clay-court final–but his longer-term track record still suggests he’s the favorite (or close) at Roland Garros.

We also cover the continued late blooming of Kiki Bertens, the surprise relevance of Roger Federer, the return of the always-dangerous Serena Williams, and the abysmal doubles record of Marco Cecchinato.

Thanks for listening!

(Note: this week’s episode is about 62 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Podcast Episode 60: Goodbye, David Ferrer. Hello, Cristian Garin

Episode 60 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, delves into the little-known group of dirt-ballers, including Garin, Matteo Berrettini, and Guido Pella, who are piling up the clay-court wins in 2019. We mull the vagaries of surface-specific Elo ratings, as well as the types of skills that might lead these guys to have crossover success on faster courts.

We also touch on the WTA results in Rabat and Prague, with a particular focus on the up-and-down career of Johanna Konta. Finally, we consider how David Ferrer stacks up against the best in the history of the sport, as he plays his last event in Madrid this week.

Thanks for listening!

(Note: this week’s episode is about 63 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Podcast Episode 59: More Surprises on Clay, and L’Affaire Gimelstob

Episode 59 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, re-evaluates Rafael Nadal’s status as clay-court favorite after his semi-final loss to Dominic Thiem in Barcelona. We also consider what Daniil Medvedev is doing right, even if it didn’t work against Thiem. We compare Medvedev’s accomplishments to those of another Russian, Karen Khachanov, and consider which set of skills is likely to lead to a better career. The same type of comparison is worth making for Istanbul finalist Marketa Vondrousova, whose counterpunching style differs from many of her teenage peers.

Finally, we dive into the muck of Justin Gimelstob’s assault case and tennis’s typically incoherent response.

Thanks for listening!

(Note: this week’s episode is about 72 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Podcast Episode 58: An Unexpected Introduction to the European Clay Season

Episode 58 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, recaps the Monte Carlo Masters and tries to make sense of Rafael Nadal’s semi-final loss to Fabio Fognini. We discuss how seriously to take the early exits of Nadal and Djokovic, as well what the result tells us about Fognini. We also cover the Fed Cup final four and consider whether the women’s event should undergo a radical change next year to match the Davis Cup.

Thanks for listening!

(Note: this week’s episode is about 69 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Podcast Episode 57: Clay Court Specialists, Return Attackers, and Predictable Servers

Episode 57 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, reviews a week’s worth of ATP and WTA results, starting with Christian Garin and Casper Ruud, the dirtballers who contested the Houston final. We consider the decline of clay-court specialization, and the more aggressive returning style favored by up-and-coming women’s stars such as Amanda Anisimova.

Finally, we express considerable befuddlement over my recent findings about Caroline Wozniacki’s extremely predictable serves.

Thanks for listening!

(Note: this week’s episode is about 62 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Around the Net, Issue 8

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles

Boys Grand Slam Winners Developing as Pros, or “Geoffrey Blancaneaux, You’re On the Clock” (hiddengameoftennis.com)
Putting ELO on Your Radar (hiddengameoftennis.com)
What Can Match Stats Tell Us About Playing Styles? (on-the-t.com)
More Exploration on Using Match Stats to Classify Playing Styles (on-the-t.com)
Forecasting Future Felix With ATP Aging Patterns (tennisabstract.com/blog)
The Most Predictable Woman in Tennis (tennisabstract.com/blog)

Multimedia

Data

Match Charting Project: The dataset has grown by more than 75 matches in the last two weeks, from 5,439 to 5,517. We’ve added several more men’s and women’s major semi-finals from the 1990s, some vintage WTA Hilton Head and Berlin finals, along with the usual grab bag of recent matches.
Spinrate analysis from the Miami Masters (twitter.com/Vestige_du_jour)

Trivia

At the ITF Sunderland event, Tara Moore came back to win from a 0-6, 0-5, 30-40 deficit.
Amazingly, there’s an even longer-shot comeback in the WTA history books. In 1983 US Open qualifying, Barbie Bramblett was down 0-6, 0-5, 0-40 and stared down 18 match points before coming back to beat Ann Hulbert.
Compared to Moore’s comeback, most WTA oddities barely register, but here’s another: In Charleston, Kaia Kanepi came back from a 0-6 first set against Elise Mertens to win, 0-6 6-0 7-5, the first time since 2000 that any match (including ITFs) has been decided by that score.

Beyond the net

How Good Are FiveThirtyEight Forecasts? (fivethirtyeight.com)
The marathon has vanquished other long-distance running events (economist.com/gametheory)
Why the NHL’s best team in decades is unlikely to win the Stanley Cup (economist.com/gametheory)

Thanks to Peter for help with this week’s issue.

The Most Predictable Woman in Tennis

Italian translation at settesei.it

Caroline Wozniacki is set in her habits. In the eight service games of her first round match in Charleston against Laura Siegemund last week, she followed a strict pattern: wide serve on the first point, T serve on the second, T on the third, and wide on the fourth. Aside from two missed first serves that weren’t classified as “wide” or “T”, that’s 30 points. Wozniacki served in her preferred direction on all 30. From the fifth point in each game, her choices were closer to random.

This is nothing new for the Danish former No. 1. Against Monica Niculescu in the Miami third round, she had 11 service games. In the first four points of each, she followed the exact pattern: wide/T/T/wide. 44 service points, and zero deviations from the first-serve script. The Match Charting Project (MCP) has logged over 2,600 WTA matches, and no other player has ever gone an entire match without varying their first-four-point serve direction. Wozniacki has done so 17 times.

Measuring serve predictability

Just how extreme is Caro’s reliability, and how much does she differ from the competition? Let’s take a look.

I classified each first serve as either “wide” or “T.” MCP coding provides for three categories (wide, body, and T), and where a serve is coded as “body,” I used the returner’s first shot as an indication of the serve direction. That’s not perfect, because some returners will run around a weak serve, but it gets us pretty close. I excluded unreturned body serves and body serve faults. Here is Caro’s percentage of wide serves for each point of over 1,000 charted service games:

Point  Wide%  
1st    82.8%  
2nd    17.4%  
3rd    16.7%  
4th    78.5%  
5th    52.3%  
6th    46.8%  
deuce  48.0%  
ad     50.6%

Wozniacki only varies her first serve direction on the first four points about once every five deliveries. If we convert the first four rates (82.8%, 17.4%, 16.7%, and 78.5%) to the frequency with which she hit her favored serve (82.8%, 82.6%, 83.3%, 78.5%), we get an average–call it FSP, for First Serve Predictability–of 81.8%. Only two other women with at least ten charted matches, Kateryna Kozlova and Justine Henin, exceed 70%, and Henin’s repetition has more to do with her preference for the T serve in all situations.

Amazingly, Caro’s overall numbers obscure just how often she uses the pattern these days. The MCP has 52 Wozniacki matches dating from the beginning of 2017, and that more recent subset gives us a FSP of 94.0%. I suspect that the more extreme number is a better representation of Woz’s tendencies, because the more recent data includes a broader selection of matches, including contests against weaker opponents. The MCP is not a random sample, and older matches tend to be more notable ones involving higher-quality opponents.

Wozniacki’s not-really-peers

Let’s take a look at some of the other women who are more predictable than average. The median WTAer with at least 10 charted matches in the MCP dataset has an FSP of about 58%, meaning that they might prefer one direction to the other, or that they often aim for a right-hander’s backhand, but that they vary the first serve delivery quite a bit.

Here are the 20 who change direction the least. For each player, the following table shows the frequency with which they hit a wide serve on each of the first four points, their FSP on the first four points–FSP(1-4)–and their FSP on points from the fifth onward, FSP(5+).

Player         1st  2nd  3rd  4th  FSP(1-4)  FSP(5+)  
Wozniacki      83%  17%  17%  79%       82%      52%  
Kozlova        60%  35%  10%  73%       72%      64%  
Henin          38%  11%  57%  25%       71%      66%  
Vikhlyantseva  92%  46%  38%  63%       68%      54%  
Petkovic       74%  72%  36%  38%       68%      58%  
Vondrousova    15%  63%  30%  54%       68%      68%  
Brengle        82%  67%  53%  68%       67%      56%  
Clijsters      86%  32%  61%  52%       67%      56%  
Stephens       76%  21%  53%  46%       65%      62%  
Voegele        71%  35%  59%  34%       65%      60%  
                                                      
Player         1st  2nd  3rd  4th  FSP(1-4)  FSP(5+)  
Dementieva     76%  54%  71%  60%       65%      60%  
Dodin          58%  14%  43%  43%       65%      64%  
Li Na          28%  33%  52%  33%       65%      56%  
Kerber         43%  78%  56%  67%       65%      64%  
Doi            21%  60%  64%  56%       65%      63%  
Vandeweghe     35%  35%  62%  66%       65%      55%  
A Beck         59%  24%  45%  33%       64%      61%  
Sanchez V      43%  77%  42%  65%       64%      64%  
Buzarnescu     19%  39%  58%  46%       64%      59%  
Sevastova      73%  58%  37%  60%       64%      55%

Only two servers, Kozlova and Natalia Vikhlyantseva, follow the general principle of Wozniacki’s wide/T/T/wide pattern. Many of these players, like Henin, prefer wide or T serves at all times, and others, including Andrea Petkovic and Coco Vandeweghe, often opt for one type of serve on the first two points and another on the next two. It’s tough to see much in the patterns among these players, especially since most of them are closer to the median level of predictability than they are to Wozniacki’s extreme consistency.

I included the final column, FSP(5+), to illustrate another aspect of Caro’s uniqueness. While she closely follows her script for the first four points, she reverts to almost 50/50 wide and T serves after that–even in the more extreme 2017-present subset of matches. Many of the other players on this list do not. Angelique Kerber, for instance, is a near Woz-level lock to go wide in the ad court late in games. She hits wide first serves more than 80% of the time at 40-30 or 30-40, and 73% of the time at AD-40 or 40-AD. Henin also stuck with her preferences on higher-leverage points.

Equilibrium

For whatever reason, Wozniacki is comfortable with this pattern, and is confident that it works. Or, at least, that it doesn’t work against her. It’s not a secret–the sequence came to my attention after Siegemund’s coach pointed it out during an on-court coaching visit in Charleston.

Tennis is full of decisions like this: when to follow a pattern, and how often to vary things to keep an opponent from getting too comfortable. On this week’s podcast, Carl and I speculated about how often a player would need to deploy an underarm serve in order to force a returner out of position. If Wozniacki’s tendencies are any indication, the answer is: not very often. The mere fact that Caro could serve the other direction was apparently enough to prevent Niculescu or Siegemund from pouncing on her first serves, even if Woz stuck to the script from the first game to the last.

I realize I’ve left a lot of questions unanswered. Does Caro win more first serve points when she varies her delivery more? Does she follow any similar patterns with her second serve? Does she use the results of the first four points to help decide the direction of the following points? Are there particular types of players who force her to mix things up–as Madison Keys did in the Charleston final, with her aggressive return tactics?

Keep an eye on this space–maybe I’ll be able to offer some answers. In the meantime, I hope you derive some extra enjoyment the next time you watch a Wozniacki match, knowing in advance where her next serve will go. Or, perhaps, you’ll witness one of the rare occasions when the most predictable woman in tennis goes off-script.

Thanks to Kees for charting the Siegemund match, passing along the on-court coaching conversation, and providing the impetus for this post.

Podcast Episode 56: Gender Differences in Surface Differences

Episode 56 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, considers the clay-court success of Charleston winner Madison Keys and how aggression seems to be a proven strategy at WTA clay events… though not for everyone.

We also do a deep dive on underarm serve strategy, and review some recent research into ATP aging patterns.

Thanks for listening!

(Note: this week’s episode is about 63 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Forecasting Future Felix With ATP Aging Patterns

Italian translation at settesei.it

It’s been an exceptional six weeks for Felix Auger-Aliassime. He broke into the top 100 with a runner-up performance on clay in Rio de Janeiro, won two matches each at Sao Paulo and Indian Wells (including an upset of Stefanos Tsitsipas), and raced to a semi-final at the Miami Masters, the youngest player ever to make the final four of that event. Four months away from his 19th birthday, his ranking is up to 33rd in the world, and he has few points to defend until June.

Felix is the youngest man in the top 100, and he’s reaching milestones early enough to draw comparisons with some of the best young players in the sport’s history. Will he follow in the footsteps of past wunderkinds such as Rafael Nadal and Lleyton Hewitt? To answer that question, let’s take a look at typical ATP aging patterns, what they say about when players hit their peaks, and what they can show us about the fate of the best 18 year olds.

The standard curve

Last week, I looked at WTA aging curves and found that women tend to peak around age 23 or 24, an age that has not changed even as the sport has gotten older. I also discovered that there is a surprisingly modest gap–about 70 Elo points–between 18-year-old performance and a woman’s peak level. The men’s results are different.

To calculate the average ATP aging curve, I found over 700 players who were born between 1960 and 1989 and played at least 20 tour-level, tour qualifying, or challenger-level matches in each of five seasons. Overall, peak age was 25, though the difference from age 24 to 27 is only a few Elo points, so small as to be negligible.

As the tour has gotten older, the men’s peak age has also increased. Of the nearly 300 players born between 1980 and 1989, peak age is 26-27, with ages 28 and 29 also within 10 Elo points of the age 26-27 peak. Plenty of players are peaking at older ages, and many of those who aren’t are remaining close to their best levels into their late twenties. The peak age could be even higher still–a few of the players in the 1980-89 cohort turn 30 this year, and could conceivably still improve on their career bests.

The following graph shows the trajectory of the average player (with peak year-end Elo set to 1,850) born in the 1960s and the pattern of the average player born in the 1980s:

It’s a long ascent from the performance level at age 18 to the typical peak, especially for more recent players. There’s even a hefty bit of selection bias that should inflate the level of 18 year olds, since only about 10% of the players in the overall sample qualified for a year-end Elo rating when they were 18. The ones who did were, in general, the best of the bunch.

Felix forward

Through the Miami semi-final, Auger-Aliassime’s Elo rating is 1,848. The average player in the entire dataset who played at least 20 matches in their age-18 season went on to add another 281 Elo points to their rating between the end of their age-18 season and their peak. In the narrower, more recent cohort of 1980-89 births, the players with year-end ratings as 18 year olds improved their Elos by a whopping 369 points before reaching their peaks.

Adding either of those numbers to Felix’s current rating gives us quite the rosy forecast:

Cohort   Current  Increase  Proj. Peak  
1960-89     1848       281        2129  
1980-89     1848       369        2217

There’s a bit of slight of hand in how I’m doing this, since my study uses players’ year-end ratings, and I’m using Felix’s rating in April. However, there’s no natural law that says one artificial 12-month span is better than another, and Felix’s current age of 18.6 is roughly in the middle of the ages of the year-end 18-year-olds with whom I’m comparing him.

An Elo rating of 2,129 would be good enough for fourth place on the current list, behind only the big three. The rating of 2,217 is better than any of the big three can boast at the moment, and would be the fourth-best peak year-end rating among active players, again trailing only the big three. (And Andy Murray, if you consider him active.) Only 15 Open era players have managed year-end Elo peaks above 2,217.

No comparisons

It’s tough to say whether this method, of finding the typical difference between 18-year-old and peak Elo ratings, is adequate to handle the extremes. Some players peak earlier than average, and it stands to reason that the best young talents are more likely to do so. Boris Becker posted a whopping 2,212 Elo rating at the end of his age-18 season, which didn’t leave much room for improvement. He gained another 90 points before the end of his age-19 season, which was his career best.

Becker’s career path is not particularly helpful to our effort to forecast Felix’s, in part because the German was so unique, and also because his experience reflects such a different era. But even among less unique players, there are few useful comparables. No one born since 1987 managed a better age-18 Elo rating than Felix’s 1,848, and only a handful of active or recently-retired players even reached 1,750 by that age.

Lacking the data for a more precise approach, let’s repeat what I did for Bianca Andreescu last week, and see how the nearest 18-year-old comparisons fared. Of the players whose age-18 year-end Elos were closest to Felix’s 1,848, here are the 10 above him and the 10 below him on the list:

Player               BirthYr  18yo Elo  Incr  Peak Elo  
Stefan Edberg           1966      1916   350      2266  
John Mcenroe            1959      1912   496      2408  
Guillermo Coria         1982      1909   145      2055  
Pat Cash                1965      1907   151      2058  
G. Perez Roldan         1969      1884    41      1925  
Andy Murray             1987      1878   465      2343  
Roger Federer           1981      1871   487      2359  
Thomas Enqvist          1974      1865   216      2081  
Rafael Nadal            1986      1862   452      2314  
Jim Courier             1970      1849   283      2132  
…                                                       
Jimmy Brown             1965      1834     0      1834  
Andy Roddick            1982      1815   291      2106  
Aaron Krickstein        1967      1812   246      2058  
Yannick Noah            1960      1812   299      2112  
Fabrice Santoro         1972      1805    85      1890  
Andreas Vinciguerra     1981      1803    16      1819  
Novak Djokovic          1987      1792   645      2436  
Sergi Bruguera          1971      1790   265      2055  
Thomas Muster           1967      1788   329      2117  
Dominik Hrbaty          1978      1779   133      1913

The average increase among this group is 270 Elo points, close to the overall average for players who qualified for a year-end Elo rating at age 18. The youngest members of this list are encouraging: the big four, Andy Roddick, and Andreas Vinciguerra. Most promising youngsters would happily take a two-in-three shot at having a career at the level of the big four.

Perhaps the best comparison for Felix is a player who didn’t quite make that list, Alexander Zverev. The 21-year-old German posted a year-end Elo of 1,768 as an 18 year old, and already boosted that number by more than 300 points at the end of his 2018 campaign. Zverev is only an approximate comparison, he’s just a single data point, and we don’t know where he’ll end up, but his experience is a decade more recent than those of Novak Djokovic, Murray, and Nadal.

Forecasting the career performance of young tennis players is an inexact science, at best. Potential outcomes for Auger-Aliassime range from teenage flameout to double-digit major winner. Based on the limited information he’s given us so far, the latter seems within reach. What we know for sure is that he’s playing better tennis than any 18 year old we’ve seen in a decade. If that’s not reason for optimism, I don’t know what is.