Podcast Episode 55: Miami Titles for Barty and Federer

Episode 55 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, discusses the newly-minted Miami champions. We start with Ashleigh Barty, possibly now the best hard-court player in the game, and mull over how her throwback style will translate to other surfaces this season. The second half is for Roger Federer, who breezed through a weaker-than-usual draw, but did so in particularly dominating style.

We also consider whether Karolina Pliskova has peaked, if Simona Halep will regain the No. 1 ranking, and the future for the pair of Canadian ATP prospects, Denis Shapovalov and Felix Auger-Aliassime. We wrap up with some thoughts about the gap on this week’s ATP calendar where Davis Cup used to be.

Thanks for listening!

(Note: this week’s episode is about 65 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Around the Net, Issue 7

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles

Multimedia

Data

Trivia

  • Roger Federer could finally become the first ATP player to win multiple titles this season, but the WTA remains a tour of unique winners. In Miami, Ashleigh Barty became the 14th champion in 14 tour-level events.
  • To reach the final, Federer needed to beat someone more than 15 years his junior. In fact, both Miami semi-finals involved age gaps of at least one and a half decades. That hasn’t happened at an ATP event since 1979. The closest since then was in Dubai last month, when Fed-Coric and Monfils-Tsitsipas were both at least 11.9 year gaps.
  • Speaking of unusual semi-finals… The Bryans beat Kubot/Melo by a score of 7-6(7) 6-7(8) [14‑12], just about as long as a match can be within the constraints of the modern doubles format of no-ad with a third-set super-tiebreak. It lasted 187 points. While match stats are hard to come by for doubles, I do have a reasonably complete set for tour-level doubles since 2017. In that span, 187 points is the longest match under these rules. There was one other 187-pointer in 2018 and a 186-point marathon in 2017.
  • Thanks in part to his run in Miami, Felix Auger Aliassime won his first five career matches against top 20 players, something that’s never been done before. Mario Ancic won his first three; Felix is the only guy with more. After the semi-final loss to Isner, FAA falls to 5-1, but still has a chance to set more records. No one has won more than 7 of their first 10 matches against the top 20, a feat accomplished by Gustavo Kuerten and Andrei Medvedev.

Beyond the net

Thanks to Peter for help with this week’s issue.

WTA Aging Patterns and Bianca Andreescu’s Future

Italian translation at settesei.it

Bianca Andreescu is really good, right now. Still a few months away from her 19th birthday, she has collected her first Premier Mandatory title, beaten a few top-ten players (including Angelique Kerber twice), and climbed to 7th in the Elo ratings. She is the only teenager in the WTA top 30 and one of only five in the top 100.

The burning question about Andreescu isn’t how good she is, it’s how good she could become. It’s easy to look at the best 18-year-old in the game and imagine her becoming the best 19-year-old, best 20-year-old, and so on, until she’s at her peak age and she’s the best player in the world, period. As the sport in general has gotten older, teenage champions have become rarer, so she seems all the more destined for success. But it isn’t that simple: Prospects get injured, opponents learn how to beat them, they peak early and fizzle out. Tennis history is littered with teen starlets who failed to reach their potential.

Building an aging curve

Let’s start with the basics. What is the trajectory of the typical WTA career? Answering that question requires a whole slew of assumptions, so keep in mind that this is approximate. I found every player born between 1960 and 1989* who played at least five full** seasons, a total of about 500 players. For each one, I calculated her year-end Elo for every full season she played, as well as the difference between that year’s Elo and her peak year-end Elo.

* I wish we knew more about players born in the 1990s, since their experience is most relevant to today’s teens, but many of them have yet to reach their peaks, whenever that will be.

** I’ve defined a full season very broadly, as 20 or more completed matches at the ITF $50K level or higher.

For every player, then, we have an idea of how they aged. To get our bearings, let’s look at a couple of players with unique aging trajectories: Martina Navratilova and Venus Williams:

(Martina’s peak was about 50 Elo points higher than Venus’s, but I set them equal to each other for the purpose of this graph.)

Venus peaked at age 21 and had her last all-time-great-level season at 23, while Martina’s peak came at age 30. There’s more than one way to amass a Hall of Fame career, and it’s important to keep in mind that “average” aging patterns hide a lot of more extreme possibilities.

The usual route

When we take Venus’s and Martina’s trajectories and average them with the other 500-or-so players in our dataset, here’s what we get:

The most common peak age is 24, with 23 a very close second. In the above graph, I set peak Elo at 1,820, the average peak Elo of the players I looked at, but the absolute number isn’t important. The typical player who completes a full season at age 18 is about 70 Elo points away from her peak. There’s isn’t much downward movement in the 20s; at age 30, those players who are still active are only 43 Elo points below their peak.

There’s a poison pill in that last sentence that is difficult to avoid when analyzing aging patterns–we only know what happens to those players who are still active. That’s even more troublesome for young players. Venus, for instance, improved 211 Elo points between her year-end finish as an 18-year-old and her best year-end rating. Kerber, on the other hand, wasn’t even good enough to show up in the ratings until she was 19. If we were able to estimate Kerber’s level at that age, it would probably be very low. Thus, forecasting an 18-year-old using this dataset may understate the degree to which a player can improve.

Changing times

Using the numbers above, we can make a baseline estimate. Those players who had year-end Elo ratings as 18-year-olds typically improved about 70 more points before hitting their peak. Through her Indian Wells title, Andreescu is rated at 2,017, giving us an estimated peak of 2,087. That’s good enough for 2nd place on the current list and just inside the top 50 of all time (as measured by the player’s best year-end Elo). Still, that seems a bit modest–it doesn’t represent much of an additional improvement for a player who has come so far in just a few months.

The forecast is slightly more optimistic if we narrow our view to players born in the 1980s. It seems like a reasonable thing to do, because Andreescu is facing an era with older competition, more like the last decade than, say, the one faced by players born in the 1960s. Our dataset shrinks to about 200 players, and those players do show a bigger gap between their 18-year-old Elo rating and their career peak. The difference is about 83 points, giving Bianca a revised estimated peak of 2,100–exactly even with Simona Halep, who currently tops the list, and around the 40th best of all time.

The biggest difference in the overall aging curve and the curve for players born in the 1980s isn’t the timing of the peak, it’s the duration. I looked at several age cohorts, and the typical WTA peak is always at 23 or 24 years old. But there’s more to it than that. Take a look at the trajectory of players born in the 1960s compared to those born in the 1980s:

For the more recent generation of players, there is little difference between age 23 and 28 or 29. Even into the early 30s, those players who stick around are competing almost as well as they did at their peak.

Bespoke for Bianca

Aging patterns in women’s tennis have changed, so it’s important to look at a relevant era when there’s enough data to do so. But what if that’s not the best way to narrow our view? As I’ve noted, the average peak Elo of the 500 players in our dataset is 1820. Bianca is already 200 points higher than that. What if the best players are qualitatively different as well as quantitatively superior?

Here are 20 players whose year-end Elo at age 18 were similar to Andreescu’s current rating: the ten closest who were higher and the ten closest who were lower:

Player                     Birth Year  18yo Elo  Peak Elo  
Jelena Dokic                     1983      2110      2110  
Conchita Martinez                1972      2085      2191  
Arantxa Sanchez Vicario          1971      2084      2314  
Hana Mandlikova                  1962      2071      2160  
Iva Majoli                       1977      2067      2067  
Belinda Bencic                   1997      2066      2066  
Caroline Wozniacki               1990      2059      2194  
Lindsay Davenport                1976      2053      2353  
Nicole Vaidisova                 1989      2043      2121  
Manuela Maleeva Fragniere        1967      2035      2059  
---                                                        
Mary Pierce                      1975      2008      2161  
Ana Ivanovic                     1987      1994      2133  
Victoria Azarenka                1989      1986      2270  
Anke Huber                       1974      1980      2072  
Magdalena Maleeva                1975      1961      2024  
Agnieszka Radwanska              1989      1957      2116  
Mary Joe Fernandez               1971      1955      2110  
Anna Kournikova                  1981      1954      2020  
Kathy Rinaldi Stunkel            1967      1947      1947  
Justine Henin                    1982      1946      2411

Both halves of the list include some of the greatest of all time: Arantxa Sanchez Vicario, Lindsay Davenport, Victoria Azarenka, and Justine Henin. Yet several of these players failed to build on their early-career peaks, such as Jelena Dokic and (so far, at least) Belinda Bencic.

The average 18-year-old year-end Elo of these 20 players is 2,018, virtually the same as Andreescu’s post-Indian Wells level. The average peak year-end Elo of these 20 players is 2,145, a 120 point improvement and a more optimistic forecast than anything we’ve seen so far. That rating would put her a tick above Ana Ivanovic at her best, a bit below Hana Mandlikova at hers, and just inside the 30 greatest of all time.

This is heady stuff for a teenager, but after watching her ascent this year, it’s tough to bet against her. And as long as Kerber is in the draw, apparently, we can expect Andreescu to keep winning.

Podcast Episode 54: Miami At the Half-Way Point

Episode 54 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, catches up on the Miami early rounds, beginning with some rocky starts for Roger Federer and Novak Djokovic. We also look at the slew of early upsets and the threats who remain in the draw. We talk about how to evaluate Bianca Andreescu’s feats at her young age, and the wacky wild cards that always find their way into the Miami main draw.

Thanks for listening!

(Note: this week’s episode is about 64 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Around the Net, Issue 6

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles

Multimedia

Data

Trivia

  • We’re still waiting for our first multiple-title winner of the 2019 season. On the ATP side, that’s 19 champions in 19 events, a new record.
  • The player to break that streak will not be Dominic Thiem, who lost his first match in Miami against Hubert Hurkacz. Thiem is the first Indian Wells titlist to fail to win a Miami match since 2010, when Ivan Ljubicic lost to Benjamin Becker. It’s not bad company for Thiem, though, as the other three IW champions to lose their first match in Miami are Novak Djokovic, Lleyton Hewitt, and Alex Corretja.
  • Conceivably, the man who breaks the unique-titlist streak could be Reilly Opelka, who beat Diego Schwartzman despite being out-aced by El Pique in the first set. Opelka didn’t record a single ace in the first set, and it was only his second tour-level match in which less than 10% of his service points went for aces. (The other was his 2017 first-round encounter with Tommy Haas in Houston, and his career rate is 22.3%.)
  • Kei Nishikori is king of deciding sets no more. After dropping a third set to Dusan Lajovic in his first outing in Miami, he loses the top spot on the deciding-set winning percentage leaderboard, to Djokovic.
  • Yesterday, Naomi Osaka won the first set against Su-Wei Hsieh, but Hsieh came back to win the match. It’s the first time since 2016 that Osaka failed to convert a one-set advantage, a streak I wrote about a couple of months ago. She fell only 156 matches short of Chris Evert’s record.

Beyond the net

Thanks to Peter for help with this week’s issue.

Podcast Episode 53: Indian Wells in Review

Episode 53 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, focuses on the breakthrough hard-court title for Dominic Thiem, who claimed his first non-clay Masters trophy yesterday in Indian Wells. We talk a bit about his tactics, his draw, and–we just can’t help it, apparently–whether we’re in a weak era.

On the women’s side, we use the shock victory of 18-year-old wild card Bianca Andreescu to consider the strength of her generation, with names like Osaka, Sabalenka, and now Andreescu poised to sweep away their elders. Yet still in the mix, and representing our last topic, are Serena, Venus, and Vika, who remain a threat against just about anyone.

Thanks for listening!

(Note: this week’s episode is about 60 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Around the Net, Issue 5

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles

Multimedia

Data

  • Match Charting Project: The dataset has grown by 70 matches in the last week, from 5,256 to 5,326. We’ve added a slew of men’s and women’s matches from Indian Wells, several more 90’s Wimbledon semi-finals, and best of all, three long-sought Roland Garros women’s finals. We have now charted all men’s and women’s French Open finals back to 1980.

Trivia

  • Belinda Bencic continues to rack up top-ten wins, and despite her semi-final loss to Angelique Kerber on Friday, her record against the top ten is above .500, at 19-16. That’s something that few of her peers can claim, even many players we consider to be elites.
  • Sara Errani hit a whopping 57 double faults in her last four matches, including 22 in the Guadalajara first round against Irina Camelia Begu. And she won! 57 double faults is more than she hit in the entire 2017 or 2018 seasons.
  • The next generation of WTA teens is coming fast: 16-year-old Clara Tauson won this week’s ITF Shenzhen $60K title, and 15-year-old Dasha Lopatetskaya won her fifth pro title. At least two more teens won ITF titles this week, with three more playing finals today.
  • With his defeat of Novak Djokovic, Philipp Kohlschreiber became the 4th-oldest player to beat an ATP No. 1.
  • Ivo Karlovic turned 40 three weeks ago, and celebrated by winning three matches at Indian Wells, the first time since 2011 (also at Indian Wells) that he won three or more matches at a Masters event.

Thanks to Peter for help with this week’s issue.

Podcast Episode 52: The Unpredictable WTA of Osaka, Stephens, and Sasnovich

Episode 52 of the Tennis Abstract Podcast features guest co-host Jeff McFarland, the man behind Hidden Game of Tennis. We start with Jeff M’s origin story as a baseball analyst shifting to tennis, and then dive into a slew of WTA topics as we enter the second week of Indian Wells.

We start with Aliaksandra Sasnovich’s unusual proclivity for losing 6-0 sets and the serve-return balance in women’s tennis that results in more lopsided set scores. Next, we consider Sloane Stephens’s latest early-round ouster, dismissing some of the theories that are often thrown around after such upsets. We also talk Osaka, and finish up with a speed round on Serena, Vika, and Danielle Collins.

Thanks for listening!

(Note: this week’s episode is about 70 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Around the Net, Issue 4

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles

Multimedia

Data

  • Match Charting Project: The dataset has grown by more than 60 matches in the last week, from 5,194 to 5,256. We completed a run of Indian Wells women’s finals back to 2004, along with 1999 and 2000. We also added all of last week’s finals, Kyrgios’s last four matches in Acapulco, and another handful of Pete Sampras’s grand slam semi-finals.
  • MCP Most Wanted Video: We’re really close to completing some noteworthy subsets, but we’re missing video for several key matches. Please help!

Trivia

Beyond the Net

Dating a Kardashian had a statistically significant (at the 10% level) negative effect on athletic performance.

Thanks to Peter for help with this week’s issue.

Nick Kyrgios Really Is Different Under Pressure

Italian translation at settesei.it

Earlier this week, we looked at whether Nick Kyrgios is unusually inconsistent. That is, is he more likely to upset higher-ranked players and lose to lower-ranked players than his peers? The numbers say he isn’t.

But that isn’t all we mean when we talk about Kyrgios’s unreliability. He often undergoes dramatic shifts within matches. At times, he is visibly distracted; during his Delray Beach match against Radu Albot, he even shouted that he wanted to get off the court. Other times, he comes up with breathtaking serving and shotmaking at the most crucial moments. He seems motivated by both packed grandstands and on-court pressure. Unfortunately, both of those are missing from a lot of professional tennis.

We already have some evidence for the better-under-pressure hypothesis. In his five matches in Acapulco last week, he won a mere 50.4% of points, one of the lowest totals ever for a title-winner. In three of the five matches, he won return points at a lower rate this opponent, resulting in Dominance Ratios (DRs) below 1.0. Winning a match with a sub-1.0 DR (or fewer than 50% of total points won) isn’t unheard of, but it’s not a reliable way to rise to the top of the sport. Such contests are called “lottery matches” for a reason–there’s a lot of luck involved in winning with such fine margins, and fortune tends to even out.

Yet Kyrgios’s “luck” keeps nudging his results in the same direction. He has played 15 career tour-level matches in which his DR is between 0.9 and 0.99–close matches in which he was slightly outplayed, at least in the points column. With stats like that, players tend to win about one-third of the time. Kyrgios, however, has won eleven of those 15 matches. His good fortune doesn’t cancel out when he narrowly edges out an opponent: In 13 matches with DRs between 1.0 and 1.1, he has lost only two. The Australian is doing something right.

Big points are big

You probably already know what’s going on here, even if you haven’t listened to commentators speculate during Nick’s matches. The key to such narrow victories is converting the “big” points–break points, deuces, tiebreaks, and so on. It doesn’t matter if you throw away a point or two when serving at 40-love. Other situations have considerably more leverage, and that’s when Kyrgios brings his best tennis.

I tallied up Kyrgios’s return points won over the course of his career, based on the point score of each one. (I don’t have the point-by-point sequence of every one of his tour-level matches, but most of them are included, more than enough to constitute a reliable sample.) Here are the five games scores when he wins the most return points, starting with the most effective:

  • 0-40, 40-AD, 15-30, 30-40, 40-40

And the five scores, again in order, starting with least effective:

  • 30-0, 40-0, 40-15, 0-15, 0-0

In other words, when he has a chance to break, he’s great. In my sample of matches, he won 31.5% of return points; when the opposing server is facing him at 0-40, he wins the point 45.0% of the time. At 40-AD, it’s 41.9%. When his opponent serves with a 30-0 advantage, Kyrgios wins a mere 27.3% of return points.

Everybody does it (a little)

Astute readers will realize that I haven’t accounted for a key variable. In a data set of dozens of matches, scores that favor the returner will occur more often against weaker servers. Kyrgios didn’t get many 0-40 or even 40-AD chances against John Isner last week, but he can expect to get more against the likes of Albot. So to some extent, we should expect players to win more return points at these moments. In the last 52 weeks, ATPers have won 37.3% of return points, but 40.1% of break points.

Everybody does it, but Nick does it more. The following table shows the ratio of return points won at each game score to average return points won. The middle column shows Kyrgios’s ratios and the right-most column shows the 2018 ATP tour average:

Situation       NK   ATP  
0-40          1.43  1.14  
40-AD         1.33  1.09  
15-30         1.27  1.05  
30-40         1.26  1.06  
40-40         1.16  1.02  
15-40         1.13  1.06  
15-15         1.11  0.99  
15-0          1.11  0.98  
30-15         1.09  1.00

Situation       NK   ATP  
0-30          1.07  1.06 
AD-40         1.06  1.02  
40-30         1.05  1.00  
30-30         1.03  1.01  
0-0           1.02  0.99  
0-15          1.01  1.05  
40-15         0.95  0.92  
40-0          0.91  0.87  
30-0          0.87  0.91

Most players take advantage in 0-40 situations, and to a lesser extent at break points, but Kyrgios is on another planet. The average player wins roughly 10% more return points in break situations; Kyrgios triples the ratio.

Leverage

We’ve taken a big step toward explaining Kyrgios’s pattern-breaking results and his in-match inconsistency. But even game scores don’t tell the whole story. A deuce point at 5-0 usually matters a great deal more than a break point when the returner is already up a set and a break.

To account for those differences, we’ll turn to the leverage metric. (You’ll also see it referred to as “volatility” or “importance.”) Here’s the idea: Given what we know about two players, we can calculate the probability that one of them will win the match, based on the current situation. If the server wins, that probability shifts in his favor. If the returner wins, it shifts in the opposite direction. Leverage is the sum of those two shifts: the amount of win probability that is at stake at any given point.

For today’s purposes, there are no specific numbers; you need only to understand the concept. The higher the leverage, the more the point matters. Players might disagree with some of the details that a purely math-based approach spits out, but for the most part, the equations capture our intuition about which points matter, and how much.

I calculated the leverage for every point of the 2018 ATP season and split the points into ten categories, from least important (1) to most important (10). The following graph shows the tour average rate of return points won (RPW) for each of those ten categories:

If we ignore the leftmost and rightmost data points, there’s something of a trend here. From the second-to-least-important category to the second-to-most-important, players increase their return points won from about 36.0% to 37.5%. Some of that shift can be explained by a phenomenon I’ve already mentioned: returners find themselves in crucial situations (such as break points) more often against weaker servers.

Here’s the same graph, now with a second line showing Kyrgios’s RPW in the ten categories, from least important to most important. I’ve kept the ATP average trendline for comparison:

Remember that 36.0% to 37.5% increase I mentioned a minute ago? For Kyrgios, the same shift is 27.0% to 35.2%–eight percentage points instead of less than two. It appears that the Australian is extremely sensitive to what’s at stake throughout matches, and when the rewards are high enough, he turns into a credible returner.

Some of you are probably thinking, “of course, I knew that all along.” First of all, I hate it when people say that, because what they really mean is, “I suspected that all along,” and they didn’t really know. Some of the other things such people “know” are actually wrong.

Second, I need to underline just how unusual this is. I’ve been playing around with point-by-point data for a few years now, looking for in-match patterns, for specific players and for the sport overall. Such patterns exist: points and games aren’t entirely independent of each other. But usually they are minor–a percentage point or two, not the kind of thing you could spot even in a fortnight’s worth of matches. Kyrgios breaks the mold. When it comes to the mercurial Australian, the assumptions that are adequate to account for most of professional tennis simply fail.