Forecasting Future Felix With ATP Aging Patterns

Italian translation at settesei.it

It’s been an exceptional six weeks for Felix Auger-Aliassime. He broke into the top 100 with a runner-up performance on clay in Rio de Janeiro, won two matches each at Sao Paulo and Indian Wells (including an upset of Stefanos Tsitsipas), and raced to a semi-final at the Miami Masters, the youngest player ever to make the final four of that event. Four months away from his 19th birthday, his ranking is up to 33rd in the world, and he has few points to defend until June.

Felix is the youngest man in the top 100, and he’s reaching milestones early enough to draw comparisons with some of the best young players in the sport’s history. Will he follow in the footsteps of past wunderkinds such as Rafael Nadal and Lleyton Hewitt? To answer that question, let’s take a look at typical ATP aging patterns, what they say about when players hit their peaks, and what they can show us about the fate of the best 18 year olds.

The standard curve

Last week, I looked at WTA aging curves and found that women tend to peak around age 23 or 24, an age that has not changed even as the sport has gotten older. I also discovered that there is a surprisingly modest gap–about 70 Elo points–between 18-year-old performance and a woman’s peak level. The men’s results are different.

To calculate the average ATP aging curve, I found over 700 players who were born between 1960 and 1989 and played at least 20 tour-level, tour qualifying, or challenger-level matches in each of five seasons. Overall, peak age was 25, though the difference from age 24 to 27 is only a few Elo points, so small as to be negligible.

As the tour has gotten older, the men’s peak age has also increased. Of the nearly 300 players born between 1980 and 1989, peak age is 26-27, with ages 28 and 29 also within 10 Elo points of the age 26-27 peak. Plenty of players are peaking at older ages, and many of those who aren’t are remaining close to their best levels into their late twenties. The peak age could be even higher still–a few of the players in the 1980-89 cohort turn 30 this year, and could conceivably still improve on their career bests.

The following graph shows the trajectory of the average player (with peak year-end Elo set to 1,850) born in the 1960s and the pattern of the average player born in the 1980s:

It’s a long ascent from the performance level at age 18 to the typical peak, especially for more recent players. There’s even a hefty bit of selection bias that should inflate the level of 18 year olds, since only about 10% of the players in the overall sample qualified for a year-end Elo rating when they were 18. The ones who did were, in general, the best of the bunch.

Felix forward

Through the Miami semi-final, Auger-Aliassime’s Elo rating is 1,848. The average player in the entire dataset who played at least 20 matches in their age-18 season went on to add another 281 Elo points to their rating between the end of their age-18 season and their peak. In the narrower, more recent cohort of 1980-89 births, the players with year-end ratings as 18 year olds improved their Elos by a whopping 369 points before reaching their peaks.

Adding either of those numbers to Felix’s current rating gives us quite the rosy forecast:

Cohort   Current  Increase  Proj. Peak  
1960-89     1848       281        2129  
1980-89     1848       369        2217

There’s a bit of slight of hand in how I’m doing this, since my study uses players’ year-end ratings, and I’m using Felix’s rating in April. However, there’s no natural law that says one artificial 12-month span is better than another, and Felix’s current age of 18.6 is roughly in the middle of the ages of the year-end 18-year-olds with whom I’m comparing him.

An Elo rating of 2,129 would be good enough for fourth place on the current list, behind only the big three. The rating of 2,217 is better than any of the big three can boast at the moment, and would be the fourth-best peak year-end rating among active players, again trailing only the big three. (And Andy Murray, if you consider him active.) Only 15 Open era players have managed year-end Elo peaks above 2,217.

No comparisons

It’s tough to say whether this method, of finding the typical difference between 18-year-old and peak Elo ratings, is adequate to handle the extremes. Some players peak earlier than average, and it stands to reason that the best young talents are more likely to do so. Boris Becker posted a whopping 2,212 Elo rating at the end of his age-18 season, which didn’t leave much room for improvement. He gained another 90 points before the end of his age-19 season, which was his career best.

Becker’s career path is not particularly helpful to our effort to forecast Felix’s, in part because the German was so unique, and also because his experience reflects such a different era. But even among less unique players, there are few useful comparables. No one born since 1987 managed a better age-18 Elo rating than Felix’s 1,848, and only a handful of active or recently-retired players even reached 1,750 by that age.

Lacking the data for a more precise approach, let’s repeat what I did for Bianca Andreescu last week, and see how the nearest 18-year-old comparisons fared. Of the players whose age-18 year-end Elos were closest to Felix’s 1,848, here are the 10 above him and the 10 below him on the list:

Player               BirthYr  18yo Elo  Incr  Peak Elo  
Stefan Edberg           1966      1916   350      2266  
John Mcenroe            1959      1912   496      2408  
Guillermo Coria         1982      1909   145      2055  
Pat Cash                1965      1907   151      2058  
G. Perez Roldan         1969      1884    41      1925  
Andy Murray             1987      1878   465      2343  
Roger Federer           1981      1871   487      2359  
Thomas Enqvist          1974      1865   216      2081  
Rafael Nadal            1986      1862   452      2314  
Jim Courier             1970      1849   283      2132  
…                                                       
Jimmy Brown             1965      1834     0      1834  
Andy Roddick            1982      1815   291      2106  
Aaron Krickstein        1967      1812   246      2058  
Yannick Noah            1960      1812   299      2112  
Fabrice Santoro         1972      1805    85      1890  
Andreas Vinciguerra     1981      1803    16      1819  
Novak Djokovic          1987      1792   645      2436  
Sergi Bruguera          1971      1790   265      2055  
Thomas Muster           1967      1788   329      2117  
Dominik Hrbaty          1978      1779   133      1913

The average increase among this group is 270 Elo points, close to the overall average for players who qualified for a year-end Elo rating at age 18. The youngest members of this list are encouraging: the big four, Andy Roddick, and Andreas Vinciguerra. Most promising youngsters would happily take a two-in-three shot at having a career at the level of the big four.

Perhaps the best comparison for Felix is a player who didn’t quite make that list, Alexander Zverev. The 21-year-old German posted a year-end Elo of 1,768 as an 18 year old, and already boosted that number by more than 300 points at the end of his 2018 campaign. Zverev is only an approximate comparison, he’s just a single data point, and we don’t know where he’ll end up, but his experience is a decade more recent than those of Novak Djokovic, Murray, and Nadal.

Forecasting the career performance of young tennis players is an inexact science, at best. Potential outcomes for Auger-Aliassime range from teenage flameout to double-digit major winner. Based on the limited information he’s given us so far, the latter seems within reach. What we know for sure is that he’s playing better tennis than any 18 year old we’ve seen in a decade. If that’s not reason for optimism, I don’t know what is.

Podcast Episode 55: Miami Titles for Barty and Federer

Episode 55 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, discusses the newly-minted Miami champions. We start with Ashleigh Barty, possibly now the best hard-court player in the game, and mull over how her throwback style will translate to other surfaces this season. The second half is for Roger Federer, who breezed through a weaker-than-usual draw, but did so in particularly dominating style.

We also consider whether Karolina Pliskova has peaked, if Simona Halep will regain the No. 1 ranking, and the future for the pair of Canadian ATP prospects, Denis Shapovalov and Felix Auger-Aliassime. We wrap up with some thoughts about the gap on this week’s ATP calendar where Davis Cup used to be.

Thanks for listening!

(Note: this week’s episode is about 65 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Around the Net, Issue 7

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles

Multimedia

Data

Trivia

  • Roger Federer could finally become the first ATP player to win multiple titles this season, but the WTA remains a tour of unique winners. In Miami, Ashleigh Barty became the 14th champion in 14 tour-level events.
  • To reach the final, Federer needed to beat someone more than 15 years his junior. In fact, both Miami semi-finals involved age gaps of at least one and a half decades. That hasn’t happened at an ATP event since 1979. The closest since then was in Dubai last month, when Fed-Coric and Monfils-Tsitsipas were both at least 11.9 year gaps.
  • Speaking of unusual semi-finals… The Bryans beat Kubot/Melo by a score of 7-6(7) 6-7(8) [14‑12], just about as long as a match can be within the constraints of the modern doubles format of no-ad with a third-set super-tiebreak. It lasted 187 points. While match stats are hard to come by for doubles, I do have a reasonably complete set for tour-level doubles since 2017. In that span, 187 points is the longest match under these rules. There was one other 187-pointer in 2018 and a 186-point marathon in 2017.
  • Thanks in part to his run in Miami, Felix Auger Aliassime won his first five career matches against top 20 players, something that’s never been done before. Mario Ancic won his first three; Felix is the only guy with more. After the semi-final loss to Isner, FAA falls to 5-1, but still has a chance to set more records. No one has won more than 7 of their first 10 matches against the top 20, a feat accomplished by Gustavo Kuerten and Andrei Medvedev.

Beyond the net

Thanks to Peter for help with this week’s issue.

WTA Aging Patterns and Bianca Andreescu’s Future

Italian translation at settesei.it

Bianca Andreescu is really good, right now. Still a few months away from her 19th birthday, she has collected her first Premier Mandatory title, beaten a few top-ten players (including Angelique Kerber twice), and climbed to 7th in the Elo ratings. She is the only teenager in the WTA top 30 and one of only five in the top 100.

The burning question about Andreescu isn’t how good she is, it’s how good she could become. It’s easy to look at the best 18-year-old in the game and imagine her becoming the best 19-year-old, best 20-year-old, and so on, until she’s at her peak age and she’s the best player in the world, period. As the sport in general has gotten older, teenage champions have become rarer, so she seems all the more destined for success. But it isn’t that simple: Prospects get injured, opponents learn how to beat them, they peak early and fizzle out. Tennis history is littered with teen starlets who failed to reach their potential.

Building an aging curve

Let’s start with the basics. What is the trajectory of the typical WTA career? Answering that question requires a whole slew of assumptions, so keep in mind that this is approximate. I found every player born between 1960 and 1989* who played at least five full** seasons, a total of about 500 players. For each one, I calculated her year-end Elo for every full season she played, as well as the difference between that year’s Elo and her peak year-end Elo.

* I wish we knew more about players born in the 1990s, since their experience is most relevant to today’s teens, but many of them have yet to reach their peaks, whenever that will be.

** I’ve defined a full season very broadly, as 20 or more completed matches at the ITF $50K level or higher.

For every player, then, we have an idea of how they aged. To get our bearings, let’s look at a couple of players with unique aging trajectories: Martina Navratilova and Venus Williams:

(Martina’s peak was about 50 Elo points higher than Venus’s, but I set them equal to each other for the purpose of this graph.)

Venus peaked at age 21 and had her last all-time-great-level season at 23, while Martina’s peak came at age 30. There’s more than one way to amass a Hall of Fame career, and it’s important to keep in mind that “average” aging patterns hide a lot of more extreme possibilities.

The usual route

When we take Venus’s and Martina’s trajectories and average them with the other 500-or-so players in our dataset, here’s what we get:

The most common peak age is 24, with 23 a very close second. In the above graph, I set peak Elo at 1,820, the average peak Elo of the players I looked at, but the absolute number isn’t important. The typical player who completes a full season at age 18 is about 70 Elo points away from her peak. There’s isn’t much downward movement in the 20s; at age 30, those players who are still active are only 43 Elo points below their peak.

There’s a poison pill in that last sentence that is difficult to avoid when analyzing aging patterns–we only know what happens to those players who are still active. That’s even more troublesome for young players. Venus, for instance, improved 211 Elo points between her year-end finish as an 18-year-old and her best year-end rating. Kerber, on the other hand, wasn’t even good enough to show up in the ratings until she was 19. If we were able to estimate Kerber’s level at that age, it would probably be very low. Thus, forecasting an 18-year-old using this dataset may understate the degree to which a player can improve.

Changing times

Using the numbers above, we can make a baseline estimate. Those players who had year-end Elo ratings as 18-year-olds typically improved about 70 more points before hitting their peak. Through her Indian Wells title, Andreescu is rated at 2,017, giving us an estimated peak of 2,087. That’s good enough for 2nd place on the current list and just inside the top 50 of all time (as measured by the player’s best year-end Elo). Still, that seems a bit modest–it doesn’t represent much of an additional improvement for a player who has come so far in just a few months.

The forecast is slightly more optimistic if we narrow our view to players born in the 1980s. It seems like a reasonable thing to do, because Andreescu is facing an era with older competition, more like the last decade than, say, the one faced by players born in the 1960s. Our dataset shrinks to about 200 players, and those players do show a bigger gap between their 18-year-old Elo rating and their career peak. The difference is about 83 points, giving Bianca a revised estimated peak of 2,100–exactly even with Simona Halep, who currently tops the list, and around the 40th best of all time.

The biggest difference in the overall aging curve and the curve for players born in the 1980s isn’t the timing of the peak, it’s the duration. I looked at several age cohorts, and the typical WTA peak is always at 23 or 24 years old. But there’s more to it than that. Take a look at the trajectory of players born in the 1960s compared to those born in the 1980s:

For the more recent generation of players, there is little difference between age 23 and 28 or 29. Even into the early 30s, those players who stick around are competing almost as well as they did at their peak.

Bespoke for Bianca

Aging patterns in women’s tennis have changed, so it’s important to look at a relevant era when there’s enough data to do so. But what if that’s not the best way to narrow our view? As I’ve noted, the average peak Elo of the 500 players in our dataset is 1820. Bianca is already 200 points higher than that. What if the best players are qualitatively different as well as quantitatively superior?

Here are 20 players whose year-end Elo at age 18 were similar to Andreescu’s current rating: the ten closest who were higher and the ten closest who were lower:

Player                     Birth Year  18yo Elo  Peak Elo  
Jelena Dokic                     1983      2110      2110  
Conchita Martinez                1972      2085      2191  
Arantxa Sanchez Vicario          1971      2084      2314  
Hana Mandlikova                  1962      2071      2160  
Iva Majoli                       1977      2067      2067  
Belinda Bencic                   1997      2066      2066  
Caroline Wozniacki               1990      2059      2194  
Lindsay Davenport                1976      2053      2353  
Nicole Vaidisova                 1989      2043      2121  
Manuela Maleeva Fragniere        1967      2035      2059  
---                                                        
Mary Pierce                      1975      2008      2161  
Ana Ivanovic                     1987      1994      2133  
Victoria Azarenka                1989      1986      2270  
Anke Huber                       1974      1980      2072  
Magdalena Maleeva                1975      1961      2024  
Agnieszka Radwanska              1989      1957      2116  
Mary Joe Fernandez               1971      1955      2110  
Anna Kournikova                  1981      1954      2020  
Kathy Rinaldi Stunkel            1967      1947      1947  
Justine Henin                    1982      1946      2411

Both halves of the list include some of the greatest of all time: Arantxa Sanchez Vicario, Lindsay Davenport, Victoria Azarenka, and Justine Henin. Yet several of these players failed to build on their early-career peaks, such as Jelena Dokic and (so far, at least) Belinda Bencic.

The average 18-year-old year-end Elo of these 20 players is 2,018, virtually the same as Andreescu’s post-Indian Wells level. The average peak year-end Elo of these 20 players is 2,145, a 120 point improvement and a more optimistic forecast than anything we’ve seen so far. That rating would put her a tick above Ana Ivanovic at her best, a bit below Hana Mandlikova at hers, and just inside the 30 greatest of all time.

This is heady stuff for a teenager, but after watching her ascent this year, it’s tough to bet against her. And as long as Kerber is in the draw, apparently, we can expect Andreescu to keep winning.

Podcast Episode 54: Miami At the Half-Way Point

Episode 54 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, catches up on the Miami early rounds, beginning with some rocky starts for Roger Federer and Novak Djokovic. We also look at the slew of early upsets and the threats who remain in the draw. We talk about how to evaluate Bianca Andreescu’s feats at her young age, and the wacky wild cards that always find their way into the Miami main draw.

Thanks for listening!

(Note: this week’s episode is about 64 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Around the Net, Issue 6

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles

Multimedia

Data

Trivia

  • We’re still waiting for our first multiple-title winner of the 2019 season. On the ATP side, that’s 19 champions in 19 events, a new record.
  • The player to break that streak will not be Dominic Thiem, who lost his first match in Miami against Hubert Hurkacz. Thiem is the first Indian Wells titlist to fail to win a Miami match since 2010, when Ivan Ljubicic lost to Benjamin Becker. It’s not bad company for Thiem, though, as the other three IW champions to lose their first match in Miami are Novak Djokovic, Lleyton Hewitt, and Alex Corretja.
  • Conceivably, the man who breaks the unique-titlist streak could be Reilly Opelka, who beat Diego Schwartzman despite being out-aced by El Pique in the first set. Opelka didn’t record a single ace in the first set, and it was only his second tour-level match in which less than 10% of his service points went for aces. (The other was his 2017 first-round encounter with Tommy Haas in Houston, and his career rate is 22.3%.)
  • Kei Nishikori is king of deciding sets no more. After dropping a third set to Dusan Lajovic in his first outing in Miami, he loses the top spot on the deciding-set winning percentage leaderboard, to Djokovic.
  • Yesterday, Naomi Osaka won the first set against Su-Wei Hsieh, but Hsieh came back to win the match. It’s the first time since 2016 that Osaka failed to convert a one-set advantage, a streak I wrote about a couple of months ago. She fell only 156 matches short of Chris Evert’s record.

Beyond the net

Thanks to Peter for help with this week’s issue.

Podcast Episode 53: Indian Wells in Review

Episode 53 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, focuses on the breakthrough hard-court title for Dominic Thiem, who claimed his first non-clay Masters trophy yesterday in Indian Wells. We talk a bit about his tactics, his draw, and–we just can’t help it, apparently–whether we’re in a weak era.

On the women’s side, we use the shock victory of 18-year-old wild card Bianca Andreescu to consider the strength of her generation, with names like Osaka, Sabalenka, and now Andreescu poised to sweep away their elders. Yet still in the mix, and representing our last topic, are Serena, Venus, and Vika, who remain a threat against just about anyone.

Thanks for listening!

(Note: this week’s episode is about 60 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Around the Net, Issue 5

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles

Multimedia

Data

  • Match Charting Project: The dataset has grown by 70 matches in the last week, from 5,256 to 5,326. We’ve added a slew of men’s and women’s matches from Indian Wells, several more 90’s Wimbledon semi-finals, and best of all, three long-sought Roland Garros women’s finals. We have now charted all men’s and women’s French Open finals back to 1980.

Trivia

  • Belinda Bencic continues to rack up top-ten wins, and despite her semi-final loss to Angelique Kerber on Friday, her record against the top ten is above .500, at 19-16. That’s something that few of her peers can claim, even many players we consider to be elites.
  • Sara Errani hit a whopping 57 double faults in her last four matches, including 22 in the Guadalajara first round against Irina Camelia Begu. And she won! 57 double faults is more than she hit in the entire 2017 or 2018 seasons.
  • The next generation of WTA teens is coming fast: 16-year-old Clara Tauson won this week’s ITF Shenzhen $60K title, and 15-year-old Dasha Lopatetskaya won her fifth pro title. At least two more teens won ITF titles this week, with three more playing finals today.
  • With his defeat of Novak Djokovic, Philipp Kohlschreiber became the 4th-oldest player to beat an ATP No. 1.
  • Ivo Karlovic turned 40 three weeks ago, and celebrated by winning three matches at Indian Wells, the first time since 2011 (also at Indian Wells) that he won three or more matches at a Masters event.

Thanks to Peter for help with this week’s issue.

Podcast Episode 52: The Unpredictable WTA of Osaka, Stephens, and Sasnovich

Episode 52 of the Tennis Abstract Podcast features guest co-host Jeff McFarland, the man behind Hidden Game of Tennis. We start with Jeff M’s origin story as a baseball analyst shifting to tennis, and then dive into a slew of WTA topics as we enter the second week of Indian Wells.

We start with Aliaksandra Sasnovich’s unusual proclivity for losing 6-0 sets and the serve-return balance in women’s tennis that results in more lopsided set scores. Next, we consider Sloane Stephens’s latest early-round ouster, dismissing some of the theories that are often thrown around after such upsets. We also talk Osaka, and finish up with a speed round on Serena, Vika, and Danielle Collins.

Thanks for listening!

(Note: this week’s episode is about 70 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Around the Net, Issue 4

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles

Multimedia

Data

  • Match Charting Project: The dataset has grown by more than 60 matches in the last week, from 5,194 to 5,256. We completed a run of Indian Wells women’s finals back to 2004, along with 1999 and 2000. We also added all of last week’s finals, Kyrgios’s last four matches in Acapulco, and another handful of Pete Sampras’s grand slam semi-finals.
  • MCP Most Wanted Video: We’re really close to completing some noteworthy subsets, but we’re missing video for several key matches. Please help!

Trivia

Beyond the Net

Dating a Kardashian had a statistically significant (at the 10% level) negative effect on athletic performance.

Thanks to Peter for help with this week’s issue.