Elena Rybakina and the Value of Average

Also today: Ugo Humbert in the (Elo) top ten; South American Davis Cup hard courts

Elena Rybakina at the 2023 US Open. Credit: Hameltion

Never underestimate average. Establishing oneself on the top level of the pro tennis circuit is extraordinarily difficult; proving that any particular skill is average among one’s tour-level peers is even harder. Most players are better than the norm in some categories, worse in others. Anyone who can beat the middle of the pack in every department is virtually guaranteed to be a superstar.

Average is Elena Rybakina’s secret weapon. You probably didn’t know she needed one, because she has a very effective, very evident non-secret weapon: an unreadable bullet of a first serve. In the last year, over 43% of her first serves have gone unreturned. No one else on tour comes within three percentage points of that, and only five other women top 35%. On a good day, the serve can put a match out of reach nearly on its own. When she faced Aryna Sabalenka in Beijing last fall, 65% of her first serves didn’t come back. Most women barely manage to win that many first serve points, let alone decide them with one stroke.

I’ll come back to the serve in a moment, because it is so remarkable, and it would be strange to talk about Rybakina without discussing it. But what makes her a contender every week–not to mention a champion in Abu Dhabi yesterday–is the way that the rest of her game doesn’t hold her back. Among the other women who end points with more than 35% of their first serves, you’ll find a long list of weaknesses. Qinwen Zheng doesn’t put nearly enough of them in the box. Donna Vekic and Caroline Garcia struggle to break serve. Liudmila Samsonova doesn’t break much, either, and her mistakes come in excruciating, match-endangering bunches.

Lopsided player profiles make sense. Only a few people have the combination of natural gifts and discipline to develop a dominant serve. Tennis skills are correlated, but not perfectly so. Someone who serves like Vekic can often learn good-enough groundstrokes and secondary shots. But players with one standout skill are unlikely to be solid across the board. Just because someone is top ten in the world in one category, why would we expect them to rank in the top 100 by a different measure?

Rybakina has reached the top–or close, anyway–by coupling a world-class serve with a set of skills that lacks defects. (You can nitpick her footwork or technique, but none of that holds her back when it comes to winning enough points.) After we review the devastation wrought by her serve, we’ll see just how average she otherwise is, and why that wins her so many matches.

First serves first

I’ve already given you the headline number: Since this time last year, 43.4% of Rybakina’s first serves haven’t come back. That’s one percentage point better than Serena Williams’s career rate. Serena’s numbers are based on matches logged by the Match Charting Project, a non-random sample skewed toward high-profile contests against strong opponents, so I’m not ready to say outright that Rybakina is serving better than Serena. But I’m not not saying that–we’re within the margin of error.

Some back-of-the-envelope math shows what kind of gains a player can reap from the best first serve in the game. Rybakina makes about 60% of her first serves–lower than average, but probably worth the trade-off. (And improving–we’ll talk about that in a bit.) When the serve does come back, she wins about half of points, roughly typical for tour players. All told, 43% of her serve points are first-serve points won. Tack on about half of her second serve points–she wins 48% of those, better than average but not by a wide margin–and we end up with her win rate of 62.5% of serve points–fourth-best on tour.

Put another way: We combine one world-class number (unreturned first serves) with a below-average figure (first serves in), one average number (success rate when the serve come back), and one more that was slightly better than average (second-serve points won). The result is an overall success rate that trails only those of Iga Swiatek, Sabalenka, and Garcia. That, in case you ever doubted the value of an untouchable first serve, is the impact of one very good number.

The key to Rybakina’s first serve–apart from blinding speed–is its unreadability. She must lead the tour in fewest returner steps per ace, a stat I dreamed up while watching the Abu Dhabi semi-final on Saturday. Samsonova seemed to stand bolted to the ground, watching one serve after another dart past her. After one business-as-usual ace out wide, Samsonova even offered a little racket-clap of appreciation, an unusual gesture for such a routine occurrence.

In addition to the deceptiveness of a nearly identical toss and service motion, Rybakina is effective in every direction. There’s no way for an opponent to cheat to one side, hoping to get an edge on a delivery in that corner of the box. Here are Elena’s rates of unreturned first serves and total points won in each corner of the two service boxes:

Direction   Unret%  Won%  
Deuce-Wide     36%   69%  
Deuce-T        45%   75%  
Ad-T           37%   70%  
Ad-Wide        42%   74%

The average player ends points with their first serve between 20% and 25% of the time and wins 60% of their first serve points. Rybakina obliterates those numbers in every direction. If there’s a strategy to be exploited, it’s that returners ought to lean toward their forehand, because if the serve comes to their backhand, they don’t have a chance anyway.

The scariest thing for the rest of the tour is that the 24-year-old’s biggest weapon may be getting even bigger. Her 43.4% rate of unreturned first serves in the last 52 weeks compares favorably to a career clip of 38.2%. Against Samsonova on Saturday, over 41% of all serves didn’t come back, better than Rybakina managed in any of their four previous meetings.

She may be getting savvier, too. One of the dangers of a game built around a single weapon is that certain players might be able to neutralize it. Daria Kasatkina, Elena’s opponent in yesterday’s final, is just such an opponent, a resourceful defender and a first-class mover. When the two women played a three-and-a-half-hour epic in Montreal last summer, Kasatkina put three-quarters of first serves back in play, something that few women on tour could manage and one of the main reasons the match stretched so long. Rybakina survived, but she was broken ten times.

Yesterday, Kasatkina was as pesky as ever, getting almost as many balls back as she did in Montreal. But Rybakina took fewer chances with her first strike, perhaps as much to counter the wind as to adjust for her opponent. Whatever the reason, Elena made three-quarters of her first serves. She had never landed more than 61% against Kasatkina.

The Abu Dhabi final was an exaggerated example of a longer-term trend. Somehow, Rybakina is making way more first serves than ever before, sacrificing no aces and only a fraction of first-serve points won. The overall results speak for themselves:

Year    1stIn%  1st W%   Ace%   SPW%  
2024     66.8%   70.9%  10.3%  64.8%  
2023     56.8%   73.6%  10.5%  62.8%  
Career   57.8%   71.1%   8.4%  62.0%

It’s not a perfect comparison, because the entire 2024 season so far has been on hard courts. Her season stats will probably come down. But a ten-percentage-point increase in first serves in? Nobody does that. Kasatkina won just five games yesterday, and she won’t be the last opponent to discover that whatever edge she once had against Rybakina is gone.

Average ballast

As Ivo Karlovic can tell you, the best service in the world can take you only so far. Some first serves will go astray, some serves will come back, and then there’s the whole return game to contend with. Women’s tennis rarely features characters quite as one-sided as Ivo, but Vekic and Garcia illustrate the point, struggling to string together victories because their serves alone are not enough.

Here’s a quick overview of how the rest of Rybakina’s game stacks up against the average top-50 player over the last 52 weeks:

Stat     Top-50  Elena  
2nd W%    46.7%  48.4%  
DF%        5.2%   3.9%  
RPW       44.4%  44.2%  
Break%    35.5%  36.9%  
BPConv%   46.6%  43.5%

She’s somewhat better than average behind her second serve, as you’d expect from someone with such a dominant first serve. It’s aided by fewer double faults than the norm. On return, we have two separate stories. Taking all return points as a whole, Rybakina is almost exactly average, matching the likes of Barbora Krejcikova and Marta Kostyuk. The only category where she trails the majority of the pack is in break point conversions–and by extension, breaks of serve.

The discrepancy between Rybakina’s results on break points and on return points in general may just be a temporary blip. Most players win more break points than their typical return performance, because break points are more likely to arise against weaker servers. That hasn’t been the case for Elena in the last 52 weeks, and it wasn’t in 2022, either, when she won 41.9% of return points that year but converted only 40.5% of break opportunities.

Match Charting Project data indicates that she is slightly more effective returning in the deuce court than the ad court; since most break points are in the ad court, that could explain a bit of the gap. Charting data also suggests she is a bit more conservative on break point, scoring fewer winners and forced errors than her normal rate, though not fewer than the typical tour player. It may be that Rybakina will always modestly underperform on break opportunities, but it would be unusual for a player to sustain such a large gap.

In any case, she hasn’t struggled in that department in 2024. In 13 matches, she has won 46.9% of return points overall and 47.3% of break points. It’s dangerous to extrapolate too much from a small sample, especially on her preferred surface, but it may be that Rybakina’s single weak point is already back to the top-50 norm of her overall return performance.

The value of all this average is this: What Rybakina takes with her first serve, she doesn’t give back with the rest of her game. We’ve already seen how a standout rate of unreturned first serves–plus a bunch of average-level support from her second serve and ground game–translates into elite overall results on serve. A tour-average return game generates about four breaks per match. Elena has been closer to 3.5, but either way, that’s more than enough when coupled with such a steady performance on the other side of the ball.

I can’t help but think of Rybakina’s “other” skills as analogous to the supporting cast in team sports. Her first serve is an all-star quarterback or big-hitting shortstop; the rest of her game is equivalent to the roster around them. In baseball, a league-average player is worth eight figures a year. Though Elena’s return, for instance, doesn’t cash in to quite the same degree, it is critical in the same way. A superstar baseball player can easily end up on a losing team, just as Caroline Garcia can drop out of the top 50 despite her serve. Rybakina is at no risk of that.

A final striking attribute of Rybakina’s game is that her array of tour-average skills can neutralize such a range of opponents. Her weekend in Abu Dhabi was a perfect illustration, as she overcame Samsonova and Kasatkina, two very different opponents, each of whom has bedeviled her in the past. Elena is more aggressive than the average player, but she is considerably more careful than Samsonova; her Rally Aggression Score is equivalent to Swiatek’s. She was able to take advantage of the Russian’s rough patches without losing her own rhythm or coughing up too many errors of her own.

Against Kasatkina, she posted the most unexpected “average” stat of all. In a matchup of power against defense, defense should improve its odds as the rallies get longer. On Sunday, the two women played 15 points of ten strokes or more, and Rybakina won 8 of them. In her career, Elena has won 52% of those points–probably more by wearing down opponents with down-the-middle howitzers than any kind of clever point construction, but effective regardless of the means.

Rybakina won’t beat you at your own game. But she’ll play it pretty well. Combined with the best first serve in women’s tennis, drawing even on the rest is a near-guarantee of victory. Abu Dhabi marked her seventh tour-level title, and it will be far from her last.

* * *

Ugo Humbert, Elo top-tenner

You probably don’t think of Ugo Humbert as a top-ten player, if you think of him at all. The 25-year-old left-hander cracked the ATP top 20 only a few months ago, and his title last week in Marseille gave him a modest boost to #18.

Elo is much more positive about the Frenchman. Today’s new Elo rankings place him 9th overall, just behind Hubert Hurkacz, the man he defeated to reach the Marseille final. Humbert has always been dangerous against the best, with a 22-25 career record facing the top 20, and a 10-12 mark against the top ten.

Humbert’s place in the Elo top ten might feel like a fluke; there’s a tightly-packed group between Hurkacz at #8 and Holger Rune at #13, and an early loss in Rotterdam could knock the Frenchman back out of the club. But historically, if a player reaches the Elo top ten, a spot in the official ATP top ten is likely in the offing.

I wrote about this relationship back in 2018, after Daniil Medvedev won in Tokyo. As his ATP ranking rose to #22, he leapt to #8 on the Elo list. In retrospect, it’s odd to think that “Daniil Medvedev will one day crack the top ten” was a big call, and it wasn’t that far-fetched: Plenty of people would’ve concurred with Elo on that one. He made it, of course, officially joining the elite the following July.

In that post, I called Elo a “leading indicator,” since most players reach the Elo top ten before the ATP computer renders the same judgment. This makes sense: Elo attempts to measure a player’s level right now, while the ATP formula generates an average of performances over the last 52 weeks. That’s a better estimate of how the player was doing six months ago. Indeed, for those players who cracked both top tens, Elo got there, on average, 32 weeks sooner. In Medvedev’s case, it was 40 weeks.

Most importantly for Humbert, Elo is almost always right. In October 2018, I identified just 19 players who had reached the Elo top ten but not the ATP top ten. Three of those–Medvedev, Stefanos Tsitsipas, and Roberto Bautista Agut–have since taken themselves off the list. One more has come along in the meantime: Sebastian Korda joined the Elo top ten in early 2023, but his ATP points total has yet to merit the same ranking.

Most of the Elo-but-not-ATP top-tenners had very brief stays among the Elo elite: Robby Ginepri qualified for just one week. The only exception is Nick Kyrgios, who spent more than a year in the Elo top ten, thanks to his handful of victories over the best players in the game. His upsets earned him plenty of notoriety, but his inability to consistently beat the rest of the field kept his points total deflated.

Humbert, in his much quieter way, fits the same profile. His serve means that he can keep things close against higher-ranked players, but he has struggled to string together enough routine wins to earn more of those chances. (Injuries haven’t helped.) Still, the odds are in his favor. In 32 weeks–give or take a lot of weeks–he could find himself in the ATP top ten.

* * *

Surfaces in South American Davis Cup

It dawned on me about halfway through the deciding rubber of the Chile-Peru Davis Cup qualifying tie: They were playing on a hard court! In South America! Against another South American side!

It made sense for Chile, with big hitters Nicolas Jarry and Alejandro Tabilo leading the team, and they did indeed vanquish the Peruvian visitors. But South America is known as a land of clay courts, the home of the “Golden Swing.” It seemed weird that an all-South American tie would be played on anything else.

As it turns out, it isn’t that unusual. Since the late 1950s, I found 252 Davis Cup ties between South American sides. I don’t have surface for 37 of them, almost all from the 1970s. Presumably most of those were on clay, but since that’s the question I’m trying to answer, I’m not going to assume either way.

That leaves us with 215 known-surface ties, from 1961 to the Chile-Peru meeting last weekend. (I’m excluding the matchup between Argentina and Chile at the 2019 Davis Cup Finals, since neither side had any say in the surface.) To my surprise, 37 of those ties–about one in six–took place on something other than clay. That’s mostly hard courts, but five of them were played on indoor carpet as well.

The country most likely to bust the stereotype has been Venezuela, which preferred hard courts as early as the 1960s. Ecuador also opted to skip clay with some frequency; it accounted for the first appearance of carpet in an all-South American tie back in 1979.

Chile has generally stuck with clay, but not always. The last time they hosted a South American side on another surface was 2000, when they faced Argentina on an indoor hard court. The surface probably wouldn’t have mattered, as Marcelo Rios and Nicolas Massu were heavy favorites against a much weaker Argentinian side. Though they won, the home crowd was so disruptive that the visitors pulled out without playing the doubles. Chile was disqualified from the next round and barred from hosting again until 2002.

The crowd last weekend was typically rowdy, but Jarry and Tabilo advanced without controversy. For some South American sides, hosting on hard courts may finally become the rule, not the exception.

* * *

Subscribe to the blog to receive each new post by email:

 

What Is Going Wrong For Novak Djokovic?

Also: Arina Rodionova (probably) in the top 100

Novak Djokovic practicing at the 2023 US Open. Credit: Amaury Laporte

Fifteen break points. A week has passed, a new champion has been crowned, and I still can’t stop thinking about it. In the first two sets of his Australian Open quarter-final match against Taylor Fritz, Novak Djokovic failed to convert fifteen straight break points.

It’s so far out of character as to defy belief. Djokovic has converted more than 40% of his break chances in the past year, even counting the 4-for-21 showing in the entire Fritz match. The American, one of the better servers on tour, typically saves only two-thirds of the break points he faces. The chances that Novak would come up short 15 times in a row are about one in seven million.

Even stranger, it wasn’t because Fritz served so well. He missed his first serve on 7 of the 15 break points. He hit two aces and another four didn’t come back, but that leaves nine rallies when–under pressure, in Australia–Taylor Fritz beat Novak Djokovic. Five of those lasted at least seven strokes, including a 25-shot gutbuster at 4-3 in the second set that was followed, two points later, by yet another Fritz winner on the 17th shot. All credit to the American, who walked a tightrope of down-the-line backhands and refused to give in to an opponent who, even in the first two sets, was outplaying him. But clearly this wasn’t a matter of Fritz intimidating or otherwise imposing himself on Novak.

There’s no shortage of explanations. Djokovic is recovering from a wrist injury that hampered him in his United Cup loss to Alex de Minaur. He apparently had the flu going into the Melbourne semi against Jannik Sinner. The whole Australian adventure might be nothing more than a health-marred aberration; in this interpretation, none of Jiri Lehecka, Dino Prizmic, Alexei Popyrin, or even Fritz would otherwise have taken a set from the all-time great.

But… the man is 36 years old. If other tennis players his age are any guide, he may never be fully healthy again. He will continue to get slower, if only marginally so. He personally raised the physical demands of the sport, and finally, a younger generation has accepted the challenge. Djokovic has defied the odds to stay on top for as long as he has, but eventually he will fade, even if that means only a gentle tumble out of the top three. After a month like this, we have to ask, is it the beginning of the end?

Rally intolerance

The two marathon break points that Fritz saved were not exceptions. 64 of the 269 points in the quarter-final reached a seventh shot, and the American won more than half of them. Even among double-digit rallies, the results were roughly even.

Here’s another data point: Djokovic fought out 53 points in his first-rounder against Prizmic that reached ten shots or more. The 18-year-old Croatian won 30 of them. Yeah, Prizmic is a rising star with mountains of potential, but he’s also ranked 169th in the world. This is not the Novak we’ve learned to expect: Even after retooling his game around a bigger serve and shorter points, he remained unshakeable from the baseline, his famous flexibility keeping him in position to put one more ball back in play.

Down Under, though, those skills went missing. Based on 278 charted matches since the start of 2015, the following table shows the percentage of points each year that he takes to seven shots or more, and his success rate in those rallies:

Year  7+ Freq  7+ Win%  
2015    23.3%    54.9%  
2016    26.7%    53.1%  
2017    29.1%    53.3%  
2018    24.4%    52.6%  
2019    25.0%    55.1%  
2020    26.0%    54.3%  
2021    23.8%    53.6%  
2022    23.2%    54.7%  
2023    23.4%    54.1%  
2024    26.0%    49.8%

By the standards of tennis’s small margins, that’s what it looks like to fall off a cliff. The situation probably isn’t quite so bad: The sample from 2024 is limited to only the matches against Lehecka, de Minaur, Prizmic, Fritz, and Sinner. On the other hand, matches charted in previous years also skew in favor of novelty, so upsets, close matches, and elite opponents are overrepresented there too.

It is especially unusual for Djokovic to see such a decline on hard courts. Over the last decade, he has gone through spells when he loses more long rallies than he wins. But they typically come on clay. Carlos Alcaraz shut him down in last year’s Wimbledon final as well, winning 57% of points that reached the seventh shot and 63% of those with ten or more strokes. The only period when hard-court Novak consistently failed to win this category was late 2021, when Medvedev beat him for the US Open title (and then outscored him in long rallies in Paris), and Alexander Zverev won 62% of the seven-plusses (and 70% of ten-plusses!) to knock him out of the Tour Finals.

Protracted rallies are a young man’s game, and Djokovic’s results are starting to show it. Before dissecting Alcaraz in Turin last November, Novak had never won more than half of seven-plusses against Carlitos. He has barely held on against Sinner, winning 43% of those points in their Tour Finals round-robin match and 51% at the Davis Cup Finals. In 13 meetings since 2019, Medvedev has won more of these long rallies than Djokovic has. Zverev, too, has edged him out in this category since the end of 2018.

Against the rest of the pack, Djokovic manages just fine. He dominates seven-plusses against Casper Ruud and Stefanos Tsitsipas, for instance. But it’s one of the few chinks in his armor against the best, and if January represents anything more than the temporary struggles of an ailing star, more players are figuring out how to take advantage.

Avoiding danger

For players who lose a disproportionate number of long points, the best solution is to shorten them. Djokovic may never have thought in exactly those terms, but perhaps with an eye toward energy conservation, he has done exactly that.

Especially from 2017 to 2022, Novak drastically reduced the number of points that reached the seven-shot threshold:

In 2017, 29% of his points went that long; in 2022 and 2023, barely 23% did. It remains to be seen whether January 2024 is more than a blip. In his up-and-down month, Novak remained able to control his service points, but he was less successful avoiding the grind on return. As we’ve seen, that’s dangerous territory: Djokovic won a healthy majority of the short points against Fritz but was less successful in the long ones, especially following the American’s own serve.

Much rests on the direction of these trends. If the players Djokovic has faced so far this year can prevent him from finishing points early, how will he handle Medvedev or Zverev?. If Novak can’t reliably outlast the likes of Fritz and Prizmic, what are his chances against Alcaraz?

Djokovic is well-positioned to hold on to his number one ranking until the French Open, when he’ll be 37 years old. By then, presumably, he’ll be clear of the ailments that held him back in Australia. Still, holding off the combination of Sinner, Alcaraz, Medvedev, Zverev, and Father Time will be increasingly difficult. The 24-time major champion will need to redouble the tactical effort to keep points short and somehow recover the magic that once made him so implacable in the longest rallies. Age is just a number, but few metrics are so ruthless in determining an athlete’s fate.

* * *

Arina Rodionova on the cusp of the top 100

In December, Australian veteran Arina Rodionova celebrated her 34th birthday. Now she’s competing at the tour-level event in Hua Hin this week, sporting a new career-best ranking of 101. With a first-round upset win over sixth-seed Yue Yuan, she’s up to 99th in the live rankings. Her exact position next Monday is still to be determined–a few other women could spoil the party with deep runs, or she could climb higher with more victories of her own–but a top-100 debut is likely.

Rodionova, assuming she makes it, will be the oldest woman ever* to crack the top 100 for the first time. The record is held by Tzipi Oblizer, who was two months short of her own 34th birthday when she reached the ranking milestone in 2007. Rodionova will be just the fifth player to join the top-100 club after turning 30.

* I say “ever” with some caution: I don’t have weekly rankings before the mid-80s, so I checked back to 1987. Before then, the tour skewed even younger, so I doubt there were 30-somethings breaking into the top 100. But it’s possible.

Here is the list of oldest top-100 debuts since 1987:

Player                    Milestone  Age at debut  
Arina Rodionova*         2024-02-05          34.1  
Tzipi Obziler            2007-02-19          33.8  
Adriana Villagran Reami  1988-08-01          32.0 
Emina Bektas             2023-11-06          30.6  
Nuria Parrizas Diaz      2021-08-16          30.1  
Mihaela Buzarnescu       2017-10-16          29.5  
Julie Ditty              2007-11-05          28.8  
Eva Bes Ostariz          2001-07-16          28.5  
Maryna Zanevska          2021-11-01          28.2  
Ysaline Bonaventure      2022-10-31          28.2  
Mashona Washington       2004-07-19          28.1  
Laura Pigossi            2022-08-29          28.1  
Maureen Drake            1999-02-01          27.9  
Hana Sromova             2005-11-07          27.6  
Laura Siegemund          2015-09-14          27.5

* pending!

I extended the list to 16 places in order to include Laura Siegemund. She and Buzarnescu are the only two women to crack the top 100 after their 27th birthdays yet still ascend to the top 30. The odds are against Rodionova doing the same–the average peak of the players on the list is 67, and the majority of them achieved the milestone a half-decade earlier–but you never know.

A triumph of scheduling

Rodionova has truly sweated her way to the top. She played 105 matches last year, winning 78 of them, assembling a haul of seven titles and another three finals. When I highlighted the exploits of Emma Navarro a couple of weeks ago, I couldn’t help but draw attention to the Australian, who is one of only two women to win more matches than Navarro since the beginning of last year. Iga Swiatek is the other.

Most of the veteran’s recent triumphs–44 match wins and five of her seven 2023 titles–have come at the ITF W25 level. She didn’t beat a single top-200 player in those events, and she faced only five of them. In her long slog through the tennis world last year, Rodionova played just one match against a top-100 opponent, and that was a loss to 91st-ranked Dalma Galfi.

The point is, the Aussie earned her ranking with quantity, not quality. No shame in that: The WTA made the rules, and the Australian not only chose a schedule to maximize her chances of climbing the ranking table, she executed. Kudos to her.

What her ranking does not mean, however, is that she is one of the 100 best players in the world. Elo is a more reliable judge of that, and going into this week, the algorithm ranks her 207th. (She peaked in the 140s, back in 2017.) You can hack the WTA rankings with a punishing slate of ITFs, but it’s much harder to cheat Elo.

Here are the players in the official top 150 who Elo considers to be most overrated:

Player             Elo Rank  WTA Rank  Ratio  
Caroline Dolehide       124        41    3.0  
Peyton Stearns          145        54    2.7  
Arantxa Rus             103        43    2.4  
Tatjana Maria            94        44    2.1  
Arina Rodionova         207       101    2.0  
Laura Pigossi           221       114    1.9  
Elina Avanesyan         120        62    1.9  
Varvara Gracheva         89        46    1.9  
Nadia Podoroska         127        67    1.9  
Lucia Bronzetti         109        58    1.9  
Dayana Yastremska        54        29    1.9

Once you climb into the top 100, savvy scheduling is increasingly impractical. Instead, this kind of gap comes from a deep run or two combined with many other unimpressive losses. Caroline Dolehide reached the final in Guadalajara followed by a quarter-final exit at a WTA 125, then lost three of five matches in Australia. Arantxa Rus won the title in Hamburg and reached a W100 semi-final, then lost five of six. The WTA formula lets you keep all the points from a big win for 52 weeks; Elo takes them away if you don’t keep demonstrating that you belong at the new level.

The sub-200 Elo rank suggests that Rodionova will have a hard time sustaining her place on the WTA list once the ranking points from her W25 titles start to come off the board. Until then, she can continue to pad her total and–fingers crossed–enjoy the hard-earned reward of a double-digit ranking.

* * *

Subscribe to the blog to receive each new post by email:

 

The Improbable Rise of Emma Navarro

Also today: New stat leaderboards

Emma Navarro at the 2023 US Open. Credit: Hameltion

When Emma Navarro beat Elise Mertens for her first WTA title in Hobart on Saturday, it was only part of a natural progression. For more than a year now, she has shown a knack for winning, regardless of level, surface, or just about anything else. While most fans still don’t know her name, she’s up to 26th in the official rankings and 22nd on the Elo list.

The former collegiate champion–winner of the national title as a Virginia Cavalier in 2021–started her 2023 campaign just inside the top 150. She arrived at the brink of the top 100 with back-to-back ITF titles on clay in April, then cracked the top 60 with a grass-court final in Ilkley. Her first top-ten win came in September on hard courts, against Maria Sakkari in San Diego, and after a busy fall that included another two ITF titles, she broke into the top 40. She’s 8-1 so far in 2024; the only blip is a loss to Coco Gauff.

Altogether, that’s 72 victories since the beginning of last year. Not many women can boast so much success at the W25 level or higher in that span:

Player                   2023-24 Wins  
Arina Rodionova                    79  
Iga Swiatek                        73  
Emma Navarro                       72  
Oceane Dodin                       64  
Jessica Pegula                     62  
Julia Riera                        59  
Aryna Sabalenka                    59  
Martina Capurro Taborda            59  
Yafan Wang                         58  
Carlota Martinez Cirez             57

The remarkable part of Navarro’s rise is not the sheer quantity of positive results; it’s that she rose through the rankings so fast at the age she did. She first cracked the top 100 last May just before her 22nd birthday–hardly old by any rational standards, but nearly geriatric on the youth-driven WTA tour. The 25 players standing in front of Navarro in this week’s rankings broke into the top 100, on average, before their 20th birthday: The median is Aryna Sabalenka’s arrival at 19 years, 5 months. Late developers like Jessica Pegula, Barbora Krejcikova, and Navarro are exceptions to a long-standing rule.

It’s not unusual for a player to finally achieve a double-digit ranking when they are 21 or older, but it’s rare for a future star to do so–and now that Navarro is a tour-level title-holder ensconced in the top 30, she deserves that label. Since 1990, there have been 207 players who finished their age-21 season ranked between 101 and 200 without a previous appearance in the top 100. Only 25 of them reached #100 at the end of the following year; Navarro was only the fourth to crack the top 50.

Of those 200-plus players, only 35 of them ever achieved a top-40 ranking. (A few more, including Katie Boulter and Katie Volynets, could still join the group.) On average, it took them 1437 days–just short of four years–to do so. Navarro needed only 315 days, the second-fastest in the last 30-plus years. Here are the players who made the fastest move from the end of their age-21 season to the top 40:

Player                 Age 21  top 40 debut  Days  
Elise Mertens            2016    2017-08-28   245  
Emma Navarro             2022    2023-11-06   315  
Veronika Kudermetova     2018    2019-11-11   315  
Kurumi Nara              2012    2014-06-09   525  
Jamie Hampton            2011    2013-06-24   546  
Casey Dellacqua          2006    2008-07-28   581  
Tathiana Garbin          1998    2000-09-25   637  
Liudmila Samsonova       2019    2021-11-01   672  
Bethanie Mattek Sands    2006    2008-11-03   679  
Anne Kremer              1996    1999-04-12   833  
Jil Teichmann            2018    2021-04-26   847  
Zi Yan                   2005    2008-05-05   861  
Paula Badosa             2018    2021-05-24   875  
Yone Kamio               1992    1995-06-12   896  
Alison Riske Amritraj    2011    2014-06-09   896  
Johanna Konta            2012    2016-02-01  1127

It’s possible that Navarro could have been ready for the big time earlier had she not spent two years playing college tennis. Her sub-100 ranking at the end of 2022 was partly due to a limited schedule, as she played only a handful of tournaments before leaving school after the spring semester that year. But she wasn’t playing top-100 tennis when she did step on court: Elo ratings respond much more quickly to quality results (and do not reward quantity for its own sake), and her ranking by that algorithm, 148th, was virtually identical to her place on the official list.

Whatever the benefits and (temporary) costs of her stay at the University of Virginia, Navarro seemed to learn from the step up in competition–and quickly. She lost her first 11 matches against the top 50; in the last four months, she has won 5 of 6.

What works

The most memorable victory so far was Saturday’s triumph over Mertens for a debut WTA title. It was a grind, taking two hours, 50 minutes, and spanning 14 breaks of serve en route to a 6-1, 4-6, 7-5 finish. There was little first-strike tennis on display, as the average point ran to 5.5 strokes. 69 points required seven shots or more, and 37 reached double digits.

The battle for openings worked to Navarro’s advantage. In a sample of eleven previous matches logged by the Match Charting Project, she struggled in longer rallies, winning just 46% of points that reached a seventh shot compared to 49% overall. On Saturday, she reversed that trend in a big way, out-point-constructing her veteran opponent and winning a whopping 59% of the longer points. Of 84 charted Mertens matches, it was only the eighth time that she played at least 20 long points and won so few of them. Among the few players to beat her so soundly on rally tactics: Pegula and Simona Halep.

While Navarro’s results have steadily improved, her game plan is still recognizable form her days as a college champion. After defeating Miami’s Estrela Perez-Somarriba for the 2021 NCAA title, she described her approach: “I was able to dictate with my forehand and finish a lot of points with my backhand.” In Hobart, her backhand continued to populate the highlight reel, with seven clean down-the-line winners. But it was the forehand that opened the court in the first place.

She played, essentially, a clay-court match, using the forehand to create opportunities for the next ball. She hit winners with 7% of her forehand groundstrokes, slightly below tour average. But when she was able to hit a forehand, she won the point 62% of the time, an outstanding figure for a close match. One point serves as an illustration of the rest: At 2-all, 15-all in the third set, Navarro converted a return point with a down-the-line backhand winner on the 14th shot of the rally. After a deep forehand return, Navarro was forced to hit two backhands. When she was finally able to deploy the forehand on the 8th shot, she stabilized the point by going down the middle. The 10th shot took advantage of a let cord with a heavy crosscourt forehand, a weapon that worked in her favor on Saturday more than two-thirds of the time. Her next forehand went the other direction, creating the space for–finally–a backhand out of the Belgian’s reach.

While not every point was quite so tactical, point construction always lurked. Mertens frequently attempted a pattern where she would go the same direction with two consecutive groundstrokes then, having wrong-footed Navarro with the second of them, go for a winner. The sequence doesn’t work against a big swinger because the points don’t last long enough. That wasn’t a problem against the American, but Navarro’s resourcefulness nullified the tactic nonetheless. Unlike many players her age, Navarro is able to use slices off both wings to neutralize points, and she often did so on the second shot of Mertens’s would-be pattern. The Hobart champion hit 40 slices over the course of the match, ultimately winning the point on 20 of them. For a defensive shot, rescuing 50% of those situations counts as a victory.

There is little in Navarro’s game that advertises her as a world-beater: The weapons I’ve described work best as part of a carefully-managed package. She may prove to be most dangerous on clay, where aggressive opponents will have a harder time keeping points short. She might also develop yet another level. Twelve months ago, only a reckless forecaster would have predicted she could rise so high, so quickly. We still haven’t seen her peak.

* * *

Deep leaderboards

Among the cult favorites on the Tennis Abstract site are the tour leaderboard pages, which contain nearly 60 sortable stats for the top 50 players on each circuit. Many of those stats aren’t available anywhere else, including things like average opponent ranking and time per match. It’s also possible to filter the matches for each calculation to determine things like the best hold percentages on clay.

Last week I introduced three new pages that extend the same concept:

Here’s just one example of what’s possible, the best WTA players outside the top 50 by ace percentage:

These are a great way to identify standout skills of lesser-known players. All of the leaderboards update every Monday.

* * *

Subscribe to the blog to receive each new post by email:

 

August 23, 1973: One Perfect Truth

Stan Smith, Ilie Năstase, and Tom Okker

Major tournament committees never had an easy job. Given a pile of national and regional rankings–sometimes many months out of date–and another pile of entry forms, they had to decide who could play their event. Then, with the field in place, they had to decide on the seedings.

It was an art, not a science. Rankings were published just once a year. Beyond the first ten, few lists compared players across national borders. In both ranking lists and entry decisions, there were biases, both acknowledged and obscured. Players complained of a “star system,” in which famous names were given priority over superior players. Insiders, especially members at clubs where tournaments were held, had an edge. Young players benefited from well-connected coaches.

So it had been for half a century. Tournament entries hadn’t always been an issue: There was usually enough room in the bracket for everyone. In the early days, draws were arranged at random. It took a run of disastrous bad luck for officials to decide to keep top players away from each other. At the US National Championships in 1921, the paths of the two best men players–Big Bill Tilden and Little Bill Johnston–intersected in the fourth round. The women’s draw was even worse: Visiting sensation Suzanne Lenglen drew home favorite Molla Mallory in the second round. It is no exaggeration to say that the latter quirk of fate–and Suzanne’s loss by retirement–altered the course of tennis history.

Within six months, USLTA tournament draws were seeded.

In 1973, the system underwent a change almost as significant as the adoption of seeding. On August 23rd, the new men’s players’ union, the ATP, released its first set of rankings.

There was no bias in the ATP’s calculation, aside from the tendencies of an imperfect algorithm. Players were given points for their performance at each tournament, then assigned an overall total based on their average over the past year.

The ATP’s list didn’t immediately rise to the top of the heap. The same week, the US Open announced its seeding lists, based on

the U.S. Lawn Tennis Association rankings, Commercial Union Grand Prix points, World Championship Tennis records, and–for the first time–a statistical approach consisting of a new computerized ranking system developed by the Association of Tennis Professionals.

Information overload, perhaps. Committee members couldn’t decide between Ilie Năstase and Stan Smith, so they awarded the two men co-No. 1 seeds. (The ATP ranked them first and third, respectively.) The committee also acknowledged surface preferences, something that the single-number ATP formula ignored. Dirtballer Manuel Orantes ranked second on the new computer, but he was seeded eighth on the grass at Forest Hills.

Quibbles about the ranking formula are as old as the system itself. The approach of averaging tournament results, in particular, incentivized players to stick to their best surface and skip smaller events; it was possible for someone to sit out a week and see his ranking go up!

The important thing, though, was that the imperfections were the same for everyone. An algorithm could be tweaked; a small group of entrenched bureaucrats could not. Bill Scanlon, then a 16-year-old beginning to gain attention as a promising junior in Texas, later called the ATP rankings “the one perfect truth.” They weren’t perfect, but that wasn’t the point. The formula provided objective targets free of favoritism.

The biggest winners were the deserving players on the fringes. Nastase and Smith would’ve been seeded anywhere regardless of the system. Most people could agree on the top ten, give or take a name or two. But what about an American teen who grew up playing in public parks, as Bobby Riggs had done in the 1930s? Or the rising number of challengers from Eastern Bloc nations without a long history on the international scene? Outsiders could now be judged more on their performance, less on their reputation and connections.

The players, in short, had gained even more control over the game. Within a few years, most tournament committees had given up on the job of determining entries and seeds themselves. Most fans probably didn’t notice the difference. But the rise of computer rankings set the stage for a more meritocratic, more inclusive sport.

* * *

This post is part of my series about the 1973 season, Battles, Boycotts, and Breakouts. Keep up with the project by checking the TennisAbstract.com front page, which shows an up-to-date Table of Contents after I post each installment.

You can also subscribe to the blog to receive each new post by email:

 

Aslan Karatsev Isn’t Better Than Novak Djokovic, But…

What’s better, winning 15 of 17 matches, or going undefeated for 9?

Even if you know that the 15-2 guy is Aslan Karatsev in 2021, and the 9-0 guy is Novak Djokovic this year, there’s no obvious answer. Sure, Djokovic beat Karatsev easily, and Novak’s nine wins included a grand slam title. We know Djokovic is the better player–he’s got more than a decade of proof to support that claim–and no one in their right mind would take Karatsev’s last three months over Novak’s.

True as all of that is, it’s not the question I’m asking.

The player with the 15-2 record has two advantages over his 9-0 peer. First, he has more wins. (Mind-blowing stuff, I know.) Second and more importantly, he has more evidence of his current level, even if it includes two losses. The 9-0 guy could go undefeated for 17 matches… but he could also end up 11-6. His nine-match record simply doesn’t give us as much information.

Again, if you know which players I’m talking about, that doesn’t matter–we have 1,100 matches worth of information about Djokovic, most of which say that his 9-0 is business as usual. He might not win his next eight matches, but he’s certainly not going to lose more than a few of them.

The yElo light at the end of the tunnel

If you’ve been reading my last couple of posts, you know where I’m going with this.

Last week, I introduced the concept of yElo. The “y” stands for year, but it can be used for any unit of time shorter than an entire career. Instead of using every bit of available information, we look only at a designated time frame, such as the 2021 season. While maintaining our knowledge of other players (e.g. Andrey Rublev is a really tough opponent; Egor Gerasimov not so much), we treat each player as if we know nothing else about him.

So truly, we’re comparing Karatsev’s 15-2 with Djokovic’s 9-0, taking into account the quality of their competition.

Plug every ATPer’s 2021 season into the formula, and here are the yElo leaders, through last weekend’s finals in Dubai and Acapulco:

Rank  Player                  W-L  yElo  
1     Aslan Karatsev         15-2  2082  
2     Novak Djokovic          9-0  2081  
3     Daniil Medvedev        13-2  2061  
4     Andrey Rublev          15-3  2006  
5     Marton Fucsovics       14-4  2000  
6     Stefanos Tsitsipas     14-4  1983  
7     Alexander Zverev        9-4  1922  
8     Matteo Berrettini       8-2  1918  
9     Jeremy Chardy          13-6  1915  
10    Lloyd Harris           11-5  1878  
11    Jannik Sinner           9-4  1848  
12    Alexei Popyrin          9-3  1836  
13    Roberto Bautista Agut   8-7  1831  
14    Taylor Fritz            7-4  1830  
15    Sebastian Baez         14-1  1820  
16    Felix Auger Aliassime   8-4  1818  
17    Karen Khachanov         9-5  1810  
18    Mackenzie McDonald     11-5  1809  
19    Tomas Machac           10-3  1806  
20    Daniel Evans            6-3  1800

Yes, Karatsev really does outscore Djokovic. Barely.

We are accustomed to 52-week rankings and Elo ratings that carefully weigh an entire career’s worth of work. So this is a deeply weird list, with only a handful of players anywhere near where we’d expect. #15 and #19 are Challenger-level guys, for crying out loud!

Embrace the race

The official Race to Turin doesn’t look as bizarre as the yElo list, but imagine showing it to someone in December, with Karatsev 5th, Marton Fucsovics 7th, and Rafael Nadal outside the top 20. Both the Race and the yElo list are “wrong” in the traditional sense, but they tell us much more about the 2021 season than the old-fashioned rankings do.

Tennis’s relentless focus on the long view sucks some excitement out of the season. Think of virtually any team sport. A month into the season, some unheralded club has gotten off to a hot start, and at least in some quarters, that’s the story–can they keep it up? should we have seen this coming all along? Nobodies are cast in the role of front-runners, and established stars play the part of underdogs.

In tennis, nobodies are… well, nobodies who won a few matches lately. Superstars play the part of superstars who’ve been taking some time off. Sure, we know that Djokovic and Nadal are going to end up near the top of the rankings list in November, just like we know the Dodgers and Yankees will be in the playoffs. But that doesn’t mean we ought to take it as a foregone conclusion from day one. In baseball, as the saying goes, everybody’s in first place on Opening Day.

Embracing the race–focusing on which players are leading the pack at each point throughout the season–doesn’t have to mean throwing away longer-term rankings. The traditional calculations should still be used for tournament entries and (maybe) for seedings. Top players have earned as much, and tournament entry is a factor that isn’t present in the major team sports.

Everybody wants to know how the ATP will survive when the Big Three are out of the picture. Well, this is a start–pay attention to who’s winning in 2021. If we take yElo’s word for it, a virtual nobody emerged to overtake Djokovic for the #1 spot going into Miami! An Argentinian prospect is playing like a top-15 guy just by winning a bunch of Challengers! Jeremy Chardy is more than just a hitting partner for the other Frenchmen!

The stories are out there, just like they are every year. It’s a shame that they get buried by all the talk about players who won last year.

I’ve added men’s and women’s yElo ratings to the Tennis Abstract website, and they’ll be updated weekly.

The Best 22-Match yElo Streaks

Earlier this week I wrote about Garbine Muguruza’s outstanding start to the season, and I introduced a new method to quantify a player’s level in a relatively short time span. Instead of using traditional Elo, which takes into account everything we know about a player, my new metric, yElo, uses what we know about everyone else, but treats a player’s short-term performance as if it is all we know about her. The parameters for yElo, such as k-value, are the same as the ones I’ve arrived at to make “regular Elo” as predictive as possible.

In other words, we measure Muguruza’s 22 matches in 2021 as if she had never played a WTA event before. As we saw in my earlier post, this approach considers the strength of opponents each player faced, and it rates her 18-4 record as better than anyone else in 2021, including Naomi Osaka’s 10-0 start.*

* excluding walkovers, which I ignore for all versions of Elo and yElo.

Muguruza’s season start has been outstanding and it is definitely underrated by the official WTA rankings and maybe even by the race, but I don’t want to make too much of it–one title in five tournaments in hardly world-historical stuff. On the other hand, it’s a good way to get our feet wet with a new metric that I think will prove useful for a wide range of tennis comparisons.

Garbine vs Garbine

The Spaniard won majors in 2016 and 2017, and she briefly reached number one in the rankings in September of 2017. Those achievements belong on a Hall of Fame plaque over her recent Dubai title and Yarra River Classic final. But was she really playing better back then?

She was not! I ran the yElo formula for every 22-match sequence in Muguruza’s career. The best of the bunch–again, taken entirely out of context, as if we know nothing beyond those 22 matches–was a run late in 2015 when she reached the Wuhan final, won Beijing, then went undefeated in the WTA Finals round robin stage. Her yElo based on those 22 matches was 2172, narrowly better than her 2021 yElo of 2160.

The more memorable moments of her career don’t quite stack up:

Elo   W-L   Span                            
2172  17-5  2015 Wim R16 - WTA Finals RR    
2160  18-4  2021 Abu Dhabi R64 - Dubai F    
2148  18-4  2017 Birmingham R32 - Cinci F   
2122  19-3  2017 Wimb R128 - USO R16 (#1)   
2084  17-5  2017 Miami R64 - Wimb F         
2076  16-6  2016 Doha QF - Roland Garros F 

I haven’t shown every 22-match sequence of her career, because that list is long and boring–the streaks heavily overlap with each other, and thus there are often tiny differences between them. But it is instructive to look at the time periods that ended at key moments.

The best of that bunch was the 22-match run ending with Muguruza’s 6-1 6-0 beatdown of Simona Halep at the 2017 Cincinnati final. That set the stage for her ascent to #1, though the ranking move didn’t happen until after the US Open. That streak is close to her current level. The 22 matches leading up to the official #1 takeover are a bit lower (she lost to Petra Kvitova at the US Open, which was less forgivable then than now), and the timespans ending with her two slam finals are still further down the list.

Don’t misunderstand–Muguruza was playing very well throughout all of these time periods. But when we crunch the numbers, we find that her current level is roughly on par with the best she’s ever played.

Garbine vs the world

Metrics are a lot more informative once we gain some context. Many of you probably have a good sense of what regular Elo ratings mean–2100+ is outstanding, 2000+ is top ten-ish, 1900+ is approximately the top 20, and so on. We can piggyback on that for yElo. When Muguruza’s 22-match yElo this season is 2160, it really does mean that, when feeding that very limited set of results into the Elo formula, it thinks Muguruza’s level is close to that of the best player in the world.

Well… the best player in the world right now. There’s no truly dominant force in women’s tennis at the moment, so we’re not seeing players at the top end of the all-time Elo scale. In regular Elo, peak Martina Navratilova and peak Steffi Graf topped 2600, more than 400 points above Osaka’s current rating of 2189. It will not surprise you, then, to learn that Navratilova, Graf, Serena Williams, Chris Evert, and many others put together 22-match runs* that make Muguruza’s 2021 season look positively pedestrian.

* yes, I know how ridiculous it is that this whole article is based on the arbitrary 22-match time span. We could do the same stuff with the more natural-sounding 20-match span, but there wouldn’t be an intuitive way to fit Muguruza’s current run into the discussion. And let’s face it, 20 is just as arbitrary as 22.

Out of my entire database on women’s tennis results going back to 1950 or so, about 100 women have enjoyed a 22-match run that outscores Muguruza’s best. The top of the list is the end of Navratilova’s 1983 season, which is worth a yElo of 2445. Close behind is Monica Seles, who reached 2438 with a streak starting at the end of 1992 and extending into the 1993 season. Three more women topped 2400, another 27 exceeded 2300, and 46 more put together 22 consecutive matches worth at least 2200.

Here are the 15 active women who’ve played at least as well as Muguruza for their best 22-match spans:

yElo  Player                W-L   Year(s)  
2389  Serena Williams       21-1  2001-02  
2386  Venus Williams        22-0  2000     
2335  Kim Clijsters         20-2  2002-03  
2332  Victoria Azarenka     22-0  2012     
2234  Vera Zvonareva        18-4  2008     
2217  Svetlana Kuznetsova   19-3  2004     
2217  Naomi Osaka           20-2  2019-20  
2209  Samantha Stosur       20-2  2010     
2205  Petra Kvitova         19-3  2011-12  
2205  Simona Halep          20-2  2018     
2196  Caroline Garcia       18-4  2017     
2186  Ashleigh Barty        19-3  2019     
2180  Angelique Kerber      18-4  2015-16  
2174  Carla Suarez Navarro  18-4  2015     
2172  Garbine Muguruza      17-5  2015

With the caveat that I haven’t spent much of my life thinking about the best 22-match runs in women’s tennis history, this seems like a credible list. I particularly like how yElo manages to consider strength of opponent to the point that an 18-4 run*, like Zvonareva’s in 2008, can outrank so many 20-2s. (Vera even beats a few 22-0s from the amateur era.)

* the link shows a few extra matches–the 18-4 run starts in the QFs of Guangzhou and ends in the Tour Finals semi-final. Note again that yElo skips retirements.

I hope you find the new yElo metric as interesting as I do. I’ll definitely be doing more with it, since I suspect it has value even outside the narrow context of one player and a single timespan of arbitrary lenth.

Repurposing Elo for Streaks, Seasons, and Garbine Muguruza

Elo is a fantastic tool for its explicit purpose: estimating the skill level of players based on available information. For instance, my WTA ratings currently rank Ashleigh Barty second. That seems plausible enough–it may be correct to give her the edge in a head-to-head matchup with everyone on tour except for Naomi Osaka. But with women pursuing such different schedules this season, a rating is only so useful.

For all of Barty’s or Osaka’s skill, is it right to say either one of them has had a better 2021 season than Garbine Muguruza? Osaka won the Australian Open, so she has a valid claim. Barty’s argument is a lot more tenuous, based on only eight victories. The Spaniard’s case writes itself–only a handful of players are up to double digits in wins this year, and Muguruza already has 18. How could we decide? If Elo is the smart version of the official rankings, what’s the smart version of the official race?

Starting fresh

The Elo algorithm itself offers a solution. A big part of the reason Muguruza is rated 4th on my current Elo list–and not higher–is her career before 2021. We had hundreds of matches worth of data on Garbine before January 1st, and it would be silly to throw all that away. Her 18-4 start is fantastic, but it doesn’t supersede everything that came before. It just gives us reason to update our rating.

Here’s where the ranking/race analogy is useful. The official rankings use a time span of 52 weeks (or more). The race restarts on January 1st. We could do the exact same thing with Elo, throwing away all results from the previous year and starting over, but that would be wasteful–it wouldn’t allow us to take into account whether players had faced particularly easy or tough draws, for instance.

The solution is to set Elo ratings back to zero (or 1500, in Elo parlance) one player at a time.

Take Muguruza. Instead of starting the year with a rating of 1981 and a history of several hundred matches, we pretend to know nothing about her. We give her a newbie’s rating of 1500 and a history of zero matches. Then we run the Elo algorithm to update her rating over the course of her 22 matches. First she faces Kristina Mladenovic (with her actual rating at the time of 1817), and improves to 1605. Then she beats Aliaksandra Sasnovich (and her rating of 1805), and improves to 1692. Repeat for each of her 2021 results, and the end result is a rating of 2160–almost 100 points higher than her current “real Elo” rating and within shouting distance of Osaka’s 2189.

To compare players, work through the same steps for everybody else, calculating their current-season rating as if they played their first career match in January.

It’s worth taking a moment to think about exactly what we’re measuring. That outstanding 2160 rating is what you get if a complete unknown shows up with zero match experience, then goes on the 22-match run that has been Muguruza’s season so far. The difference between real-Garbine and fake-newbie-Garbine is that the real one has an extensive track record that tells us she’s always been good–but that she probably isn’t quite this good.

I call it … yElo

This approach is “Elo for seasons” or “year Elo”–yElo*. It doesn’t have to be limited to calendar years, as the same approach would be useful to comparing, say, 20-match segments. It allows us to take advantage of the Elo algorithm–and the well-informed ratings of other players–to measure partial careers.

* you can pronounce it like the color “yellow,” but I prefer to say it like Phil Dunphy from Modern Family answering the phone.

Muguruza’s 2160 rating sure looks good, so how does it stack up against the rest of the tour? Here’s the 2021 top 20, considering players with at least five match wins through the Dubai and Guadalajara finals last weekend:

Rank  Player                W-L  yElo  
1     Garbine Muguruza     18-4  2160  
2     Naomi Osaka          10-0  2094  
3     Jessica Pegula       15-5  2002  
4     Serena Williams       8-1  1997  
5     Elise Mertens        11-2  1971  
6     Karolina Muchova      7-1  1953  
7     Aryna Sabalenka      11-4  1943  
8     Iga Swiatek          10-3  1941  
9     Daria Kasatkina      10-4  1910  
10    Barbora Krejcikova   10-5  1905  
11    Shelby Rogers         9-4  1902  
12    Jil Teichmann         9-5  1899  
13    Anett Kontaveit       9-4  1897  
14    Jennifer Brady        9-4  1892  
15    Cori Gauff           11-5  1885  
16    Danielle Collins      9-4  1883  
17    Ashleigh Barty        8-2  1878  
18    Sara Sorribes Tormo   9-2  1867  
19    Ann Li                5-1  1864  
20    Simona Halep          6-2  1854 

Like any Race list in March, this isn’t really reflective of skill. But when we consider the small amount of data it has to work with for each player, it’s … pretty good?

Again, you can quibble over whether Osaka or Muguruza has had the better season, but this approach weighs the better winning percentage and stronger average opponent against the much higher absolute win count and gives us a credible answer. Muguruza’s additional evidence of good tennis playing puts her ahead of Osaka’s evidence of short-term unbeatability.

While yElo is basically just a toy–it certainly doesn’t have the same predictive value as regular Elo–this initial look makes me like it. The possibilities are endless, from more sophisticated race tracking, to ranking the greatest seasons of all time, to comparing a player’s current hot streak to what’s she’s done in the past. Stay tuned, as I’m sure I’ll have more yElo results to report in the future.

So, About Those Stale Rankings

Both the ATP and WTA have adjusted their official rankings algorithms because of the pandemic. Because many events were cancelled last year (and at least a few more are getting canned this year), and because the tours don’t want to overly penalize players for limiting their travel, they have adopted what is essentially a two-year ranking system. For today’s purposes, the details don’t really matter–the point is that the rankings are based on a longer time frame than usual.

The adjustment is good for people like Roger Federer, who missed 14 months and is still ranked #6. Same for Ashleigh Barty, who didn’t play for 11 months yet returned to action in Australia as the top seed at a major. It’s bad for young players and others who have won a lot of matches lately. Their victories still result in rankings improvements, but they’re stuck behind a lot of players who haven’t done much lately.

The tweaked algorithms reflect the dual purposes of the ranking system. On the one hand, they aim to list the best players, in order. On the other hand, they try to maintain other kinds of “fairness” and serve the purposes of the tours and certain events. The ATP and WTA computers are pretty good at properly ranking players, even if other algorithms are better. Because the pandemic has forced a bunch of adjustments, it stands to reason that the formulas aren’t as good as they usually are at that fundamental task.

Hypothesis

We can test this!

Imagine that we have a definitive list, handed down from God (or Martina Navratilova), that ranks the top 100 players according to their ability right now. No “fairness,” no catering to the what tournament owners want, and no debates–this list is the final word.

The closer a ranking table matches this definite list, the better, right? There are statistics for this kind of thing, and I’ll be using one called the Kendall rank correlation coefficient, or Kendall’s tau. (That’s the Greek letter τ, as in Τσιτσιπάς.) It compares lists of rankings, and if two lists are identical, tau = 1. If there is no correlation whatsoever, tau = 0. Higher tau, stronger relationship between the lists.

My hypothesis is that the official rankings have gotten worse, in the sense that the pandemic-related algorithm adjustments result in a list that is less closely related to that authoritative, handed-down-from-Martina list. In other words, tau has decreased.

We don’t have a definitive list, but we do have Elo. Elo ratings are designed for only one purpose, and my version of the algorithm does that job pretty well. For the most part, my Elo formula has not changed due to the pandemic*, so it serves as a constant reference point against which we can compare the official rankings.

* This isn’t quite true, because my algorithm usually has an injury/absence penalty that kicks in after a player is out of action for about two months. Because the pandemic caused all sorts of absences for all sorts of reasons, I’ve suspended that penalty until things are a bit more normal.

Tau meets the rankings

Here is the current ATP top ten, including Elo rankings:

Player       ATP  Elo  
Djokovic       1    1  
Nadal          2    2  
Medvedev       3    3  
Thiem          4    5  
Tsitsipas      5    6  
Federer        6    -  
Zverev         7    7  
Rublev         8    4  
Schwartzman    9   10  
Berrettini    10    8

I’m treating Federer as if he doesn’t have an Elo rating right now, because he hasn’t played for more than a year. If we take the ordering of the other nine players and plug them into the formula for Kendall’s tau, we get 0.778. The exact value doesn’t really tell you anything without context, but it gives you an idea of where we’re starting. While the two lists are fairly similar, with many players ranked identically, there are a couple of differences, like Elo’s higher estimate of Andrey Rublev and its swapping of Diego Schwartzman and Matteo Berrettini.

Let’s do the same exercise with a bigger group of players. I’ll take the top 100 players in the ATP rankings who met the modest playing time minimum to also have a current Elo rating. Plug in those lists to the formula, and we get 0.705.

This is where my hypothesis falls apart. I ran the same numbers on year-end ATP rankings and year-end Elo ratings all the way back to 1990. The average tau over those 30-plus years is about 0.68. In other words, if we accept that Elo ratings are doing their job (and they are indeed about as predictive as usual), it looks like the pandemic-adjusted official rankings are better than usual, not worse.

Here’s the year-by-year tau values, with a tau value based on current rankings as the right-most data point:

And the same for the WTA, to confirm that the result isn’t just a quirk of the makeup of the men’s tour:

The 30-year average for women’s rankings is 0.723, and the current tau value is 0.764.

What about…

You might wonder if the pandemic is wreaking some hidden havoc with the data set. Remember, I said that I’m only considering players who meet the playing time minimum to have an Elo rating. For this purpose, that’s 20 matches over 52 weeks, which excludes about one-third of top-100 ranked men and closer to half of top-100 women. The above calculations still consider 100 players for year-end 2020 and today, but I had to go deeper in the rankings to find them. Thus, the definition of “top 100” shifts a bit from year-end 2019 to year-end 2020 to the present.

We can’t entirely address this problem, because the pandemic has messed with things in many dimensions. It isn’t anything close to a true natural experiment. But we can look only at “true” top-100 players, even if the length of the list is smaller than usual for current rankings. So instead of taking the top 100 qualifying players (those who meet a playing time minimum and thus have an Elo ranking), we take a smaller number of players, all of whom have top-100 rankings on the official list.

The results are the same. For men, the tau based on today’s rankings and today’s Elo ratings is 0.694 versus the historical average of 0.678. For women, it’s 0.721 versus 0.719.

Still, the rankings feel awfully stale. The key issue is one that Elo can’t help us solve. So far, we’ve been looking at players who are keeping active. But the really out-of-date names on the official lists are the ones who have stayed home. Should Federer still be #6? Heck if I know! In the past, if an elite player missed 14 months, Elo would knock him down a couple hundred points, and if that adjustment were applied to Fed now, it would push down tau. But there’s no straightforward answer for how the inactive (or mostly inactive) players should be rated.

What we’ve learned today

This is the part of the post where I’m supposed to explain why this finding makes sense and why we should have suspected it all along. I don’t think I can manage that.

A good way to think about this might be that there is a sort of tour-within-a-tour that is continuing to play regularly. Federer, Barty, and many others haven’t usually been part of it, while several dozen players are competing as often as they can. The relative rankings of that second group are pretty good.

It doesn’t seem quite fair that Clara Tauson is stuck just inside the top 100 while her Elo is already top-50, or that Rublev remains behind Federer despite an eye-popping six months of results while Roger sat at home. And for some historical considerations–say, weeks inside the top 50 for Tauson or the top 5 for Rublev–maybe it isn’t fair that they’re stuck behind peers who are choosing not to play, or who are resting on the laurels of 18-month-old wins.

But in other important ways, the absolute rankings often don’t matter. Rublev has been a top-five seed at every event he’s played since late September except for Roland Garros, the Tour Finals, and the Australian Open, despite never being ranked above #8. When the tour-within-a-tour plays, he is a top-five guy. The likes of Rublev and Tauson will continue to have the deck slightly stacked against them at the majors, but even that disadvantage will steadily erode if they continue to play at their current levels.

Believing in science as I do, I will take these findings to heart. That means I’ll continue to complain about the problems with the official rankings–but no more than I did before the pandemic.

Podcast Episode 86: A New Documentary on Guillermo Vilas and the No. 1 Ranking

Episode 86 of the Tennis Abstract Podcast features Jeff and co-host Carl Bialik, of the Thirty Love podcast, discussing the new Netflix doc Guillermo Vilas: Settling the Score.

The Argentine star was a multi-slam winner in the 1970s, yet he never reached the top of the official ATP ranking list. The film covers journalist Eduardo Puppos’s quest to prove that Vilas deserved to be #1. Over the course of the episode, we ponder the importance of the top ranking, the vagaries of the ATP ranking algorithm, how Elo rates Vilas’s peak years, and the ATP’s response to Vilas’s case for the top spot. We didn’t love the documentary, but the issues it raises are fun to debate.

Fans of the TA podcast will also want to check out Dangerous Exponents, the new Covid-19 podcast that Carl Bialik and I are doing. Episode 3 will be available later today.

Thanks for listening!

(Note: this week’s episode is about 48 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

There’s Always a Chance: Marie Bouzkova Edition

Last night in Toronto, 91st-ranked qualifier Marie Bouzkova won her quarter-final match against 4th-ranked Simona Halep. Halep retired with a leg injury after losing the first set, so there’s a caveat–even if we were prepared to read too much into a single match, we wouldn’t attribute a lot of meaning to this one. But it’s a big accomplishment for the 21-year-old Czech, who earned her second top-ten scalp of the week and will advance to her first Premier-level semi-final, against no less of an obstacle than Serena Williams.

Here’s the nutty thing: It was Bouzkova’s 62nd match of the 2019 season, her 61st against someone with a WTA ranking. She got the win against the highest-ranked foe–Halep–but just last week, she lost to 636th-ranked CoCo Vandeweghe, her lowest-ranked opponent of the year. Yeah, the caveats keep coming: Vandeweghe is coming back from injury and is surely better than a ranking outside the top 600, and the ITF Transition Tour hijinks mean that the ranking system didn’t work as usual in 2019. Some players who would normally have a very low ranking, like the Kazakh wild card who Bouzkova crushed a couple of weeks ago, don’t count.

Still. 61 matches, with a win against the highest-ranked player and a loss against the lowest.

That sent me to my database, which had plenty more surprises in store. Going back less than a decade, to 2010, I found 127 players who recorded the same oddball combination of feats in a single season, minimum 30 matches. (To be consistent with the Halep result, I included retirements if at least one set was completed.) While many of the players won’t be of wide interest–last year, one of the exemplars was Mira Antonitsch, who didn’t play anyone ranked in the top 400–63 of the 127 player-seasons involved beating a top-100 opponent, 44 included the defeat of someone in the top 50, and 25 were highlighted by a top-ten upset.

Three of them included Halep as the top-ten scalp! That makes Bouzkova the fourth player to beat Halep, not face anyone higher ranked, and also lose to her lowest-ranked opponent of the season. (Through eight months, anyway.) Halep shouldn’t feel too bad, though, as Angelique Kerber has been the extreme-ranked loser in five such cases, four of them in 2017. Ouch.

Here are the 25 player-seasons between 2010 and 2018 in which a WTAer beat her highest-ranked opponent and lost to her lowest:

Year  Player       High-Ranked  Rk  Low-Ranked  Rk       
2017  Kasatkina    Kerber       1   Kanepi      418      
2018  Hsieh        Halep        1   Gasparyan   410      
2010  Jankovic     Serena       1   Diyas       268      
2010  Clijsters    Wozniacki    1   G-Vidagany  258   *  
2014  Cornet       Serena       1   Townsend    205      
2010  Yakimova     Jankovic     2   Dellacqua   980      
2017  Bouchard     Kerber       2   Duval       896   *  
2017  Vesnina      Kerber       2   Azarenka    683      
2016  Bencic       Kerber       2   Boserup     225      
2014  Rybarikova   Halep        2   Eguchi      183      
2017  Mladenovic   Kerber       2   Andreescu   167   *  
2018  Goerges      Wozniacki    3   Serena      451      
2014  Tomljanovic  Radwanska    3   A Bogdan    308      
2015  Mladenovic   Halep        3   Savchuk     262      
2017  Kerber       Pliskova     4   Stephens    934      
2014  Pavlyu'ova   Radwanska    4   Wozniak     241      
2017  Dodin        Cibulkova    5   Rybarikova  453      
2017  Bellis       Radwanska    6   Azarenka    683      
2018  Buyukakcay   Ostapenko    6   Di Sarra    555      
2017  Sakkari      Wozniacki    6   Potapova    454      
2015  L Davis      Bouchard     7   E Bogdan    527      
2015  Ostapenko    S-Navarro    9   Dushevina   1100  *  
2016  KC Chang     Vinci        10  S Murray    862      
2018  Pera         Konta        10  Hlavackova  825      
2018  Danilovic    Goerges      10  Pegula      620

* also faced one unranked player

A quick glance is all it takes to establish that Vandeweghe isn’t the first lowest-ranked player to inspire a “yeah, but” reaction. The list of purportedly weak opponents is very strong for one made up of players with an average ranking outside of the top 500. We have stars such as Victoria Azarenka (twice) and Serena as well as a helping of prospects such as Bianca Andreescu and Victoria Duval.

Consider this as today’s reminder of the limitations of the WTA computer rankings. They tell us who has won a lot of matches in the last 52 weeks, not necessarily who is playing well right now. These cases include many of the most extreme mismatches between official ranking and on-the-day ability. I don’t think it says anything meaningful about a player to show up on this list–though Kerber’s many appearances (as both player and scalp!) are a good summary of her disappointing 2017 campaign.

Bouzkova will remain on the list for at least a couple more days: Serena is currently ranked 10th and both of the other semi-finalists are ranked lower, so Halep will remain her “toughest” opponent. Despite the Czech’s breakout week, it would be understandable if she found herself overawed to face a 23-time slam champion across the net. But one thing is certain: Bouzkova couldn’t care less about the number next to the name.