Return stats Archives - Page 2 of 4

How Grigor Dimitrov Unbalanced Holger Rune in Brisbane

*Grigor Dimitrov. Credit: Bradley Kanaris / Getty*

Grigor Dimitrov was long known as “Baby Fed,” but yesterday, Holger Rune was the one trying to do a Roger Federer impression. Facing break point at 3-all in the second set, Rune kicked a second serve wide, got a cross-court slice reply, then ran around his backhand to smack an inside-in forehand: a high-risk, high-reward shot, especially if you aim for the line. Rune went big and he pulled it wide. That was the only break of the match.

The 20-year-old had already missed one of those in the same game: The first error dug him a 15-40 hole. Over the course of the match, he attempted seven inside-in forehands, a shot that usually wins him two out of three points. Against Dimitrov, he blew four of them.

The errors are a symptom of one of something separating Rune from the top of the game. In his eagerness to maintain an aggressive position at the baseline–a willingness that defines his style and, in fairness, often pays off–he tries a bit too hard. He swings to end points in three shots that probably need to go five. He keeps a toe on the baseline when he ought to be one step further back.

This isn’t a secret, and Dimitrov exploited it. The Bulgarian landed 82% of his returns behind the service line, compared to a tour average of 70%. 39% of Dimitrov’s returns fell in the back quarter of the court, beating the 28% that players typically face. In rallies, the veteran kept pummeling Rune’s feet, prioritizing depth over direction.

The strategy worked. Take the other pivotal juncture of the match, early in the first-set tiebreak. Serving at 0-1, Rune pushed Dimitrov off the court with an inside-out forehand, which came back as a deep slice. Nothing special, but as Rune stepped back to accommodate it, he hit an equally indifferent reply. Dimitrov came back with another middle-deep backhand and Rune hit the tape with as pedestrian an error as you’ll ever see. At 0-2, Rune’s plus-one forehand forced Dimitrov deep and set up the point for an easy finish–or so he thought. Dimitrov managed to get his defensive forehand deep enough that Rune stepped in–his back foot on the baseline–and the result was another miss that would leave a club player berating himself.

On both points, a slightly more conservative court position, or a better last-minute adjustment step, would have let Rune continue the rally with his opponent on the run. Most players tread more carefully in tiebreaks. Instead, he missed twice and fell to 0-3. He got one point back but couldn’t close the entire gap and lost the first set, 7-6(5).

Middle-deep mediocrity

Yesterday wasn’t the first time that Rune misreads a neutral opportunity as a chance to go big. His own-the-baseline strategy is a mixed bag, the best example of which is how he responds to service returns that land at his feet. The Match Charting Project codes every return by direction (cross-court, middle, or down-the-line) and by depth (shallow–in front of the service line, deep–behind it, or very deep–in the back quarter of the court). Dimitrov placed 13 of his returns in the middle-deep region, and Rune saved just 5 of those points.

When a return lands middle-deep, the point is fully up for grabs. Counting both first- and second-serve points, the server wins roughly 49% of the time from that position. (Once a deep return is in play, any lingering effect of a big serve is mostly erased.) A top player should do better, but Rune does not. Here are the career outcomes of those points for the current ATP top four, plus the two Brisbane finalists:

Player             W/FE%   UFE%  PtsWon%  
Novak Djokovic      6.8%   7.1%    53.8%  
Jannik Sinner       5.7%   6.0%    51.6%  
Daniil Medvedev     5.3%   5.9%    50.6%  
Carlos Alcaraz      8.0%   6.2%    50.1%  
Grigor Dimitrov     9.6%   7.9%    49.6%  
--Average--         7.4%   8.7%    48.9%  
Holger Rune        11.5%  10.9%    48.0%

Rune is much more aggressive than his peers in these situations. It may feel like it pays off, since he ends more points with winners (or forced errors) than unforced errors. But the bottom line tells another story: He wins fewer points than average, and trails the best players in the game by a sizeable margin. As Djokovic, Sinner, and Medvedev can tell you, from a neutral position, immediate outcomes don’t matter as much as point construction.

It’s the same story later in the rally. Dimitrov won those two crucial tiebreak points by putting his second shot near the baseline. The serve return isn’t unique: Any stroke that lands in the middle-deep region turns the point into a 50-50 proposition. The above table showed how players fare from that position on the plus-one shot. Here are the numbers for everything after that:

Player           Winner%   UFE%  PtsWon%  
Carlos Alcaraz      8.2%  12.8%    55.3%  
Grigor Dimitrov     6.6%   6.3%    54.7%  
Novak Djokovic      6.2%   8.0%    54.6%  
Jannik Sinner       7.2%  10.5%    52.3%  
Daniil Medvedev     4.7%   6.8%    52.0%  
--Average--         7.1%  10.2%    49.3%  
Holger Rune         9.4%   9.7%    49.0%

The order changes, and Rune’s aggression doesn’t stand out like it does earlier in the rally. But the message is the same, only with a wider margin. Given the mix of players represented in the Match Charting Project, “average” is better than tour average, but it’s still a number Rune needs to surpass.

The second table, finally, brings us back to Dimitrov. If he hadn’t played yesterday, I wouldn’t have thought to include him on the list with the top four, but in this type of situation–one that demands both patience and tactical soundness–he rates with the best in the game.

Faced with an over-aggressive, slightly erratic opponent, the 32-year-old took advantage and turned in a workmanlike performance. That isn’t a dig: Dimitrov didn’t need fireworks, just steadiness. By my count, he racked up just 10 unforced errors to Rune’s 29, and just one of them–serving for 4-0 in the tiebreak–came a critical moment. It’s nothing so flashy as the “Baby Fed” moniker once promised, but Dimitrov’s mature game has gotten him up to 7th place on the Elo list, and a return to the official top ten is not far away.

* * *

Subscribe to the blog to receive each new post by email:

The Most Exclusive Clubs In Tennis

Tireless podcaster Alex Gruskin likes to talk about what he calls the “top-ten, top-15, top-20, and top-25 clubs.” He works out the membership of each one by consulting the Tennis Abstract ATP and WTA stats leaderboards, which display dozens of metrics for each of the top 50 ranked players on both tours.

To qualify for Alex’s “top ten club,” a player needs to be in the top ten in both hold percentage and break percentage–in other words, to be an elite server and returner. Even cracking the top 25 club is no easy task. In 2023, only 11 men were better than half of the top 50 on both sides of the ball. It’s more common to excel at one or the other. In 2022, the best returner (Diego Schwartzman) ranked 50th out of 50 on serve, and the best server (Nick Kyrgios) came in 40th on return.

The top-25 club is a high standard, and the top-ten club is a stratospheric one. This year, only three men–Novak Djokovic, Jannik Sinner, and Carlos Alcaraz–made the cut, and Alcaraz almost missed it, ranking 10th in hold percentage. Daniil Medvedev almost qualified, but he trailed Alcaraz by 0.7% in hold percentage and came in 11th in that category.

Three top-ten clubbers is, as it turns out, an unusual showing. In the 33 seasons for which we have the necessary stats to calculate hold and break percentage (back to 1991), only 13 men have ever managed the feat. Many of them did it several times, so there are a total of 49 player-seasons that qualify. For the two-plus decades between 1991 and 2011, there were only two seasons in which more than one player reached both top-ten thresholds. In 1992, the entire tour fell short.

By “club” standards (and most others), Djokovic’s 2023 season was particularly impressive. Alex usually classifies players into round-number clubs, occasionally giving credit to a near-miss who makes, for instance, the “top 26” club. We can extend the concept a bit further and place every season into its best possible club: If a player ranks in the top three by both hold and break percentage, he’s in the “top-three” club; if he ranks among the top four in both, he’s in the “top-four club,” and so on.

In 2023, Novak led the tour in hold percentage and was bested by only Alcaraz and Medvedev in break percentage. Thus, he’s a member of the top-three club. More exclusive categories are hard to find. Here’s the complete list of top-three clubbers since 1991, along with their ranks in hold percentage (H% Rk) and break percentage (B% Rk):

Year  Player          H% Rk  B% Rk  CLUB  
2023  Novak Djokovic      1      3     3  
1999  Andre Agassi        3      1     3  
1995  Andre Agassi        3      3     3

That’s it.

Sinner’s 2023 campaign was also sneakily great. He finished a deceptive fourth on the official ATP points table, but by ranking fifth in hold percentage and fourth in break percentage, he joined an absurdly elite group of top-five clubbers: only Djokovic, Agassi, Rafael Nadal, and Roger Federer.

Here’s the full list of top-ten club seasons since 1991:

Year  Player            H% Rk  B% Rk  CLUB  
2023  Novak Djokovic        1      3     3  
1999  Andre Agassi          3      1     3  
1995  Andre Agassi          3      3     3  
2021  Novak Djokovic        4      3     4  
2013  Rafael Nadal          4      1     4  
2008  Rafael Nadal          4      1     4  
2002  Andre Agassi          4      3     4  
2023  Jannik Sinner         5      4     5  
2019  Rafael Nadal          5      1     5  
2017  Rafael Nadal          5      2     5  
2015  Novak Djokovic        5      1     5  
2014  Novak Djokovic        5      2     5  
2012  Rafael Nadal          5      1     5  
2007  Rafael Nadal          5      2     5  
2006  Roger Federer         2      5     5  
2003  Andre Agassi          5      3     5  
                                            
Year  Player            H% Rk  B% Rk  CLUB  
2022  Novak Djokovic        6      4     6  
2013  Novak Djokovic        6      2     6  
2021  Daniil Medvedev       7      4     7  
2020  Rafael Nadal          7      2     7  
2019  Novak Djokovic        7      2     7  
2012  Novak Djokovic        7      2     7  
2011  Novak Djokovic        7      1     7  
2010  Rafael Nadal          2      7     7  
2008  Novak Djokovic        7      4     7  
2004  Roger Federer         2      7     7  
2021  Alexander Zverev      8      7     8  
2020  Daniil Medvedev       8      8     8  
2018  Novak Djokovic        8      5     8  
2016  Novak Djokovic        8      2     8  
2015  Roger Federer         4      8     8  
2005  Roger Federer         2      8     8  
2001  Andre Agassi          8      3     8  
1998  Marcelo Rios          8      2     8  
1991  Stefan Edberg         4      8     8  
                                            
Year  Player            H% Rk  B% Rk  CLUB  
2022  Daniil Medvedev       8      9     9  
2020  Andrey Rublev         9      5     9  
2018  Rafael Nadal          9      1     9  
2017  Roger Federer         2      9     9  
2009  Andy Murray           9      2     9  
2007  Roger Federer         3      9     9  
2000  Andre Agassi          8      9     9  
2023  Carlos Alcaraz       10      1    10  
2020  Novak Djokovic       10      4    10  
2019  Roger Federer         3     10    10  
2013  Roger Federer         7     10    10  
1998  Andre Agassi         10      3    10  
1994  Andre Agassi         10      5    10  
1993  Thomas Muster        10      4    10

The list is heavily weighted toward the Big Three and the current era. Whether it’s surface speed convergence or something about the players themselves, it’s tougher to reach the top with a lopsided game these days. Stefan Edberg was a top-eight clubber in 1991 (and might have been as good for several seasons before that), but Pete Sampras didn’t get anywhere close. His best showing by this metric came in 1997, when he cracked the top-14 club. Andy Roddick never even cleared the top 30.

Finally, here are the 15 men who reached both top-30 thresholds in 2023:

Year  Player            H% Rk  B% Rk  CLUB  
2023  Novak Djokovic        1      3     3  
2023  Jannik Sinner         5      4     5  
2023  Carlos Alcaraz       10      1    10  
2023  Daniil Medvedev      11      2    11  
2023  Andrey Rublev        17     11    17  
2023  Karen Khachanov      18     16    18  
2023  Alexander Zverev     15     18    18  
2023  Grigor Dimitrov      19     15    19  
2023  Taylor Fritz          6     19    19  
2023  Casper Ruud          21     17    21  
2023  Holger Rune          20     21    21  
2023  Frances Tiafoe        9     26    26  
2023  Ugo Humbert          29     23    29  
2023  Roman Safiullin      30     24    30  
2023  Sebastian Korda      14     30    30

Women’s clubs

The WTA gets the short shrift on topics like these, because much less historical data is available. I only have the necessary stats back to 2015, and even that season is incomplete.

Still, that doesn’t make some recent individual performances any less impressive. Iga Swiatek’s effort in 2023 predictably stands out: She came in third behind Aryna Sabalenka and Caroline Garcia in hold percentage, and she trailed only Sara Sorribes Tormo and Lesia Tsurenko in break percentage. By finishing third in both categories, she–like Djokovic–is a member of the top-three club.

Depending on how you define a full-season, Iga might be the first ever woman to reach such a standard, at least in the nine-year span for which we can do the math. Here is the full list of top-ten clubbers back to 2015:

Year  Player             H% Rk  B% Rk  CLUB  
2016  Victoria Azarenka      2      1     2  
2023  Iga Swiatek            3      3     3  
2022  Iga Swiatek            5      1     5  
2019  Serena Williams        1      6     6  
2015  Serena Williams        1      7     7  
2016  Serena Williams        1      8     8  
2016  Angelique Kerber      10      6    10

Azarenka’s run in 2016 was really a partial season: She hurt her knee and didn’t play again after retiring from her first-round match at the French. Her first four months of tennis put her on the path toward a historic campaign, but we’ll never know how it would have turned out. Those 29 matches can’t really be set along the same measuring stick as Iga’s 75-plus in each of the last two years. Serena’s three entries on this table were almost as abbreviated, but again we’re reminded of the limited data. Surely the list would be much longer, with many more instances of the Williams name, if we had better data.

Anyway, all hail the great Iga. May her reign last until Sabalenka figures out how to become a top-ten returner.

At least this year, it was slightly harder to crack the top-25 and top-30 clubs in the women’s game than it was in the men’s. Here is the full 2023 women’s list down to the top-32 threshold, which allows us to include a few names of interest who missed out on the top 30:

Year  Player               H% Rk  B% Rk  CLUB  
2023  Iga Swiatek              3      3     3  
2023  Cori Gauff              13      8    13  
2023  Jessica Pegula          16      5    16  
2023  Madison Keys             6     16    16  
2023  Barbora Krejcikova      12     18    18  
2023  Victoria Azarenka       19     17    19  
2023  Aryna Sabalenka          1     20    20  
2023  Marketa Vondrousova     22      6    22  
2023  Karolina Muchova         8     22    22  
2023  Leylah Fernandez        20     27    27  
2023  Jelena Ostapenko        28     12    28  
2023  Marie Bouzkova          29     21    29  
2023  Caroline Dolehide       23     30    30  
2023  Elina Svitolina         31     24    31  
2023  Beatriz Haddad Maia     18     31    31  
2023  Ons Jabeur              32      9    32  
2023  Belinda Bencic           5     32    32

More than ever, a well-rounded game is a necessity for players who hope to reach the top. For fans, “clubs” like these are a useful way to think about which stars are getting the job done on both sides of the ball.

* * *

I’ll be writing more about analytics and present-day tennis in 2024. Subscribe to the blog to receive each new post by email:

Alexander Bublik and Return of Serve Futility

In Sunday’s Singapore final, Alexander Bublik won six return points. Not a typo. Out of Alexei Popyrin’s 52 serve points, that’s a win percentage of 11.6%. The technical term for this level of performance is… bad.

Yet somehow, Bublik concentrated four of those points in the fifth game and broke serve. (Popyrin helped–one of the four was a double fault.) Even more miraculously, it was the only break in the opening set, so the Russian won the set and got halfway to the title. Alas, he cranked the futility up another notch, winning only one return point the rest of the way, and it was Popyrin who came away with his maiden championship.

Freakish statistical feats tend to raise three questions: What are the odds? Has this ever happened before? And, can we learn anything from this nonsense?

What are the odds?

If Bublik had that exact 11.6% chance of winning each service point, his probability of breaking in any given game would be 0.26%, or about 1 in 384. In reality, it’s probably higher than that, because servers aren’t robots. Presumably Popyrin’s level dipped a bit. Still, if we take that 0.26% as the answer, Bublik’s likelihood of breaking serve at least once in the 22-game match were less than 3%.

You probably don’t need the precise numbers to recognize that, if you win six return points in the whole match, your odds of breaking serve aren’t that great.

Has this ever happened before?

The answer depends on what you mean by “this.” In our 30 years of ATP tour matches with stats on things like return points won and breaks of serve, the Singapore final was the first time that a player broke serve and won a set despite winning six or fewer return points.

It’s fairly common for a player to have a very bad return day, or face an extremely hot server. On average, there are about 30 completed tour-level matches per year in which the loser manages six or fewer return points. But of those 900-plus matches, the official stats only show seven times that the loser managed to break serve. (I emphasize “official” here because the ATP’s stats do have errors, and extreme situations like these tend to bring them out of hiding. A simple data-entry error can easily make a routine match look like a record-breaker.)

The most recent instance of six-return-points-and-a-break was in 2010, when Lukasz Kubot concentrated his efforts in a single return game of a Bucharest first-round match against Filippo Volandri. Every match on the list was a first-rounder except for a 1995 quarter-final at the Tokyo Indoors, when Alexander Volkov managed to break Michael Chang despite winning only those few return points.

Every six-pointer was a straight set loss, at least until Bublik came along.

Except… it’s possible to win six or fewer return points and win a set without breaking serve. In fact, it’s theoretically possible to win an entire match with only two return points going your way, if you deploy them in tiebreaks and remain flawless on your own deal. Reilly Opelka did exactly that (well, he won six points, not two) in Basel two years ago against Cristian Garin. Garin won all but 6 of his 69 service points but lost, 7-6(5) 7-6(10).

Bublik’s feat in the Singapore final wasn’t quite that level of oddity, but as an accomplishment amid return futility, his break-and-a-set is a close second.

Can we learn anything from this nonsense?

Bublik is a talented player, but he’s not a very good returner. This was his third career ATP final (excluding a two-game retirement in January), and his rates of return points won in those matches are 26.7%, 18.9%, and now 11.6%. It’s no surprise that he’s still looking for his first title. It turns out that underarm serving doesn’t have any secret advantages for his return game.

He has won 35.6% of his return points over the last 52 weeks–an improvement over his 34.1% mark at tour level in 2019, but still only good for 42nd out of the current ATP top 50. If he continues to serve big, that’s good enough for an Isner-like career, possibly spending considerable time in the top 20, maybe even with a brief stop in the top ten.

But to reach the next level, the Russian will need to return a lot better. Several years ago, I looked at the “minimum viable return game” necessary for an elite player. At the time, I was interested in Nick Kyrgios’s chances at a spot near the top of the rankings despite his own brand of return futility. In the 25 years between 1991 and 2015, when I wrote that piece, only four players finished a season in the top five while winning less than 37% of their return points, and two of those were within a percentage point of the threshold.

Kyrgios wasn’t close to that level then, and he still isn’t. Bublik is closer, but he’s still on the wrong side of the line. Optimists can point to the Russian’s relative youth–he turns 24 in June–and trust he’ll improve. Of course he might, but history isn’t on his side there, either. Kyrgios’s lack of progress is typical of the breed. Mediocre returners may improve their skills and tactics, but as they do so, they face more difficult opponents, keeping their numbers down.

If there is a positive take-away from the Singapore final, it’s that Bublik did manage to bunch his return points. Kyrgios outplays his numbers by saving his heroics for bigger moments. (Another way of looking at “outplaying his numbers” is “underperforming given his skills.”) Bublik shows signs of doing the same, so when he does manage to win more than six return points, he may be able to eke disproportionate gains out of them.

That’s the theory, anyway.

Charting Aryna Sabalenka’s Win Streak

Aryna Sabalenka has won 3 titles and 14 matches in a row. Let’s dig into the data and see if we can identify any improvements that would account for her success.

For the Match Charting Project, I’ve logged every shot of each of the Belarussian’s tour-level matches. (There are a few exceptions where I haven’t found video.) We’ll look at hard-court matches only today. With that constraint, we have 140 Sabalenka matches, dating back to early 2017 (including the current streak), and another 1,121 women’s tour-level contests over the same time period for reference.

Big serving?

Aryna always brings a powerful serve, but it remains a work in progress, at least tactically. The key metric for pure serve dominance is unreturned serves–quite simply, serves that don’t come back. While some are aces, they don’t have to be, and the distinction doesn’t really matter.

This first graph has a lot going on, but as I’ll use the same basic template for several more figures, it’s worth taking a moment to understand what we’re looking at. The two dotted lines show tour average rates of unreturned serves (the lower average is for all players; the higher one is for match winners), the thin jagged line shows Sabalenka’s rate of unreturned serves for each individual match, and the thicker red line shows her five-match rolling average.

Her five-match rolling average has been above 30% for the entire win streak. It’s not an unprecedented level for her, though–she sustained similarly high levels at various points over the last three years. (We should also be a bit cautious ascribing serve effectiveness to a player when the Ostrava, Linz, and Abu Dhabi courts might have been faster than average.) Consistently powerful serving has certainly helped Sabalenka’s cause, but it probably isn’t the whole story.

We might gain from breaking down Aryna’s serve effectiveness into first and second serves. First, let’s look at something else:

Serve plus one

There are two ways we could look at “serve plus one” effectiveness, and we’ll do both. First, let’s count Sabalenka’s opportunities to hit a second shot behind her serve, and see what percentage she puts away. (As with aces and other unreturned serves, the “winner” concept is a distraction: I’m counting second-shot winners together with shots that force errors. If you end the point, it doesn’t matter much whether your opponent touches the ball.)

The second figure shows us that, on hard courts, when women are faced with a second shot behind their serve, they finish the point about 20% of the time. Sabalenka’s career average is 28%. She far exceeded that over a string of four matches to finish Ostrava and start Linz, maxing out at 42% against Jennifer Brady in the Ostrava semi-final. Since then, her rate returned to roughly her (impressive) career average.

This measure is something of a “key to the match” for Sabalenka. When she converts at least 30% of second-shot opportunities behind her serve, she wins 91% of her matches. When she doesn’t, she wins 62%. Of course, 62% is nothing to be ashamed of, and the dip visible in early 2020 coincides with her Doha title, the one time in her career that the five-match rolling average fell below 20%.

Serve plus serve plus one

These first two measures are related, of course. A big server should post good numbers in both. But a great “pure” serving day might mean a worse-looking serve-plus-one day, because fewer weak returns are coming back at all. The reverse holds as well: A strong server might not hit as many unreturned serves as usual because her opponent is managing to just barely put them back in play–easy sitters for second shots.

To identify the combined benefits of good serving and efficient serve-plus-one’ing, we simply count how often Sabalenka wins service points in two shots or less.

We’ve already seen the two components of this, so there are no surprises here. The typical player wins about 40% of her service points this way, and Aryna has historically averaged 46% on hard courts. This number looks as good for her recent winning streak as we’d expect. But as with the previous graph, it suggests weakness during her 2020 Doha title, so the predictive power here is limited.

First and second serves

The combined metric of unreturned serves plus second-shot putaways gives us a good snapshot of when the offensive game is working. Let’s break down the previous graph into first- and second-serve specific numbers:

These track the overall numbers. Aryna has generally been good lately on both first and second serves, but with neither one has she been more successful or consistent than in previous hot streaks. Second serves are particularly hard to rate because the per-match sample size is so small–fewer than 30 second serve points per player per match, and some of those end up as double faults.

Before moving on to the return game, let’s look at one more indicator of service-point success:

Longer points on serve

As I said at the outset, Sabalenka has always been a good server. While her current momentum might owe a bit to fewer mental lapses on serve, it would be logical to look elsewhere for an explanation, simply because there was more room to improve in other areas.

We’ve seen how her serve and second shot rate. What about serve points that go deeper? This metric considers all points where the returner’s second shot comes back, and then counts how often the server goes on to win the point.

The average hard-court WTA match winner claims almost exactly half of her service points when the rally reaches five shots. Over her career, Sabalenka has won 48%, worse than the typical match winner but better than the overall tour average.

Aryna has done better lately. To cherry-pick a starting point, she has won 51% of these points in her last 24 matches, dating back to the Doha second round. Her average over the first five matches in Abu Dhabi was 55%, the best she has managed since her breakout run in late 2018, when she pushed Naomi Osaka to three sets at the US Open and hoisted the Wuhan trophy a few weeks later.

Return winners

We’ll walk through the dimensions of her return performance in a similar manner, starting with return winners (and point-ending non-winners), then on to “return-plus-one” putaways, followed by the combination of the two.

First, return winners. I use the number of point-ending return winners divided by in-play serves–that is, excluding double faults.

Veronika Kudermetova had a rough day last Wednesday, so Sabalenka’s current five-match rolling average is as high as it’s been since early 2018. Apart from that last-minute burst of return dominance, her recent return winner rates look a bit like the serve stats: consistently solid, if not spectacular.

Return plus one

How about when the serve return doesn’t finish the job? This “return plus one” metric counts opportunities when the server puts her second shot in play and measures how often the returner hits a winner or forces an error with her own second shot. The sample sizes are a getting a bit small here (each player has 43 such opportunities in an average hard-court match), so the per-match rates are rather spiky:

The small single-match samples, combined with the relationship between return-plus-one and return winners–almost interchangeable ways to respond successfully to a mediocre serve–render conclusions a bit tough to come by. Sabalenka was average by this measure in Ostrava, great in Linz, and all over the place in Abu Dhabi.

Short return points won

Will things be clearer when we combine both methods of quickly winning a return point?

Aside from a weak return performance against Elena Rybakina in Abu Dhabi, Sabalenka has been comfortably above average in this metric in every match since she faced Victoria Azarenka in the Ostrava final.

Like “serve plus one,” this is a good indicator of overall success for the Belarussian. If we use this metric to split her 140 charted hard-court matches in half, the dividing line is 27.5% of return points won with a return winner or a return-plus-one putaway. Above that mark, she has won 62 matches, or 88.6%. Below it, she has won only 41, or 58.6%. She was above the line in nearly all of her matches in Linz and Abu Dhabi, and she sat at 25% or higher in every round of her 2020 Doha triumph, clearing 30% in three of five matches there.

First and second serve returns

Has she been particularly devastating against first or second serves? Let’s see:

Few women feast on second serves the way Sabalenka does, and she’s been particularly relentless of late. The typical tour player wins about 30% of second-serve return points with a first- or second-shot putaway, and over her last 15 matches, Aryna has won 41% that way. 41% is a respectable total percentage of return points won against many servers, and Sablaenka would be winning that many even if she refused to hit more than two shots per rally.

Granted, Sabalenka doesn’t hit that many fifth or sixth shots. How does she fare when her return points extend that far?

Long return points

You’ll be glad to know that the code for this final* graph didn’t throw any divide-by-zero errors–Aryna has played at least one “long” return point in each of her hard-court matches. This metric tallies up all return points in which the server puts her third shot in play, then calculates how often the returner won the point.

** Yes! It’ll be over soon!

This is another spiky mess, with an average of only 20 points per match. Still, if we’re looking for a category in which Sabalenka is newly excelling–not just thriving as usual–this could be our smoking gun.

Tour average for match winners on this stat is 46.7%. The server has an advantage by definition, because she has just put the ball back in play. The Belarussian’s career mark is 44.4%, only a bit better than the overall average. Yet in her last 15 matches, she has won 48.0% of these long return points, her best 15-match span since early in her career, when she faced a weaker mix of opponents.

I don’t want to overemphasize this: When there are only 20 points of this type per match, an improvement of 3.6 percentage points translates to a gain of less than one point per match. That doesn’t explain the magnitude of Sabalenka’s recent gains. But it does indicate that she is shoring up one of her few weaknesses, and in combination with her solid play on long serve points, it suggests that she no longer needs to rely on a one-two punch, even if her one-two punch is as dizzying as anyone’s.

Don’t make me say consistency

Tennis matches are decided by a handful of points: While Sabalenka has been dominant lately, she lost more points than she won against Coco Gauff in the Ostrava opening round. As such, improvements always look minor when we try to quantify them, if we can quantify them at all.

I’ve pointed out some areas where Sabalenka may be improving, others where a good statistical showing usually coincides with a W, and still others where an excellent performance doesn’t seem to matter much. All of these categories have one thing in common: She is putting up stellar numbers right now.

Remember, in the twelve graphs above (yes, twelve, sheesh), the dotted yellow lines indicate the average performance of match winners. In every single one of the categories, Aryna’s five-match rolling average is above that line. Every single one! In most cases, it has been above the line for some time.

It doesn’t take any statistical savvy to see that if a player is better than the average match winner in every category, she’ll be awfully tough to beat. The rest of the Australian Open field can only cross their fingers that Sabalenka’s current form won’t survive two weeks of quarantine.

Match Charting Project Return Stats: Glossary

I’m in the process of rolling out more stats based on Match Charting Project data across Tennis Abstract. This is one of several glossaries intended to explain those stats and point interested visitors to further reading.

At the moment, the following return stats can be seen at a variety of leaderboards.

RiP% – Return in play percentage. The percent of return points in which this player got the serve back in play.
RiP W% – Return in play winning percentage. Of points in which the returner got the serve back in play, the percentage that the returner won.
RetWnr% – Return winner percentage. The percentage of return points in which the return was a winner (or induced a forced error).
Wnr FH% – Return winner forehand percentage. Of return winners, the percentage that were forehands (topspin, chip/slice, or dropshot).
RDI – Return Depth Index, a stat recently introduced at Hidden Game of Tennis. The Match Charting Project records the depth of each return, coding each as a “7” (landing in the service box), an “8” (in back half of the court, but closer to the service line than the baseline), or a “9” (in the backmost quarter of the court). In the original formulation, RDI weights those depths 1, 2, and 4, respectively, and then calculates the average. I’ve tweaked it a bit to reflect the effectiveness of various return depths. For men, the weights are 1, 2, and 3.5, and for women, the weights are 1, 2, and 3.7.
Slice% – Slice/chip percentage. Of returns put in play, the percent that are slices or chips, including dropshots.

The return stats leaderboards also show most of these stats for first-serve returns only, and for second-serve returns only.

Frances Tiafoe’s Narrow Margins

Italian translation at settesei.it

Yesterday, Frances Tiafoe added another breakthrough to his young career with a fourth-round defeat of 20th seed Grigor Dimitrov at the Australian Open. The whole tournament has been a coming-out party for the just-turned 21 year old, as Tiafoe only got this far thanks to an even more impressive upset of 5th seed Kevin Anderson in the second round. The American will see his ranking climb into the top 30 for the first time, and his marketability as a potential superstar will soar even higher.

The role of the statistical analyst is often to stand athwart an exciting trend yelling “Stop!,” and I’m afraid that’s my role today. Yes, Tiafoe is a compelling young player with a lot of potential. Throughout 2018 he repeatedly demonstrated he could hang with the best players in the world, something he further solidified with the win over Anderson last week. But the Dimitrov win, life-changing as it may be, was a bit of a fluke.

In fact, yesterday’s match was–by a couple of simple metrics–less impressive than a lot of his 2018 losses, including a defeat at the hands of Dimitrov in Toronto last year. Across 337 points against the Bulgarian on Sunday, Tiafoe lost more than half of them, winning only 34.7% of his return points compared to Dimitrov’s 39.5%. The resulting Dominance Ratio (DR) for the match is 0.88, a mark that almost never results in victory. (DR is the ratio of return points won to opponent return points won: 1.0 means that the players performed equally, and higher is better.) On the ATP tour last year, more than 92% of winners recorded a DR of 1.0 or better, and 97.4% of winners–that’s 39 out of every 40–won enough points to amass a DR of 0.9.

As I’ve said, many of Tiafoe’s losses have seen him play better. Against Dimitrov in Toronto, his DR was 0.98; versus Anderson in Miami his DR was 0.99 in a straight-set defeat; and even in his routine, 6-4 6-4 loss to Joao Sousa in the Estoril final, his DR was almost as good as it was yesterday, at 0.87. In the range of close-but-outplayed matches–let’s say DRs from 0.85 to 0.99–Tiafoe won 4 of 18 last year, and all but one of the wins were closer than yesterday’s triumph.

The trick to winning a match while tallying fewer than half the total points and a lower rate of return points than your opponent is to play better in the big moments, like break points. The American certainly did that, converting 5 of 13 break opportunities while limiting Dimitrov to only 3 of 18. Execution in tiebreaks also helps, though it didn’t make a difference in yesterday’s upset, as the two men split a pair of breakers. To Tiafoe’s credit, he outplayed the Bulgarian when it mattered most. In that sense, he deserved the victory, no matter what the stats say.

But break point and tiebreak performance tends to even out. Just because the 21-year-old captured lightning in a bottle at a few key moments to win a high-profile match doesn’t mean he’ll be able to do it again. Just as there are almost no players who win tiebreaks any more often than their overall performance would suggest, players with excellent single-year break-point records quickly regress to the mean. It may not be correct to say that Tiafoe was lucky to win yesterday–he may well have kept his focus and maintained his level better than opponent did–but whatever made the difference, it’s not something with predictive power. Next week, next major, or next year, he isn’t any more likely than the next guy to post a DR of 0.88 and come out on top.

Still, I’m not here just to throw cold water on a young player’s prospects. For one thing, had a couple of break points gone the other way yesterday and Dimitrov gotten through, a fourth-result result would still represent an encouraging step forward for the American. His upset of Anderson sported a particularly impressive DR of 1.29–35.1% of return points won compared to Kevin’s 27.2%–which was better than all but ten of Anderson’s matches last year. (Three of those ten came at the hands of Novak Djokovic, and seven of the ten were against top ten players.)

Tiafoe is getting better, and there are plenty of signs that indicate he’s the brightest young star in American men’s tennis. He’s accomplished a lot of things in Melbourne, but outplaying Dimitrov isn’t one of them.

Mackie McDonald’s Secret Weapon

Italian translation at settesei.it

In the first round on Monday, the 23-year-old American Mackenzie McDonald defeated young Russian Andrey Rublev in four sets, 6-4 6-4 2-6 6-4. While Rublev missed part of the 2018 season due to injury and carries a ranking just inside the top 100, the victory still qualifies as a bit of an upset for McDonald, who has never come close to Rublev’s peak of No. 31.

The handful of fans who kept tabs on Court 10 were treated to an unusual display. The American relentlessly attacked Rublev’s second serve, rushing the net behind his return almost two dozen times. Many players don’t hit return approach shots that often in an entire year. What’s more, the tactic worked. Without it, the already close match would have been a coin flip.

By my count, in the log I kept for the Match Charting Project, McDonald came in behind his second serve return 22 times. Approach shot counts are never precise, because when a player hits a winner or an error, he may lean forward as if to continue toward the net, but quickly stop when he realizes it’s unnecessary. To be precise, he came in at least 22 times, and perhaps one more return winner or a couple of return errors should also be added to the total. No matter, the conclusions are similar regardless of whether the number is 22 or 24.

Rublev hit 62 second serves, but 9 of those resulted in double faults, so we’re looking at 53 playable second serves. McDonald netrushed 22 of those, winning 10. Of the other 31, he won only 11. That’s a return winning percentage of 45% on return approaches compared to 35% on other returns. Had he won all of those points at the 35% rate, it would have cost him two, perhaps three points off his overall total. He barely outscored Rublev as it was, 124 points to 118, so every little bit helped.

A rarity in context

The Match Charting Project has shot-by-shot data for nearly 2,000 men’s matches from this decade, and Monday’s four-setter was the first one of those in which a player hit at least 20 second-serve return approaches. (Dustin Brown approached at a higher rate in multiple matches, including his 2015 Wimbledon upset of Rafael Nadal.) There are only ten other matches in the database in which one player hit at least ten such approaches, and Mischa Zverev accounts for three of them. More than three-quarters of the time, the total number of second-serve return approaches is zero.

McDonald is not alone in enjoying some success with the tactic: The 1500 or so second-serve return approaches in the dataset were about 14% more effective than non-approaches in the same matches. However, it’s hard to be sure what that number is telling us, since most players approach so rarely. Some of the attacks are probably on-the-fly decisions against particularly weak serves, not pre-planned plays like many of Mackie’s netrushes on Monday.

Thus, it’s difficult to know how much success most men would have with the tactic, were they to adopt it more often. The fact that they employ it so rarely might tell us all we need to know: If more players thought that attacking the net behind the second serve return would win them more points, they’d do it. But for McDonald, it doesn’t matter what his peers do; it only matters what works for him. These 22 return approaches represented a lot more aggression than he displayed in the four previous matches we’ve charted, and it paid off.

It wasn’t enough to get him a win today against Marin Cilic, but he did outperform expectations, taking a set against the 6th seed and defending finalist. Best of all, he won more than half of Cilic’s second-serve points–a better rate than he managed against Rublev, and several ticks above 46%, the fraction that the average opponent manages against Cilic. In a sport often criticized for its uniformity of tactics, McDonald is an up-and-comer worth watching.

The US Open Surface Speed Puzzle

Embed from Getty Images

Italian translation at settesei.it

Almost everyone agrees that the courts were slower at the US Open this year. The players thought so, the media concurred, and the tournament director confirmed that they had slightly changed the physical makeup of the surface in order to slow things down. Even clay-court wizard Dominic Thiem got within two points of the semi-finals, so clearly something changed.

I’m not going to argue with that. But when I set out to measure the change and get a sense of who might have benefited, I kept finding odd results. Almost nothing I tried revealed any clear-cut slowing of the surface, and by some metrics, the courts in Flushing played faster this year. Maybe it was just the heat and humidity–though the numbers don’t make that clear, either.

My usual starting point is my own surface-speed stat, which compares the ace rate at each tournament while controlling for the mix of servers and returners. While the dearth of advanced stats means it is limited to some basic inputs, it usually matches up quite well with our intution and doesn’t differ too much from Court Pace Index (CPI), an infrequently-available metric based on direct physical measurements. Using my algorithm, the US Open surface was 5% faster than the average surface at an ATP event in the last 52 weeks, compared to last year, when it was 4% slower. Compared to courts at the average WTA tournament, New York was 5% slower this year, versus 19% slower in 2017. The slowest tour-level surfaces (for either gender) have about 50% fewer aces than average, and the fastest have about 50% more.

2017 wasn’t just a blip, either in real-life on in my metric. It was similar to 2016, which also rated as considerably slower than this year’s surfaces. We’re left with a discrepancy that may stem from using an algorithm that relies too much on aces: Perhaps players were overwhelmed by the heat and tried more than usual to keep rallies short, or they simply didn’t bother trying to put their racket on first serves as often.

The evidence is clearer that players were more aggressive this year than in 2017. The average rally length, excluding double faults, on courts covered by Slamtracker (179 of the 254 main draw singles matches) fell from 4.28 shots last year to 4.17 this year, a drop of 2.6%. That could be affected by the changing mix of players in the draw (as well as those selected to play on higher-profile courts) so I isolated the 27 players with at least two matches worth of data from both 2017 and 2018. Those 27 saw their rally length drop a tiny bit more, about 3% from last year to this year.

We have the beginnings of an explanation. If players were showing more aggression–perhaps because the heat encouraged them to adopt more first-strike tactics–that could cancel out the effect of a slower surface. We can drill down even further using the Aggression Score (AS) metric, which measures the rate of winners and unforced errors per shot. Across all matches, AS rose from 15.3% in 2017 to 16.1% this year, an increase of 5.7%. Using the 27 players with multiple matches from both years’ tournaments, the difference is more stark, rising by 8.7%.

It’s clear that we saw more aggressive tennis at the 2018 Open than the year before. If we take for granted that the courts played faster, the case is closed: Tactics, probably heat-induced, outweighed surface. But if we approached the problem without knowing what players, media, and tournament officials said, the same numbers would unequivocally point to an even simpler conclusion, that the courts played faster.

If tactics explain our discrepancy, one more place we might look is first serves. Maybe servers took more chances, increasing their ace rate at the expense of first-serve percentage. But the data doesn’t back us up: The overall first-serve percentage in Slamtracker matches fell by a mere 0.07%. Using year-to-year comparisons for our set of 27 players, the difference was larger, but still a measly 0.3%. If tactics are the answer, it must be on the return of serve, not the serve itself.

This is where the trail runs cold. Return tactics are tougher to quantify than serving strategy, and there’s a limit to how much we can do with the available data. We can tally return winners and induced forced errors (IFEs), points in which the returner ended things with a strong reply. If returners allowed more aces, it should be because they took a more aggressive approach, trading fewer opportunities for better odds of winning when they did make contact. Instead, the record shows that return winners and IFEs fell a whopping 7% from last year to this year. That number supports the theory of a slower surface, and it meets expectations for those players who adopted very conservative return positions, such as Rafael Nadal, whose return winner/IFE rate went down by 3%, and Thiem, whose rate decreased by 7%. But a slower surface and a lower return winner/IFE rate should add up to fewer aces, not more.

Compared to where we started, we have a lot more data but not many more answers. Some signs point to a faster surface, others to a slower; some indicate more aggressive tactics, others more conservative ones. regardless of what we know about the physical makeup of the courts, there are many factors that influence what we refer to as “surface speed.” The hot, humid conditions in Flushing this year surely help complicate things–perhaps a study that took into account the heat index for each individual match would shed more light on these questions. We could also be seeing players adapt to the conditions–whether the heat or the slower surface–in different ways. Everyone may agree about how the courts played this year, but it’s much more difficult to pin down exactly what that means.

The Most Aggressive ATP Returners

In yesterday’s post, I outlined a new method to measure return aggression. Using Aggression Score (AS) as a starting point, I made some adjustments in order to treat return winners (and induced forced errors) and return errors separately. The resulting metric–Return Aggression Score (RAS)–gives equal weight to return winners and return errors. A positive RAS represents an aggressive return game, while a negative number indicates a more conservative one. The most aggressive single-match performances were nearly four standard deviations above the mean, while player averages varied between about one standard deviation above and below the mean.

We can now point the algorithm at the ATP, and calculate RAS for each player in the 1,500 or so 2010-present men’s matches logged by the Match Charting Project.

The difference between the frequency of return errors and return winners is even greater for men than it is for women. The WTA tour averages, as we saw yesterday, are 17.8% and 5.5%, respectively, and the men’s averages are 20.9% and 4.1%. Thus, treating the two categories separately is even more important when analyzing ATP matches.

The overall range in single-match RAS figures is about the same as it is for women. The most aggressive one-match returners are nearly four standard deviations above the mean (a RAS mark near 4.0), while the lowest are almost two standard deviations below (RAS marks near -2.0). What differs between genders is that the most aggressive men’s single-match performances are not clustered around one player, as Serena Williams dominates the women’s list. Of the top ten one-match men’s RAS marks, only one player appears twice, and that is partly an accident:

Year  Event         Returner      Opponent   RAS  
2015  Halle         Berdych       Karlovic  3.96  
2014  Halle         D Brown       Nadal     3.72  
2016  Stuttgart     Marchenko     Groth     3.49  
2014  Aus Open      Dolgopolov    Berankis  2.99  
2016  Dallas CH     Tiafoe        Groth     2.91  
2014  Bogota        J Wang        Karlovic  2.79  
2015  Fairfield CH  Tiafoe        D Brown   2.72  
2017  Montpellier   De Schepper   M Zverev  2.64  
2015  Madrid        Isner         Kyrgios   2.60  
2014  Halle         An Kuznetsov  D Brown   2.58

Two factors make it more likely a returner appears on this list: His opponent, and the surface. Facing a serve-and-volleyer means adopting a higher-risk return strategy, and playing on a faster surface has a similar effect. Four of the top ten matches here were played on grass, and seven of the ten returners faced opponents who often come in behind their serves. Frances Tiafoe is partly responsible for his double-appearance here, but I suspect it has more to do with his opponents.

Grass is, by far, the most extreme surface in its effect on return tactics. Here are the numbers for each court type, along with the RAS of the average match on that surface:

Surface  RetE%  RetW%    RAS  
Hard     21.4%   4.1%   0.04  
Grass    25.3%   5.6%   0.54  
Clay     18.5%   3.5%  -0.24  
Average  20.9%   4.1%   0.00

Even though the average clay court match isn’t as extreme as a grass court match in this regard, the ten least aggressive single-match return performances all took place on clay, five of them recorded by Rafael Nadal.

Player averages

The Match Charting Project has at least 10 matches (2010-present) for about 75 players. Here is the top quintile, the 15 most aggressive players of that group:

Player                 Matches  RetPts   RAS  
Dustin Brown                11     676  1.90  
Ivo Karlovic                16    1116  0.85  
John Isner                  30    2202  0.77  
Alexandr Dolgopolov         20    1417  0.76  
Philipp Kohlschreiber       18    1334  0.69  
Lukas Rosol                 11     841  0.67  
Vasek Pospisil              14     812  0.62  
Andrey Kuznetsov            11     585  0.54  
Benoit Paire                17    1198  0.54  
Jeremy Chardy               14     923  0.39  
Kevin Anderson              23    1681  0.39  
Kei Nishikori               47    3128  0.38  
Milos Raonic                42    3211  0.34  
Sam Querrey                 17    1219  0.31  
Fernando Verdasco           17    1109  0.30

There’s aggression, and then there’s Dustin Brown. No other player is one full standard deviation above average, and he is nearly two, more than twice as aggressive as the next-most tactically extreme ATPer.

We don’t see quite the same extremes in the other direction, just a bunch of clay-courters:

Player                  Matches  RetPts    RAS  
Jiri Vesely                  11     716  -0.76  
Marcel Granollers            12     746  -0.64  
Paolo Lorenzi                13     912  -0.58  
Inigo Cervantes Huegun       10     705  -0.58  
Tommy Robredo                10     622  -0.57  
Damir Dzumhur                11     688  -0.56  
Guido Pella                  11     749  -0.51  
Guillermo Garcia Lopez       10     734  -0.49  
Casper Ruud                  16    1000  -0.48  
Hyeon Chung                  10     621  -0.48  
Rafael Nadal                157   11773  -0.42  
Richard Gasquet              36    2180  -0.42  
Roberto Bautista Agut        25    1633  -0.42  
Diego Schwartzman            44    3289  -0.42  
Juan Martin Del Potro        42    2900  -0.40

These least-aggressive numbers are partly a reflection of playing styles, and partly the surface, as we’ve already seen.

Next, let’s look at how much players alter their style to the circumstances. Here are 16 players–top guys along with some others I found interesting–along with their average RAS numbers on the three major surfaces:

Player                   RAS   Hard   Clay  Grass  
John Isner              0.77   0.71   1.03   0.72  
Marin Cilic             0.28   0.09   0.02   1.38  
Jo Wilfried Tsonga      0.24   0.31  -0.22   0.38  
Gilles Muller           0.10   0.07  -0.74   1.13  
Roger Federer           0.08   0.04  -0.07   0.40  
Grigor Dimitrov         0.07   0.12  -0.30   0.28  
Novak Djokovic          0.02   0.03  -0.12   0.25  
Nick Kyrgios            0.02  -0.06   0.07   1.20  
Jack Sock              -0.08  -0.09   0.08         
Stanislas Wawrinka     -0.09  -0.11  -0.23   0.95  
Alexander Zverev       -0.13  -0.06  -0.33   0.18  
Andy Murray            -0.20  -0.25  -0.32   0.15  
Dominic Thiem          -0.24  -0.13  -0.40   0.25  
Juan Martin Del Potro  -0.40  -0.43  -0.58  -0.07  
Diego Schwartzman      -0.42  -0.34  -0.45         
Rafael Nadal           -0.42  -0.25  -0.76   0.57

The big servers have some surprises in store: John Isner is more aggressive on the return on clay than on other surfaces, and Jack Sock and Nick Kyrgios show the same, at least compared to hard courts. Marin Cilic is extremely aggressive on the grass court return, but his clay court tactics are similar to those on hard courts. In stark contrast is Gilles Muller, second only to Nadal as a conservative returner on clay, but quite aggressive on other surfaces.

One of the many underexplored topics in tennis analytics is the different ways players change (or choose not to change) their tactics on different surfaces. While comparing Return Aggression Score by surface is a tiny step in that direction, it does suggest just how much those strategies vary.

As always, a reminder that analyses like these are only possible with the volunteer-generated shot-by-shot logs of the Match Charting Project. I hope you’ll contribute.

Measuring Return Aggression

In the last couple of years, I’ve gotten a lot of mileage out of a metric called Aggression Score (AS), first outlined here by Lowell West. The stat is so useful due to its simplicity. The more aggressive a player is, the more she’ll rack up both winners and unforced errors. AS, then, is essentially the rate at which a player hits winners and unforced errors.

Yet one limitation lies in Aggression Score’s simplicity. It works best when winners and unforced errors move together, and when they are roughly similar. If someone is having a really bad day, her unforced errors might skyrocket, resulting in a higher AS, even if the root cause of the errors is poor play, not aggression. On the flip side, a locked-in player will see her AS increase by hitting more winners, even if those winners are more a reflection of good form than a high-risk tactic.

I’ve long wanted to extend the idea behind Aggression Score to return tactics, but when we narrow our view to the second shot of the rally, the simplicity of the metric becomes a handicap. On the return, the vast majority of “aggressive” shots are errors, so the results will be swamped by error rate, minimizing the role of return winners, which are a more reliable indicator. Using Match Charting Project data from 2010-present women’s tennis, returns result in errors 18% of the time, while they turn into winners (or they induce forced errors) less than one-third as often, 5.5% of the time. The appealingly simple Aggression Score formula, narrowed to consider only returns of serve, won’t do the job here.

Return aggression score

Let’s walk through a formula to measure return aggression, using last month’s Miami final between Sloane Stephens and Jelena Ostapenko as an example. Tallying up return points (excluding aces and service winners), along with return errors* and return winners** for both players from the match chart, we get the following:

Returner          RetPts  RetErr  RetWin  RetE%  RetW%  
Sloane Stephens       64       9       1  14.1%   1.6%  
Jelena Ostapenko      63      11       6  17.5%   9.5%

* “errors” are a combination of forced and unforced, because most return errors are scored as forced errors, and because the distinction between the two is so unreliable as to be meaningless. Some forced error returns are nearly impossible to make, so they don’t really belong in this analysis, but with the state of available data, it’ll have to do.

** throughout this post, I’ll use “winners” as short-hand for “winners plus induced forced errors” — that is, shots that were good enough to end the point.

These numbers make clear which of the two players is the aggressive one, and they confirm the obvious: Ostapenko plays much higher-risk tennis than Stephens does. In this case, Ostapenko’s rates are nearly equal to or above the tour averages of 17.8% and 5.5%, while both of Stephens’s are well below them.

The next step is to normalize the error and winner rates so that we can more easily see how they relate to each other. To do that, I simply divide each number by the tour average:

Returner          RetE%  RetW%  RetE+  RetW+  
Sloane Stephens   14.1%   1.6%   0.79   0.28  
Jelena Ostapenko  17.5%   9.5%   0.98   1.73

The last two columns show the normalized figures, which reflect how each rate compares to tour average, where 1.0 is average, greater than 1 means more aggressive, and less than 1 means less aggressive.

We’re not quite done yet, because, as Ostapenko and Stephens illustrate, return winner rates are much noisier than return error rates. That’s largely a function of how few there are. The gap between the two players’ normalized rates, 0.28 and 1.73, looks huge, but represents a difference of only five winners. If we leave return winner rates untouched, we’ll end up with a metric that varies largely due to movement in winner rates–the opposite problem from where we started.

To put winners and errors on a more equal footing, we can express both in terms of standard deviations. The standard deviation of the adjusted error ratio is 0.404, while the standard deviation of the adjusted winner ratio is 0.768, so when we divide the ratios by the standard deviations, we’re essentially reducing the variance in the winner number by half. The resulting numbers tell us how many standard deviations a certain statistic is above or below the mean, and these final results give us winner and error rates that are finally comparable to each other:

Returner          RetE+  RetW+  RetE-SD  RetW-SD  
Sloane Stephens    0.79   0.28    -0.52    -0.93  
Jelena Ostapenko   0.98   1.73    -0.05     0.95

(Math-oriented readers might notice that the last two steps don’t need to be separate; we could just as easily think of these last two numbers as standard deviations above or below the mean of the original winner and error rates. I included the intermediate step to–I hope–make the process a bit more intuitive.)

Our final stat, Return Aggression Score (RAS) is simply the average of those two rates measured in standard deviations:

Returner          RetE-SD  RetW-SD    RAS  
Sloane Stephens     -0.52    -0.93  -0.73  
Jelena Ostapenko    -0.05     0.95   0.45

Positive numbers represent more aggression than tour average; negative numbers less aggression. Ostapenko’s +0.45 figure is higher than about 75% of player-matches among the nearly 4,000 in the Match Charting Project dataset, though as we’ll see, it is far more conservative than her typical strategy. Stephens’s -0.73 mark is at the opposite position on the spectrum, higher than only one-quarter of player-matches. It is also lower than her own average, though it is higher than the -0.97 RAS she posted in the US Open final last fall.

The extremes

The first test of any new metric is whether the results actually make sense, and we need look no further than the top ten most aggressive player-matches for confirmation. Five of the top ten most aggressive single-match return performances belong to Serena Williams, and the overall most aggressive match is Serena’s 2013 Roland Garros semifinal against Sara Errani, which rates at 3.63–well over three standard deviations above the mean. The other players represented in the top ten are Ostapenko, Oceane Dodin, Petra Kvitova, Madison Keys, and Julia Goerges–a who’s who of high-risk returning in women’s tennis.

The opposite end of the spectrum includes another group of predictable names, such as Simona Halep, Agnieszka Radwanska, Caroline Wozniacki, Annika Beck, and Errani. Two of Halep’s early matches are lowest and third-lowest, including the 2012 Brussels final against Radwanska, in which her return aggression was 1.6 standard deviations below the mean. It’s not as extreme a mark as Serena’s performances, but that’s the nature of the metric: Halep returned 46 of 48 non-ace serves, and none of the 46 returns went for winners. It’s tough to be less aggressive than that.

The leaderboard

The Match Charting Project has shot-by-shot data on at least ten matches each for over 100 WTA players. Of those, here are the top ten, as ranked by RAS:

Player                    Matches  RetPts   RAS  
Oceane Dodin                   11     665  1.18  
Aryna Sabalenka                11     816  1.12  
Camila Giorgi                  19    1155  1.07  
Mirjana Lucic                  11     707  1.05  
Julia Goerges                  27    1715  0.94  
Petra Kvitova                  65    4142  0.90  
Serena Williams                91    5593  0.90  
Jelena Ostapenko               35    2522  0.88  
Anastasia Pavlyuchenkova       21    1180  0.78  
Lucie Safarova                 34    2294  0.77

We’ve already seen some of these names, in our discussion of the highest single-match marks. When we average across contests, a few more players turn up with RAS marks over one full standard deviation above the mean: Aryna Sabalenka, Camila Giorgi, and Mirjana Lucic-Baroni.

Again, the more conservative players don’t look as extreme: Only Madison Brengle has a RAS more than one standard deviation below the mean. I’ve included the top 20 on this list because so many notable names (Wozniacki, Radwanska, Kerber) are between 11 and 20:

Player                Matches  RetPts     RAS  
Madison Brengle            11     702   -1.06  
Monica Niculescu           32    2099   -0.93  
Stefanie Voegele           12     855   -0.85  
Annika Beck                16    1181   -0.78  
Lara Arruabarrena          10     627   -0.72  
Johanna Larsson            14     873   -0.65  
Barbora Strycova           20    1275   -0.63  
Sara Errani                25    1546   -0.60  
Carla Suarez Navarro       36    2585   -0.55  
Svetlana Kuznetsova        27    2271   -0.55 

Player                Matches  RetPts     RAS  
Viktorija Golubic          16    1272   -0.53  
Agnieszka Radwanska        96    6239   -0.51  
Yulia Putintseva           22    1552   -0.51  
Caroline Wozniacki         80    5165   -0.50  
Christina McHale           11     763   -0.48  
Angelique Kerber           93    6611   -0.46  
Louisa Chirico             13     806   -0.44  
Darya Kasatkina            26    1586   -0.43  
Magdalena Rybarikova       12     725   -0.41  
Anastasija Sevastova       30    1952   -0.40

A few more notable names: Halep, Stephens and Elina Svitolina all count among the next ten lowest, with RAS figures between -0.30 and -0.36. The most “average” player among game’s best is Victoria Azarenka, who rates at -0.08. Venus Williams, Johanna Konta, and Garbine Muguruza make up a notable group of aggressive-but-not-really-aggressive women between +0.15 and +0.20, just outside of the game’s top third, while Maria Sharapova, at +0.63, misses our first list by only a few places.

Unsurprisingly, these results track quite closely to overall Aggression Score figures, as players who adopt a high-risk strategy overall are probably doing the same when facing the serve. This metric, however, allows to identify players–or even single matches–for which the two strategies don’t move in concert. Further, the approach I’ve taken here, to separate and normalize winners and errors, rather than treat them as an undifferentiated mass, could be applied to Aggression Score itself, or to other more targeted versions of the metric, such as a third-shot AS, or a backhand-specific AS.

As always, the more data we have, the more we can learn from it. Analyses like these are only possible with the work of the volunteers who have contributed to the Match Charting Project. Please help us continue to expand our coverage and give analysts the opportunity to look at shot-by-shot data, instead of just the basics published by tennis’s official federations.