January 2021 – Heavy Topspin

Flipping Coins in the Rain

The singles final in last week’s ITF M15 Antalya event was washed out by rain. It went in the books as a walkover victory for Giovanni Fonio over Juan Manuel Cerundolo. Doubles specialist Harri Heliovaara explains (from the Finnish, via Google translate, via Peter Wetz):

Namely, the ITF competitions were also played here last week, and the men’s final scheduled for Sunday of that competition could not be played at all due to the rain. Normally in that situation, each player only gets the ATP points and prize money of the losing finalist, but now the players threw themselves creatively and decided to decide the final winner with a coin toss. They agreed that the winner of the coin toss would receive a surrender win and thus the winner’s ATP points, while the loser would receive the winner’s prize money, so each received more than what would have resulted from not playing the final anyway.

This is indeed a clever solution, and one that was sometimes employed in the amateur era. When grass courts made up a bigger part of the tour and court maintenance in general was more primitive, it was more common for the tail end of events to be left unplayed. Usually the finalists (or semifinalists, in extreme cases) divided the prizes, but occasionally they resorted to a coin toss, especially in mixed doubles, which has always had a bit of an “exhibition” vibe.

While Fonio and Cerundolo benefited (in different ways) from the coin flip, parts of this scenario don’t quite smell right to me. First I’ll explain why, then I’ll offer a solution.

The winning player gave his prize money to the loser. Change the context a tiny bit, and that’s match fixing.
The ITF doesn’t have a provision in their rulebook for coin-flipping (as far as I know), so this solution only worked because the losing player agreed to claim an injury. Again, this is an unusual situation, but attesting to a fake injury is frowned upon, to say the least.
Fonio gets some extra ranking points. Typically when tournaments are washed out, those points aren’t awarded, so it isn’t as if there is a precedent that Fonio or Cerundolo “deserved” those points. Instead, Fonio gains an unearned edge (albeit a small one) over several similarly-ranked players, who are presumably competing for entry and seeding in the same events.
Players in other unfinished events–such as the Nur Sultan and Potchefstroom Challengers that were halted due to Covid-19 last March–didn’t get a chance to divide the unawarded points, by coin-flipping or any other method.

We can collapse these four points into two issues: First, there’s no ITF rule, so swapping points for prize money requires treading very close to some ethical and rule-breaking lines. Second, allowing players to improvise (sometimes? depending on the attitude of the on-site supervisor? I don’t know) inevitably gives an unearned advantage to some players over others.

Edit the rulebook

Fortunately, this is an easy fix. By providing a simple guideline for situations like this, the ITF can avoid the iffy behavior of prize-swapping and lying about injuries, ensure fairness for players across the whole tour, and do a better job of delivering the rewards that players expect when they show up for a tournament.

How about this:

Matches that cannot be played due to weather or force majeure will be decided by a coin flip, with ranking points awarded to the winner.

I’m not a lawyer, so my one-liner is probably missing a few paragraphs, but the main idea is pretty straightforward.

Prize money is trickier. Should anyone–“winner” or “loser”–receive prize money for unplayed matches? I don’t know. Tournaments–even the occasional ITF–earn revenue from ticket sales and broadcast rights, so they would surely prefer to hold back prize money from unplayed matches. On the other hand, players spend money and travel to events with the expectation that certain rewards are on offer.

Reap the benefits

The biggest gain in establishing this rule is consistency. As fans, we expect that sporting bodies treat players equally, and at the moment, handling unplayable matches is a real (if rare) source of inconsistency.

The other benefit is in guaranteeing at least some of what competitors have been promised. In several past articles, I’ve used metrics such as “expected points” to quantify how a player can expect to perform at a tournament. If he has a 50% chance of reaching the second round, he has a 50% chance of earning those points; if he has a 15% chance of winning the tournament, he has a 15% chance of earning those additional points. Most players don’t explicitly choose tournaments by predicting exact draws and calculating expected points, but many–including Heliovaara, incidentally–very much think in these terms.

If a tournament is forced to end early, those calculations–explicit or not–are worthless. As a qualifier and the fifth seed, respectively, Fonio and Cerundolo probably would’ve been happy with finalist points in Antalya. But what about the players who ended up making trips to Kazakhstan and South Africa last March for nothing better than quarter-finalist points?

As I’ve said above, we can debate whether tournaments should be expected to pay out prize money (and whether prize money should go to the coin-flip losers, as it did in Antalya), but there’s no reason for a similar dispute about ranking points. Sure, some players would get lucky in that a coin proclaims them the winner, but it’s not much different from finding oneself the beneficiary of a withdrawal, or even in a weak section of the draw.

Fonio and Cerundolo ended up with the right solution, and I’m glad the tournament supervisor didn’t stand in the way. I just wish the coin flip were standard practice, to avoid the ethical tightrope walk. I’m sure that players would appreciate the increased clarity, as well.

Recreating the 1957 Women’s Tennis Season in 2,600 Easy Steps

Another historical season in the database! In 1957, Althea Gibson was so good it was almost boring. She was in the middle of a 161-week streak at the top of the Elo rankings, and with a 66-2 won-loss record this year, she finished the campaign more than 200 Elo points ahead of the number two player, Dorothy Head Knode.

Of course, no one knew about Elo in 1957, and there weren’t even week-by-week rankings. It didn’t take an advanced algorithm to know that Gibson belonged at the top of the heap. However, the newspapermen who published the most respected year-end ranking lists had at least as many blind spots as the WTA computer does these days. While Althea was comfortably on top, Knode was never considered to be better than 5th.

About 240 events worth of results–that’s about 2,600 matches–from this season are now on Tennis Abstract, and you can jump in via the 1957 season page. There’s a week-by-week calendar, year-end rankings, stats breakdowns for the top players, the most common head-to-heads, and country-by-country comparisons. All of this is now available for 11 pre-Open Era seasons.

The raw data has been added to my GitHub repo, and I offer another hearty round of thanks to the contributors at tennisforum’s Blast From the Past forum, who did the heavy lifting of typing out so many of these results from contemporary newspapers and annuals.

Podcast Episode 92: Natural Experiments and Second-Order Pandemic Effects

Episode 92 of the Tennis Abstract Podcast, with Carl Bialik, of the Thirty Love podcast, addresses the opportunity generated by the Covid-19 pandemic to study natural experiments in sports.

Many of the things we used to take for granted–stadiums full of fans, weekly travel schedules, consistent training opportunities–have been disrupted for some or all players, in tennis and other major sports. We consider what we can learn about home-court advantage, the predictability of results, the role of unchanging venues, and even the speed of play, by comparing pre-pandemic numbers with their corresponding figures since sports got back underway. We also wonder about the limitations of these sorts of studies, because there are always confounding variables. The biggest confounder of all: the pandemic itself.

I’ve been writing about these issues occasionally. Click for my posts on the predictability of match results, the effect of an empty stadium on serves, and the pace of play with no fans, no towelkids, and no linespeople.

Thanks for listening!

In housekeeping notes:

The TAP book club will reconvene in four weeks or so with our next selection, John Updike’s 1968 novel, Couples. Read along with us, tell us what you think, and suggest topics/questions/comments for our discussion in a future episode.
Fans of the TA podcast will also want to check out Dangerous Exponents, Carl’s and my Covid-19 podcast. Later today, we’re releasing a new episode about masks–the science behind wearing them, the ways researchers study their benefits, how they stack up against other public health interventions, and much more.

(Note: this week’s episode is about 51 minutes long; in some browsers the audio player may display a different length. Sorry about that! Also, I refer to this episode as episode 91, because for a numbers guy, I’m pretty bad at counting.)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Serving In an Empty Stadium

The pandemic offers a wealth of natural experiments. Ever wondered how the presence of fans affects players? Before last March, we were mostly limited to speculation, because fans were almost always there. The tours have made various compromises to keep the action going, so we have a wealth of data from all sorts of different scenarios–with or without linespeople, with or without towelkids, and of course, with or without fans.

The closest thing we have to a “pure” natural experiment concerning the effect on fans on tennis players is the 2020 US Open. Flushing Meadows is usually packed with spectators on most courts, while in 2020, it was empty save a handful of support staff. There are confounding variables aplenty, such as the aforementioned lack of linespeople (on most courts) and towelkids, and we also must keep in mind that players entered the 2020 US Open with less recent match play than usual. It isn’t a perfect natural experiment–such things are exceedingly rare–but it is better than tennis usually offers.

What should we expect from spectator-free tennis? One suggestion comes from Ben Cohen and Joshua Robinson, who found in August that both basketball and soccer players were shooting more accurately in empty stadiums:

NBA players are making a higher percentage of their free throws and hitting corner 3-pointers at rates the league has never seen. Soccer players are striking dead balls more precisely than they did before the pandemic. Without the distraction of screaming fans, one part of their games seems to have improved: shooting.

We can already speculate that tennis won’t be so clear cut. For one thing, there weren’t screaming fans before the pandemic. For another, everything in tennis is a tradeoff: If you’re serving more accurately, you might be tempted to try for a bit more power or aim closer to the corners. The “accuracy” effect, then, wouldn’t show up as accuracy, but as increased speed, or some mix of several measures. But let’s not rush to throw in the towel (as it were)–let’s look at the numbers.

US Open, now and then

We’ll check four different stats for an empty-stadium effect: first serve in, double faults per second serve (the inverse of second serves made), first serve points won, and average first serve speed.

For each stat, we’ll calculate averages for men and women from 2019 and 2020 US Open single main draw matches, adjusted for player. (I’m using the data available in my slam_pointbypoint GitHub repository.) That is, we’ll limit our focus to those players who appeared in both tournaments and weight each player’s effect by the year they played the least. A player who served 300 points in 2019 and 100 points in 2020 will have a weight of 100 points in both calculations; a player who served 250 points in both years will have a weight of 250 points in both. This corrects for the different mix of players (and the amount that each player competed) in the two adjacent years, which might otherwise affect the numbers in a misleading way.

Here are the results:

WOMEN        2020   2019  Change  
First in    61.8%  61.5%    0.5%  
DF/second   13.7%  13.4%    2.0%  
First won   66.4%  62.2%    6.6%  
First KM/H  158.6  155.2    2.1%  
                                  
MEN          2020   2019  Change  
First in    61.7%  59.5%    3.7%  
DF/second   10.7%  11.3%   -5.4%  
First won   72.9%  71.2%    2.3%  
First KM/H  186.2  184.8    0.8%

The women didn’t really improve their accuracy: a slight uptick in first serves in, and a bigger decrease in second serves made. On the other hand, they won way more of their first-serve points in 2020 than in 2019, and they served 3.4 KM/h faster. That puts the accuracy figure in perspective–no, they didn’t make more first serves, but it appears that they traded speed for accuracy. They did quite well in the bargain.

The men, on average, took a different approach. They made more first serves and committed fewer double faults (Alexander Zverev notwithstanding), but they didn’t increase their first serve speed as much. Men also won more first serve points, though their gain was not the enormous boost seen by the women.

Improvements in context

Based on these year-to-year comparisons, it looks like both men and women served better without spectators. The women’s giant boost in first serve points won suggests that there are other factors beyond those we can easily measure–perhaps players were missing first serves at a typical rate not only because they were hitting harder, but also because they were aiming for the corners. It’s also possible that post-restart rustiness affected returns more than serves–in lockdown, it’s easier to drill your own serves than to keep in practice against elite-level first serves.

Another consideration is the usual year-to-year fluctuations. For women, the small changes in first serves in and second serves missed are less than half the magnitude of the year-to-year changes at the US Open between 2015 and 2019. These numbers will always drift up and down for a variety of reasons, sheer randomness not least among them.

The 6.6% jump in the women’s rate of first serve points won, on the other hand, is quite unusual. The average fluctuation in the previous four pairs of years is 1.7%. The serve speed increase is also unusually large. It’s a 2.1% jump, compared to a typical movement of about 0.7%.

The 2019-to-2020 changes in men’s rates are less noteworthy in context, even if they do tell a suggestive story. The rate of first serves made is surprisingly noisy, fluctuating an average of 2.7% in each pair of years between 2015 and 2019, so the fans-to-no-fans shift of 3.7% doesn’t prove much of anything. The double fault and serve speed changes are no greater than previous fluctuations.

The only slightly convincing “pandemic effect” on the men’s side is the percentage of first serve points won. As we’ve seen, men won 2.3% more such points in 2020 than in 2019, adjusted for the mix of players–an increase half-again as large as the typical fluctuation of 1.5%. That’s hardly a slam-dunk case for better post-restart serving. It could be pure luck, or it could be attributed to a mix of the many confounding variables I’ve already mentioned.

Serving isn’t shooting

This stuff is complicated. Penalty kickers in soccer have an objective that is clear to all–to score a goal. While tennis servers have a similarly simple aim–to win the point–the only part of the point they can completely control is the serve, and given the tradeoffs between speed, precision, and keeping the ball in the box, there’s no single variable that tells us whether a player is serving better.

Things get even hairier when we look for data beyond the US Open. We could do the same exercise for the last few years of French Opens, but remember that the post-restart Roland Garros had a sprinkling of paying fans. Is that halfway between an empty stadium and normalcy? Is it worse than a full stadium, because individual voices are easier to discern? Is it a mix, because the French fans filled up the stands for their native players and left other courts empty? I have no idea.

It is clear that women served harder than usual at the 2020 US Open, and they won way more first serve points than in recent years. Men served more accurately, even if their success rate didn’t translate into the same whopping success than the women’s adjustments did. What we can’t say for sure is how much of those shifts can be attributed to the empty stands in Flushing last year. Even the purest natural experiments don’t always return bulletproof findings.

The Post-Covid Tennis World is Unpredictable. The Match Results Are Not.

Both the ATP and WTA patched together seasons in the second half of 2020, providing playing opportunities to competitors who had endured vastly different lockdowns–some who couldn’t practice for awhile, some who came down with Covid-19, and others who got knee surgery.

When the tours came back, we didn’t know quite what to expect. I’m sure some of the players didn’t know, either. Yet when we take the 2020 season (plus a couple weeks of 2021) as a whole, what happened on court was pretty much what happened before. The Australian Open, with its dozens of players in hard quarantine for two weeks, may change that. But for about five months, players faced all kinds of other unfamiliar challenges, and they responded by posting results that wouldn’t have looked out of place in January 2020.

The Brier end

My usual metric for “predictability” is Brier Score, which measures both accuracy (did our pre-match favorite win?) and confidence (if we think four players are all 75% favorites, did three of them win?). Pre-match odds are determined by my Elo ratings, which are far from the final word, but are more than sufficient for these purposes. My tour-wide Brier Scores are usually in the neighborhood of 0.21, several steps better than the 0.25 Brier that results from pure coin-flipping. A lower score indicates more accurate forecasts and/or better calibrated confidence levels.

Here are the tour-wide Brier Scores for the ATP and WTA since the late-summer restart:

ATP: 0.213 (2017 – early 2020: 0.212)
WTA: 0.192 (2017 – early 2020: 0.212)

The ATP’s level of predictability is so steady that it’s almost suspicious, while the WTA has somehow been more predictable since the restart.

But we aren’t quite comparing apples to apples. The post-restart WTA was sparser than the pre-Covid women’s tour, and the post-restart ATP was closer to its pre-pandemic normal.

Let’s look at a few things that do line up. Most of the top players showed up for the main events of the restarted tour, such as the US Open, Roland Garros, Rome, “Cincinnati” (played in New York), and men’s Masters event in Paris. Here are the 2019 and 2020 Brier Scores for each of those events:

Event          Men '19  Men '20  Women '19  Women '20  
Cincinnati       0.244    0.210      0.244      0.252  
US Open          0.210    0.167      0.178      0.186  
Roland Garros    0.163    0.199      0.191      0.226  
Rome             0.209    0.274      0.205      0.232  
Paris            0.226    0.199          -          -  
---
Total            0.204    0.202      0.198      0.218

(If you want even more numbers, I did similar calculations in August after Palermo, Lexington, and Prague.)

Three takeaways from this exercise:

Brier Scores are noisy. Any single tournament number can be heavily affected by a few major upsets.
Man, those ATP dudes were steady.
The WTA situation is more complicated than I thought.

Whether we look at the entire post-restart tour or solely the big events, the story on the ATP side is clear. Long layoffs, tournament bubbles, missing towelkids, Hawkeye Live … none of it had much effect on the status quo.

The predictability of the women’s tour is another thing entirely. The 12 top-level events between Palermo in July and Abu Dhabi in January were easier to forecast than a random sampling of a dozen tournaments from, say, 2018. But the four biggest events deviated from the script considerably more than they had in 2019 (or 2017 or 2018, for that matter).

From this, I offer a few tentative conclusions:

Big events, with their disproportionate number of star-versus-star matches, are a bit more predictable than other tournaments.
Accordingly, the post-restart WTA wasn’t as predictable as it first appeared. It was just lopsided in favor of tournaments that drew (most of) the top stars. Had the women’s tour featured a wider variety of events–which probably would’ve included a larger group of players, including some fringier ones–it’s post-restart Brier Score would’ve been higher. Perhaps even higher than the corresponding pre-Covid number.
Most tentative of all: The predictability of ATP and WTA match results might have itself been affected by the availability of tournaments. Top men were able to get into something like their usual groove, despite the weirdness of virus testing and empty stadiums. Most women never got a chance to play more than two or three weeks in a row.

Even six months after Palermo, the data is still limited. And by the time we have enough match results to do proper comparisons, some things will have gotten back to normal (hopefully!), complicating the analysis even further. That said, these findings are much clearer than my initial forays into post-restart Brier Scores in August. As for the Australian Open, quarantine and all, I’m forecasting a predictable tournament. At least for the men.

The 1958 Women’s Tennis Season, When Maria Bueno Held It All Together

I’ve added another historical season to the Tennis Abstract database, so we can now see thousands of results per year for a full decade before the beginning of the Open Era. 1958 might be the most interesting year of the bunch.

You can jump right in to the 1958 calendar, year-end Elo rankings, player stats, and more by clicking here for the season page.

1958 was the final full season as an amateur for Althea Gibson, and it was an awfully good one. She won her last 33 matches, including the Wimbledon and US Open titles. She turned 31 in August, and her performance in her age-30 campaign will forever leave us wondering what kind of career numbers she could have posted had she continued to play amateur tennis. Her lifetime totals are also clipped by the institutional racism that prevented her from competing on the world stage until well into her 20s.

Two of Gibson’s three losses in 1958 came at the hands of Janet Hopps Adkisson, herself an excellent player, one who just missed a top-ten year-end Elo finish in both 1957 and 1958. Hopps spent the years 1954-56 at Seattle University, where she played on the men’s tennis team (there was no alternative for women) and won 70% of her matches. When the ITA Women’s Collegiate Tennis Hall of Fame honored her in 1999, she quipped, “I never played in [an official] women’s match. I should be in the men’s hall of fame.”

Compared to later years, 1958 looks noticeably fractured. Gibson played almost four-fifths of her matches on grass, while British up-and-comer Shirley Bloomer Brasher played 47 of her 66 contests on clay, and American vet Beverly Baker Fleitz fought 23 of her 35 bouts on hard courts.

The only top player to tie it all together was Brazilian teen Maria Bueno, who played at least 102 matches in a year when no other notable player reached 70. Bueno started the year in Florida, played the Caribbean circuit (beating Hopps in five of seven meetings, all by early April), then shifted operations to Europe where she won Rome and reached the semis at Roland Garros. She followed the tour across the channel, losing a grass-court final in Manchester to Gibson, beating Angela Mortimer for the title match the following week in Bristol, and falling in the Wimbledon final eight. Then back to Europe, after which she competed at Forest Hills and other US events before finishing the year at home in Brazil.

Bueno’s eleven-month marathon left her in 7th place in the year-end Elo rankings, but not for long: She would reach the top spot by the end of the following year. Like Mortimer, who held the number one position in early 1956 and would win it back in mid-1959, Bueno would have to wait until Gibson left the scene.

Again, I invite you to dig in to the 200+ events and 2,500+ matches from 1958 on Tennis Abstract. The season page provides an easy introduction.

I’ve added the raw data from 1958, along with all other historical seasons I’ve added, to my GitHub repo. My work rests heavily on the shoulders of the contributors to tennisforum.com’s Blast From the Past section, who have painstakingly recovered all of these results from newspapers and annuals, organizing and double-checking the often-messy records along the way. As always, a big round of thanks to them.

Charting Aryna Sabalenka’s Win Streak

Aryna Sabalenka has won 3 titles and 14 matches in a row. Let’s dig into the data and see if we can identify any improvements that would account for her success.

For the Match Charting Project, I’ve logged every shot of each of the Belarussian’s tour-level matches. (There are a few exceptions where I haven’t found video.) We’ll look at hard-court matches only today. With that constraint, we have 140 Sabalenka matches, dating back to early 2017 (including the current streak), and another 1,121 women’s tour-level contests over the same time period for reference.

Big serving?

Aryna always brings a powerful serve, but it remains a work in progress, at least tactically. The key metric for pure serve dominance is unreturned serves–quite simply, serves that don’t come back. While some are aces, they don’t have to be, and the distinction doesn’t really matter.

This first graph has a lot going on, but as I’ll use the same basic template for several more figures, it’s worth taking a moment to understand what we’re looking at. The two dotted lines show tour average rates of unreturned serves (the lower average is for all players; the higher one is for match winners), the thin jagged line shows Sabalenka’s rate of unreturned serves for each individual match, and the thicker red line shows her five-match rolling average.

Her five-match rolling average has been above 30% for the entire win streak. It’s not an unprecedented level for her, though–she sustained similarly high levels at various points over the last three years. (We should also be a bit cautious ascribing serve effectiveness to a player when the Ostrava, Linz, and Abu Dhabi courts might have been faster than average.) Consistently powerful serving has certainly helped Sabalenka’s cause, but it probably isn’t the whole story.

We might gain from breaking down Aryna’s serve effectiveness into first and second serves. First, let’s look at something else:

Serve plus one

There are two ways we could look at “serve plus one” effectiveness, and we’ll do both. First, let’s count Sabalenka’s opportunities to hit a second shot behind her serve, and see what percentage she puts away. (As with aces and other unreturned serves, the “winner” concept is a distraction: I’m counting second-shot winners together with shots that force errors. If you end the point, it doesn’t matter much whether your opponent touches the ball.)

The second figure shows us that, on hard courts, when women are faced with a second shot behind their serve, they finish the point about 20% of the time. Sabalenka’s career average is 28%. She far exceeded that over a string of four matches to finish Ostrava and start Linz, maxing out at 42% against Jennifer Brady in the Ostrava semi-final. Since then, her rate returned to roughly her (impressive) career average.

This measure is something of a “key to the match” for Sabalenka. When she converts at least 30% of second-shot opportunities behind her serve, she wins 91% of her matches. When she doesn’t, she wins 62%. Of course, 62% is nothing to be ashamed of, and the dip visible in early 2020 coincides with her Doha title, the one time in her career that the five-match rolling average fell below 20%.

Serve plus serve plus one

These first two measures are related, of course. A big server should post good numbers in both. But a great “pure” serving day might mean a worse-looking serve-plus-one day, because fewer weak returns are coming back at all. The reverse holds as well: A strong server might not hit as many unreturned serves as usual because her opponent is managing to just barely put them back in play–easy sitters for second shots.

To identify the combined benefits of good serving and efficient serve-plus-one’ing, we simply count how often Sabalenka wins service points in two shots or less.

We’ve already seen the two components of this, so there are no surprises here. The typical player wins about 40% of her service points this way, and Aryna has historically averaged 46% on hard courts. This number looks as good for her recent winning streak as we’d expect. But as with the previous graph, it suggests weakness during her 2020 Doha title, so the predictive power here is limited.

First and second serves

The combined metric of unreturned serves plus second-shot putaways gives us a good snapshot of when the offensive game is working. Let’s break down the previous graph into first- and second-serve specific numbers:

These track the overall numbers. Aryna has generally been good lately on both first and second serves, but with neither one has she been more successful or consistent than in previous hot streaks. Second serves are particularly hard to rate because the per-match sample size is so small–fewer than 30 second serve points per player per match, and some of those end up as double faults.

Before moving on to the return game, let’s look at one more indicator of service-point success:

Longer points on serve

As I said at the outset, Sabalenka has always been a good server. While her current momentum might owe a bit to fewer mental lapses on serve, it would be logical to look elsewhere for an explanation, simply because there was more room to improve in other areas.

We’ve seen how her serve and second shot rate. What about serve points that go deeper? This metric considers all points where the returner’s second shot comes back, and then counts how often the server goes on to win the point.

The average hard-court WTA match winner claims almost exactly half of her service points when the rally reaches five shots. Over her career, Sabalenka has won 48%, worse than the typical match winner but better than the overall tour average.

Aryna has done better lately. To cherry-pick a starting point, she has won 51% of these points in her last 24 matches, dating back to the Doha second round. Her average over the first five matches in Abu Dhabi was 55%, the best she has managed since her breakout run in late 2018, when she pushed Naomi Osaka to three sets at the US Open and hoisted the Wuhan trophy a few weeks later.

Return winners

We’ll walk through the dimensions of her return performance in a similar manner, starting with return winners (and point-ending non-winners), then on to “return-plus-one” putaways, followed by the combination of the two.

First, return winners. I use the number of point-ending return winners divided by in-play serves–that is, excluding double faults.

Veronika Kudermetova had a rough day last Wednesday, so Sabalenka’s current five-match rolling average is as high as it’s been since early 2018. Apart from that last-minute burst of return dominance, her recent return winner rates look a bit like the serve stats: consistently solid, if not spectacular.

Return plus one

How about when the serve return doesn’t finish the job? This “return plus one” metric counts opportunities when the server puts her second shot in play and measures how often the returner hits a winner or forces an error with her own second shot. The sample sizes are a getting a bit small here (each player has 43 such opportunities in an average hard-court match), so the per-match rates are rather spiky:

The small single-match samples, combined with the relationship between return-plus-one and return winners–almost interchangeable ways to respond successfully to a mediocre serve–render conclusions a bit tough to come by. Sabalenka was average by this measure in Ostrava, great in Linz, and all over the place in Abu Dhabi.

Short return points won

Will things be clearer when we combine both methods of quickly winning a return point?

Aside from a weak return performance against Elena Rybakina in Abu Dhabi, Sabalenka has been comfortably above average in this metric in every match since she faced Victoria Azarenka in the Ostrava final.

Like “serve plus one,” this is a good indicator of overall success for the Belarussian. If we use this metric to split her 140 charted hard-court matches in half, the dividing line is 27.5% of return points won with a return winner or a return-plus-one putaway. Above that mark, she has won 62 matches, or 88.6%. Below it, she has won only 41, or 58.6%. She was above the line in nearly all of her matches in Linz and Abu Dhabi, and she sat at 25% or higher in every round of her 2020 Doha triumph, clearing 30% in three of five matches there.

First and second serve returns

Has she been particularly devastating against first or second serves? Let’s see:

Few women feast on second serves the way Sabalenka does, and she’s been particularly relentless of late. The typical tour player wins about 30% of second-serve return points with a first- or second-shot putaway, and over her last 15 matches, Aryna has won 41% that way. 41% is a respectable total percentage of return points won against many servers, and Sablaenka would be winning that many even if she refused to hit more than two shots per rally.

Granted, Sabalenka doesn’t hit that many fifth or sixth shots. How does she fare when her return points extend that far?

Long return points

You’ll be glad to know that the code for this final* graph didn’t throw any divide-by-zero errors–Aryna has played at least one “long” return point in each of her hard-court matches. This metric tallies up all return points in which the server puts her third shot in play, then calculates how often the returner won the point.

** Yes! It’ll be over soon!

This is another spiky mess, with an average of only 20 points per match. Still, if we’re looking for a category in which Sabalenka is newly excelling–not just thriving as usual–this could be our smoking gun.

Tour average for match winners on this stat is 46.7%. The server has an advantage by definition, because she has just put the ball back in play. The Belarussian’s career mark is 44.4%, only a bit better than the overall average. Yet in her last 15 matches, she has won 48.0% of these long return points, her best 15-match span since early in her career, when she faced a weaker mix of opponents.

I don’t want to overemphasize this: When there are only 20 points of this type per match, an improvement of 3.6 percentage points translates to a gain of less than one point per match. That doesn’t explain the magnitude of Sabalenka’s recent gains. But it does indicate that she is shoring up one of her few weaknesses, and in combination with her solid play on long serve points, it suggests that she no longer needs to rely on a one-two punch, even if her one-two punch is as dizzying as anyone’s.

Don’t make me say consistency

Tennis matches are decided by a handful of points: While Sabalenka has been dominant lately, she lost more points than she won against Coco Gauff in the Ostrava opening round. As such, improvements always look minor when we try to quantify them, if we can quantify them at all.

I’ve pointed out some areas where Sabalenka may be improving, others where a good statistical showing usually coincides with a W, and still others where an excellent performance doesn’t seem to matter much. All of these categories have one thing in common: She is putting up stellar numbers right now.

Remember, in the twelve graphs above (yes, twelve, sheesh), the dotted yellow lines indicate the average performance of match winners. In every single one of the categories, Aryna’s five-match rolling average is above that line. Every single one! In most cases, it has been above the line for some time.

It doesn’t take any statistical savvy to see that if a player is better than the average match winner in every category, she’ll be awfully tough to beat. The rest of the Australian Open field can only cross their fingers that Sabalenka’s current form won’t survive two weeks of quarantine.

Podcast Episode 91: Book Club: A Handful of Summers by Gordon Forbes

Episode 91 of the Tennis Abstract Podcast, with Carl Bialik, of the Thirty Love podcast, recaps the first installment of our book club, on A Handful of Summers by Gordon Forbes.

Forbes’s book, first published in 1978, is a well-regarded memoir of 1950s and 1960s amateur tennis, and a timely read, as the South African died last month, aged 86. Carl and I talk about what we learned about pre-Open Era tennis, what set Rod Laver apart from his peers, how Forbes stacked up as a player, and whether the lifestyles of amateur and pro players were really so different. We also address the tricky subject of how to read a memoir with very of-the-time attitudes toward women, barely an acknowledgement of apartheid, and a 2017 prologue that has nothing to say about either issue. Despite those reservations, there’s much in the book to appreciate.

The TAP book club will reconvene in about one month with our next pick, John Updike’s 1968 novel, Couples. While the book is about much more than tennis, novelist Benjamin Markovits (a Thirty Love guest) gave it a place on his list of favorite tennis books.

Fans of the TA podcast will also want to check out Dangerous Exponents, the new Covid-19 podcast that Carl and I are doing. Later today, we’re releasing our 10th episode, about the tradeoffs faced by hospitals and policymakers between minimizing deaths and optimizing for other health-related outcomes.

(Note: this week’s episode is about 48 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Hello, 1959

Another season, another 2,300 matches on the Tennis Abstract site. The latest addition is the 1959 women’s tennis season, which you can dig into here.

Althea Gibson more or less retired from the amateur circuit after a dominant 1958 season. She did a bit of acting, some lounge singing, and returned to the courts only long enough to win the Chicago Pan-American Games in August. That left the field open for three other women to spend some time at number one–according to Elo, anyway.

Angela Mortimer was the first to unseat Gibson, holding the top spot for 18 weeks on the strength of her perfect 15-0 record in finals this season. Maria Bueno took over for a week in November, losing her position to Beverly Baker Fleitz for two weeks, then reclaiming the honor, which she would hold well into 1960. For Baker Fleitz, who was ambidextrous and played with two forehands (!), it was a fitting sendoff into retirement after an outstanding decade of top-level tennis.

As usual, the raw data is available in my GitHub repo. Another round of thanks are due to the contributors at the Blast From the Past forum, who did much of the heavy lifting you see here.

The 1960 Women’s Tennis Season, When Quality Topped Quantity

Our dive into the history of women’s tennis keeps getting deeper. Tennis Abstract now includes hundreds of events and thousands of matches from the 1960 season, which you can browse here.

1960 was the year of the first major title for Margaret Court, when the 17-year-old proved that, if nothing else, she was a glutton for punishment. But she didn’t travel abroad, which made her a non-factor for the rest of the season. With Althea Gibson out of the picture on the pro tour*, the field was open for stars such as Maria Bueno, Darlene Hard, and the largely forgotten Zsuzsa Kormoczy. Bueno won Wimbledon and narrowly lost to Hard in the finals at Forest Hills, spending most of the year at number one in the Elo rankings.

* I’m collecting pro results when I come across them, but the return so far is sparse. Most professional women’s matches were one-offs, akin to today’s exhibitions, and were generally played among a very small group of competitors.

For sheer endurance, the 1960 crown should go to Ann Jones. She played over 120 matches, won 106 of them, and took home 15 titles. (15.5, actually, as she reached the Montego Bay final, which was rained out.) Yet according to Elo, those eye-popping numbers weren’t quite enough to overtake Bueno. Amateur era tennis is full of tricky comparisons like this, with one elite player opting for a shorter schedule against top-flight competition, and another choosing to play almost every week, which out of necessity included weaker regional tournaments. Jones might be the best exemplar of the second category. I now have records of her playing over 1,300 career matches, and that figure is almost certainly missing some early-round tilts.

In 1960, Bueno played exactly half as many matches (evenly splitting her eight meetings with the Brit), yet narrowly edged Jones in the year-end Elo race, 2240 to 2237. The Brazilian held the number one position all year except for four weeks in May and June. That was to Kormoczy, who played an even more selective schedule. But while the 36-year-old Hungarian stayed home for most of the year, she reeled off a 19-match win streak on the Riviera circuit, capped by a win over Jones in the Rome final.

You can take your own look at the 1960 women’s season here. The linked page includes a full calendar of events, year-end Elo rankings, season stats, head-to-heads, and country comparisons.

The raw data, along with that of every season from 1961 to the present, is available in my GitHub repo. I’ve also recently added thousands of matches from second-tier events and qualifying in the early Open Era. This project owes a huge debt to the contributors at tennisforum.com’s Blast From the Past, who have been moving tennis data from dusty annuals and newspaper archives to the internet for the last decade.