Unpredictable Bounces, Predictable Results

These days, the grass court season is the awkward stepchild of the tennis calendar. It takes place almost entirely within a single country’s borders, lasts barely a month, and often suffers from the absence of top players who prefer to rest after the French Open.

The small number of grass court events makes the surface problematic for analysts, as well. The surface behaves differently than hard or clay courts and rewards certain playing styles, so it’s reasonable to assume that many players will be particularly good or bad on grass. But with 90% of tour-level matches contested on other surfaces, many players don’t have much of a track record with which we can assess their grass-court prowess.

I was surprised, then, to find that grass court results are rather predictable. Elo-based forecasts of ATP grass court matches are almost as accurate as hard court predictions and considerably more effective than clay court forecasts. Even when we use “pure” surface forecasts–that is, predicting matches using ratings which draw only on results from that surface–grass court forecasts are a bit better than clay court predictions.

I started with a dataset of the roughly 50,000 ATP matches from 2000 through last week, excluding retirements and withdrawals. As a benchmark, I used official ATP rankings to make predictions for each of those matches. 66.6% of them were right, and the Brier score for ATP rankings over that span is .210. (Brier score measures the accuracy of a set of forecasts by averaging the squared error of each individual forecast, so a lower number is better. To put tennis-specific Brier scores in context, in 2016, ATP rankings had a .208 Brier score, and aggregate betting odds had a .189 Brier score.)

Let’s break that down by surface and compare the performance of ATP rankings, Elo, and surface-specific Elo. “F%” is the percentage of matches won by the favorite–as determined by that system, and “Br” is Brier score:

Surface  ATP F%  ATP Br  Elo F%  Elo Br  sElo F%  sElo Br  
Hard      67.3%   0.207   68.0%   0.205    68.5%    0.202  
Clay      66.1%   0.211   67.1%   0.211    67.0%    0.213  
Grass     66.0%   0.215   67.6%   0.207    68.5%    0.207

All three rating systems do best on hard courts, and for good reason: official rankings and overall Elo are more heavily weighted toward hard court results than they are clay or grass. Surface-specific Elo does best on hard courts for a similar reason: more data.

Already, though, we can see the unexpected divergence of clay and grass courts, especially with surface-specific Elo. It’s possible to explain overall Elo’s better performance on grass courts due to the presumed similarly between hard and grass–if a player excels on one, he’s probably good on the other, even if he’s horrible on clay.  But that doesn’t explain sElo doing better on grass than on clay. There are 3.3 times as many tour-level matches on clay than on grass, so even allowing for the fact that players choose schedules to suit their surface preferences, almost everyone is going to have more results on dirt than on turf. More data should give us better results, but not here.

We can improve our forecasts even more by blending surface-specific ratings with overall ratings. After testing a wide range of possible mixes, it turns out that equally weighting Elo and sElo provides close to the best results. (The differences between, say, 60/40 and 50/50 are extremely small on all surfaces, so even where 60/40 is a bit better, I prefer to keep it simple with a half-and-half mix.) Here are the results for weighted surface Elos for all three surfaces:

Surface  ATP F%  ATP Br  
Hard      68.6%   0.202  
Clay      68.0%   0.207  
Grass     69.8%   0.196

Now grass courts are the most predictable of the major surfaces! Even when we use a weighted average of Elo and sElo, grass court forecasts rely on less data than those of the other surfaces–the surface-specific half of the grass court forecasts uses less than one-third the match results of clay court predictions and less than one-fifth the results of hard court forecasts. In fact, we can do at least as well–and perhaps a tiny bit better–with even less data: A 50/50 weighting of grass-specific Elo and hard-specific Elo is just as accurate as the half-and-half mix of grass-specific and overall Elo.

Regardless of the exact formula, it remains striking that we can predict ATP grass court results so accurately from such limited data. Even if one-third of ATP events were played on grass, I still wouldn’t have been surprised if grass court results turned out to be the least predictable. The more a surface favors the server–and it’s hardest to break on grass–the tighter the scoreline will tend to be, introducing more randomness into the end result. Despite that structural tendency, we’re able to pick winners as successfully on grass as on the more common surfaces.

Here’s my theory: Even though there aren’t many grass court events, the conditions at those few tournaments are quite consistent. Altitude is roughly sea level, groundskeepers follow the lead of the staff at Wimbledon, and rain clouds are almost always in sight. Compare that homogeneity to the variety of hard courts or clay courts. The high-altitude hard courts in Bogota are nothing like the slow ones in Indian Wells. The “clay” in Houston is only nominally equal to the crushed brick of Roland Garros. While grass courts are almost identical to each other, clay courts are nearly as different from each other as they are from other surfaces.

It makes sense that ratings based on a uniform surface would be more accurate than ratings based on a wide range of surfaces, and it’s reassuring to find that the limited available data doesn’t cancel out the advantage. This research also suggests a further path to better forecasts: grouping hard and clay matches by a more precise measure of surface speed. If 10% of tour matches are sufficient to make accurate grass court predictions, the same may be true of the slowest one-third of clay courts. More data is almost always better, but sometimes, precisely targeted data is best of all.

Podcast Episode 12: Fed’s Match Points, Vika’s Return, and Wimbledon’s Seeding Formula

In the Episode 12 of the Tennis Abstract Podcast, Carl Bialik and I get excited about grass season, both for the glories of the surface itself and for the great players it has brought back, including Roger Federer and Victoria Azarenka. We talk Fed’s surprise loss, the arcane (but worthwhile) Wimbledon seeding formula, the returning WTA stars who will threaten on grass, David Goffin’s injury, and the debut Challenger title for 16-year-old Felix Auger-Aliassime.

We had a lot of fun recording this one. Hope you enjoy it as well!

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Just How Aggressive is Jelena Ostapenko?

If you picked up only two stats about surprise Roland Garros champion Jelena Ostapenko, you probably heard that, first, her average forehand is faster than Andy Murray’s, and second, she hit 299 winners in her seven French Open matches. I’m not yet sure how much emphasis we should put on shot speed, and I instinctively distrust raw totals, but even with those caveats, it’s hard not to be impressed.

Compared to the likes of Simona Halep, Timea Bacsinszky, and Caroline Wozniacki, the last three women she upset en route to her maiden title, Ostapenko was practically playing a different game. Her style is more reminiscent of fellow Slam winners Petra Kvitova and Maria Sharapova, who don’t construct points so much as they destruct them. What I’d like to know, then, is how Ostapenko stacks up against the most aggressive players on the WTA tour.

Thankfully we already have a metric for this: Aggression Score, which I’ll abbreviate as AGG. This stat requires that we know three things about every point: How many shots were hit, who won it, and how. With that data, we figure out what percentage of a player’s shots resulted in winners, unforced errors, or her opponent’s forced errors. (Technically, the denominator is “shot opportunities,” which includes shots a player didn’t manage to hit after her opponent hit a winner. That doesn’t affect the results too much.) For today’s purposes, I’m calculating AGG without a player’s serves–both aces and forced return errors–so we’re capturing only rally aggression.

The typical range of this version AGG is between 0.1–very passive–and 0.3–extremely aggressive. Based on the nearly 1,600 women’s matches in the Match Charting Project dataset, Kvitova and Julia Goerges represent the aggressive end, with average AGGs around .275. We only have four Samantha Crawford matches in the database, but early signs suggest she could outpace even those women, as her average is at .312. At the other end of the spectrum, Madison Brengle is at 0.11, with Wozniacki and Sara Errani at 0.12. In the Match Charting data, there are single-day performances that rise as high as 0.44 (Serena Williams over Errani at the 2013 French Open) and fall as low as 0.06. In the final against Ostapenko, Halep’s aggression score was 0.08, half of her average of 0.16.

Context established, let’s see where Ostapenko fits in, starting with the Roland Garros final. Against Halep, her AGG was a whopping .327. That’s third highest of any player in a major final, behind Kvitova at Wimbledon in 2014 (.344) and Serena at the 2007 Australian Open (.328). (We have data for every Grand Slam final back to 1999, and most of them before that.) Using data from IBM Pointstream, which encompasses almost all matches at Roland Garros this year, Ostapenko’s aggression in the final was 7th-highest of any match in the tournament–out of 188 player-matches with the necessary data–behind two showings from Bethanie Mattek Sands, one each from Goerges, Madison Keys, and Mirjana Lucic … and Ostapenko’s first-round win against Louisa Chirico. It was also the third-highest recorded against Halep out of more than 200 Simona matches in the Match Charting dataset.

You get the picture: The French Open final was a serious display of aggression, at least from one side of the court. That level of ball-bashing was nothing new for the Latvian, either. We have charting data for her last three matches at Roland Garros, along with two matches from Charleston and one from Prague this clay season. Of those six performances, Ostapenko’s lowest AGG was .275, against Wozniacki in the Paris quarters. Her average across the six was .303.

If those recent matches indicate what we’ll see from her in the future, she will likely score as the most aggressive rallying player on the WTA tour. Because she played less aggressively in her earlier matches on tour, her career average still trails those of Kvitova and Goerges, but not by much–and probably not for long. It’s scary to consider what might happen as she gets stronger; we’ll have to wait and see how her tactics evolve, as well.

The Match Charting Project contains at least 15 matches on 62 different players–here is the rally-only aggression score for all of them:

PLAYER                    MATCHES  RALLY AGG  
Julia Goerges                  15      0.277  
Petra Kvitova                  57      0.277  
Jelena Ostapenko               17      0.271  
Madison Keys                   35      0.261  
Camila Giorgi                  17      0.257  
Sabine Lisicki                 19      0.246  
Caroline Garcia                15      0.242  
Coco Vandeweghe                17      0.238  
Serena Williams               108      0.237  
Laura Siegemund                19      0.235  
Anastasia Pavlyuchenkova       17      0.230  
Danka Kovinic                  15      0.223  
Kristina Mladenovic            28      0.222  
Na Li                          15      0.218  
Maria Sharapova                73      0.217  
PLAYER                    MATCHES  RALLY AGG  
Eugenie Bouchard               52      0.214  
Ana Ivanovic                   46      0.211  
Garbine Muguruza               57      0.210  
Lucie Safarova                 29      0.209  
Karolina Pliskova              42      0.207  
Elena Vesnina                  20      0.207  
Venus Williams                 46      0.205  
Johanna Konta                  31      0.205  
Monica Puig                    15      0.203  
Dominika Cibulkova             38      0.198  
Martina Navratilova            25      0.197  
Steffi Graf                    39      0.196  
Anastasija Sevastova           17      0.194  
Samantha Stosur                19      0.193  
Sloane Stephens                15      0.190  
PLAYER                    MATCHES  RALLY AGG  
Ekaterina Makarova             23      0.189  
Lauren Davis                   16      0.186  
Heather Watson                 16      0.185  
Daria Gavrilova                20      0.183  
Justine Henin                  28      0.183  
Kiki Bertens                   15      0.181  
Monica Seles                   18      0.179  
Svetlana Kuznetsova            28      0.174  
Timea Bacsinszky               28      0.174  
Victoria Azarenka              55      0.170  
Andrea Petkovic                24      0.166  
Roberta Vinci                  23      0.164  
Barbora Strycova               16      0.163  
Belinda Bencic                 31      0.163  
Jelena Jankovic                24      0.162  
PLAYER                    MATCHES  RALLY AGG  
Alison Riske                   15      0.161  
Angelique Kerber               83      0.161  
Flavia Pennetta                23      0.160  
Simona Halep                  218      0.160  
Carla Suarez Navarro           31      0.159  
Martina Hingis                 15      0.157  
Chris Evert                    20      0.152  
Darya Kasatkina                18      0.148  
Elina Svitolina                46      0.141  
Yulia Putintseva               15      0.137  
Alize Cornet                   18      0.136  
Agnieszka Radwanska            90      0.130  
Annika Beck                    16      0.126  
Monica Niculescu               25      0.124  
Caroline Wozniacki             62      0.122  
Sara Errani                    23      0.121

(A few of the match counts differ slightly from what you’ll find on the MCP home page. I’ve thrown out a few matches with too much missing data or in formats that didn’t play nice with the script I wrote to calculate aggression score.)

Is Jelena Ostapenko More Than the Next Iva Majoli?

Winning a Grand Slam as a teenager–or, in the case of this year’s French Open champion Jelena Ostapenko, a just-barely 20-year-old–is an impressive feat. But it isn’t always a guarantee of future greatness. Plenty of all-time greats launched their careers with Slam titles at age 20 or later, but three of the players who won their debut major at ages closest to Ostapenko’s serve as cautionary tales in the opposite direction: Iva Majoli, Mary Pierce, and Gabriela Sabatini. Each of these women was within three months of her 20th birthday when she won her first title, and of the three, only Pierce ever won another.

However, simply comparing her age to that of previous champions understates the Latvian’s achievement. Women’s tennis has gotten older over the last two decades: The average age of a women’s singles entrant in Paris this year was 25.6, a few days short of the record established at Roland Garros and Wimbledon last year. That’s two years older than the average player 15 years ago, and four years older than the typical entrant three decades ago. Headed into the French Open this year, there were only five teenagers ranked in the top 100; at the end of 2004, the year of Maria Sharapova’s and Svetlana Kuznetsova’s first major victories, there were nearly three times as many.

Thus, it doesn’t seem quite right to group Ostapenko with previous 19- and 20-year-old first-time winners. Instead, we might consider the Latvian’s “relative age”—the difference between her and the average player in the draw—of 5.68 years younger than the field. When I introduced the concept of relative age last week, it was in the context of Slam semifinalists, and in every era, there have been some very young players reaching the final four who burned out just as quickly. The same isn’t true of women who went on to win major titles.

In the last thirty years, only two players have won a major with a greater relative age than Ostapenko: Sharapova, who was 6.66 years younger than the 2004 US Open field, and Martina Hingis, who won three-quarters of the Grand Slam in 1997 at age 16, between 6.3 and 6.6 years younger than each tournament’s average entrant. The rest of the top five emphasizes Ostapenko’s elite company, including Monica Seles (5.29, at the 1990 French Open) and Serena Williams (5.26, at the 1999 US Open).

Each of those four women went on to reach the No. 1 ranking and win at least five majors–an outrageously optimistic forecast for Ostapenko, who, even after winning Roland Garros, is ranked outside the top ten. By relative age, Majoli, Pierce, and Sabatini are poor comparisons for Saturday’s champion–Majoli and Pierce were only three years younger than the fields they overcame, and Sabatini was only two years younger than the average entrant. By comparison, Garbine Muguruza, who won last year’s French Open at age 22, was two and a half years younger than the field.

Which is it, then? Unfortunately I don’t have the answer to that, and we probably won’t have a better idea for several more years. For most of the Open Era, until about ten years ago, the average age on the women’s tour fluctuated between 21 and 23. Thus, for the overall population of first-time major champions, actual age and relative age are very highly correlated. It’s only with the last decade’s worth of debut winners that the numbers meaningfully diverge. For Ostapenko and Muguruza–and perhaps Victoria Azarenka and Petra Kvitova–we have yet to see what their entire career trajectory will look like. To build a bigger sample to test the hypothesis, we’ll need a few more young first-time Slam winners, something we may finally see with Sharapova and Williams out of the way.

For more post-French Open analysis, here’s my Economist piece on Ostapenko and projecting major winners in the long term. Also at the Game Theory blog, I wrote about Rafael Nadal and his abssurd dominance on clay in a fast-court-friendly era.

Finally, check out Carl Bialik’s and my extra-long podcast, recorded Monday, with tons of thoughts and the winners and the fields in general.

Podcast Episode 11: The French Open in Review

In the Episode 11 of the Tennis Abstract Podcast, Carl Bialik and I celebrate the 2017 Roland Garros with a lot to say about the perils of forecasting. Could we have seen Rafa’s resurgence coming? What are the career prospects for Latvian sensation Jelena Ostapenko and women’s runner-up Simona Halep? And what on earth are we supposed to conclude about Andy Murray right now?

It’s a super-sized episode, clocking in just under 80 minutes … and we still couldn’t fit everything in. Enjoy!

Click to listen, subscribe on iTunes, find us on Stitcher, or use our feed to get updates on your favorite podcast software.

First Meetings in Grand Slam Finals

The 2017 Roland Garros final is crammed with firsts for 20-year-old Latvian Jelena Ostapenko. Playing in only her eighth major, she had never before reached the round of 16, let alone the final two. Her opponent, Simona Halep, has been here before–she lost the 2014 French Open final to Maria Sharapova–but the two women have one first in common: Halep and Ostapenko have never played each other.

Slam finals are usually reserved for an elite group, and that select few tends to play each other quite a bit. Since 1980, women’s major finalists have had an average of 12 previous meetings. The veteran Australian Open finalists this year, Serena Williams and Venus Williams, had faced off 27 times before their clash in Melbourne.

That makes the Halep-Ostapenko debut meeting an unusual one, but the situation is not unheard of. The 2012 Roland Garros final was the first match between Sharapova and Sara Errani (they’ve since played five more). Overall, there have been five first meetings in women’s major finals in the last 35 years:

Slam     Winner           Finalist               
2012 RG  Maria Sharapova  Sara Errani         
2009 US  Kim Clijsters    Caroline Wozniacki  
2007 W   Venus Williams   Marion Bartoli      
1988 RG  Steffi Graf      Natalia Zvereva

(There were probably a few more before that, but my database is missing a lot of matches from the mid-1970s, so I don’t know for sure.)

In all of these cases, the established star defeated the upstart, which bodes well for Halep. On the other hand, the Romanian doesn’t quite measure up to the previous four winners, all of whom had won a Grand Slam title before their final on this list.

First meetings in Grand Slam finals are a bit more common in the men’s game, though it’s been nearly a decade since the last one. We’ll probably wait quite a bit longer, too. Rafael Nadal and Stanislas Wawrinka will play for the 19th time on Sunday, and of the 45 possible pairings in the current top ten, only Kei Nishikori and Alexander Zverev have yet to face off. The next highest-ranked pair without a head-to-head is Andy Murray and Jack Sock which, come to think of it, would make for an interesting Wimbledon final next month.

The last debut clash on such a big stage was the 2008 Australian Open, between Novak Djokovic and Jo Wilfried Tsonga. It was the eighth in the last 35 years:

Slam     Winner            Finalist                
2008 AO  Novak Djokovic    Jo Wilfried Tsonga   
2003 US  Andy Roddick      Juan Carlos Ferrero  
1997 RG  Gustavo Kuerten   Sergi Bruguera       
1997 AO  Pete Sampras      Carlos Moya          
1996 W   Richard Krajicek  Malivai Washington   
1986 RG  Ivan Lendl        Mikael Pernfors      
1985 W   Boris Becker      Kevin Curren         
1984 AO  Mats Wilander     Kevin Curren

Before 1982, most first-meeting finals took place at the Australian Open, which at that time usually featured a weaker draw than the other Slams. For instance, the 1979 final was played by Guillermo Vilas and John Sadri. While Vilas is among the all-time greats, Sadri never advanced beyond the fourth round of any other major–where he might have encountered Vilas more often.

One thing seems certain: It won’t be the last meeting for Halep and Ostapenko. All of the pairs I’ve listed played at least once after their Slam final, and with the exception of Wilander-Curren, each one played at least twice more. Halep is only 25, so if she remains near the top of the game and Ostapenko continues climbing the ranks, the pair could aim to match Graf and Zvereva, who met 20 more times after the 1988 French Open final. The loser of today’s match will want to avoid Zvereva’s fate, though: In those 20 matches, the Belarussian won only once.

Dominic Thiem and Reversible Blowouts

A few weeks ago in Rome, Dominic Thiem got destroyed by Novak Djokovic, 6-1 6-0. It was a letdown after Thiem’s previous-round upset of Rafael Nadal, and it seemed to provide a reminder of the old adage that tennis is about matchups. Even someone good enough to beat the King of Clay might struggle against a different sort of opponent.

Those struggles didn’t last. On Wednesday, Thiem faced Djokovic again, this time in the French Open quarterfinals, and won in straight sets. In less than three weeks, the Austrian bounced back from a brutal loss to defeat one of the greatest players of all time.

I’ve written before about the limited value of head-to-head records: When the head-to-head suggests that one player will win but the rankings disagree, the rankings prove to be the better forecaster. More sophisticated rating systems such as Elo would presumably do better still, though I haven’t done that exact test. There are certainly individual cases in which something specific about a matchup casts doubt on the predictiveness of the rankings, but if you have to pick one or the other, head-to-heads are the loser.

What about blowouts? Going into Wednesday’s quarterfinal, my surface-specific Elo ratings suggested that Thiem had a 26% chance of scoring the upset. The recent 6-1 6-0 loss was factored into those numbers, but only as a loss–there’s no consideration of severity. Should we have been even more skeptical of Thiem’s chances, given the most recent head-to-head result?

As it turns out, Thiem is far from the first player to turn things around after such a nasty scoreline. The most famous example is Robin Soderling, who lost 6-1 6-0 to Nadal in Rome in 2009, then bounced back to register one of the biggest upsets in tennis history, knocking out Rafa at Roland Garros. Few recoveries are so dramatic, but there are hundreds more.

Most players who lose lopsided scorelines–for today’s purposes, I’m considering any match in which the loser won two games or fewer–never get a chance to redeem themselves. I found roughly 2250 such matches in the ATP’s modern era, and the same two players met again less than half of those times. The fact that the head-to-head continues is a signal itself: Mediocre players–the ones you’d expect to lose badly–don’t get another chance. Even some top-20 players rarely meet each other on court, so the sort of player who earns the chance for redemption might have already proven that his lopsided loss was just an off day.

Of the 951 occasions that a player loses badly and faces the same opponent again, he gets revenge and wins the next match 277 times–about 29%. Crazy as it sounds, if the only thing we knew about Djokovic and Thiem entering Wednesday’s match was that Djokovic had won the last match 6-1 6-0, our base forecast would’ve been pretty close to the 26% that the much-more sophisticated Elo algorithm offered us.

29% is much higher than I expected, but it is lower than the typical rate for players in this situation. I found all head-to-heads of at least two meetings, and for every match after the first, counted whether it maintained or reversed the previous result. In addition to isolating lopsided scores, I also considered matches in which the loser won a set, on the assumption that those might be tighter matchups. Finally, for each of those categories, I tracked whether the follow-up matches were on the same surface as the previous one. Here are the results, with all win percentages shown from the perspective of the player who, like Thiem, lost the first encounter:

Score     Next Surface  Matches   Wins  Win %  
Any loss  All             68128  26586  39.0%  
Any loss  Same            31084  11855  38.1%  
Any loss  Diff            37044  14731  39.8%  
Bad loss  All               951    277  29.1%  
Bad loss  Same              457    128  28.0%  
Bad loss  Diff              494    149  30.2%  
Won set   All             26075  11286  43.3%  
Won set   Same            11766   4974  42.3%  
Won set   Diff            14309   6312  44.1%

The chances of recovering from a bad loss are better than I thought, but they are considerably worse than the odds that a player reverses the result after a less conspicuous scoreline–39%. The table also shows that the player seeking revenge is more likely to get it if the opportunity arises on a different surface, though not by a wide margin.

It’s clear that players are less likely to recover from a bad loss than from a more typical one, but how much of that is selection bias? After all, most of the players who lose 6-1 6-0 aren’t of the caliber of Thiem or Soderling, even if they are good enough to stick around in main draws and ultimately face the same opponent again.

To answer that question, I looked again at those 950 post-blowout matches, this time with pre-match Elo ratings. After eliminating everything before 1980 and a few other matchups with very little data, we were left with just under 600 data points. In this subset, Elo predicted that the players who lost badly had a 33.6% chance of winning the follow-up match. As we’ve seen, the actual success rate was 29%. Players who won lopsided matches outperformed their Elo forecast in the next meeting.

It’s not a huge difference, but enough to suggest that the matchup tells a little bit about how the next contest will go. One match can make a difference in the forecast–as long as it isn’t against Dominic Thiem.

Digging into the cases when a player lost badly and then recovered, I found a couple of entertaining examples:

  • Former No. 7 Harold Solomon beat Ivan Lendl in their first meeting, 6-1 6-1. Later that year, they met again at the US Open, and Lendl won, 6-1 6-0 6-0. Lendl also won their six matches after that.
  • Over the course of four years, Phil Dent and Mark Cox played three lopsided matches against each other. Cox won the first, Dent got revenge in the second, and Cox reversed things again in the third.