Surface speed – Page 2 – Heavy Topspin

Slow Conditions Might Just Flip the Outcome of Federer-Nadal XL

Roger Federer likes his courts fast. Rafael Nadal likes them slow. With eight Wimbledon titles to his name, Federer is the superior grass court player, but the conditions at the All England Club have been unusually slow this year, closer to those of a medium-speed hard court.

On Friday, Federer and Nadal will face off for the 40th time, their first encounter at Wimbledon since the Spaniard triumped in their historical 2008 title-match battle. Rafa leads the head-to-head 24-15, including a straight-set victory at his favorite slam, Roland Garros, several weeks ago. But before that, Roger had won five in a row–all on hard courts–the last three without dropping a set.

Because of the contrast in styles and surface preferences, the speed of the conditions–a catch-all term for surface, balls, weather, and so on–is particularly important. Nadal is 14-2 against his rival on clay, with Federer holding a 13-10 edge on hard and grass. Another way of splitting up the results is by my surface speed metric, Simple Speed Rating (SSR). 22 of the matches have been been on a court that is slower than tour average, with the other 17 at or above tour average speed:

Matches     Avg SSR  RN - RF  Unret%  <= 3 shots  Avg Rally  
SSR < 0.92     0.74     17-5   21.2%       49.5%        4.7  
SSR >= 1.0     1.14     7-10   27.0%       56.9%        4.3

At faster events–all of which are on hard or grass–fewer serves come back, more points end by the third shot, and the overall rally length is shorter. Fed has the edge, with 10 wins in 17 tries, while on slower surfaces–all of the clay matches, plus a handful of more stately hard courts–Rafa cleans up.

Rafa broke Elo

According to my surface-weighted Elo ratings, Federer is the big semi-final favorite. He leads Nadal by 300 points in the grass-only Elo ratings, which gives him a 75% chance of advancing to the final. The betting market strongly disagrees, believing that Rafa is the favorite, with a 57% chance of winning.

The collective wisdom of the punters is onto something. Elo has systematically underwhelmed when it comes to forecasting the 39 previous Fedal matches. Federer has more often been the higher-rated player, and if Roger and Rafa behaved like the algorithm expected them to, the Swiss would be narrowly leading the head-to-head, 21-18. We might reasonably conclude that, going into Friday’s semi-final, Elo is once again underestimating the King of Clay.

How big of Fedal-specific adjustment is necessary? I fit a logit model to the previous 39 matches, using only the surface-weighted Elo forecast. The model makes a rough adjustment to account for Elo’s limitations, and reduces Roger’s chances of winning the semi-final from 74.8% all the way down to 48.5%.

Now, about those conditions

The updated 48.5% forecast takes the surface into account–that’s part of my Elo algorithm. But it doesn’t distinguish between slow grass and fast grass.

To fix that, I added SSR, my surface speed metric, to the logit model. The model’s prediction accuracy improved from 64% to 72%, its Brier score dropped slightly (a lower Brier score indicates better forecasts), and the revised model gives us a way of making surface-speed-specific forecasts for this matchup. Here are the forecasts for Federer at several surface speed ratings, from tour average (1.0) to the fastest ratings seen on the circuit:

SSR  p(Fed Wins)  
1.0        49.3%  
1.1        51.4%  
1.2        53.4%  
1.3        55.5%  
1.4        57.5%  
1.5        59.5%

In the fifteen years since Rafa and Roger began their rivalry, the Wimbledon surface has averaged around 1.20, 20% quicker than tour average. In 2006, when they first met at SW19, it was 1.24, and in 2008, it was 1.15. Three times in the last decade it has topped 1.30, 30% faster than the average ATP surface. This year, it has dropped almost all the way to average, at 1.00, when both men’s and women’s results are taken into account.

As the table shows, such a dramatic difference in conditions has the potential to influence the outcome. On a faster surface, which we’ve seen as recently as 2014, Federer has the edge. At this year’s apparent level, the model narrowly favors Nadal. Rafa has said that the surface itself is unchanged, but that the balls have been heavier due to humidity. He should hope for another muggy day on Friday–the end result could depend on it.

The Grass Dies, But the Speed Lives On

Italian translation at settesei.it

Earlier this week, I trotted out some stats showing that the Wimbledon grass is playing slower this year, the latest tick in a years-long trend. Many fans suspect that by the second week, the conditions are even slower still, with huge brown spots around each baseline where the players have worn away the grass. Assuming that the dying-grass effect is similar each year, this is something we can test.

I ran my surface speed algorithm for several subsets of Wimbledon men’s singles matches: week 1, week 2, each round from 1 to 4, and the final 8. For a single year, the “week 2,” “round 4,” and “final 8” samples are too small to give us any reliable indicators. But over the course of two decades, the differences between weeks and rounds–the effect we’re interested in today–should become clear.

(Quick refresher on my surface speed method: It uses ace rate as a proxy for speed–not perfect, but functional, using a stat that is universally available–and takes into account the server and returner in each match. An average court speed is 1.0, and ratings typically range from about 0.5 for a venue like Monte Carlo to 1.5 for the fastest grass and indoor hard courts.)

For example, here are the week-by-week and round-by-round speed ratings for the 2018 Wimbledon men’s draw:

Week 1: 1.16
Week 2: 1.16
Round 1: 1.02
Round 2: 1.29
Round 3: 1.33
Round 4: 1.25
Last 8: 1.08

I promised noise, and there it is. Each week is equally speedy, but the first round and last few rounds are oddly slower than the rest. I don’t have a good explanation for the first round (and there might not be one–it could be random), but the last 8 often features fewer aces, even when adjusting for the players involved. We’ll come back to that in a bit.

Wimbledon, 2000-18

Here are the same numbers, averaged over the last 19 Wimbledons:

Week 1: 1.20
Week 2: 1.21
Round 1: 1.19
Round 2: 1.20
Round 3: 1.21
Round 4: 1.25
Last 8: 1.16

The sample of the last 8 still deviates from the rest, but with more data, the difference is much smaller. The gap between 1.20 and 1.16 is just an ace or two per match. That’s not enough to reverse the outcome of any but the very closest matches.

As usual, I must acknowledge that an ace-based metric isn’t definitive. There’s more to court speed than what aces can tell us. It’s possible that the surface behaves differently as the grass is worn away, even if it doesn’t show up in serve stats. Since net approaches are increasingly rare, the service-box grass lasts longer than the baseline grass, meaning that the speed at which serves move through the court would be relatively unchanged. On the other hand, the biggest brown spots on court are behind the baseline, so most groundstrokes also bounce on green grass, not on brown dirt.

The best versus the best

Even the small difference between the last 8 and the rest of the tournament may not have anything to do with the decaying of the surface. Since 2000, the US Open has exhibited the same trend: 1.07 for week 1, 1.06 for round 4, and 0.97 for the final 8. (The Australian Open numbers are much noisier than the other slams, perhaps due to frequent use of the roof, so I’m hesitant to use them.)

It seems safe to assume that the hard courts in Flushing don’t suddenly get slower starting on Tuesday or Wednesday of the second week. Instead, I think the answer is in the mix of players–or more precisely, how those players interact with each other. By this ace-based metric, the Tour Finals have often been rated as one of the slowest indoor hard court events–even though the official Court Pace Index (CPI) ratings disagree.

In other words, aces tend to go down when the best play the best. Maybe the elites serve more tactically when facing tough opponents? Perhaps they focus more consistently on return, rarely allowing cheap aces? Maybe the best players know each other’s games so well that they anticipate even better than usual? This seems like an interesting line of research, even if it’s not something I’m going to resolve today.

The bottom line is that partly-brown Wimbledon courts play just about as fast as totally-green Wimbledon courts do. There might be a very minor slowdown toward the end of the fortnight, but even there, we should remain skeptical. The conditions are slow this year, but at least they won’t get much slower.

Yep, Wimbledon is Playing Slower This Year

Italian translation at settesei.it

The players are right. Wimbledon’s surface–or balls, or atmosphere, or aura–has slowed down in comparison with recent years. We’ve heard comments to that effect from Roger Federer, Milos Raonic, Boris Becker, Rafael Nadal, and many others. Raonic attributes the change to the grass, and Nadal to the balls. Regardless of the reason, the numbers back up their perceptions.

Here is an overview of several surface-speed indicators for the first three rounds of singles matches at Wimbledon, 2017-19:

                     2017   2018   2019  
Aces (Men)           8.9%  10.0%   8.5%  
Aces (Women)         4.1%   4.2%   4.1%  
                                         
Unret (Men)         36.0%  36.6%  33.3%  
Unret (Women)       25.9%  27.6%  25.2%  
                                         
<= 3 Shots (Men)    65.2%  65.6%  61.9%  
<= 3 Shots (Women)  55.3%  57.9%  55.0%  
                                         
Avg Rally (Men)       3.4    3.5    3.7  
Avg Rally (Women)     4.0    3.8    4.1

The second set of rows, "Unret," is the percent of unreturned serves. The next set, "<=3 Shots," is the percent of points that ended in three shots or less. For all four of the stats shown, including aces and average rally length, men's numbers point to slower conditions. The women's numbers are less clear, but to the extent that they point in either direction, they concur.

Not just 2019

Aggregate numbers such as these usually give us an idea of what's going on. But we can do better. The numbers above do not control for the mix of players or the length of their matches. For instance, 2019's rates would be different if John Isner, instead of Mikhail Kukushkin, had played a third-round match. The surface speed might have affected that result, but if we're going to compare ace rate from one year to the next, we shouldn't compare Isner's ace rate with Kukushkin's ace rate.

That's where my surface speed metric comes in. For each tournament, I control for the mix of servers and returners (yes, returners affect ace rate, too) to boil down each event to one number, representing how the tournament's ace rate compares to tour average. While there's more to surface speed than ace rate, aces are a good proxy for many of those other indicators, and more importantly, aces are one of the few stats that are available for every match.

The resulting score usually ranges between 0.5--50% fewer aces than average, usually on a slow clay court like Monte Carlo--and 1.5--50% more aces than average, on a fast grass or indoor hard court, like Antalya or Metz. Over the last decade, Wimbledon's conditions have drifted from the high end of that range to the middle:

Year      Men    Women  Average  
2011     1.26     1.37     1.31  
2012     1.27     1.06     1.17  
2013     1.29     1.04     1.17  
2014     1.35     1.19     1.27  
2015     1.20     1.16     1.18  
2016     1.06     1.03     1.04  
2017     1.03     1.07     1.05  
2018     1.14     0.98     1.06  
2019     1.04     0.96     1.00

The men's numbers are usually more reliable measurements, because they are based on many more aces, which means that the ace rate for any given match is less fluky. Ideally, we'd see the men's and women's speed ratings move in lockstep, but there is some noise in the calculation, and the ratings are also relative to that year's tour average, which depends in turn on the changing speeds of dozens of other surfaces.

Caveats aside, the direction of the trend is clear. There isn't a substantial difference between 2019 and the last few years, but the gap between the first and second half of the decade is dramatic.

What is less clear--and will require considerable further research--is how much it matters. In 2014, Nick Kyrgios upset Nadal in four sets, while last week, the result was reversed. How much of that can we attribute to the surface? Would faster conditions have allowed Isner to outlast Kukushkin? Kevin Anderson to hold off Guido Pella? Jelena Ostapenko to withstand Su Wei Hsieh?

For now, those questions remain in the speculation-only file. Now that we can conclude that the grass really has gotten slower, we can focus that speculation on the fates of several grass court savants, including Federer, Raonic, and Karolina Pliskova. By the end of the fortnight, they--like Kyrgios--might be wishing it was 2014 again.

The Happy Slam is the Speedy Slam

Italian translation at settesei.it

Two years ago, during the 2017 Australian Open, I offered a partial explanation of the many upsets at that year’s first major. Novak Djokovic, Andy Murray, Angelique Kerber, Simona Halep and many others had been ousted before the quarter-finals, all to players with a more aggressive, attacking style. It turned out that the courts that year were playing particularly fast–quicker than any of the other slams, including Wimbledon, as well as most hard-court tour stops.

In Melbourne this year, the courts are playing even faster.

Through three rounds of play, almost 90% of the tournament’s singles matches are in the books. Based on my surface-speed metric, which measures how many aces are struck at each tournament while controlling for the mix of servers and returners, the 2019 Australian Open can boast the quickest surface at the event since at least 2011*, and the second-fastest conditions of any major in that time span.

* Match stats, even simple ones such as service points and aces, are increasingly tough to come by for the women’s game before 2011.

The average of my surface-speed ratings for the men’s and women’s events at 2019’s first major is 1.28, meaning that there have been 28% more aces than expected, given the mix of servers and returners across the matches played so far. The notably fast 2017 event was 1.23, the fastest US Open of the last eight years was 1.14 (in 2015), and last year’s Wimbledon, played on the surface that is supposed to be fastest of all, was a mere 1.06.

Here are the top ten fastest slam surfaces from 2011 to the present:

Speed Rating Tournament      
1.31     2011 Wimbledon    
1.28     2019 Australian Open* 
1.27     2014 Wimbledon    
1.27     2016 Australian Open 
1.23     2017 Australian Open 
1.20     2015 Australian Open 
1.18     2015 Wimbledon    
1.17     2013 Wimbledon    
1.17     2012 Wimbledon    
1.15     2014 Australian Open

* through first three rounds

Last year’s Aussie Open was a bit of an outlier, but even still, it barely missed this list, coming in 12th at 1.12.

At least most players arrived prepared. The warm-up events in Brisbane and Auckland ranked among the fastest conditions since the beginning of last season: Brisbane rates at 1.29 while Auckland came in at a blink-and-you’ll-miss-it 1.35. Last year, only four events per tour were faster.

In theory, such a speedy surface should work to the advantage of big servers with aggressive games. At least so far, it hasn’t worked out that way. Unlike in 2017, Djokovic, Halep, and Kerber are still in the running, while Kevin Anderson was an early casualty. On the other hand, the court speed does jibe with some results, like Maria Sharapova’s third-round upset of defending champion Caroline Wozniacki.

If the conditions are to impact the result of the tournament, it will have to happen in matches yet to come. A slick surface tends to favor Roger Federer, even if Djokovic remains the popular pick to hoist the trophy next Sunday. More immediately, a fast surface doesn’t bode well for Halep’s chances in her fourth-round match against Serena Williams. Facing Serena is difficult enough without the conditions working against you, too.

How Fast Was the Laver Cup Court?

Embed from Getty Images

Italian translation at settesei.it

Laver Cup has redefined what a tennis event can be, and so far, the new definition seems to involve fast courts. Last year, we saw nine tiebreaks out of eighteen traditional sets, plus a pair of match tiebreaks that went to 11-9. This year’s edition wasn’t quite so extreme, with five tiebreaks out of sixteen traditional sets, but it still featured more tight sets than the typical tour event, in which tiebreaks occur less than once every five frames.

As usual, teasing out surface speed comes with its share of obstacles. Yes, there were lots of tiebreaks and yes, there were plenty of aces, but the player field featured more than its share of big servers. John Isner, Nick Kyrgios, and Roger Federer each contested two matches each year, and in Chicago, Kevin Anderson represented one-quarter of Team World’s singles contribution. No matter what the surface, we’d expect these guys to give us more serve-dominated matches than the tour-wide average.

Let’s turn to the results of my surface speed metric, which compares tournaments by using ace rate, adjusted for the serve and returning tendencies of the players at each event. The table below shows raw ace rate (“Ace%”) and the speed rating (“Speed”) for ten events from the last 52 weeks: The four 2018 grand slams, the fastest and slowest tour stops (Metz and Estoril, respectively), the two Laver Cups, and the two events that rate closest to the Laver Cups (Antalya and New York).

Year  Event            Surface   Ace%  Speed  
2018  Metz             Hard     10.6%   1.57  
2018  Antalya          Grass     9.9%   1.28  
2017  Laver Cup        Hard     17.0%   1.26  
2018  Australian Open  Hard     11.7%   1.17  
2018  Wimbledon        Grass    12.9%   1.16  
2018  Laver Cup        Hard     13.3%   1.09  
2018  New York         Hard     15.7%   1.09  
2018  US Open          Hard     10.8%   1.02  
2018  Roland Garros    Clay      7.7%   0.74  
2018  Estoril          Clay      5.2%   0.55

The speed rating metric ranges from about 0.5 for the slowest surfaces to 1.5 for the fastest, meaning that the stickiest clay results in about half as many aces as the same players would tally on a neutral surface, while the quickest grass or plexipave would give the same guys about half again as many aces as a neutral court would.

Last year’s Laver Cup, despite a whopping 17% ace rate, was barely among the top ten fastest courts out of the 67 tour stops I was able to rate. The surface in Chicago was on the edge of the top third, behind the speedy clay of Quito and considerably slower than the Australian Open.

These conclusions come with the usual share of caveats. First, surface speed is about more than ace rate. I’ve stuck with my ace-based metric because it’s one of the few stats we have for every tour-level event, and because despite its simplicity, it tracks closely with intuition, other forms of measurement, and player comments. Second, we’re not exactly overloaded with observations from either edition of the Laver Cup. Last year’s event featured nine singles matches, and this year there were eight. It’s even worse than that, because third sets are swapped out for match tiebreaks, leaving us even less data. That said, while we don’t have many matches to work with, we know a lot about the players involved, which isn’t as true of, say, Newport or Shenzhen, where a larger number of matches are contested by players who don’t make many appearances on tour.

The two Laver Cup surfaces rate as speedy, but not out of line with other indoor hard courts on the ATP tour. There will be tiebreaks and plenty of aces wherever Isner and Anderson go, no matter what the conditions.

The US Open Surface Speed Puzzle

Embed from Getty Images

Italian translation at settesei.it

Almost everyone agrees that the courts were slower at the US Open this year. The players thought so, the media concurred, and the tournament director confirmed that they had slightly changed the physical makeup of the surface in order to slow things down. Even clay-court wizard Dominic Thiem got within two points of the semi-finals, so clearly something changed.

I’m not going to argue with that. But when I set out to measure the change and get a sense of who might have benefited, I kept finding odd results. Almost nothing I tried revealed any clear-cut slowing of the surface, and by some metrics, the courts in Flushing played faster this year. Maybe it was just the heat and humidity–though the numbers don’t make that clear, either.

My usual starting point is my own surface-speed stat, which compares the ace rate at each tournament while controlling for the mix of servers and returners. While the dearth of advanced stats means it is limited to some basic inputs, it usually matches up quite well with our intution and doesn’t differ too much from Court Pace Index (CPI), an infrequently-available metric based on direct physical measurements. Using my algorithm, the US Open surface was 5% faster than the average surface at an ATP event in the last 52 weeks, compared to last year, when it was 4% slower. Compared to courts at the average WTA tournament, New York was 5% slower this year, versus 19% slower in 2017. The slowest tour-level surfaces (for either gender) have about 50% fewer aces than average, and the fastest have about 50% more.

2017 wasn’t just a blip, either in real-life on in my metric. It was similar to 2016, which also rated as considerably slower than this year’s surfaces. We’re left with a discrepancy that may stem from using an algorithm that relies too much on aces: Perhaps players were overwhelmed by the heat and tried more than usual to keep rallies short, or they simply didn’t bother trying to put their racket on first serves as often.

The evidence is clearer that players were more aggressive this year than in 2017. The average rally length, excluding double faults, on courts covered by Slamtracker (179 of the 254 main draw singles matches) fell from 4.28 shots last year to 4.17 this year, a drop of 2.6%. That could be affected by the changing mix of players in the draw (as well as those selected to play on higher-profile courts) so I isolated the 27 players with at least two matches worth of data from both 2017 and 2018. Those 27 saw their rally length drop a tiny bit more, about 3% from last year to this year.

We have the beginnings of an explanation. If players were showing more aggression–perhaps because the heat encouraged them to adopt more first-strike tactics–that could cancel out the effect of a slower surface. We can drill down even further using the Aggression Score (AS) metric, which measures the rate of winners and unforced errors per shot. Across all matches, AS rose from 15.3% in 2017 to 16.1% this year, an increase of 5.7%. Using the 27 players with multiple matches from both years’ tournaments, the difference is more stark, rising by 8.7%.

It’s clear that we saw more aggressive tennis at the 2018 Open than the year before. If we take for granted that the courts played faster, the case is closed: Tactics, probably heat-induced, outweighed surface. But if we approached the problem without knowing what players, media, and tournament officials said, the same numbers would unequivocally point to an even simpler conclusion, that the courts played faster.

If tactics explain our discrepancy, one more place we might look is first serves. Maybe servers took more chances, increasing their ace rate at the expense of first-serve percentage. But the data doesn’t back us up: The overall first-serve percentage in Slamtracker matches fell by a mere 0.07%. Using year-to-year comparisons for our set of 27 players, the difference was larger, but still a measly 0.3%. If tactics are the answer, it must be on the return of serve, not the serve itself.

This is where the trail runs cold. Return tactics are tougher to quantify than serving strategy, and there’s a limit to how much we can do with the available data. We can tally return winners and induced forced errors (IFEs), points in which the returner ended things with a strong reply. If returners allowed more aces, it should be because they took a more aggressive approach, trading fewer opportunities for better odds of winning when they did make contact. Instead, the record shows that return winners and IFEs fell a whopping 7% from last year to this year. That number supports the theory of a slower surface, and it meets expectations for those players who adopted very conservative return positions, such as Rafael Nadal, whose return winner/IFE rate went down by 3%, and Thiem, whose rate decreased by 7%. But a slower surface and a lower return winner/IFE rate should add up to fewer aces, not more.

Compared to where we started, we have a lot more data but not many more answers. Some signs point to a faster surface, others to a slower; some indicate more aggressive tactics, others more conservative ones. regardless of what we know about the physical makeup of the courts, there are many factors that influence what we refer to as “surface speed.” The hot, humid conditions in Flushing this year surely help complicate things–perhaps a study that took into account the heat index for each individual match would shed more light on these questions. We could also be seeing players adapt to the conditions–whether the heat or the slower surface–in different ways. Everyone may agree about how the courts played this year, but it’s much more difficult to pin down exactly what that means.

Dominic Thiem, Old-School Clay Court Specialist

Italian translation at settesei.it

With a tennis calendar tilted heavily toward hard court events, we don’t see many true clay court specialists these days. The best male players who excel on clay are forced to adapt their games to hard courts, as well: Rafael Nadal has won six majors off of clay, while Pablo Carreno Busta and Diego Schwartzman have both hoisted trophies at tour level hard court events. It’s possible to play a mostly-clay schedule at the Challenger level, but it’s nearly impossible to establish yourself as an ATP regular without winning some matches on hard courts.

Dominic Thiem is capable enough on fast surfaces, but more than any other tour player, he is considerably better on the dirt than off it. In the last 52 weeks, he has won 25 of 31 matches on clay, compared to only 24 of 42 on other surfaces. Against the top ten, he is a respectable 7-9 on clay (more impressive when you consider that 12 of the 16 matches were against the Big Four, seven of them facing Nadal, and two of the others came against Stan Wawrinka), but a dismal 2-15 on hard courts. If you, like me, had settled into thinking of Thiem as a solid but not particuarly threatening member of the top ten, you probably didn’t realize quite how bad he is on hard courts–or just how good he has become on clay.

When only clay court results are taken into consideration, Thiem rates as the second-best player on the surface. According to clay court Elo, the Austrian outranks everyone on tour except for Rafa and Novak Djokovic, whose rating reflects his skill level when he last regularly played and very likely will overstate his ability when he returns. Thiem trails Nadal by about 180 points, 2410 to 2235, implying that in a head-to-head matchup, we’d except the Austrian to win only 26% of the time. But when we compare Thiem to the rest of the pack and exclude the walking wounded–Djokovic, Wawrinka, Andy Murray, and Kei Nishikori–along with clay-skipper Roger Federer, his position looks much better. The next best clay courter, Alexander Zverev, trails Thiem by about the same margin, nearly 170 points.

A clay court Elo rating over 2200 is a useful marker of elite status. In the professional era, only 29 players have reached that level, 22 of whom can count at least one Grand Slam title to their names. Among active players, only the Big Four, Nishikori, Juan Martin del Potro, David Ferrer, and Thiem belong to the club.

Where Thiem really stands out is the juxtaposition of his clay court success and his hard court mediocrity. After his title last week in Buenos Aires, his Elo rating based only on clay court results was 2234, compared to a hard court rating of 1869. The first number, as we’ve seen, is good for third overall, second if we exclude Djokovic’s increasingly stale results; the second puts him 31st on tour, behind Schwartzman, Damir Dzumhur, and Fabio Fognini.

No one active today is more of a clay specialist–in the sense that his results on clay exceed his results on hard–than Thiem. (There are some even more extreme differences between grass and either hard or clay, but the brevity of the grass season means that many of those contrasts are due only to small samples.) The ratio of Thiem’s clay court Elo rating to his hard court rating–again, 2234 to 1869–is 1.20, far beyond any of the 44 other active players with a clay court Elo rating of 1800 or higher. Simone Bolelli comes in second, at 1.12, and a handful of players, including Nadal, register at 1.10. Here is the entire top 20:

Player                 Clay Elo  Hard Elo  Ratio
Dominic Thiem              2234      1869   1.20
Simone Bolelli             1834      1634   1.12
Rafael Nadal               2410      2182   1.10
Albert Ramos               1873      1696   1.10
Federico Delbonis          1869      1696   1.10
Pablo Carreno Busta        1921      1746   1.10
Pablo Cuevas               1873      1709   1.10
Nicolas Almagro            1903      1755   1.08
Karen Khachanov            1838      1701   1.08
Leonardo Mayer             1878      1741   1.08
Aljaz Bedene               1826      1695   1.08
David Ferrer               2017      1894   1.07
Philipp Kohlschreiber      1951      1845   1.06
Stan Wawrinka              2138      2027   1.06
Martin Klizan              1800      1709   1.05
Guido Pella                1825      1744   1.05
Borna Coric                1830      1760   1.04
Fernando Verdasco          1863      1794   1.04
Alexander Zverev           2067      1997   1.04
Feliciano Lopez            1830      1772   1.03

A few decades ago, when it was possible for top players to spend more than two or three months per year racking up points on clay courts, such lopsided ratings were a bit more common. Of the 29 men who have ever topped 2200 in clay court Elo rating, 11 have at some point recorded a ratio of 1.20 or higher. That includes Nadal, whose clay rating was 20% higher than his hard court number early in 2008, and Sergi Bruguera, whose ratio topped out at 1.29. Four other major titlists–Bjorn Borg, Juan Carlos Ferrero, Thomas Muster, and Guillermo Vilas–also exceeded 1.20 at some point during their career. To put Thiem’s specialization in context, though, consider that Guillermo Coria maxed out at 1.19 and Gustavo Kuerten peaked at 1.16. Even Ferrer–the epitome of the clay court specialist to a generation of fans–never exceeded 1.15 once his clay court Elo rating had passed the 2000-point threshold.

The category into which Thiem fits most neatly–specialists who are decidedly middle-of-the-pack on hard courts–largely belongs to an earlier era. When we lower our clay court Elo standard to a career peak of 2000 points, a mark equal to about 15th on tour right now, we’re left with a group of 145 players in the professional era. Of those, 65 (45%) were at some point as lopsided as Thiem is now, with a clay-to-hard rating ratio of at least 1.20. Yet only five of those belong to active players (Nadal, Thiem, Fognini, Pablo Cuevas, and Nicolas Almagro) and two-thirds of them came before 1995.

In some cases, players with substantially better clay court results learn to compete at a higher level on faster surfaces. Thiem is 24, and Nadal had a similar specialist’s ratio at age 22. Other former greats enjoyed early success on clay and quickly figured out hard courts as well. The Austrian may prove to be a late bloomer in that regard. That’s unlikely, but when Nadal retires or (improbable as it seems) fades, Thiem is poised to rack up titles and emerge as the greatest clay court player of his generation, regardless of whether his hard court game improves.

Unpredictable Bounces, Predictable Results

Italian translation at settesei.it

These days, the grass court season is the awkward stepchild of the tennis calendar. It takes place almost entirely within a single country’s borders, lasts barely a month, and often suffers from the absence of top players who prefer to rest after the French Open.

The small number of grass court events makes the surface problematic for analysts, as well. The surface behaves differently than hard or clay courts and rewards certain playing styles, so it’s reasonable to assume that many players will be particularly good or bad on grass. But with 90% of tour-level matches contested on other surfaces, many players don’t have much of a track record with which we can assess their grass-court prowess.

I was surprised, then, to find that grass court results are rather predictable. Elo-based forecasts of ATP grass court matches are almost as accurate as hard court predictions and considerably more effective than clay court forecasts. Even when we use “pure” surface forecasts–that is, predicting matches using ratings which draw only on results from that surface–grass court forecasts are a bit better than clay court predictions.

I started with a dataset of the roughly 50,000 ATP matches from 2000 through last week, excluding retirements and withdrawals. As a benchmark, I used official ATP rankings to make predictions for each of those matches. 66.6% of them were right, and the Brier score for ATP rankings over that span is .210. (Brier score measures the accuracy of a set of forecasts by averaging the squared error of each individual forecast, so a lower number is better. To put tennis-specific Brier scores in context, in 2016, ATP rankings had a .208 Brier score, and aggregate betting odds had a .189 Brier score.)

Let’s break that down by surface and compare the performance of ATP rankings, Elo, and surface-specific Elo. “F%” is the percentage of matches won by the favorite–as determined by that system, and “Br” is Brier score:

Surface  ATP F%  ATP Br  Elo F%  Elo Br  sElo F%  sElo Br  
Hard      67.3%   0.207   68.0%   0.205    68.5%    0.202  
Clay      66.1%   0.211   67.1%   0.211    67.0%    0.213  
Grass     66.0%   0.215   67.6%   0.207    68.5%    0.207

All three rating systems do best on hard courts, and for good reason: official rankings and overall Elo are more heavily weighted toward hard court results than they are clay or grass. Surface-specific Elo does best on hard courts for a similar reason: more data.

Already, though, we can see the unexpected divergence of clay and grass courts, especially with surface-specific Elo. It’s possible to explain overall Elo’s better performance on grass courts due to the presumed similarly between hard and grass–if a player excels on one, he’s probably good on the other, even if he’s horrible on clay. But that doesn’t explain sElo doing better on grass than on clay. There are 3.3 times as many tour-level matches on clay than on grass, so even allowing for the fact that players choose schedules to suit their surface preferences, almost everyone is going to have more results on dirt than on turf. More data should give us better results, but not here.

We can improve our forecasts even more by blending surface-specific ratings with overall ratings. After testing a wide range of possible mixes, it turns out that equally weighting Elo and sElo provides close to the best results. (The differences between, say, 60/40 and 50/50 are extremely small on all surfaces, so even where 60/40 is a bit better, I prefer to keep it simple with a half-and-half mix.) Here are the results for weighted surface Elos for all three surfaces:

Surface  ATP F%  ATP Br  
Hard      68.6%   0.202  
Clay      68.0%   0.207  
Grass     69.8%   0.196

Now grass courts are the most predictable of the major surfaces! Even when we use a weighted average of Elo and sElo, grass court forecasts rely on less data than those of the other surfaces–the surface-specific half of the grass court forecasts uses less than one-third the match results of clay court predictions and less than one-fifth the results of hard court forecasts. In fact, we can do at least as well–and perhaps a tiny bit better–with even less data: A 50/50 weighting of grass-specific Elo and hard-specific Elo is just as accurate as the half-and-half mix of grass-specific and overall Elo.

Regardless of the exact formula, it remains striking that we can predict ATP grass court results so accurately from such limited data. Even if one-third of ATP events were played on grass, I still wouldn’t have been surprised if grass court results turned out to be the least predictable. The more a surface favors the server–and it’s hardest to break on grass–the tighter the scoreline will tend to be, introducing more randomness into the end result. Despite that structural tendency, we’re able to pick winners as successfully on grass as on the more common surfaces.

Here’s my theory: Even though there aren’t many grass court events, the conditions at those few tournaments are quite consistent. Altitude is roughly sea level, groundskeepers follow the lead of the staff at Wimbledon, and rain clouds are almost always in sight. Compare that homogeneity to the variety of hard courts or clay courts. The high-altitude hard courts in Bogota are nothing like the slow ones in Indian Wells. The “clay” in Houston is only nominally equal to the crushed brick of Roland Garros. While grass courts are almost identical to each other, clay courts are nearly as different from each other as they are from other surfaces.

It makes sense that ratings based on a uniform surface would be more accurate than ratings based on a wide range of surfaces, and it’s reassuring to find that the limited available data doesn’t cancel out the advantage. This research also suggests a further path to better forecasts: grouping hard and clay matches by a more precise measure of surface speed. If 10% of tour matches are sufficient to make accurate grass court predictions, the same may be true of the slowest one-third of clay courts. More data is almost always better, but sometimes, precisely targeted data is best of all.

The Proud Tradition of Americans Skipping Monte Carlo

Italian translation at settesei.it

The Monte Carlo Masters is unique among the ATP’s 1,000 series events. The stakes are high, but attendance isn’t mandatory, so while most of the game’s top players show up, a few take the week off. No group has so consistently skipped Monte Carlo than players from the U.S.A.

This year, six U.S. players had rankings that would’ve gotten them into the Monte Carlo main draw, where winning a single match earns you 45 ranking points and just over €28,000 in prize money. Five of those players–including John Isner, who reached the third round two years ago and won a pair of tough Davis Cup matches at the same venue–opted out. All five played the 250-level Houston tournament last week instead. Only Ryan Harrison made the trip to Europe–losing in the opening round, as Carl Bialik and I safely predicted on this week’s podcast.

Choosing the low-stakes event on home soil isn’t the wise choice, but it’s nothing new. Since 2006, only seven Americans have appeared in a Monte Carlo main draw: Isner twice, Harrison, Sam Querrey, Donald Young, Steve Johnson, and Denis Kudla, who qualified in 2015. From 2006 to 2016, 7 of the 11 Monte Carlo draws were entirely USA-free. In the same time span, Houston draws have featured 35 Americans ranked in the top 60–all players who probably would have earned direct entry in the higher-stakes clay event, as well.

For a player like Isner or Jack Sock, an April schedule can handle both tournaments. Four of the seven Americans who went to Monte Carlo played Houston as well, including Querrey in 2008, when he lost in the first round in Houston but reached the final eight in Monte Carlo.

Most U.S. players, including just about everyone I’ve mentioned so far, would much rather play on hard courts than on clay. (The Houston surface is more conducive to aggressive, first-strike tennis than is the Monte Carlo dirt, one of the slowest surfaces on the calendar.) However, as Isner and Querrey have shown, a one-dimensional power game can succeed on a slow court, even if it looks nothing like the strategy of a traditional clay specialist.

Isner, in particular, has racked up plenty of points on the surface. While he’d much rather play on home soil, he has twice reached the fourth round at the French Open and pushed none other Rafael Nadal to a deciding set in both Paris and Monte Carlo. Sock is also a threat on the surface, having won nearly two-thirds of his tour-level matches on clay. Many of those wins came in Houston, but like Isner, he took a set from Nadal in Europe on the surface the Spaniard typically dominates.

Even if the top Americans had little chance of going deep in Monte Carlo, one wonders what the additional time on the surface would do for the rest of their clay season. Most will show up for Madrid and Rome, and all of them will play Roland Garros. It’s a bit of a chicken-and-egg question–do Americans avoid the dirt because they suck on clay, or do they suck because they avoid it?–but it couldn’t hurt to play on the more traditional European surface against elite-level opponents.

The difference in rewards between a 250 like Houston and a Masters 1000 like Monte Carlo make it likely that the risk of playing in unfamiliar territory would pay off, as it did for Querrey in his one trip and for Isner two years ago. And I suspect that the rewards would stretch beyond the immediate shot at a bigger payday: If someone like Sock invested more time in developing his clay-court game now, he could become a legitimate threat at a faster clay tournament (such as the Madrid Masters) in a few years. It’s probably too late for the likes of Querrey, but the next generation of U.S. men’s stars would do well to break with tradition and give themselves more chances to excel on the dirt.

The Speed of Every Surface, 2016 Edition

More than five years after I first started trying to use ATP match stats to estimate surface speed, the issue remains a contentious one. Most commentators agree that surface speeds have converged and generally gotten slower. The ATP has begun to release a trickle of court speed data, but it raises more questions than answers.

It’s been three years since I’ve published surface speed numbers, so we’re due for an update. Before we do that, it’s important to understand what exactly these figures mean, as well as their limitations.

Court surfaces–and, more broadly, the environments in which pro matches are played–have a variety of characteristics. Some courts are faster or slower and some cause higher or lower bounces. Tournaments use different balls, are played at a range of elevations, and take place in all sorts of weather conditions. All of these factors, and more, affect how matches are played.

Due to the limits of available tennis data, however, we can’t isolate those different factors. It would be great to know which surfaces allowed for the most effective slice approaches or the deadliest drop shots, but we don’t have the data to even begin trying to answer those questions. The Match Charting Project is a step in the right direction, but with only a few hundred men’s matches per year, there isn’t quite enough to compare surfaces while controlling for different players and playing styles.

So we work with what we have. Faster surfaces are more favorable to the server, which shows up in ace counts and service breaks. The ATP publishes those basic stats for every match, so that’s what we’ll use. When I first researched this issue, I discovered that there isn’t much difference between counting aces and counting service breaks, except that there’s a wider variation in ace rates between faster and slower surfaces, so the resulting numbers are easier to understand.

At the risk of repeating myself: Measuring surface speed by ace rate ignores a lot of court characteristics. It is far from complete and certainly imperfect. It does, however, give us an idea of how tournaments compare in one important regard.

Aces, adjusted

That said, simply counting aces–for example, 6.8% of points in Buenos Aires this year and 11.2% of points in Los Cabos–isn’t good enough. Players make scheduling choices based on their strengths and preferences, so the guys who show up for clay court events tend, on average, to be weaker servers than those who play on hard and grass courts. To take an extreme example, Gilles Muller managed to play only two matches on clay this season. As it turns out, the courts in Buenos Aires and Los Cabos had almost identical effects on ace rates–the difference is entirely due to the mix of players in each draw.

So we adjust for the makeup of the field. For every player with at least three tour-level matches on clay and another three on hard or grass, I calculated their season average ace rates on clay and hard/grass,which I then weighted (one-third clay, two-thirds hard/grass) so that the numbers give us idea of what their ace rate would’ve been had they played an “average” (that is, unbiased by scheduling preferences) season. I’ve lumped hard and grass together here, not because they are the same–of course they’re not–but because the small number of grass court events makes it difficult to treat on its own.

With player averages in hand, we can go through every match of the season (between players who meet our minimums) and, using their ace rates and the rates at which players hit aces against them, calculate a “predicted” ace rate for the match, given a neutral surface. Then, by comparing the match’s actual ace rate to the neutral prediction, we get one data point regarding the surface’s effect on aces. If the actual ace rate is greater than the prediction, it suggests the surface is faster than average. If the prediction is greater than the ace rate, it implies the surface is slower than average.

No single match can tell us about a court’s tendency, but by aggregating all the matches at an event, we get a fairly good idea. With that final step, we get a single number per event. A neutral surface rates at 1, faster surfaces are greater than 1, and slower surfaces are less than 1. For instance, this algorithm rates the 2016 Paris Masters as 1.18, meaning that there were 18% more aces than we would expect on a neutral surface, rating Bercy as faster than all but 10 other events this season.

Whew! Here are the ace-based surface ratings for the last three seasons of every current tour-level event listed from fastest to slowest:

Tournament            Surface  2016 Ace%  2016  2015  2014  
Shenzhen                 Hard      12.9%  1.54  1.20  1.49  
Quito                    Clay      11.9%  1.50  0.89        
Metz                     Hard      12.6%  1.43  1.28  1.37  
Marseille                Hard      15.3%  1.38  1.28  1.26  
Stuttgart               Grass      13.3%  1.38  1.32  0.89  
Chengdu                  Hard      11.7%  1.27              
Australian Open          Hard      12.3%  1.25  1.19  1.12  
Queen's Club            Grass      14.3%  1.25  1.27  1.26  
Washington               Hard      19.5%  1.24  1.12  1.25  
Cincinnati Masters       Hard      14.2%  1.18  1.04  1.17  
Paris Masters            Hard      13.7%  1.18  1.03  1.03  
Brisbane                 Hard      12.2%  1.16  1.20  1.23  
Canada Masters           Hard      12.6%  1.16  1.08  1.00  
Halle                   Grass      12.2%  1.16  1.12  1.31  
Nottingham              Grass      12.0%  1.15  1.21        
Gstaad                   Clay      10.1%  1.12  0.84  0.77  
Basel                    Hard      10.1%  1.12  1.01  1.20  
Tokyo                    Hard      11.5%  1.12  1.00  1.06  
Chennai                  Hard      10.3%  1.12  0.91  0.65  
Auckland                 Hard      12.9%  1.11  1.21  1.01  
                                                            
Tournament            Surface  2016 Ace%  2016  2015  2014  
Doha                     Hard       8.8%  1.11  1.06  0.83  
Sydney                   Hard      10.5%  1.11  1.32  1.27  
Montpellier              Hard       9.7%  1.10  1.29  1.29  
Shanghai Masters         Hard      10.7%  1.10  1.05  1.34  
Kitzbuhel                Clay       6.9%  1.09  0.85  0.81  
s-Hertogenbosch         Grass      13.2%  1.08  1.06  1.05  
Winston-Salem            Hard      10.4%  1.07  1.33  1.10  
Newport                 Grass      11.0%  1.07  1.26  1.23  
Tour Finals              Hard       9.5%  1.06  0.99  0.89  
Wimbledon               Grass      11.8%  1.06  1.20  1.35  
Rotterdam                Hard       9.8%  1.04  1.19  1.08  
Vienna                   Hard      11.8%  1.02  1.39  1.26  
Memphis                  Hard       8.7%  1.00  1.19  0.94  
Miami Masters            Hard      10.0%  1.00  0.86  1.04  
Sofia                    Hard       8.4%  1.00              
Beijing                  Hard       9.4%  0.99  1.05  0.81  
Atlanta                  Hard      15.5%  0.97  1.35  0.90  
St.Petersburg            Hard       8.1%  0.97  0.98        
Marrakech                Clay       8.5%  0.95              
Olympics                 Hard       7.1%  0.95              
                                                            
Tournament            Surface  2016 Ace%  2016  2015  2014  
Moscow                   Hard       6.6%  0.94  1.08  1.12  
Antwerp                  Hard       8.6%  0.93              
Delray Beach             Hard       9.2%  0.92  0.88  0.93  
US Open                  Hard       8.9%  0.91  1.10  1.10  
Dubai                    Hard       9.4%  0.88  0.93  0.81  
Madrid Masters           Clay       8.6%  0.86  0.85  0.94  
Los Cabos                Hard      11.2%  0.85              
Buenos Aires             Clay       6.8%  0.85  0.78  0.64  
Houston                  Clay      11.5%  0.84  0.76  0.70  
Sao Paulo                Clay       7.1%  0.83  1.03  1.20  
Acapulco                 Hard      10.5%  0.83  0.67  0.98  
Indian Wells Masters     Hard       8.2%  0.83  0.99  0.90  
Stockholm                Hard       7.6%  0.82  1.13  1.15  
Rio de Janeiro           Clay       7.4%  0.81  0.80  0.77  
Estoril                  Clay       7.4%  0.80  0.63  0.62  
Nice                     Clay       6.3%  0.79  0.64  0.74  
Geneva                   Clay       8.3%  0.77  0.78        
Umag                     Clay       5.4%  0.77  0.67  0.76  
Roland Garros            Clay       7.6%  0.77  0.72  0.71  
Rome Masters             Clay       7.2%  0.76  0.94  0.74  
Bucharest                Clay       5.9%  0.71  0.59  0.51  
Munich                   Clay       6.3%  0.71  1.01  0.87  
Monte Carlo Masters      Clay       6.2%  0.70  0.63  0.64  
Istanbul                 Clay       5.7%  0.67  0.83        
Barcelona                Clay       5.4%  0.65  0.70  0.72  
Bastad                   Clay       5.3%  0.65  0.64  1.07  
Hamburg                  Clay       5.7%  0.60  0.62  0.79

As usual, we have an interesting mix of usual suspects and surprises. The top of the list is primarily indoor hard and grass courts, along with the high-altitude clay in Quito and Gstaad. However, in both of the latter cases, those tournaments had lower-than-expected ace rates in 2015. The surface ratings for 250s are particularly volatile because, in addition to the small number of matches, many of these matches must be discarded because one or both of the players didn’t meet our minimums. For the 2015 Quito event, we have only 11 matches to work with.

The sample size problem doesn’t apply to larger events, however, so we can have a fair amount of confidence in the ratings for the Australian Open, showing up here as the fastest of the Grand Slams–considerably faster than Wimbledon, which is only a few ticks above neutral.

Ace ratings and Court Pace Index

Last month, TennisTV released some data on court speed for this season’s Masters events. Court Pace Index (CPI) is a commonly-accepted measure of the speed of the surface itself–that is, the physical makeup of the court. As I’ve said, that’s far from the only factor affecting how a court plays, but it is an important one.

Here’s how my surface ratings compare to CPI:

Tournament            Surface  TA Rating   CPI  
Cincinnati Masters       Hard       1.18  35.1  
Paris Masters            Hard       1.18  39.1  
Canada Masters           Hard       1.16  35.2  
Shanghai Masters         Hard       1.10  44.1  
Tour Finals              Hard       1.06  40.6  
Miami Masters            Hard       1.00  33.1  
Madrid Masters           Clay       0.86  22.5  
Indian Wells Masters     Hard       0.83  30.0  
Rome Masters             Clay       0.76  24.0  
Monte Carlo Masters      Clay       0.70  23.7

It’s noteworthy that Madrid is, by my measure, the most ace-friendly of the three clay-court Masters, while its CPI is the lowest. Altitude could account for the difference.

The biggest mismatch, though, is the Tour Finals. The O2 Arena has one of the highest CPIs, but it doesn’t rate very far above average in aces. The Tour Finals has always been a bit problematic, as there is an unusually small number of matches, and the level of returning is very, very high. My algorithm takes into account how well each player prevents aces, but perhaps that issue is more complex when our view is limited to only the very best players.

TennisTV also showed CPI for the last several years of Tour Finals:

Compared to my ratings:

Year  TA Rating   CPI  
2016       1.06  40.6  
2015       0.99  34.0  
2014       0.89  33.6  
2013       0.90  32.8  
2012       1.18  33.9

If the table cut off after 2013, it would look like a relatively good fit. As it is, the relationship between CPI and my rating for 2012 wouldn’t be out of place in the previous table, which included a 35.1 CPI for Cincinnati to go with an ace-based rating of 1.18.

I hope that this is a sign of more data to come. If so, we can move beyond approximations based on ace rate to get a better sense of what factors influence play at the ATP level. More data won’t settle the age-old surface speed debates, but it will make them a whole lot more interesting.