Unpredictable Bounces, Predictable Results

These days, the grass court season is the awkward stepchild of the tennis calendar. It takes place almost entirely within a single country’s borders, lasts barely a month, and often suffers from the absence of top players who prefer to rest after the French Open.

The small number of grass court events makes the surface problematic for analysts, as well. The surface behaves differently than hard or clay courts and rewards certain playing styles, so it’s reasonable to assume that many players will be particularly good or bad on grass. But with 90% of tour-level matches contested on other surfaces, many players don’t have much of a track record with which we can assess their grass-court prowess.

I was surprised, then, to find that grass court results are rather predictable. Elo-based forecasts of ATP grass court matches are almost as accurate as hard court predictions and considerably more effective than clay court forecasts. Even when we use “pure” surface forecasts–that is, predicting matches using ratings which draw only on results from that surface–grass court forecasts are a bit better than clay court predictions.

I started with a dataset of the roughly 50,000 ATP matches from 2000 through last week, excluding retirements and withdrawals. As a benchmark, I used official ATP rankings to make predictions for each of those matches. 66.6% of them were right, and the Brier score for ATP rankings over that span is .210. (Brier score measures the accuracy of a set of forecasts by averaging the squared error of each individual forecast, so a lower number is better. To put tennis-specific Brier scores in context, in 2016, ATP rankings had a .208 Brier score, and aggregate betting odds had a .189 Brier score.)

Let’s break that down by surface and compare the performance of ATP rankings, Elo, and surface-specific Elo. “F%” is the percentage of matches won by the favorite–as determined by that system, and “Br” is Brier score:

Surface  ATP F%  ATP Br  Elo F%  Elo Br  sElo F%  sElo Br  
Hard      67.3%   0.207   68.0%   0.205    68.5%    0.202  
Clay      66.1%   0.211   67.1%   0.211    67.0%    0.213  
Grass     66.0%   0.215   67.6%   0.207    68.5%    0.207

All three rating systems do best on hard courts, and for good reason: official rankings and overall Elo are more heavily weighted toward hard court results than they are clay or grass. Surface-specific Elo does best on hard courts for a similar reason: more data.

Already, though, we can see the unexpected divergence of clay and grass courts, especially with surface-specific Elo. It’s possible to explain overall Elo’s better performance on grass courts due to the presumed similarly between hard and grass–if a player excels on one, he’s probably good on the other, even if he’s horrible on clay.  But that doesn’t explain sElo doing better on grass than on clay. There are 3.3 times as many tour-level matches on clay than on grass, so even allowing for the fact that players choose schedules to suit their surface preferences, almost everyone is going to have more results on dirt than on turf. More data should give us better results, but not here.

We can improve our forecasts even more by blending surface-specific ratings with overall ratings. After testing a wide range of possible mixes, it turns out that equally weighting Elo and sElo provides close to the best results. (The differences between, say, 60/40 and 50/50 are extremely small on all surfaces, so even where 60/40 is a bit better, I prefer to keep it simple with a half-and-half mix.) Here are the results for weighted surface Elos for all three surfaces:

Surface  ATP F%  ATP Br  
Hard      68.6%   0.202  
Clay      68.0%   0.207  
Grass     69.8%   0.196

Now grass courts are the most predictable of the major surfaces! Even when we use a weighted average of Elo and sElo, grass court forecasts rely on less data than those of the other surfaces–the surface-specific half of the grass court forecasts uses less than one-third the match results of clay court predictions and less than one-fifth the results of hard court forecasts. In fact, we can do at least as well–and perhaps a tiny bit better–with even less data: A 50/50 weighting of grass-specific Elo and hard-specific Elo is just as accurate as the half-and-half mix of grass-specific and overall Elo.

Regardless of the exact formula, it remains striking that we can predict ATP grass court results so accurately from such limited data. Even if one-third of ATP events were played on grass, I still wouldn’t have been surprised if grass court results turned out to be the least predictable. The more a surface favors the server–and it’s hardest to break on grass–the tighter the scoreline will tend to be, introducing more randomness into the end result. Despite that structural tendency, we’re able to pick winners as successfully on grass as on the more common surfaces.

Here’s my theory: Even though there aren’t many grass court events, the conditions at those few tournaments are quite consistent. Altitude is roughly sea level, groundskeepers follow the lead of the staff at Wimbledon, and rain clouds are almost always in sight. Compare that homogeneity to the variety of hard courts or clay courts. The high-altitude hard courts in Bogota are nothing like the slow ones in Indian Wells. The “clay” in Houston is only nominally equal to the crushed brick of Roland Garros. While grass courts are almost identical to each other, clay courts are nearly as different from each other as they are from other surfaces.

It makes sense that ratings based on a uniform surface would be more accurate than ratings based on a wide range of surfaces, and it’s reassuring to find that the limited available data doesn’t cancel out the advantage. This research also suggests a further path to better forecasts: grouping hard and clay matches by a more precise measure of surface speed. If 10% of tour matches are sufficient to make accurate grass court predictions, the same may be true of the slowest one-third of clay courts. More data is almost always better, but sometimes, precisely targeted data is best of all.

The Proud Tradition of Americans Skipping Monte Carlo

The Monte Carlo Masters is unique among the ATP’s 1,000 series events. The stakes are high, but attendance isn’t mandatory, so while most of the game’s top players show up, a few take the week off. No group has so consistently skipped Monte Carlo than players from the U.S.A.

This year, six U.S. players had rankings that would’ve gotten them into the Monte Carlo main draw, where winning a single match earns you 45 ranking points and just over €28,000 in prize money. Five of those players–including John Isner, who reached the third round two years ago and won a pair of tough Davis Cup matches at the same venue–opted out. All five played the 250-level Houston tournament last week instead. Only Ryan Harrison made the trip to Europe–losing in the opening round, as Carl Bialik and I safely predicted on this week’s podcast.

Choosing the low-stakes event on home soil isn’t the wise choice, but it’s nothing new. Since 2006, only seven Americans have appeared in a Monte Carlo main draw: Isner twice, Harrison, Sam Querrey, Donald Young, Steve Johnson, and Denis Kudla, who qualified in 2015. From 2006 to 2016, 7 of the 11 Monte Carlo draws were entirely USA-free. In the same time span, Houston draws have featured 35 Americans ranked in the top 60–all players who probably would have earned direct entry in the higher-stakes clay event, as well.

For a player like Isner or Jack Sock, an April schedule can handle both tournaments. Four of the seven Americans who went to Monte Carlo played Houston as well, including Querrey in 2008, when he lost in the first round in Houston but reached the final eight in Monte Carlo.

Most U.S. players, including just about everyone I’ve mentioned so far, would much rather play on hard courts than on clay.  (The Houston surface is more conducive to aggressive, first-strike tennis than is the Monte Carlo dirt, one of the slowest surfaces on the calendar.) However, as Isner and Querrey have shown, a one-dimensional power game can succeed on a slow court, even if it looks nothing like the strategy of a traditional clay specialist.

Isner, in particular, has racked up plenty of points on the surface. While he’d much rather play on home soil, he has twice reached the fourth round at the French Open and pushed none other Rafael Nadal to a deciding set in both Paris and Monte Carlo. Sock is also a threat on the surface, having won nearly two-thirds of his tour-level matches on clay. Many of those wins came in Houston, but like Isner, he took a set from Nadal in Europe on the surface the Spaniard typically dominates.

Even if the top Americans had little chance of going deep in Monte Carlo, one wonders what the additional time on the surface would do for the rest of their clay season. Most will show up for Madrid and Rome, and all of them will play Roland Garros. It’s a bit of a chicken-and-egg question–do Americans avoid the dirt because they suck on clay, or do they suck because they avoid it?–but it couldn’t hurt to play on the more traditional European surface against elite-level opponents.

The difference in rewards between a 250 like Houston and a Masters 1000 like Monte Carlo make it likely that the risk of playing in unfamiliar territory would pay off, as it did for Querrey in his one trip and for Isner two years ago. And I suspect that the rewards would stretch beyond the immediate shot at a bigger payday: If someone like Sock invested more time in developing his clay-court game now, he could become a legitimate threat at a faster clay tournament (such as the Madrid Masters) in a few years. It’s probably too late for the likes of Querrey, but the next generation of U.S. men’s stars would do well to break with tradition and give themselves more chances to excel on the dirt.

The Speed of Every Surface, 2016 Edition

More than five years after I first started trying to use ATP match stats to estimate surface speed, the issue remains a contentious one. Most commentators agree that surface speeds have converged and generally gotten slower. The ATP has begun to release a trickle of court speed data, but it raises more questions than answers.

It’s been three years since I’ve published surface speed numbers, so we’re due for an update. Before we do that, it’s important to understand what exactly these figures mean, as well as their limitations.

Court surfaces–and, more broadly, the environments in which pro matches are played–have a variety of characteristics. Some courts are faster or slower and some cause higher or lower bounces. Tournaments use different balls, are played at a range of elevations, and take place in all sorts of weather conditions. All of these factors, and more, affect how matches are played.

Due to the limits of available tennis data, however, we can’t isolate those different factors. It would be great to know which surfaces allowed for the most effective slice approaches or the deadliest drop shots, but we don’t have the data to even begin trying to answer those questions. The Match Charting Project is a step in the right direction, but with only a few hundred men’s matches per year, there isn’t quite enough to compare surfaces while controlling for different players and playing styles.

So we work with what we have. Faster surfaces are more favorable to the server, which shows up in ace counts and service breaks. The ATP publishes those basic stats for every match, so that’s what we’ll use. When I first researched this issue, I discovered that there isn’t much difference between counting aces and counting service breaks, except that there’s a wider variation in ace rates between faster and slower surfaces, so the resulting numbers are easier to understand.

At the risk of repeating myself: Measuring surface speed by ace rate ignores a lot of court characteristics. It is far from complete and certainly imperfect. It does, however, give us an idea of how tournaments compare in one important regard.

Aces, adjusted

That said, simply counting aces–for example, 6.8% of points in Buenos Aires this year and 11.2% of points in Los Cabos–isn’t good enough. Players make scheduling choices based on their strengths and preferences, so the guys who show up for clay court events tend, on average, to be weaker servers than those who play on hard and grass courts. To take an extreme example, Gilles Muller managed to play only two matches on clay this season. As it turns out, the courts in Buenos Aires and Los Cabos had almost identical effects on ace rates–the difference is entirely due to the mix of players in each draw.

So we adjust for the makeup of the field. For every player with at least three tour-level matches on clay and another three on hard or grass, I calculated their season average ace rates on clay and hard/grass,which I then weighted (one-third clay, two-thirds hard/grass) so that the numbers give us idea of what their ace rate would’ve been had they played an “average” (that is, unbiased by scheduling preferences) season. I’ve lumped hard and grass together here, not because they are the same–of course they’re not–but because the small number of grass court events makes it difficult to treat on its own.

With player averages in hand, we can go through every match of the season (between players who meet our minimums) and, using their ace rates and the rates at which players hit aces against them, calculate a “predicted” ace rate for the match, given a neutral surface. Then, by comparing the match’s actual ace rate to the neutral prediction, we get one data point regarding the surface’s effect on aces. If the actual ace rate is greater than the prediction, it suggests the surface is faster than average. If the prediction is greater than the ace rate, it implies the surface is slower than average.

No single match can tell us about a court’s tendency, but by aggregating all the matches at an event, we get a fairly good idea. With that final step, we get a single number per event. A neutral surface rates at 1, faster surfaces are greater than 1, and slower surfaces are less than 1. For instance, this algorithm rates the 2016 Paris Masters as 1.18, meaning that there were 18% more aces than we would expect on a neutral surface, rating Bercy as faster than all but 10 other events this season.

Whew! Here are the ace-based surface ratings for the last three seasons of every current tour-level event listed from fastest to slowest:

Tournament            Surface  2016 Ace%  2016  2015  2014  
Shenzhen                 Hard      12.9%  1.54  1.20  1.49  
Quito                    Clay      11.9%  1.50  0.89        
Metz                     Hard      12.6%  1.43  1.28  1.37  
Marseille                Hard      15.3%  1.38  1.28  1.26  
Stuttgart               Grass      13.3%  1.38  1.32  0.89  
Chengdu                  Hard      11.7%  1.27              
Australian Open          Hard      12.3%  1.25  1.19  1.12  
Queen's Club            Grass      14.3%  1.25  1.27  1.26  
Washington               Hard      19.5%  1.24  1.12  1.25  
Cincinnati Masters       Hard      14.2%  1.18  1.04  1.17  
Paris Masters            Hard      13.7%  1.18  1.03  1.03  
Brisbane                 Hard      12.2%  1.16  1.20  1.23  
Canada Masters           Hard      12.6%  1.16  1.08  1.00  
Halle                   Grass      12.2%  1.16  1.12  1.31  
Nottingham              Grass      12.0%  1.15  1.21        
Gstaad                   Clay      10.1%  1.12  0.84  0.77  
Basel                    Hard      10.1%  1.12  1.01  1.20  
Tokyo                    Hard      11.5%  1.12  1.00  1.06  
Chennai                  Hard      10.3%  1.12  0.91  0.65  
Auckland                 Hard      12.9%  1.11  1.21  1.01  
Tournament            Surface  2016 Ace%  2016  2015  2014  
Doha                     Hard       8.8%  1.11  1.06  0.83  
Sydney                   Hard      10.5%  1.11  1.32  1.27  
Montpellier              Hard       9.7%  1.10  1.29  1.29  
Shanghai Masters         Hard      10.7%  1.10  1.05  1.34  
Kitzbuhel                Clay       6.9%  1.09  0.85  0.81  
s-Hertogenbosch         Grass      13.2%  1.08  1.06  1.05  
Winston-Salem            Hard      10.4%  1.07  1.33  1.10  
Newport                 Grass      11.0%  1.07  1.26  1.23  
Tour Finals              Hard       9.5%  1.06  0.99  0.89  
Wimbledon               Grass      11.8%  1.06  1.20  1.35  
Rotterdam                Hard       9.8%  1.04  1.19  1.08  
Vienna                   Hard      11.8%  1.02  1.39  1.26  
Memphis                  Hard       8.7%  1.00  1.19  0.94  
Miami Masters            Hard      10.0%  1.00  0.86  1.04  
Sofia                    Hard       8.4%  1.00              
Beijing                  Hard       9.4%  0.99  1.05  0.81  
Atlanta                  Hard      15.5%  0.97  1.35  0.90  
St.Petersburg            Hard       8.1%  0.97  0.98        
Marrakech                Clay       8.5%  0.95              
Olympics                 Hard       7.1%  0.95              
Tournament            Surface  2016 Ace%  2016  2015  2014  
Moscow                   Hard       6.6%  0.94  1.08  1.12  
Antwerp                  Hard       8.6%  0.93              
Delray Beach             Hard       9.2%  0.92  0.88  0.93  
US Open                  Hard       8.9%  0.91  1.10  1.10  
Dubai                    Hard       9.4%  0.88  0.93  0.81  
Madrid Masters           Clay       8.6%  0.86  0.85  0.94  
Los Cabos                Hard      11.2%  0.85              
Buenos Aires             Clay       6.8%  0.85  0.78  0.64  
Houston                  Clay      11.5%  0.84  0.76  0.70  
Sao Paulo                Clay       7.1%  0.83  1.03  1.20  
Acapulco                 Hard      10.5%  0.83  0.67  0.98  
Indian Wells Masters     Hard       8.2%  0.83  0.99  0.90  
Stockholm                Hard       7.6%  0.82  1.13  1.15  
Rio de Janeiro           Clay       7.4%  0.81  0.80  0.77  
Estoril                  Clay       7.4%  0.80  0.63  0.62  
Nice                     Clay       6.3%  0.79  0.64  0.74  
Geneva                   Clay       8.3%  0.77  0.78        
Umag                     Clay       5.4%  0.77  0.67  0.76  
Roland Garros            Clay       7.6%  0.77  0.72  0.71  
Rome Masters             Clay       7.2%  0.76  0.94  0.74  
Bucharest                Clay       5.9%  0.71  0.59  0.51  
Munich                   Clay       6.3%  0.71  1.01  0.87  
Monte Carlo Masters      Clay       6.2%  0.70  0.63  0.64  
Istanbul                 Clay       5.7%  0.67  0.83        
Barcelona                Clay       5.4%  0.65  0.70  0.72  
Bastad                   Clay       5.3%  0.65  0.64  1.07  
Hamburg                  Clay       5.7%  0.60  0.62  0.79

As usual, we have an interesting mix of usual suspects and surprises. The top of the list is primarily indoor hard and grass courts, along with the high-altitude clay in Quito and Gstaad. However, in both of the latter cases, those tournaments had lower-than-expected ace rates in 2015. The surface ratings for 250s are particularly volatile because, in addition to the small number of matches, many of these matches must be discarded because one or both of the players didn’t meet our minimums. For the 2015 Quito event, we have only 11 matches to work with.

The sample size problem doesn’t apply to larger events, however, so we can have a fair amount of confidence in the ratings for the Australian Open, showing up here as the fastest of the Grand Slams–considerably faster than Wimbledon, which is only a few ticks above neutral.

Ace ratings and Court Pace Index

Last month, TennisTV released some data on court speed for this season’s Masters events. Court Pace Index (CPI) is a commonly-accepted measure of the speed of the surface itself–that is, the physical makeup of the court. As I’ve said, that’s far from the only factor affecting how a court plays, but it is an important one.


Here’s how my surface ratings compare to CPI:

Tournament            Surface  TA Rating   CPI  
Cincinnati Masters       Hard       1.18  35.1  
Paris Masters            Hard       1.18  39.1  
Canada Masters           Hard       1.16  35.2  
Shanghai Masters         Hard       1.10  44.1  
Tour Finals              Hard       1.06  40.6  
Miami Masters            Hard       1.00  33.1  
Madrid Masters           Clay       0.86  22.5  
Indian Wells Masters     Hard       0.83  30.0  
Rome Masters             Clay       0.76  24.0  
Monte Carlo Masters      Clay       0.70  23.7

It’s noteworthy that Madrid is, by my measure, the most ace-friendly of the three clay-court Masters, while its CPI is the lowest. Altitude could account for the difference.

The biggest mismatch, though, is the Tour Finals. The O2 Arena has one of the highest CPIs, but it doesn’t rate very far above average in aces. The Tour Finals has always been a bit problematic, as there is an unusually small number of matches, and the level of returning is very, very high. My algorithm takes into account how well each player prevents aces, but perhaps that issue is more complex when our view is limited to only the very best players.

TennisTV also showed CPI for the last several years of Tour Finals:


Compared to my ratings:

Year  TA Rating   CPI  
2016       1.06  40.6  
2015       0.99  34.0  
2014       0.89  33.6  
2013       0.90  32.8  
2012       1.18  33.9

If the table cut off after 2013, it would look like a relatively good fit. As it is, the relationship between CPI and my rating for 2012 wouldn’t be out of place in the previous table, which included a 35.1 CPI for Cincinnati to go with an ace-based rating of 1.18.

I hope that this is a sign of more data to come. If so, we can move beyond approximations based on ace rate to get a better sense of what factors influence play at the ATP level. More data won’t settle the age-old surface speed debates, but it will make them a whole lot more interesting.

The Grass is Slowing: Another Look at Surface Speed Convergence

A few years ago, I posted one of my most-read and most-debated articles, called The Mirage of Surface Speed Convergence.  Using the ATP’s data on ace rates and breaks of serve going back to 1991, it argued that surface speeds aren’t really converging, at least to the extent we can measure them with those two tools.

One of the most frequent complaints was that I was looking at the wrong data–surface speed should really be quantified by rally length, spin rate, or any number of other things. As is so often the case with tennis analytics, we have only so much choice in the matter. At the time, I was using all the data that existed.

Thanks to the Match Charting Project–with a particular tip of the cap to Edo Salvati–a lot more data is available now. We have shot-by-shot stats for 223 Grand Slam finals, including over three-fourths of Slam finals back to 1980. While we’ll never be able to measure anything like ITF Court Pace Rating for surfaces thirty years in the past, this shot-by-shot data allows us to get closer to the truth of the matter.

Sure enough, when we take a look at a simple (but until recently, unavailable) metric such as rally length, we find that the sport’s major surfaces are playing a lot more similarly than they used to. The first graph shows a five-year rolling average* for the rally length in the men’s finals of each Grand Slam from 1985 to 2015:


* since some matches are missing, the five-year rolling averages each represent the mean of anywhere from two to five Slam finals.

Over the last decade and a half, the hard-court and grass-court slams have crept steadily upward, with average rally lengths now similar to those at Roland Garros, traditionally the slowest of the four Grand Slam surfaces. The movement is most dramatic in the Wimbledon grass, which for many years saw an average rally length of a mere two shots.

For all the advantages of rally length and shot-by-shot data, there’s one massive limitation to this analysis: It doesn’t control for player. (My older analysis, with more limited data per match, but for many more matches, was able to control for player.) Pete Sampras contributed to 15 of our data points, but none on clay. Andres Gomez makes an appearance, but only at Roland Garros. Until we have shot-by-shot data on multiple surfaces for more of these players, there’s not much we can do to control for this severe case of selection bias.

So we’re left with something of a chicken-and-egg problem.  Back in the early 90’s, when Roland Garros finals averaged almost six shots per point and Wimbledon finals averaged barely two shots per point, how much of the difference was due to the surface itself, and how much to the fact that certain players reached the final? The surface itself certainly doesn’t account for everything–in 1988, Mats Wilander and Ivan Lendl averaged over seven shots per point at the US Open, and in 2002, David Nalbandian and Lleyton Hewitt topped 5.5 shots per point at Wimbledon.

Still, outliers and selection bias aside, the rally length convergence we see in the graph above reflects a real phenomenon, even if it is amplified by the bias. After all, players who prefer short points win more matches on grass because grass lends itself to short points, and in an earlier era, “short points” meant something more extreme than it does today.

The same graph for women’s Grand Slam finals shows some convergence, though not as much:


Part of the reason that the convergence is more muted is that there’s less selection bias. The all-surface dominance of a few players–Chris Evert, Martina Navratilova, and Steffi Graf–means that, if only by historical accident, there is less bias than in men’s finals.

We still need a lot more data before we can make confident statements about surface speeds in 20th-century tennis. (You can help us get there by charting some matches!) But as we gather more information, we’re able to better illustrate how the surfaces have become less unique over the years.

The Speed of Every 2013 Surface

Few debates get tennis fans as riled up as the general slowing–or homogenization–of surface speeds.  While indoor tennis (to take a recent example) is a different animal than it was fifteen or twenty years ago, it’s tough to separate the effect of the court itself from the other changes in the game that have taken place in that time.

Further, the “court effect” itself is multi-dimensional.  The surface makes a big difference, as grass will almost always play quicker than a hard court, which will usually play faster than clay.  But as we’ve seen with the persistence of Sao Paulo as one of the fastest-playing events on tour, altitude is a major factor, as is weather, which can slow down a normally speedy tournament, as was the case with Hurricane Irene at the 2011 US Open.  The choice of balls can influence the speed of play as well.

With all of these factors in play, what we often refer to as “surface speed” is really “court speed” or even “playing environment.”  It’s not just the surface.  That said, I’ll continue to use the terms interchangeably.

Because of there is only limited data available, if we want to quantify surface differences,  we must use a proxy for court speed.  What has worked in the past is ace rate–adjusted for the server and returner in each match.  On a fast court–a surface that doesn’t grip the ball; or one like grass with a low, less predictable bounce; or at a high altitude; or in particularly hot weather–a player who normally hits 5% of his service points for aces might see that number increase to 8%.  (Returners influence ace rate as well. A field with Andy Murray will allow fewer aces than a field with Juan Martin del Potro, so I’ve controlled for that as well.)

Aggregate these server- and returner-adjusted ace rates, and at the very least, we have an approximation of which courts on tour are most ace-friendly.  Since most of the characteristics of an ace-friendly court overlap with what we consider to be a fast court, we can use that number as an marker for surface speed.

2013 Court Speed Numbers

For the second year in a row, the high-altitude clay of Sao Paulo was the fastest-playing surface on tour.  The altitude also appears to play a role in making Gstaad quicker than the typical clay.

As for the slowing of indoor courts, the evidence is inconclusive.  The O2 Arena, site of the World Tour Finals, rated as slower than average in 2011 and 2012, on a level with some of the slowest hard courts on tour.  This year, it came out above average, and a three-year weighted average puts the O2 at the exact middle of the ATP court-speed range.

Valencia and the Paris Masters played about as fast as they have in the past, while Marseille remained near the top of the rankings. If there is evidence for a mass slowing of indoor speeds, it comes from some unlikely sources: Both Moscow and San Jose were among the quickest surfaces on tour in 2010 and 2011, but have been right in the middle of the pack for the last two years.

The table below shows the relative ace rate of every tournament for the last four years, along with a weighted averaged of the last three years.  The weighted average is the most useful number here, especially for the smaller 28- and 32-player events.  The limited extent of a 31-match tournament can amplify the anomalous performance of one player–as you can see from some of the bigger year-to-year movements.  But over the course of three years, individual outliers have less impact.

The “Sf” column is each event’s surface: “C” for clay, “H” for hard, and “G” for grass.  The numbers are multipliers, so Sao Paulo’s three-year weighted average of 1.58 means that players at that event hit 58% more aces than they would have on a neutral court.  Monte Carlo’s 0.67 means 33% less than neutral.

Event            Sf  10 A%  11 A%  12 A%  13 A%   3yr  
Sao Paulo        C    1.44   1.08   1.58   1.74  1.58  
Marseille        H    1.09   1.24   1.41   1.26  1.30  
Halle            G    1.20   1.39   1.26   1.20  1.25  
Wimbledon        G    1.36   1.18   1.24   1.25  1.24  
Shanghai         H    0.96   1.05   1.08   1.37  1.22  
Montpellier      H    1.28          1.40   1.16  1.21  
Brisbane         H    1.01   1.20   1.08   1.27  1.19  
Tokyo            H    1.35   0.98   1.17   1.26  1.18  
Gstaad           C    0.87   1.13   0.90   1.35  1.16  
Winston-Salem    H           1.20   1.10   1.18  1.16  

Chennai          H    0.75   0.77   1.21   1.25  1.16  
Valencia         H    1.02   1.10   1.12   1.19  1.15  
Zagreb           H    1.09   1.16   1.20   1.11  1.15  
Washington       H    0.96   0.93   1.34   1.10  1.15  
Vienna           H    1.42   1.22   1.01   1.19  1.14  
Santiago         C    1.23   1.21   0.86   1.29  1.13  
Sydney           H    1.08   1.14   0.94   1.25  1.13  
Atlanta          H    0.92   0.82   1.06   1.26  1.12  
Eastbourne       G    1.07   1.13   0.92   1.22  1.11  
Queen's Club     G    1.07   1.13   1.09   1.12  1.11  

Paris            H    1.38   0.97   1.16   1.12  1.11  
Cincinnati       H    1.09   1.02   1.08   1.13  1.10  
s-Hertogenbosch  G    1.13   1.08   1.03   1.15  1.10  
Auckland         H    1.01   1.08   1.06   1.12  1.09  
Memphis          H    1.08   1.12   0.95   1.09  1.05  
Stuttgart        C    1.09   1.05   1.04   1.06  1.05  
Bogota           H                         1.09  1.05  
Rotterdam        H    0.88   1.21   0.83   1.12  1.04  
Stockholm        H    0.93   0.96   1.15   0.99  1.04  
Basel            H    0.98   1.05   1.16   0.96  1.04  

Bangkok          H    1.20   1.12   0.73   1.19  1.03  
Australian Open  H    0.98   1.10   0.92   1.08  1.03  
US Open          H    1.14   0.93   1.06   1.04  1.03  
San Jose         H    1.21   1.23   0.96   0.99  1.02  
Moscow           H    1.28   1.12   1.01   0.99  1.02  
Dubai            H    1.13   1.07   1.14   0.92  1.02  
Doha             H    0.88   1.29   0.90   0.98  1.00  
Tour Finals      H    1.07   0.93   0.87   1.11  1.00  
Beijing          H    1.01   1.01   1.06   0.94  0.99  
Canada           H    0.99   1.02   1.04   0.95  0.99  

Madrid           C    0.76   0.86   1.19   0.89  0.98  
Kitzbuhel        C           1.12   0.70   1.12  0.98  
Metz             H    1.14   0.96   1.07   0.90  0.97  
Dusseldorf       C                         0.92  0.96  
Munich           C    0.77   0.82   0.91   0.97  0.92  
St. Petersburg   H    1.02   0.84   0.86   0.99  0.92  
Acapulco         C    0.88   0.89   1.06   0.84  0.92  
Delray Beach     H    0.98   1.07   0.92   0.85  0.91  
Newport          G    1.46   0.72   1.04   0.89  0.91  
Kuala Lumpur     H    0.96   0.97   0.81   0.94  0.90  

Miami            H    0.91   0.98   0.86   0.89  0.89  
Umag             C    0.56   0.74   0.67   1.04  0.87  
Hamburg          C    1.04   0.85   0.75   0.92  0.85  
Buenos Aires     C    0.84   0.86   0.93   0.74  0.82  
Indian Wells     H    0.92   0.90   0.86   0.77  0.82  
Roland Garros    C    0.82   0.86   0.81   0.78  0.81  
Barcelona        C    0.73   0.65   0.91   0.78  0.80  
Casablanca       C    0.82   0.91   0.77   0.75  0.79  
Estoril          C    0.62   0.73   0.79   0.71  0.74  

Houston          C    0.85   0.71   0.71   0.77  0.74  
Bucharest        C    0.61   1.08   0.62   0.68  0.73  
Rome             C    0.78   0.67   0.64   0.81  0.73  
Nice             C    0.88   0.84   0.79   0.64  0.72  
Bastad           C    0.93   0.74   0.86   0.58  0.70  
Monte Carlo      C    0.63   0.60   0.71   0.67  0.67

If Surfaces are Converging…

Internet discussion has perked up about a post of mine from last month, The Mirage of Surface Speed Convergence.

Many people don’t like my results, and plenty of people just don’t like having someone challenge their preconceived notions–or those of the players they idolize.

Yet for all the chatter, no one has even attempted to address the question at the end of that post:

If surfaces are converging, why is there a bigger difference in aces now than there was 10, 15, or 20 years ago? Why don’t we see hard-court break rates getting any closer to clay-court break rates?

Unless there is a valid answer to those questions, it really doesn’t matter how you felt after watching the Miami final, or what a top player said in some press conference.

The Mirage of Surface Speed Convergence

Rafael Nadal won Indian Wells. Roger Federer won on the blue clay. Even Alessio Di Mauro won a match on a hard court last week.

That’s just a sliver of the anecdotal evidence for one of the most common complaints about contemporary ATP tennis: Surface speeds are converging. Hard courts used to play faster, allowing for more variety in the game and providing more opportunities to different types of players. Or so the story goes.

This debate skipped the stage of determining whether the convergence is actually happening. The media has moved straight to the more controversial subject of whether it should. (Coincidentally, it’s easier to churn out columns about the latter.)

We can test these things, and we’re going to in a minute.  First, it’s important to clarify what exactly we mean by surface speed, and what we can and cannot learn about it from traditional match statistics.

There are many factors that contribute to how fast a tennis ball moves through the air (altitude, humidity, ball type) and many that affect the nature of the bounce (all of the same, plus surface). If you’re actually on court, hitting balls, you’ll notice a lot of details: how high the ball is bouncing, how fast it seems to come off of your opponent’s racket, how the surface and the atmosphere are affecting spin, and more.  Hawkeye allows us to quantify some of those things, but the available data is very limited.

While things like ball bounce and shot speed can be quantified, they haven’t been tracked for long enough to help us here.  We’re stuck with the same old stats — aces, serve percentages, break points, and so on.

Thus, when we talk about “surface speed” or “court speed,” we’re not just talking about the immediate physical characteristics of the concrete, lawn, or dirt.  Instead, we’re referring to how the surface–together with the weather, the altitude, the balls, and a handful of other minor factors–affects play.  I can’t tell you whether balls bounced faster on hard courts in 2012 than in 1992.  But I can tell you that players hit about 25% more aces.

Quantifying the convergence

In what follows, we’ll use two stats: ace rate and break rate.  When courts play faster, there are more aces and fewer breaks of serve.  The slower the court, the more the advantage swings to the returner, limiting free points on serve and increasing the frequency of service breaks.

To compare hard courts to clay courts, I looked for instances where the same pair of players faced off during the same year on both surfaces.  There are plenty–about 100 such pairs for each of the last dozen years, and about 80 per year before that, back to 1991.  Focusing on these head-to-heads prevents us from giving too much weight to players who play almost exclusively on one surface.  Andy Roddick helped increase the ace rate and decrease the break rate on hard courts for years, but he barely influences the clay court numbers, since he skipped so many of those tournaments.

Thus, we’re comparing apples to apples, like the matches this year between David Ferrer and Fabio Fognini.  On clay, Ferrer aced Fognini only once per hundred service points; on hard, he did so six times as often.  Any one matchup could be misleading, but combine 100 of them and you have something worth looking at.  (This methodology, unfortunately, precludes measuring grass-court speed.  There simply aren’t enough matches on grass to give us a reliable sample.)

Aggregate all the clay court matches and all the hard court matches, and you have overall numbers that can be compared.  For instance, in 2012, service breaks accounted for 22.0% of these games on clay, against 20.5% of games on hard.  Divide one by the other, and we can see that the clay-court break rate is 7.4% higher than its hard-court counterpart.

That’s one of the smallest differences of the last 20 years, but it’s far from the whole story.  Run the same algorithm for every season back to 1991 (the extent of available stats), and you have everything from a 2.8% difference in 2002 to a 32.8% difference in 2003.  Smooth the outliers by calculating five-year moving averages, and you get finally get something a bit more meaningful:


The larger the difference, the bigger the difference between hard and clay courts.  The most extreme five-year period in this span was 2003-07, when there were 25.4% more breaks on clay courts than on hard courts.  There has been a steady decline since then (to 16.9% for 2008-12), but not to as low a point as the early 90s (14.0% for 1991-1996), and only a bit lower than the turn of the century (17.8% for 1998-2002).  These numbers hardly identify the good old days when men were men and hard courts were hard.

When we turn to ace rate, the trend provides even less support for the surface-convergence theory.  Here are the same 5-year averages, representing the difference between hard-court ace rate and clay-court ace rate:


Here again, the most diverse results occurred during the 5-year span from 2003 to 2007, when hard-court aces were 51.3% higher than clay-court aces.  Since then, the difference has fallen to 46%, still a relatively large gap, one that only occurred in two single years before 2003.

If surfaces are converging, why is there a bigger difference in aces now than there was 10, 15, or 20 years ago? Why don’t we see hard-court break rates getting any closer to clay-court break rates?

However fast or high balls are bouncing off of today’s tennis surfaces, courts just aren’t playing any less diversely than they used to.  In the last 20 years, the game has changed in any number of ways, some of which can make hard-court matches look like clay-court contests and vice versa.  But with the profiles of clay and hard courts relatively unchanged over the last 20 years, it’s time for pundits to find something else to complain about.