April 2017 - Heavy Topspin

Why Maria Sharapova Should Get a French Open Wild Card

Maria Sharapova has returned from a 15-month doping suspension and hardly missed a step, advancing to the semifinals of her first tournament back, in the WTA Premier event in Stuttgart. While the draw has done her some favors–Ekaterina Makarova knocked out Agnieszka Radwanska, and Anett Kontaveit ousted Garbine Muguruza, Sharapova has shown she’s ready to compete at the highest level, winning about 57% of points against three credible opponents.

Many players have publicly stated that Sharapova doesn’t deserve to get wild cards, often because WCs are a sort of bonus, and a player who broke the rules doesn’t deserve any kind of handout. We’re likely to hear a lot more about it, as we won’t learn her status for the French Open for another two weeks.

However, wild cards are at the discretion of each individual tournament, and barring new regulations for players returning from doping bans, tournaments have their own incentives. Events often choose wild cards from a marketing perspective, granting main draw spots to former stars, young prospects, or local favorites.

Tournaments don’t have an explicit contract with their fans, but if they did, it would have to begin with an obligation to put the highest-quality product on the court. Most of the time, the ATP and WTA ranking and entry systems accomplish this, guaranteeing main draw places to the highest-ranked players. Occasionally, though, the ranking system fails and massively underrates the quality of a player.

Sharapova, obviously, is such a case. Unranked this week, and ranked #262 next week if she loses today to Kristina Mladenovic, she is already performing at the level of a top-20 player. My research suggests she may very soon be the best active player in women’s tennis, even if it takes many months for her official ranking to catch up.

Wild cards are the only mechanism tournaments are given to correct for the limitations of the ranking system. If the French Open (or any other event) wants to improve the quality of their draw, it should give Sharapova a wild card. If I am right that tournaments owe it to the fans to put on court the highest-level competition they possibly can, there are few opportunities so clear-cut as this one to improve the quality of a draw with a single player’s entry.

I can hear the objections already. First, as so many have claimed, Sharapova doesn’t “deserve” this kind of benefit. Yet by definition, wild cards are for players who don’t deserve a main draw entry. If they deserved one, their ranking (or “special” or “protected” ranking, if returning from injury) would guarantee them one. We use the words “deserve” and “earn” rather vaguely in this context, perhaps saying that a former great in his final year deserves a wild card based on his past contributions to tennis, or that a player has earned the free entry because she won a play-off of some sort.

It’s certainly true that some wild cards are more earned than others, but ultimately it’s beside the point. Even if it offends our sense of fairness, the players who most deserve a place in a draw are those who will make it more competitive. Last year, the French Federation gave wild cards to the likes of Alize Lim and Tessah Andrianjafitrimo, who lost in the first round to Qiang Wang without winning a single game. The eight wild cards won a total of three matches–one of them against another wild card. Except by virtue of being French, most of these wild cards didn’t do much to earn their places, and they had almost no impact on the tournament itself.

Beyond the claim that Sharapova, having broken the rules, doesn’t deserve a handout, there is a more extreme position, that her 15-month suspension wasn’t a sufficiently severe punishment. We can group that with another potential objection, that the French Open can’t be seen to endorse a doper. This is one of the many unfortunate side-effects of a weak central authority in tennis. By this argument, every tournament with the option of granting Sharapova an entry is required to re-litigate her doping ban. Even if we sidestep some of the controversial aspects of her ban and stipulate that she knowingly did something very bad, this seems nonsensical.

The whole point of having a central authority for doping enforcement is so that tournaments needn’t all police the players themselves. By issuing a 15-month ban, the ITF essentially spoke for all affiliated tournaments, saying that after 15 months (and exactly one day, as it turned out), Sharapova’s penalty would be paid and she would, in a sense, be rehabilitated. Giving a rehabilitated player a wild card is in no way an endorsement of her behavior, any more than giving a job to an ex-convict is an endorsement of the criminal act that put him in jail.

As a fan–even when I wish Sharapova wouldn’t win so many matches against my favorites–I want to see the best possible level of tennis every week. Now that her suspension is complete, every week that Sharapova wants to compete but can’t enter a top-level event is a missed opportunity for the sport. As great as the French Open is, it would be better with Sharapova than without her.

Diego Schwartzman’s Return Game Is Even Better Than I Thought

Click for an Italian translation

Diego Schwartzman is one of the most unusual players on the ATP tour. Even shorter than David Ferrer, his serve will never be a weapon, so the only way he can compete is by neutralizing everyone else’s offerings and winning baseline battles. Up to No. 34 in this week’s official rankings and No. 35 on the Elo list, he’s proven he can do that against some very good players.

Using the ATP stats leaderboard at Tennis Abstract, we can get a quick sense of how his return game compares with the elites. At tour level in the last 52 weeks (through Monte Carlo), he ranks third with 42.3% return points won, behind only Andy Murray and Novak Djokovic. He is particularly effective against second serves, winning 56.6% of those, better than anyone else on tour. He has broken in 31.8% of his return games, another third-place showing, this time behind Murray and Rafael Nadal.

Yet the leaderboard warns us to tread carefully. In the last year, Murray’s opponents have been far superior to Schwartzman’s, with a median rank of 24 and a mean rank of 41.5. The Argentine’s opponents have rated at 45.5 and 54.8, respectively. Murray, Djokovic, and Nadal are far better all-around players than Schwartzman, so they regularly reach later rounds, where the quality of competition goes way up.

Competition quality is one of the knottiest aspects of tennis analytics, and it is far from being solved. If we want to compare Murray to Djokovic, competition quality isn’t such a big factor. One or the other might get lucky over a span of months, but in the long run, the two best players on tour will face roughly equivalent levels of competition. But when we expand our view to players like Schwartzman–or even a top-tenner such as Dominic Thiem–we can no longer assume that opponent quality will even out. To use a term from other sports, the ATP has a very unbalanced schedule, and the schedule is always more challenging for the best players.

Correcting for competition quality is also key to understanding how any particular player evolves over time. If a player’s results improve, he’ll usually start facing more challenging competition, as Schwartzman is doing this spring in his first shot at the full slate of clay-court Masters events. If his return numbers decline, is he actually playing worse, or is he simply competing at his past level against tougher opponents?

Adjusting for competition

To properly compare players, we need to identify similarities in their schedules. Any pair of tour regulars have played many of the same opponents, even if they’ve never played each other. For instance, since the beginning of last season, Murray and Djokovic have faced 18 of the same players–some more than once. Further down the ranking list, players tend to have fewer opponents in common, but as we’ll see, that’s an obstacle we can overcome.

Here’s how the adjustment works: For a pair of players, find all the opponents both men have faced on the same surface. For example, both Murray and Djokovic have played David Goffin on clay in the last 16 months. Murray won 53.7% of clay return points against the Belgian, while Djokovic won only 42.1%, meaning that Djokovic returned about 22% worse than Murray did. We repeat the process for every surface-player combination, weight the results so that longer matches (or larger numbers of matches) count more heavily, and find the average.

When we do that for the top two men, we find that Djokovic has returned 2.3% better. (That’s a percentage, not percentage points. A great returner wins about 40% of return points, and a 2.3% improvement on that is roughly 41%.) Our finding suggests that Murray has faced somewhat weaker-serving competition: Since the beginning of 2016, he has won 42.9% of return points, compared to Djokovic’s 43.3%–a smaller gap than the competition-adjusted one.

It takes more work to reliably compare someone like Schwartzman to the elites, since their schedules overlap so much less. So before adjusting Diego’s return numbers, we’ll take several intermediate steps. Let’s start with the world No. 3 Stanislas Wawrinka. We follow the above process twice: Once for Wawrinka and Murray, then again for Stan and Novak. Run the numbers, and we find that Wawrinka’s return game is 22.5% weaker than Murray’s and 24.3% weaker than Djokovic’s. Wawrinka’s rates relative to the other two players correspond very well with what we already found, suggesting that Djokovic is a little better than his rival. Weighting the two numbers by sample size–which, in this case, is almost identical–we slightly adjust those two comparisons and conclude that Wawrinka’s return game is 22.4% worse than Murray’s.

Generating competition-adjusted numbers for each subsequent player follows the same pattern. For No. 4 Federer, we run the algorithm three times, one for each of the players ranked above him, then we aggregate the results. For No. 34 Schwartzman, we go through the process 33 times. Thanks to the magic of computers, it takes only a few seconds to adjust 16 months worth of return stats for the ATP top 50.

Below are the results for 2016-17. Players are ranked by “relative return points won” (REL RPW), where a rating of 1.0 is arbitrarily given to Murray, and a rating of 0.98 means that a player wins 2% fewer return points than Murray against equivalent opposition. The “EX RPW” column puts those numbers in a more familiar context: The top-ranked player’s rating is set equal to 43.0%–approximately the best RPW of any player in the last few seasons–and everyone else’s is adjusted accordingly. The last two columns show each player’s actual rate of return points won and their rank among the ATP top 50:

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK  
1     Diego Schwartzman         1.04   43.0%   42.4%     4  
2     Novak Djokovic            1.02   42.1%   43.3%     1  
3     Andy Murray               1.00   41.2%   42.9%     2  
4     Rafael Nadal              0.98   40.3%   42.6%     3  
5     David Goffin              0.97   40.1%   41.3%     5  
6     Gilles Simon              0.96   39.6%   40.1%     9  
7     Kei Nishikori             0.95   39.3%   40.1%    10  
8     David Ferrer              0.95   39.1%   40.6%     7  
9     Roger Federer             0.94   38.7%   38.7%    15  
10    Gael Monfils              0.93   38.5%   39.8%    11  


RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
11    Roberto Bautista Agut     0.93   38.3%   40.3%     8  
12    Ryan Harrison             0.92   37.9%   36.7%    33  
13    Richard Gasquet           0.92   37.9%   40.8%     6  
14    Daniel Evans              0.91   37.6%   36.9%    27  
15    Juan Martin Del Potro     0.91   37.5%   36.8%    32  
16    Benoit Paire              0.90   37.0%   38.1%    19  
17    Mischa Zverev             0.90   36.9%   36.9%    28  
18    Grigor Dimitrov           0.89   36.4%   38.2%    18  
19    Fabio Fognini             0.88   36.4%   39.7%    12  
20    Fernando Verdasco         0.88   36.4%   38.3%    16  

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
21    Joao Sousa                0.88   36.2%   38.3%    17  
22    Dominic Thiem             0.88   36.2%   38.1%    20  
23    Stani Wawrinka            0.88   36.1%   37.5%    22  
24    Alexander Zverev          0.88   36.0%   37.5%    23  
25    Albert Ramos              0.87   35.9%   38.9%    14  
26    Kyle Edmund               0.86   35.5%   36.1%    37  
27    Jack Sock                 0.86   35.5%   36.6%    34  
28    Viktor Troicki            0.86   35.4%   37.1%    26  
29    Marin Cilic               0.86   35.4%   37.3%    25  
30    Pablo Carreno Busta       0.86   35.3%   39.4%    13  

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
31    Milos Raonic              0.86   35.2%   36.1%    38  
32    Pablo Cuevas              0.85   35.1%   36.9%    29  
33    Tomas Berdych             0.85   35.1%   36.9%    30  
34    Borna Coric               0.85   34.9%   36.1%    39  
35    Nick Kyrgios              0.85   34.9%   35.7%    41  
36    Philipp Kohlschreiber     0.84   34.7%   37.9%    21  
37    Jo Wilfried Tsonga        0.84   34.6%   36.2%    36  
38    Sam Querrey               0.83   34.3%   34.6%    44  
39    Lucas Pouille             0.82   33.9%   36.9%    31  
40    Feliciano Lopez           0.81   33.2%   35.2%    43  

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
41    Robin Haase               0.80   33.0%   36.1%    40  
42    Paolo Lorenzi             0.80   32.9%   37.5%    24  
43    Donald Young              0.78   32.2%   36.3%    35  
44    Bernard Tomic             0.78   32.1%   34.1%    45  
45    Nicolas Mahut             0.76   31.4%   35.4%    42  
46    Steve Johnson             0.75   31.0%   33.8%    46  
47    Florian Mayer             0.74   30.3%   33.5%    47  
48    John Isner                0.73   30.0%   29.8%    49  
49    Gilles Muller             0.72   29.8%   32.4%    48  
50    Ivo Karlovic              0.63   25.9%   26.4%    50

The big surprise: Schwartzman is number one! While the average ranking of his opponents was considerably lower than that of the elites, it appears that he has faced bigger-serving opponents than have Murray or Djokovic. The top five on this list–Schwartzman, Murray, Djokovic, Nadal, and Goffin–do not force any major re-evaluation of who we consider to be the game’s best returners, but the competition-adjusted metric does offer more evidence that Schwartzman really belongs there.

There is a similar predictability at the bottom of the list. The five players rated the worst by the competition-adjusted metric–Steve Johnson, Florian Mayer, John Isner, Gilles Muller, and Ivo Karlovic–are the same five who sit at the bottom of the actual RPW ranking, with only Isner and Muller swapping places. This degree of consistency at the top and bottom of the list is reassuring: The metric is correcting for something important, but it isn’t spitting out any truly crazy results.

There are, however, some surprises. Three players do very well when their return games are adjusted for competition: Ryan Harrison, Daniel Evans, and Juan Martin del Potro, all of whom jump from the bottom half to the top 15. In a sense, this is a surface adjustment for Harrison and Evans, both of whom have played almost exclusively on hard courts. Players win fewer return points on faster surfaces (and faster surfaces attract bigger-serving competitors, magnifying the effect), so when adjusted for competition, someone who plays only on hard courts will see his numbers improve. Del Potro, on the other hand, has been absolutely hammered by tough competition, so in his case the correction is giving him credit for the difficult opponents he has had to face.

Several clay court specialists find their return stats adjusted in the wrong direction. Last week’s finalist, Albert Ramos, falls from 14th to 25th, Pablo Carreno Busta drops from 13th to 30th, and Roberto Bautista Agut and Paolo Lorenzi see their numbers take a hit as well. This is the reverse of the effect that pushed Harrison and Evans up the list: Clay-court specialists spend more time on the dirt and they play against weaker-serving opponents, so their season averages make them look like better returners than they really are. It appears that these players are all particularly bad on hard courts: When I ran the algorithm with only clay-court results, Bautista Agut, Ramos, and Carreno Busta all appeared among the top 12 in competition-adjusted return points won. It’s their abysmal hard-court performances that pull down their longer-term numbers.

Beyond RPW

This algorithm–or something like it–has a great deal of potential beyond simply correcting return points won for tour-level competition quality. It could be used for any stat, and if competition-adjusted return rates were combined with corrected rates of service points won, it would generate a plausible overall player rating system.

Such a rating system would be more valuable if the algorithm were extended to players beyond the top 50, as well. Just as Schwartzman doesn’t yet have that many common opponents with the elites, Challenger-level stalwarts don’t have share many opponents with tour regulars. But there is enough overlap that, when combining the shared opponents of dozens of players, we might be able to get a better grip on how Challenger-level competition compares to that of the highest levels. Essentially, we can compare adjacent levels–the elites to the middle of the pack (say, ATP ranks 21 to 50), the middle of the pack to the next 50, and so on–to get a more comprehensive idea of how much players must improve to achieve certain goals.

Finally, adjusting serve and return stats so that we have a set of competition-neutral numbers for every player, for each season of his career, we will gain a clearer picture of which players are improving and by how much. Official rankings and Elo ratings tell us a lot, but they are sometimes fooled by lucky breaks, close wins, or inconsistent opposition. And they cannot isolate individual stats, which may be particularly useful for developmental purposes.

Adjusting for opposition quality is standard practice for analysts of many other sports, and it will help tennis analytics move forward as well. If nothing else, it has shown us that one extreme performance–Schwartzman’s return game–is much more than a fluke, and that service return greatness isn’t limited to the big four.

Podcast Episode 4: La Decima, a Wild Fed Cup Weekend, and Sharapova’s Return

In the Episode 4 of the Tennis Abstract Podcast, Carl Bialik and I desperately try to cover everything going on in tennis right now. We start with Rafael Nadal’s Decima of Monte Carlo titles, along with the bizarre underperformance of the rest of the ATP elites.

From there, we touch on various highlights from the weekend’s Fed Cup action, including our perspective on Ilie Nastase and the Romania-Great Britain tie he so shamefully overshadowed. Finally, we compare predictions for Sharapova’s comeback–for this week, the clay season as a whole, and the season in general. With Serena out and Maria (and soon Vika and Petra) back in, the WTA field leaves us with even more uncertainty than usual.

Listen here, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Second-Strike Tennis: When Returners Dominate

Italian translation at settesei.it

On Wednesday, Diego Schwartzman scored a notable upset, knocking out 12th seed Roberto Bautista Agut in the second round of the Monte Carlo Masters. Even more unusual than Bautista Agut’s first-round exit was the way it happened. Both players won more than half of their return points: 61% for Schwartzman and 52% for Bautista Agut. There were 14 breaks of serve in 21 games.

Players like Schwartzman win more than half of return points fairly regularly. In the last 12 months, including both Challenger and tour-level matches, the Argentine–nicknamed El Peque for his diminutive stature–has done so more than 20 times. What is almost unheard of in the men’s game is for both players to return so well (or serve so poorly) that neither player wins at least half of his service points.

Since 1991–the first year for which ATP match stats are available–there have been fewer than 70 matches in which both players win more than half of their return points. (There are another 25 or so in which one player exceeded 50% and the other hit 50% exactly.) What’s more, these matches have become even less frequent over time: Wednesday’s result was the first instance on the ATP tour since 2014, and there have been fewer than 30 since 2000.

Here are the last 15 such matches, along with the both the winner’s (W RPW) and loser’s (L RPW) rates of return points won. Few of the players or surfaces come as a surprise:

Year  Event            Players                 W RPW  L RPW  
2017  Monte Carlo      Schwartzman d. RBA      61.4%  51.9%  
2014  Rio de Janeiro   Fognini d. Bedene       50.6%  50.6%  
2014  Houston          Hewitt d. Polansky      51.3%  51.5%  
2014  Estoril          Berlocq d. Berdych      51.5%  50.6%  
2013  Monte Carlo      Bautista Agut d. Simon  58.8%  50.6%  
2013  Estoril          Goffin d. P Sousa       55.2%  50.5%  
2011  Casablanca       Fognini d. Kavcic       51.0%  51.9%  
2011  Belgrade         Granollers d. Troicki   61.5%  50.8%  
2008  Barcelona        Chela d. Garcia Lopez   54.3%  50.5%  
2008  Costa Do Sauipe  Coria d. Aldi           58.5%  51.9%  
2007  Rome Masters     Ferrero d. Hrbaty       52.9%  51.7%  
2007  Hamburg          Ferrer d. Bjorkman      50.6%  50.6%  
2006  Monte Carlo      Coria d. Kiefer         53.2%  50.9%  
2006  Hamburg Masters  Gaudio d. A Martin      57.3%  51.1%  
2006  Australian Open  Coria d. Hanescu        53.4%  50.6%

All but 8 of the 69 total matches were on clay. One of the exceptions is at the bottom of this list, from the 2006 Australian Open, and before 2006, there were another five hard-court contests, along with two on grass courts. (The ATP database isn’t completely reliable, but in each of these cases, the high rate of return points won is partially verified by a similarly high number of reported breaks of serve.)

Bautista Agut, who won one of these matches four years ago in Monte Carlo, is one of several players who participated in multiple return-dominated clashes. Guillermo Coria played in five, winning four, and Fabrice Santoro took part in four, winning three. Coria won more than half of his return points in 75 tour-level matches over the course of his career.

Over course, both Schwartzman and Baustista Agut cleared the 50% bar with plenty of room to spare. The Spaniard won 51.9% of return points and Schwartzman comfortably exceeded 60%, putting them in an even more elite category. It was only the 22nd match since 1991 in which both players won at least 51.9% of return points.

As rare as these matches are, Schwartzman is doing everything he can to add to the list. With a ranking now in the top 40, he has entered just about every clay tournament on the schedule, so the most return-oriented competitor in the game is going to play a lot more top-level matches on slow surfaces. If anyone has a chance at equaling Coria’s mark of winning four of these return-dominated matches, my money’s on El Peque.

Rafael Nadal’s Wide-Open Monte Carlo Draw

Italian translation at settesei.it

This afternoon, Rafael Nadal will take on Albert Ramos for a chance at his tenth Monte Carlo Masters title. Since 2005, Nadal has faced the best clay-court players in the sport and, with very few exceptions, beaten them all.

Yet this year, Nadal’s path to the trophy has been remarkably easy. The three top seeds–Andy Murray, Novak Djokovic, and Stan Wawrinka–all lost early, leaving Nadal to face David Goffin in the semifinals and Ramos (who ousted Murray) in the final. Goffin, at No. 13, was Rafa’s highest ranked opponent, followed by Alexander Zverev, at No. 20, who Nadal crushed in the third round.

When we run the numbers, we’ll see that this competition isn’t just weak: It’s the weakest faced by any Masters titlist in recent history. I’ll get into the mechanics and show you some numbers in a minute.

First, a disclaimer. By saying a draw is weak, I’m not arguing that the title “means less” or is somehow less deserved. It’s not in any way a reflection on the player. For all we know, Rafa would’ve cruised through the draw had he faced the toughest possible opponent in every round. The only thing a weak draw tells us about the champion is how to forecast his future. Had Nadal beaten multiple top-ten players this week, we might be more confident predicting future success for him than we are now, after he has beaten up on a bunch of players we already suspected he’d have no problem with.

Back to the numbers. To measure the difficulty of a player’s draw, I used jrank–my own surface-adjusted rating system, roughly similar to Elo–at the time of each Masters event back to 2002. For each tournament, I found the jrank of each player the titlist defeated, and calculated the likelihood that a typical Masters winner would beat that group of players.

That’s a mouthful, so let’s walk through an example. In the last 15 years, the median Masters winner was ranked No. 3, with a jrank (for the surface of the tournament) of about 4700, good for fourth at the moment. A 4700-rated player would have an 85.7% chance of beating Ramos, a 75.7% chance of defeating Goffin, and 87.3%, 68.4%, and 88.7% chances of knocking out Diego Schwartzman, Zverev, and Kyle Edmund, respectively. Multiply those together, and our average Masters winner would have a 34.3% chance of claiming the trophy, given that competition.

I’m using a hypothetical average Masters winner so that we measure the level of competition against a constant level. It doesn’t matter whether 2017 Nadal, peak Nadal, or someone else entirely played that series of opponents. If Djokovic had faced the same five players, we’d want the numbers to come out the same.

Here are the ten easiest paths to a Masters title since 2002, measured by this algorithm:

Year  Event                Winner          Path Ease  
2017  Monte Carlo Masters  Rafael Nadal*       34.3%  
2016  Shanghai Masters     Andy Murray         33.0%  
2011  Shanghai Masters     Andy Murray         30.8%  
2013  Madrid Masters       Rafael Nadal        30.8%  
2012  Paris Masters        David Ferrer        30.4%  
2010  Monte Carlo Masters  Rafael Nadal        27.3%  
2012  Canada Masters       Novak Djokovic      25.8%  
2014  Madrid Masters       Rafael Nadal        25.3%  
2016  Paris Masters        Andy Murray         24.7%  
2010  Rome Masters         Rafael Nadal        24.6%

* pending; extremely likely

The average ‘Path Ease’ is 15.6%, and as we’ll see in a moment, some players have had it much, much harder. In Shanghai last year, Murray certainly did not: His draw turned out much like Rafa’s this week, complete with Goffin along the way and a three-named Spaniard in the final–in his case, Roberto Bautista Agut.

Here are the ten most difficult paths:

Year  Event                 Winner              Path Ease  
2007  Madrid Masters        David Nalbandian         4.1%  
2007  Paris Masters         David Nalbandian         6.2%  
2014  Canada Masters        Jo Wilfried Tsonga       6.6%  
2011  Rome Masters          Novak Djokovic           6.6%  
2009  Madrid Masters        Roger Federer            7.0%  
2010  Canada Masters        Andy Murray              7.7%  
2004  Cincinnati Masters    Andre Agassi             7.9%  
2007  Canada Masters        Novak Djokovic           8.0%  
2009  Indian Wells Masters  Rafael Nadal             8.0%  
2002  Canada Masters        Guillermo Canas          8.4%

Those of us who remember the end of David Nalbandian‘s 2007 season won’t be surprised to see him atop this list. In Madrid, he beat Nadal, Djokovic, and Roger Federer in the final three rounds, and in Paris, he knocked out Federer and Nadal again, along with three other top-16 players. Making his paths even more difficult, he didn’t earn a first-round bye in either event.

Given that Monte Carlo is the one non-mandatory Masters event, I expected that, over the years, it would prove to have the weakest competition. That was wrong. Entering this week, Monte Carlo is only fourth-easiest of the nine current 1000-series events. Indian Wells–which requires at least six victories for a title, unlike most of the others, which require only five–has been the toughest, while Miami, which also requires six wins, is closer to the middle of the pack:

Event         Avg Path Ease  
Indian Wells          12.8%  
Canada                14.3%  
Rome                  14.6%  
Miami                 15.3%  
Cincinnati            15.7%  
Monte Carlo*          16.5%  
Madrid**              16.7%  
Paris                 16.8%  
Shanghai              21.5%

* through 2016; ** hard- and clay-court eras included

Finally, seeing the presence of Nadal, Djokovic, and Murray on the list of easiest title paths raises another question. How have the big four’s levels of competition differed at the Masters events?

Player          Titles  Avg Path Ease  
Roger Federer       26          14.6%  
Novak Djokovic      30          16.1%  
Rafael Nadal        28          16.7%  
Andy Murray         14          18.1%

* not including 2017 Monte Carlo

Federer has had the most difficult paths, followed by Djokovic, Nadal, and then Murray. Assuming Rafa wins today, his number will tick up to 17.3%.

To reach ten titles at a single event, as Nadal is on the brink of doing in Monte Carlo, requires one to thrive regardless of draw luck. Rafa’s path to the trophy last year was tougher than any of his previous Monte Carlo campaigns, rating a Path Ease of 9.1%, almost difficult enough to show up on the top ten list displayed above. His 2008 title was no cakewalk either–a typical Masters winner would have only a 10.0% chance of coming through that draw successfully.

This year, Rafa’s luck has decidedly changed. To no one’s surprise, the best clay court player in history is taking full advantage.

The Proud Tradition of Americans Skipping Monte Carlo

Italian translation at settesei.it

The Monte Carlo Masters is unique among the ATP’s 1,000 series events. The stakes are high, but attendance isn’t mandatory, so while most of the game’s top players show up, a few take the week off. No group has so consistently skipped Monte Carlo than players from the U.S.A.

This year, six U.S. players had rankings that would’ve gotten them into the Monte Carlo main draw, where winning a single match earns you 45 ranking points and just over €28,000 in prize money. Five of those players–including John Isner, who reached the third round two years ago and won a pair of tough Davis Cup matches at the same venue–opted out. All five played the 250-level Houston tournament last week instead. Only Ryan Harrison made the trip to Europe–losing in the opening round, as Carl Bialik and I safely predicted on this week’s podcast.

Choosing the low-stakes event on home soil isn’t the wise choice, but it’s nothing new. Since 2006, only seven Americans have appeared in a Monte Carlo main draw: Isner twice, Harrison, Sam Querrey, Donald Young, Steve Johnson, and Denis Kudla, who qualified in 2015. From 2006 to 2016, 7 of the 11 Monte Carlo draws were entirely USA-free. In the same time span, Houston draws have featured 35 Americans ranked in the top 60–all players who probably would have earned direct entry in the higher-stakes clay event, as well.

For a player like Isner or Jack Sock, an April schedule can handle both tournaments. Four of the seven Americans who went to Monte Carlo played Houston as well, including Querrey in 2008, when he lost in the first round in Houston but reached the final eight in Monte Carlo.

Most U.S. players, including just about everyone I’ve mentioned so far, would much rather play on hard courts than on clay. (The Houston surface is more conducive to aggressive, first-strike tennis than is the Monte Carlo dirt, one of the slowest surfaces on the calendar.) However, as Isner and Querrey have shown, a one-dimensional power game can succeed on a slow court, even if it looks nothing like the strategy of a traditional clay specialist.

Isner, in particular, has racked up plenty of points on the surface. While he’d much rather play on home soil, he has twice reached the fourth round at the French Open and pushed none other Rafael Nadal to a deciding set in both Paris and Monte Carlo. Sock is also a threat on the surface, having won nearly two-thirds of his tour-level matches on clay. Many of those wins came in Houston, but like Isner, he took a set from Nadal in Europe on the surface the Spaniard typically dominates.

Even if the top Americans had little chance of going deep in Monte Carlo, one wonders what the additional time on the surface would do for the rest of their clay season. Most will show up for Madrid and Rome, and all of them will play Roland Garros. It’s a bit of a chicken-and-egg question–do Americans avoid the dirt because they suck on clay, or do they suck because they avoid it?–but it couldn’t hurt to play on the more traditional European surface against elite-level opponents.

The difference in rewards between a 250 like Houston and a Masters 1000 like Monte Carlo make it likely that the risk of playing in unfamiliar territory would pay off, as it did for Querrey in his one trip and for Isner two years ago. And I suspect that the rewards would stretch beyond the immediate shot at a bigger payday: If someone like Sock invested more time in developing his clay-court game now, he could become a legitimate threat at a faster clay tournament (such as the Madrid Masters) in a few years. It’s probably too late for the likes of Querrey, but the next generation of U.S. men’s stars would do well to break with tradition and give themselves more chances to excel on the dirt.

Podcast Episode 3: Champions Young and Old

In the Episode 3 of the Tennis Abstract Podcast, Carl Bialik and I discuss Francesca Schiavone and Marketa Vondrousova, the WTA titlists from last week at opposite ends of their career. We speculate on the future of Borna Coric–another one of last week’s titlists–as well.

Also on this week’s podcast: David Goffin’s steady improvement, future Davis Cup dynasties, problems with fixing the international team competition, the Italian Open’s decision not to give Schiavone a wild card, and Ryan Harrison’s surprising trip to Monte Carlo while the rest of the U.S. contingent stays at home.

Thanks for listening, and for tolerating our production “values” as we figure out the best way to do this. It will get better!

Listen here, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Podcast Episode 2: Doubles, Wild Cards, and Megastars

In the second episode of the Tennis Abstract Podcast, Carl Bialik and I give some much-deserved top billing to doubles, especially new ATP No. 1 Henri Kontinen and Elo doubles favorite Jack Sock.

We also cover the role of megastars in tennis, and the benefits and challenges they offer to the sport’s promoters. As we discuss, big names may be key to expanding the appeal of doubles, and they are the one major argument for the continuing existence of wild cards–on whichever side of the Maria Sharapova debate you find yourself.

Listen here, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Little Data, Big Potential

This is a guest post by Carl Bialik.

I had more data on my last 30 minutes of playing tennis than I’d gotten in my first 10 years of playing tennis — and it just made me want so much more.

Ben Rothenberg and I had just played four supertiebreakers, after 10 minutes of warmup and before a forehand drill. And for most of that time — all but a brief break while PlaySight staff showed the WTA’s Micky Lawler the system — 10 PlaySight cameras were recording our every move and every shot: speed, spin, trajectory and whether it landed in or out. Immediately after every point, we could walk over to the kiosk right next to the net to watch video replays and get our stats. The tennis sure didn’t look professional-grade, but the stats did: spin rate, net clearance, winners, unforced errors, net points won.

Later that night, we could go online and watch and laugh with friends and family. If you’re as good as Ben and I are, laugh you will: As bad as we knew the tennis was by glancing over to Dominic Thiem and Jordan Thompson on the next practice court, it was so much worse when viewed on video, from the kind of camera angle that usually yields footage of uberfit tennis-playing pros, not uberslow tennis-writing bros.

This wasn’t the first time I’d seen video evidence of my take on tennis, an affront to aesthetes everyone. Though my first decade and a half of awkward swings and shoddy footwork went thankfully unrecorded, in the last five years I’d started to quantify my tennis self. First there was the time my friend Alex, a techie, mounted a camera on a smartphone during our match in a London park. Then in Paris a few years later, I roped him into joining me for a test of Mojjo, a PlaySight competitor that used just one camera — enough to record video later published online, with our consent and to our shame. Last year, Tennis Abstract proprietor Jeff Sackmann and I demo-ed a PlaySight court with Gordon Uehling, founder of the company.

With PlaySight and Mojjo still only in a handful of courts available to civilians, that probably puts me — and Alex, Jeff and Ben — in the top 5 or 10 percent of players at our level for access to advanced data on our games. (Jeff may object to being included in this playing level, but our USPS Tennis Abstract Head2Head suggests he belongs.) So as a member of the upper echelon of stats-aware casual players, what’s left once I’m done geeking out on the video replays and RPM stats? What actionable information is there about how I should change my game?

Little data, modest lessons

After reviewing my footage and data, I’m still looking for answers. Just a little bit of tennis data isn’t much more useful than none.

Take the serve, the most common shot in tennis. In any one set, I might hit a few dozen. But what can I learn from them? Half are to the deuce court, and half are to the ad court. And almost half of the ones that land in are second serves. Even with my limited repertoire, some are flat while others have slice. Some are out wide, some down the T and some to the body — usually, for me, a euphemism for, I missed my target.

If I hit only five slice first serves out wide to the deuce court, three went in, one was unreturned and I won one of the two ensuing rallies, what the hell does that mean? It doesn’t tell me a whole lot about what would’ve happened if I’d gotten a chance to I try that serve once more that day against Ben — let alone what would happen the next time we played, when he had his own racquet, when we weren’t hitting alongside pros and in front of confused fans, with different balls on a different surface without the desert sun above us, at a different time of day when we’re in different frames of mind. And the data says even less about how that serve would have done against a different opponent.

That’s the serve, a shot I’ll hit at least once on about half of points in any match. The story’s even tougher for rarer shots, like a backhand drop half volley or a forehand crosscourt defensive lob, shots so rare they might come up once or twice every 10 matches.

More eyes on the court

It’s cool to know that my spinniest forehand had 1,010 RPM (I hit pretty flat compared to Jack Sock’s 3,337 rpm), but the real value I see is in the kind of data collected on that London court: the video. PlaySight doesn’t yet know enough about me to know that my footwork was sloppier than usual on that forehand, but I do, and it’s a good reminder to get moving quickly and take small steps. And if I were focusing on the ball and my own feet, I might have missed that Ben leans to his backhand side instead of truly split-stepping, but if I catch him on video I can use that tendency to attack his forehand side next time.

Video is especially useful for players who are most focused on technique. As you might have gathered, I’m not, but I can still get tactical edge from studying patterns that PlaySight doesn’t yet identify.

Where PlaySight and its ilk could really drive breakthroughs is by combining all of the data at its disposal. The company’s software knows about only one of the thousands of hours I’ve spent playing tennis in the last five years. But it has tens of thousands of hours of tennis in its database. Even a player as idiosyncratic as me should have a doppelganger or two in a data set that big. And some of them must’ve faced an opponent like Ben. Then there are partial doppelgangers: women who serve like me even though all of our other shots are different; or juniors whose backhands resemble mine (and hopefully are being coached into a new one). Start grouping those videos together — I’m thinking of machine learning, clustering and classifying — and you can start building a sample of some meaningful size. PlaySight is already thinking this way, looking to add features that can tell a player, say, “Your backhand percentage in matches is 11 percent below other PlaySight users of a similar age/ability,” according to Jeff Angus, marketing manager for the company, who ran the demo for Ben and me.

The hardware side of PlaySight is tricky. It needs to install cameras and kiosks, weatherproofing them when the court is outdoors, and protect them from human error and carelessness. It’s in a handful of clubs, and the number probably won’t expand much: The company is focusing more on the college game. Even when Alex and I, two players at the very center of PlaySight’s target audience among casual players, happened to book a PlaySight court recently in San Francisco, we decided it wasn’t worth the few minutes it would have taken at the kiosk to register — or, in my case, remember my password. The cameras stood watch, but the footage was forever lost.

Bigger data, big questions

I’m more excited by PlaySight’s software side. I probably will never play enough points on PlaySight courts for the company to tell me how to play better or smarter — unless I pay to install the system at my home courts. But if it gets cheaper and easier to collect decent video of my own matches — really a matter of a decent mount and protector for a smartphone and enough storage space — why couldn’t I upload my video to the company? And why couldn’t it find video of enough Bizarro Carls and Bizarro Carl opponents around the world to make a decent guess about where I should be hitting forehands?

There are bigger, deeper tennis mysteries waiting to be solved. As memorably argued by John McPhee in Levels of the Game, tennis isn’t so much as one sport as dozens of different ones, each a different level of play united only by common rules and equipment. And a match between two players even from adjacent levels in his hierarchy typically is a rout. Yet tactically my matches aren’t so different from the ones I see on TV, or even from the practice set played by Thiem and Thompson a few feet from us. Hit to the backhand, disguise your shots, attack short balls and approach the net, hit drop shots if your opponent is playing too far back. And always, make your first serve and get your returns in.

So can a tactic from one level of the game even to one much lower? I’m no Radwanska and Ben is no Cibulkova, but could our class of play share enough similarity — mathematically, is Carl : Ben :: Aga : Pome — that what works for the pros works for me? If so, then medium-sized data on my style is just a subset of big data from analogous styles at every level of the game, and I might even find out if that backhand drop half volley is a good idea. (Probably not.)

PlaySight was the prompt, but it’s not the company’s job to fulfill product features only I care about. It doesn’t have to be PlaySight. Maybe it’s Mojjo, maybe Cizr. Or maybe it’s some college student who likes tennis and is looking for a machine-learning class. Hawk-Eye, the higher-tech, higher-priced, older competitor to PlaySight, has been slow to share its data with researchers and journalists. If PlaySight has figured out that most coaches value the video and don’t care much for stats, why not release the raw footage and stats to researchers, anonymized, who might get cracking on the tennis classification question or any of a dozen other tennis analysis questions I’ve never thought to ask? (Here’s a list of some Jeff and I have brainstormed, and here are his six big ones.) I hear all the time from people who like tennis and data and want to marry the two, not for money but to practice, to learn, to discover, and to share their findings. And other than what Jeff’s made available on GitHub, there’s not much data to share. (Just the other week, an MIT grad asked for tennis data to start analyzing.)

Sharing data with outside researchers “isn’t currently in the road map for our product team, but that could change,” Angus said, if sharing data can help the company make its data “actionable” for users to improve to their games.

Maybe there aren’t enough rec players who’d want the data with enough cash to make such ventures worthwhile. But college teams could use every edge. Rising juniors have the most plastic games and the biggest upside. And where a few inches can change a pro career, surely some of the top women and men could also benefit from PlaySight-driven insights.

Yet even the multimillionaire ruling class of the sport is subject to the same limitations driven by the fractured nature of the sport: Each event has its own data and own systems. Even at Indian Wells, where Hawk-Eye exists on every match court, just two practice courts have PlaySight; the company was hoping to install four more for this year’s tournament and is still aiming to install them soon. Realistically, unless pros pay to install PlaySight on their own practice courts and play lots of practice matches there, few will get enough data to be actionable. But if PlaySight, Hawk-Eye or a rival can make sense of all the collective video out there, maybe the most tactical players can turn smarts and stats into competitive advantages on par with big serves and wicked topspin forehands.

PlaySight has already done lots of cool stuff with its tennis data, but the real analytics breakthroughs in the sport are ahead of us.

Carl Bialik has written about tennis for fivethirtyeight.com and The Wall Street Journal. He lives and plays tennis in New York City and has a Tennis Abstract page.

The Tennis Abstract Podcast

Today, Carl Bialik and I are excited to announce our new podcast, called–you guessed it–the Tennis Abstract Podcast.

In the inaugural episode, recorded Monday, we discuss the results from Indian Wells and Miami, with a special focus on the renewed relevance of Roger Federer and Rafael Nadal. We also take a wider look at the upcoming clay season for both the ATP and WTA tours.

We’re still working out all the kinks, so please forgive us for the bare bones presentation and the occasional audio glitch. If all goes well, we’ll have a new episode most weeks, maybe even with fewer glitches to apologize for.

Anyway, we both really enjoyed recording this first episode, and we hope you like it as well!

Click here to listen to Episode 1.

We’ll soon be listed on iTunes and elsewhere. In the meantime, our xml feed — http://www.tennisabstract.com/podcast/feed.xml — may come in handy for those of you who would like to subscribe.