The Steadily Less Predictable WTA

Italian translation at settesei.it

Update: The numbers in this post summarizing the effectiveness of sElo are much too high–a bug in my code led to calculating effectiveness with post-match ratings instead of pre-match ratings. The parts of the post that don’t have to do with sElo are unaffected and–I hope–remain of interest.

One of the talking points throughout the 2017 WTA season has been the unpredictability of the field. With the absence of Serena Williams, Victoria Azarenka, and until recently, Petra Kvitova and Maria Sharapova, there is a dearth of consistently dominant players. Many of the top remaining players have been unsteady as well, due to some combination of injury (Simona Halep), extreme surface preferences (Johanna Konta), and good old-fashioned regression to the mean (Angelique Kerber).

No top seed has yet won a title at the Premier level or above so far this year. Last week, Stephanie Kovalchik went into more detail, quantifying how seeds have failed to meet expectations and suggesting that the official WTA ranking system–the algorithm that determines which players get those seeds–has failed.

There are plenty of problems with the WTA ranking system, especially if you expect it to have predictive value–that is, if you want it to properly reflect the performance level of players right now. Kovalchik is correct that the rankings have done a particularly poor job this year identifying the best players. However, there’s something else going on: According to much more accurate algorithms, the WTA is more chaotic than it has been for decades.

Picking winners

Let’s start with a really basic measurement: picking winners. Through Rome, there had been more than 1100 completed WTA matches. The higher-ranked player won 62.4% of those. Since 1990, the ranking system has picked the winner of 67.9% of matches, and topped 70% during several years in the 1990s. It never fell below 66% until 2014, and this year’s 62.4% is the worst in the 28-year time frame under consideration.

Elo does a little better. It rates players by the quality of their opponents, meaning that draw luck is taken out of the equation, and does a better job of estimating the ability level of players like Serena and Sharapova, who for various reasons have missed long stretches of time. Since 1990, Elo has picked the winner of 68.6% of matches, falling to an all-time low of 63.1% so far in 2017.

For a big improvement, we need surface-specific Elo (sElo). An effective surface-based system isn’t as complicated as I expected it to be. By generating separate rankings for each surface (using only matches on that surface), sElo has correctly predicted the winner of 76.2% of matches since 1990, almost cracking 80% back in 1992. Even sElo is baffled by 2017, falling to it’s lowest point of 71.0% in 2017.

(sElo for all three major surfaces is now shown on the Tennis Abstract Elo ratings report.)

This graph shows how effectively the three algorithms picked winners. It’s clear that sElo is far better, and the graph also shows that some external factor is driving the predictability of results, affecting the accuracy of all three systems to a similar degree:

Brier scores

We see a similar effect if we use a more sophisticated method to rate the WTA ranking system against Elo and sElo. The Brier score of a collection of predictions measures not only how accurate they are, but also how well calibrated they are–that is, a player forecast to win a matchup 90% of the time really does win nine out of ten, not six out of ten, and vice versa. Brier scores average the square of the difference between each prediction and its corresponding result. Because it uses the square, very bad predictions (for instance, that a player has a 95% chance of winning a match she ended up losing) far outweigh more pedestrian ones (like a player with a 95% chance going on to win).

In 2017 so far, the official WTA ranking system has a Brier score of .237, compared to Elo of .226 and sElo of .187. Lower is better, since we want a system that minimizes the difference between predictions and actual outcomes. All three numbers are the highest of any season since 1990. The corresponding averages over that time span are .207 (WTA), .202 (Elo), and .164 (sElo).

As with the simpler method of counting correct predictions, we see that Elo is a bit better than the official ranking, and both of the surface-agnostic methods are crushed by sElo, even though the surface-specific method uses considerably less data. (For instance, the clay-specific Elo ignores hard and grass court results entirely.) And just like the results of picking winners, we see that the differences in Brier scores of the three methods are fairly consistent, meaning that some other factor is causing the year-to-year differences:

The takeaway

The WTA ranking system has plenty of issues, but its unusually bad performance this year isn’t due to any quirk in the algorithm. Elo and sElo are structured completely differently–the only thing they have in common with the official system is that they use WTA match results–and they show the same trends in both of the above metrics.

One factor affecting the last two years of forecasting accuracy is the absence of players like Serena, Sharapova, and Azarenka. If those three played full schedules and won at their usual clip, there would be quite a few more correct predictions for all three systems, and perhaps there would be fewer big upsets from the players who have tried to replace them at the top of the game.

But that isn’t the whole story. A bunch of no-brainer predictions don’t affect Brier score very much, and the presence of heavily-favored players also make it more likely that massively surprising results occur, such as Serena’s loss to Madison Brengle, or Sharapova’s ouster at the hands of Eugenie Bouchard. Many unexpected results are completely independent of the top ten, like Marketa Vondrousova’s recent title in Biel.

While some of the year-to-year differences in the graphs above are simply noise, the last several years looks much more like a meaningful trend. It could be that we are seeing a large-scale changing of a guard, with young players (and their low rankings) regularly upsetting established stars, while the biggest names in the sport are spending more time on the sidelines. Upsets may also be somewhat contagious: When one 19-year-old aspirant sees a peer beating top-tenners, she may be more confident that she can do the same.

Whatever influences have given us the WTA’s current state of unpredictability, we can see that it’s not just a mirage created by a flawed ranking system. Upsets are more common now than at any other point in recent memory, whichever algorithm you use to pick your favorites.

Podcast Episode 8: Zverev’s Title, Emerging WTA Favorites, and a New Match Format

In the Episode 8 of the Tennis Abstract Podcast, Carl Bialik and I survey the men’s and women’s fields in Rome and consider what last week’s top-tier events have to tell us about Roland Garros. We touch on Alexander Zverev’s maiden Masters title, the mixed signals of Dominic Thiem’s and Novak Djokovic’s tournaments, the rise of Elina Svitolina, and the continued relevance of Venus Williams.

We also have even more to say about wild cards (not Sharapova’s, I promise!) and dive into the potential of the best-of-five, first-to-four-games format set to debut at the NextGen ATP event in November.

Thanks for listening!

Click to listen, subscribe on iTunes, find us on Stitcher, or use our feed to get updates on your favorite podcast software.

Podcast Episode 7: Champion Simona, King Rafa, and Memories of Pico

In the Episode 7 of the Tennis Abstract Podcast, Carl Bialik and I cover a lot of ground, from Simona Halep’s Madrid title and Kristina Mladenovic’s recent outspokenness, to Rafael Nadal’s unbeaten streak and Dominic Thiem’s rising status as a clay-court contender, along with the inevitability that someone born in the 1990s will eventually win a big ATP title.

Thanks for listening!

Click to listen, subscribe on iTunes, find us on Stitcher, or use our feed to get updates on your favorite podcast software.

Podcast Episode 6: Djokovic Therapy, More WTA Chaos, and Fivers in Roehampton

In the Episode 6 of the Tennis Abstract Podcast, Carl Bialik and I discuss Novak Djokovic’s new coaching situation, consider the prospects of last week’s title winners (Marin Cilic and Alexander Zverev among them), and continue to watch Maria Sharapova in her return to tour. We also lament Wimbledon’s decision to charge admission for qualies and rejoice in an American or two who can play on clay.

Thanks for listening!

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Albert Ramos’s Record-Setting Doubles Futility

Last week, we learned that Albert Ramos is not very good at doubles. In Barcelona, he lost his first-round doubles match, running his losing streak to 21 straight and his career tour-level record to an astonishing 14-79.

Ramos hasn’t won a doubles match since Marrakech last year, so he has fallen off the doubles ranking list entirely. Elo isn’t so kind: Of the 268 players with at least one tour-level doubles match since 2014, Ramos ranks dead last, with an Elo rating of 1260, 130 points behind the second worst, Paul-Henri Mathieu, and 240 points below the default rating of 1500 given to a player when he first arrives on tour. If two players with Ramos’s rating were to play an elite team like Kontinen/Peers, Elo would give the Ramos team little more than a 2% chance of winning.

It turns out that the Barcelona loss was a notable one, setting the mark for the longest tour-level doubles losing streak since 2000. Here is the list:

PLAYER               LOSSES     YEARS  
Albert Ramos             21   2016-17*  
Florent Serra            20   2008-10  
Lars Burgsmuller         18   2001-03  
Ryan Sweeting            17   2010-12  
Mikhail Kukushkin        17   2014-16  
Gael Monfils             16   2012-15  
Jack Waite               16   2001-02  
Mikhail Youzhny          16   2002-03  
Luke Jensen              15   2000-02  
Ratiwatana brothers      15   2008-09  
Taylor Dent              15   2001-04

* active streak

My database isn’t as complete before 2000, so I can’t confidently say whether there were longer streaks earlier in ATP history.

Among active players, Ramos’s run of futility stands far above the pack. There are 14 players with active streaks of 8 or more tour-level losses, though as you’ll see, I’m defining “active” quite broadly:

PLAYER                STREAK  START  
Albert Ramos              21   2016  
Lukas Lacko               13   2012  
James Ward                11   2010  
Marinko Matosevic         11   2014  
Jimmy Wang                11   2006  
Zhe Li                    11   2010  
Omar Awadhy               10   2002  
Jose Rubin Statham        10   2006  
Mikhail Youzhny           10   2015  
Paul Henri Mathieu         9   2016  
Juan Monaco                9   2015  
Lucas Pouille              8   2016  
Andre Begemann             8   2016  
Daniel Gimeno Traver       8   2015

Many of the players on this list are attempting comebacks from injury or trying to rebuild their rankings to enter more ATP events, so few of them are likely to threaten Ramos’s mark. If he continues on tour, Mathieu may have the best chance: He has racked up five different losing streaks of 8 or more matches, including a 12-loss stretch between 2002 and 2005.

One of the things that makes Ramos’s streak so remarkable is that he has continued to enter doubles draws so frequently, playing both singles and doubles in 20 of his 31 events. Some of his peers have had poor doubles seasons, but few of them have kept trying so assiduously. Here are the 15 players with the worst doubles winning percentages in the last 52 weeks, minimum 10 matches:

PLAYER                   MATCHES  WINS  WIN PERC  
Albert Ramos                  20     0      0.0%  
Jiri Vesely                   10     1     10.0%  
Alexander Bury                13     2     15.4%  
Taylor Fritz                  11     2     18.2%  
Gilles Simon                  11     2     18.2%  
Benoit Paire                  16     3     18.8%  
Inigo Cervantes Huegun        10     2     20.0%  
Lucas Pouille                 15     3     20.0%  
Hans Podlipnik Castillo       13     3     23.1%  
Paolo Lorenzi                 33     8     24.2%  
Marcos Baghdatis              12     3     25.0%  
Adrian Mannarino              15     4     26.7%  
Andreas Seppi                 15     4     26.7%  
Joao Sousa                    30     8     26.7%  
Neal Skupski                  17     5     29.4%

Paolo Lorenzi might be a bit better than his position on this list makes him look: Over the last year, he has partnered Ramos four times, more than any other player.

Then again, Lorenzi has struggled with plenty of doubles partners. Here are the least successful doubles players since 2000, minimum 50 matches:

PLAYER              MATCHES  WINS  WIN PERC  
Albert Ramos             93    14     15.1%  
Robby Ginepri            97    21     21.6%  
Gilles Simon            151    33     21.9%  
Gael Monfils             92    21     22.8%  
Adrian Mannarino         58    14     24.1%  
Benoit Paire             93    23     24.7%  
Paul Henri Mathieu      105    26     24.8%  
Jack Waite               68    17     25.0%  
Florent Serra            72    18     25.0%  
Santiago Giraldo         99    27     27.3%  
Aleksandar Kitinov       88    24     27.3%  
Marinko Matosevic        61    17     27.9%  
Bernard Tomic            63    18     28.6%  
Younes El Aynaoui        56    16     28.6%  
Paolo Lorenzi           104    30     28.8%

Ramos, once again, is in a league of his own. Beyond him and Robby Ginepri, the list is dominated by a surprising number of Frenchmen, including Florent Serra, who outranks several of his countrymen, but appeared earlier with the 20-match losing streak that Ramos finally overtook.

Ironically, since Ramos’s losing streak has coincided with career-best success on the singles circuit, he will find it easier than ever to enter doubles draws. With the press that comes with the streak, however, potential partners may finally think twice before signing up with the worst tour-level doubles player of their generation.

Podcast Episode 5: Sharapova’s Back, Rafa’s the King, and Dropshots Are Cool

In the Episode 5 of the Tennis Abstract Podcast, Carl Bialik and I talk Sharapova, Grand Slam wild cards, Laura Siegemund, drop shot tactics, Nadal’s second Decima, and a couple of interesting doubles storylines.

Thanks for listening!

Listen here, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Why Maria Sharapova Should Get a French Open Wild Card

Italian translation at settesei.it

Maria Sharapova has returned from a 15-month doping suspension and hardly missed a step, advancing to the semifinals of her first tournament back, in the WTA Premier event in Stuttgart. While the draw has done her some favors–Ekaterina Makarova knocked out Agnieszka Radwanska, and Anett Kontaveit ousted Garbine Muguruza, Sharapova has shown she’s ready to compete at the highest level, winning about 57% of points against three credible opponents.

Many players have publicly stated that Sharapova doesn’t deserve to get wild cards, often because WCs are a sort of bonus, and a player who broke the rules doesn’t deserve any kind of handout. We’re likely to hear a lot more about it, as we won’t learn her status for the French Open for another two weeks.

However, wild cards are at the discretion of each individual tournament, and barring new regulations for players returning from doping bans, tournaments have their own incentives. Events often choose wild cards from a marketing perspective, granting main draw spots to former stars, young prospects, or local favorites.

Tournaments don’t have an explicit contract with their fans, but if they did, it would have to begin with an obligation to put the highest-quality product on the court. Most of the time, the ATP and WTA ranking and entry systems accomplish this, guaranteeing main draw places to the highest-ranked players. Occasionally, though, the ranking system fails and massively underrates the quality of a player.

Sharapova, obviously, is such a case. Unranked this week, and ranked #262 next week if she loses today to Kristina Mladenovic, she is already performing at the level of a top-20 player. My research suggests she may very soon be the best active player in women’s tennis, even if it takes many months for her official ranking to catch up.

Wild cards are the only mechanism tournaments are given to correct for the limitations of the ranking system. If the French Open (or any other event) wants to improve the quality of their draw, it should give Sharapova a wild card. If I am right that tournaments owe it to the fans to put on court the highest-level competition they possibly can, there are few opportunities so clear-cut as this one to improve the quality of a draw with a single player’s entry.

I can hear the objections already. First, as so many have claimed, Sharapova doesn’t “deserve” this kind of benefit. Yet by definition, wild cards are for players who don’t deserve a main draw entry. If they deserved one, their ranking (or “special” or “protected” ranking, if returning from injury) would guarantee them one. We use the words “deserve” and “earn” rather vaguely in this context, perhaps saying that a former great in his final year deserves a wild card based on his past contributions to tennis, or that a player has earned the free entry because she won a play-off of some sort.

It’s certainly true that some wild cards are more earned than others, but ultimately it’s beside the point. Even if it offends our sense of fairness, the players who most deserve a place in a draw are those who will make it more competitive. Last year, the French Federation gave wild cards to the likes of Alize Lim and Tessah Andrianjafitrimo, who lost in the first round to Qiang Wang without winning a single game. The eight wild cards won a total of three matches–one of them against another wild card. Except by virtue of being French, most of these wild cards didn’t do much to earn their places, and they had almost no impact on the tournament itself.

Beyond the claim that Sharapova, having broken the rules, doesn’t deserve a handout, there is a more extreme position, that her 15-month suspension wasn’t a sufficiently severe punishment. We can group that with another potential objection, that the French Open can’t be seen to endorse a doper. This is one of the many unfortunate side-effects of a weak central authority in tennis. By this argument, every tournament with the option of granting Sharapova an entry is required to re-litigate her doping ban. Even if we sidestep some of the controversial aspects of her ban and stipulate that she knowingly did something very bad, this seems nonsensical.

The whole point of having a central authority for doping enforcement is so that tournaments needn’t all police the players themselves. By issuing a 15-month ban, the ITF essentially spoke for all affiliated tournaments, saying that after 15 months (and exactly one day, as it turned out), Sharapova’s penalty would be paid and she would, in a sense, be rehabilitated. Giving a rehabilitated player a wild card is in no way an endorsement of her behavior, any more than giving a job to an ex-convict is an endorsement of the criminal act that put him in jail.

As a fan–even when I wish Sharapova wouldn’t win so many matches against my favorites–I want to see the best possible level of tennis every week. Now that her suspension is complete, every week that Sharapova wants to compete but can’t enter a top-level event is a missed opportunity for the sport. As great as the French Open is, it would be better with Sharapova than without her.

Diego Schwartzman’s Return Game Is Even Better Than I Thought

Click for an Italian translation

Diego Schwartzman is one of the most unusual players on the ATP tour. Even shorter than David Ferrer, his serve will never be a weapon, so the only way he can compete is by neutralizing everyone else’s offerings and winning baseline battles. Up to No. 34 in this week’s official rankings and No. 35 on the Elo list, he’s proven he can do that against some very good players.

Using the ATP stats leaderboard at Tennis Abstract, we can get a quick sense of how his return game compares with the elites. At tour level in the last 52 weeks (through Monte Carlo), he ranks third with 42.3% return points won, behind only Andy Murray and Novak Djokovic. He is particularly effective against second serves, winning 56.6% of those, better than anyone else on tour. He has broken in 31.8% of his return games, another third-place showing, this time behind Murray and Rafael Nadal.

Yet the leaderboard warns us to tread carefully. In the last year, Murray’s opponents have been far superior to Schwartzman’s, with a median rank of 24 and a mean rank of 41.5. The Argentine’s opponents have rated at 45.5 and 54.8, respectively. Murray, Djokovic, and Nadal are far better all-around players than Schwartzman, so they regularly reach later rounds, where the quality of competition goes way up.

Competition quality is one of the knottiest aspects of tennis analytics, and it is far from being solved. If we want to compare Murray to Djokovic, competition quality isn’t such a big factor. One or the other might get lucky over a span of months, but in the long run, the two best players on tour will face roughly equivalent levels of competition. But when we expand our view to players like Schwartzman–or even a top-tenner such as Dominic Thiem–we can no longer assume that opponent quality will even out. To use a term from other sports, the ATP has a very unbalanced schedule, and the schedule is always more challenging for the best players.

Correcting for competition quality is also key to understanding how any particular player evolves over time. If a player’s results improve, he’ll usually start facing more challenging competition, as Schwartzman is doing this spring in his first shot at the full slate of clay-court Masters events. If his return numbers decline, is he actually playing worse, or is he simply competing at his past level against tougher opponents?

Adjusting for competition

To properly compare players, we need to identify similarities in their schedules. Any pair of tour regulars have played many of the same opponents, even if they’ve never played each other. For instance, since the beginning of last season, Murray and Djokovic have faced 18 of the same players–some more than once. Further down the ranking list, players tend to have fewer opponents in common, but as we’ll see, that’s an obstacle we can overcome.

Here’s how the adjustment works: For a pair of players, find all the opponents both men have faced on the same surface. For example, both Murray and Djokovic have played David Goffin on clay in the last 16 months. Murray won 53.7% of clay return points against the Belgian, while Djokovic won only 42.1%, meaning that Djokovic returned about 22% worse than Murray did. We repeat the process for every surface-player combination, weight the results so that longer matches (or larger numbers of matches) count more heavily, and find the average.

When we do that for the top two men, we find that Djokovic has returned 2.3% better. (That’s a percentage, not percentage points. A great returner wins about 40% of return points, and a 2.3% improvement on that is roughly 41%.) Our finding suggests that Murray has faced somewhat weaker-serving competition: Since the beginning of 2016, he has won 42.9% of return points, compared to Djokovic’s 43.3%–a smaller gap than the competition-adjusted one.

It takes more work to reliably compare someone like Schwartzman to the elites, since their schedules overlap so much less. So before adjusting Diego’s return numbers, we’ll take several intermediate steps. Let’s start with the world No. 3 Stanislas Wawrinka. We follow the above process twice: Once for Wawrinka and Murray, then again for Stan and Novak. Run the numbers, and we find that Wawrinka’s return game is 22.5% weaker than Murray’s and 24.3% weaker than Djokovic’s. Wawrinka’s rates relative to the other two players correspond very well with what we already found, suggesting that Djokovic is a little better than his rival. Weighting the two numbers by sample size–which, in this case, is almost identical–we slightly adjust those two comparisons and conclude that Wawrinka’s return game is 22.4% worse than Murray’s.

Generating competition-adjusted numbers for each subsequent player follows the same pattern. For No. 4 Federer, we run the algorithm three times, one for each of the players ranked above him, then we aggregate the results. For No. 34 Schwartzman, we go through the process 33 times. Thanks to the magic of computers, it takes only a few seconds to adjust 16 months worth of return stats for the ATP top 50.

Below are the results for 2016-17. Players are ranked by “relative return points won” (REL RPW), where a rating of 1.0 is arbitrarily given to Murray, and a rating of 0.98 means that a player wins 2% fewer return points than Murray against equivalent opposition. The “EX RPW” column puts those numbers in a more familiar context: The top-ranked player’s rating is set equal to 43.0%–approximately the best RPW of any player in the last few seasons–and everyone else’s is adjusted accordingly.  The last two columns show each player’s actual rate of return points won and their rank among the ATP top 50:

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK  
1     Diego Schwartzman         1.04   43.0%   42.4%     4  
2     Novak Djokovic            1.02   42.1%   43.3%     1  
3     Andy Murray               1.00   41.2%   42.9%     2  
4     Rafael Nadal              0.98   40.3%   42.6%     3  
5     David Goffin              0.97   40.1%   41.3%     5  
6     Gilles Simon              0.96   39.6%   40.1%     9  
7     Kei Nishikori             0.95   39.3%   40.1%    10  
8     David Ferrer              0.95   39.1%   40.6%     7  
9     Roger Federer             0.94   38.7%   38.7%    15  
10    Gael Monfils              0.93   38.5%   39.8%    11  


RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
11    Roberto Bautista Agut     0.93   38.3%   40.3%     8  
12    Ryan Harrison             0.92   37.9%   36.7%    33  
13    Richard Gasquet           0.92   37.9%   40.8%     6  
14    Daniel Evans              0.91   37.6%   36.9%    27  
15    Juan Martin Del Potro     0.91   37.5%   36.8%    32  
16    Benoit Paire              0.90   37.0%   38.1%    19  
17    Mischa Zverev             0.90   36.9%   36.9%    28  
18    Grigor Dimitrov           0.89   36.4%   38.2%    18  
19    Fabio Fognini             0.88   36.4%   39.7%    12  
20    Fernando Verdasco         0.88   36.4%   38.3%    16  

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
21    Joao Sousa                0.88   36.2%   38.3%    17  
22    Dominic Thiem             0.88   36.2%   38.1%    20  
23    Stani Wawrinka            0.88   36.1%   37.5%    22  
24    Alexander Zverev          0.88   36.0%   37.5%    23  
25    Albert Ramos              0.87   35.9%   38.9%    14  
26    Kyle Edmund               0.86   35.5%   36.1%    37  
27    Jack Sock                 0.86   35.5%   36.6%    34  
28    Viktor Troicki            0.86   35.4%   37.1%    26  
29    Marin Cilic               0.86   35.4%   37.3%    25  
30    Pablo Carreno Busta       0.86   35.3%   39.4%    13  

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
31    Milos Raonic              0.86   35.2%   36.1%    38  
32    Pablo Cuevas              0.85   35.1%   36.9%    29  
33    Tomas Berdych             0.85   35.1%   36.9%    30  
34    Borna Coric               0.85   34.9%   36.1%    39  
35    Nick Kyrgios              0.85   34.9%   35.7%    41  
36    Philipp Kohlschreiber     0.84   34.7%   37.9%    21  
37    Jo Wilfried Tsonga        0.84   34.6%   36.2%    36  
38    Sam Querrey               0.83   34.3%   34.6%    44  
39    Lucas Pouille             0.82   33.9%   36.9%    31  
40    Feliciano Lopez           0.81   33.2%   35.2%    43  

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
41    Robin Haase               0.80   33.0%   36.1%    40  
42    Paolo Lorenzi             0.80   32.9%   37.5%    24  
43    Donald Young              0.78   32.2%   36.3%    35  
44    Bernard Tomic             0.78   32.1%   34.1%    45  
45    Nicolas Mahut             0.76   31.4%   35.4%    42  
46    Steve Johnson             0.75   31.0%   33.8%    46  
47    Florian Mayer             0.74   30.3%   33.5%    47  
48    John Isner                0.73   30.0%   29.8%    49  
49    Gilles Muller             0.72   29.8%   32.4%    48  
50    Ivo Karlovic              0.63   25.9%   26.4%    50

The big surprise: Schwartzman is number one! While the average ranking of his opponents was considerably lower than that of the elites, it appears that he has faced bigger-serving opponents than have Murray or Djokovic. The top five on this list–Schwartzman, Murray, Djokovic, Nadal, and Goffin–do not force any major re-evaluation of who we consider to be the game’s best returners, but the competition-adjusted metric does offer more evidence that Schwartzman really belongs there.

There is a similar predictability at the bottom of the list. The five players rated the worst by the competition-adjusted metric–Steve Johnson, Florian Mayer, John Isner, Gilles Muller, and Ivo Karlovic–are the same five who sit at the bottom of the actual RPW ranking, with only Isner and Muller swapping places. This degree of consistency at the top and bottom of the list is reassuring: The metric is correcting for something important, but it isn’t spitting out any truly crazy results.

There are, however, some surprises. Three players do very well when their return games are adjusted for competition: Ryan Harrison, Daniel Evans, and Juan Martin del Potro, all of whom jump from the bottom half to the top 15. In a sense, this is a surface adjustment for Harrison and Evans, both of whom have played almost exclusively on hard courts. Players win fewer return points on faster surfaces (and faster surfaces attract bigger-serving competitors, magnifying the effect), so when adjusted for competition, someone who plays only on hard courts will see his numbers improve. Del Potro, on the other hand, has been absolutely hammered by tough competition, so in his case the correction is giving him credit for the difficult opponents he has had to face.

Several clay court specialists find their return stats adjusted in the wrong direction. Last week’s finalist, Albert Ramos, falls from 14th to 25th, Pablo Carreno Busta drops from 13th to 30th, and Roberto Bautista Agut and Paolo Lorenzi see their numbers take a hit as well. This is the reverse of the effect that pushed Harrison and Evans up the list: Clay-court specialists spend more time on the dirt and they play against weaker-serving opponents, so their season averages make them look like better returners than they really are. It appears that these players are all particularly bad on hard courts: When I ran the algorithm with only clay-court results, Bautista Agut, Ramos, and Carreno Busta all appeared among the top 12 in competition-adjusted return points won. It’s their abysmal hard-court performances that pull down their longer-term numbers.

Beyond RPW

This algorithm–or something like it–has a great deal of potential beyond simply correcting return points won for tour-level competition quality. It could be used for any stat, and if competition-adjusted return rates were combined with corrected rates of service points won, it would generate a plausible overall player rating system.

Such a rating system would be more valuable if the algorithm were extended to players beyond the top 50, as well. Just as Schwartzman doesn’t yet have that many common opponents with the elites, Challenger-level stalwarts don’t have share many opponents with tour regulars. But there is enough overlap that, when combining the shared opponents of dozens of players, we might be able to get a better grip on how Challenger-level competition compares to that of the highest levels. Essentially, we can compare adjacent levels–the elites to the middle of the pack (say, ATP ranks 21 to 50), the middle of the pack to the next 50, and so on–to get a more comprehensive idea of how much players must improve to achieve certain goals.

Finally, adjusting serve and return stats so that we have a set of competition-neutral numbers for every player, for each season of his career, we will gain a clearer picture of which players are improving and by how much. Official rankings and Elo ratings tell us a lot, but they are sometimes fooled by lucky breaks, close wins, or inconsistent opposition. And they cannot isolate individual stats, which may be particularly useful for developmental purposes.

Adjusting for opposition quality is standard practice for analysts of many other sports, and it will help tennis analytics move forward as well. If nothing else, it has shown us that one extreme performance–Schwartzman’s return game–is much more than a fluke, and that service return greatness isn’t limited to the big four.

Podcast Episode 4: La Decima, a Wild Fed Cup Weekend, and Sharapova’s Return

In the Episode 4 of the Tennis Abstract Podcast, Carl Bialik and I desperately try to cover everything going on in tennis right now. We start with Rafael Nadal’s Decima of Monte Carlo titles, along with the bizarre underperformance of the rest of the ATP elites.

From there, we touch on various highlights from the weekend’s Fed Cup action, including our perspective on Ilie Nastase and the Romania-Great Britain tie he so shamefully overshadowed. Finally, we compare predictions for Sharapova’s comeback–for this week, the clay season as a whole, and the season in general. With Serena out and Maria (and soon Vika and Petra) back in, the WTA field leaves us with even more uncertainty than usual.

Listen here, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

 

Second-Strike Tennis: When Returners Dominate

Italian translation at settesei.it

On Wednesday, Diego Schwartzman scored a notable upset, knocking out 12th seed Roberto Bautista Agut in the second round of the Monte Carlo Masters. Even more unusual than Bautista Agut’s first-round exit was the way it happened. Both players won more than half of their return points: 61% for Schwartzman and 52% for Bautista Agut. There were 14 breaks of serve in 21 games.

Players like Schwartzman win more than half of return points fairly regularly. In the last 12 months, including both Challenger and tour-level matches, the Argentine–nicknamed El Peque for his diminutive stature–has done so more than 20 times. What is almost unheard of in the men’s game is for both players to return so well (or serve so poorly) that neither player wins at least half of his service points.

Since 1991–the first year for which ATP match stats are available–there have been fewer than 70 matches in which both players win more than half of their return points. (There are another 25 or so in which one player exceeded 50% and the other hit 50% exactly.) What’s more, these matches have become even less frequent over time: Wednesday’s result was the first instance on the ATP tour since 2014, and there have been fewer than 30 since 2000.

Here are the last 15 such matches, along with the both the winner’s (W RPW) and loser’s (L RPW) rates of return points won. Few of the players or surfaces come as a surprise:

Year  Event            Players                 W RPW  L RPW  
2017  Monte Carlo      Schwartzman d. RBA      61.4%  51.9%  
2014  Rio de Janeiro   Fognini d. Bedene       50.6%  50.6%  
2014  Houston          Hewitt d. Polansky      51.3%  51.5%  
2014  Estoril          Berlocq d. Berdych      51.5%  50.6%  
2013  Monte Carlo      Bautista Agut d. Simon  58.8%  50.6%  
2013  Estoril          Goffin d. P Sousa       55.2%  50.5%  
2011  Casablanca       Fognini d. Kavcic       51.0%  51.9%  
2011  Belgrade         Granollers d. Troicki   61.5%  50.8%  
2008  Barcelona        Chela d. Garcia Lopez   54.3%  50.5%  
2008  Costa Do Sauipe  Coria d. Aldi           58.5%  51.9%  
2007  Rome Masters     Ferrero d. Hrbaty       52.9%  51.7%  
2007  Hamburg          Ferrer d. Bjorkman      50.6%  50.6%  
2006  Monte Carlo      Coria d. Kiefer         53.2%  50.9%  
2006  Hamburg Masters  Gaudio d. A Martin      57.3%  51.1%  
2006  Australian Open  Coria d. Hanescu        53.4%  50.6%

All but 8 of the 69 total matches were on clay. One of the exceptions is at the bottom of this list, from the 2006 Australian Open, and before 2006, there were another five hard-court contests, along with two on grass courts. (The ATP database isn’t completely reliable, but in each of these cases, the high rate of return points won is partially verified by a similarly high number of reported breaks of serve.)

Bautista Agut, who won one of these matches four years ago in Monte Carlo, is one of several players who participated in multiple return-dominated clashes. Guillermo Coria played in five, winning four, and Fabrice Santoro took part in four, winning three. Coria won more than half of his return points in 75 tour-level matches over the course of his career.

Over course, both Schwartzman and Baustista Agut cleared the 50% bar with plenty of room to spare. The Spaniard won 51.9% of return points and Schwartzman comfortably exceeded 60%, putting them in an even more elite category. It was only the 22nd match since 1991 in which both players won at least 51.9% of return points.

As rare as these matches are, Schwartzman is doing everything he can to add to the list. With a ranking now in the top 40, he has entered just about every clay tournament on the schedule, so the most return-oriented competitor in the game is going to play a lot more top-level matches on slow surfaces. If anyone has a chance at equaling Coria’s mark of winning four of these return-dominated matches, my money’s on El Peque.