jsackmann – Page 51 – Heavy Topspin

So, About Those Stale Rankings

Both the ATP and WTA have adjusted their official rankings algorithms because of the pandemic. Because many events were cancelled last year (and at least a few more are getting canned this year), and because the tours don’t want to overly penalize players for limiting their travel, they have adopted what is essentially a two-year ranking system. For today’s purposes, the details don’t really matter–the point is that the rankings are based on a longer time frame than usual.

The adjustment is good for people like Roger Federer, who missed 14 months and is still ranked #6. Same for Ashleigh Barty, who didn’t play for 11 months yet returned to action in Australia as the top seed at a major. It’s bad for young players and others who have won a lot of matches lately. Their victories still result in rankings improvements, but they’re stuck behind a lot of players who haven’t done much lately.

The tweaked algorithms reflect the dual purposes of the ranking system. On the one hand, they aim to list the best players, in order. On the other hand, they try to maintain other kinds of “fairness” and serve the purposes of the tours and certain events. The ATP and WTA computers are pretty good at properly ranking players, even if other algorithms are better. Because the pandemic has forced a bunch of adjustments, it stands to reason that the formulas aren’t as good as they usually are at that fundamental task.

Hypothesis

We can test this!

Imagine that we have a definitive list, handed down from God (or Martina Navratilova), that ranks the top 100 players according to their ability right now. No “fairness,” no catering to the what tournament owners want, and no debates–this list is the final word.

The closer a ranking table matches this definite list, the better, right? There are statistics for this kind of thing, and I’ll be using one called the Kendall rank correlation coefficient, or Kendall’s tau. (That’s the Greek letter τ, as in Τσιτσιπάς.) It compares lists of rankings, and if two lists are identical, tau = 1. If there is no correlation whatsoever, tau = 0. Higher tau, stronger relationship between the lists.

My hypothesis is that the official rankings have gotten worse, in the sense that the pandemic-related algorithm adjustments result in a list that is less closely related to that authoritative, handed-down-from-Martina list. In other words, tau has decreased.

We don’t have a definitive list, but we do have Elo. Elo ratings are designed for only one purpose, and my version of the algorithm does that job pretty well. For the most part, my Elo formula has not changed due to the pandemic*, so it serves as a constant reference point against which we can compare the official rankings.

* This isn’t quite true, because my algorithm usually has an injury/absence penalty that kicks in after a player is out of action for about two months. Because the pandemic caused all sorts of absences for all sorts of reasons, I’ve suspended that penalty until things are a bit more normal.

Tau meets the rankings

Here is the current ATP top ten, including Elo rankings:

Player       ATP  Elo  
Djokovic       1    1  
Nadal          2    2  
Medvedev       3    3  
Thiem          4    5  
Tsitsipas      5    6  
Federer        6    -  
Zverev         7    7  
Rublev         8    4  
Schwartzman    9   10  
Berrettini    10    8

I’m treating Federer as if he doesn’t have an Elo rating right now, because he hasn’t played for more than a year. If we take the ordering of the other nine players and plug them into the formula for Kendall’s tau, we get 0.778. The exact value doesn’t really tell you anything without context, but it gives you an idea of where we’re starting. While the two lists are fairly similar, with many players ranked identically, there are a couple of differences, like Elo’s higher estimate of Andrey Rublev and its swapping of Diego Schwartzman and Matteo Berrettini.

Let’s do the same exercise with a bigger group of players. I’ll take the top 100 players in the ATP rankings who met the modest playing time minimum to also have a current Elo rating. Plug in those lists to the formula, and we get 0.705.

This is where my hypothesis falls apart. I ran the same numbers on year-end ATP rankings and year-end Elo ratings all the way back to 1990. The average tau over those 30-plus years is about 0.68. In other words, if we accept that Elo ratings are doing their job (and they are indeed about as predictive as usual), it looks like the pandemic-adjusted official rankings are better than usual, not worse.

Here’s the year-by-year tau values, with a tau value based on current rankings as the right-most data point:

And the same for the WTA, to confirm that the result isn’t just a quirk of the makeup of the men’s tour:

The 30-year average for women’s rankings is 0.723, and the current tau value is 0.764.

What about…

You might wonder if the pandemic is wreaking some hidden havoc with the data set. Remember, I said that I’m only considering players who meet the playing time minimum to have an Elo rating. For this purpose, that’s 20 matches over 52 weeks, which excludes about one-third of top-100 ranked men and closer to half of top-100 women. The above calculations still consider 100 players for year-end 2020 and today, but I had to go deeper in the rankings to find them. Thus, the definition of “top 100” shifts a bit from year-end 2019 to year-end 2020 to the present.

We can’t entirely address this problem, because the pandemic has messed with things in many dimensions. It isn’t anything close to a true natural experiment. But we can look only at “true” top-100 players, even if the length of the list is smaller than usual for current rankings. So instead of taking the top 100 qualifying players (those who meet a playing time minimum and thus have an Elo ranking), we take a smaller number of players, all of whom have top-100 rankings on the official list.

The results are the same. For men, the tau based on today’s rankings and today’s Elo ratings is 0.694 versus the historical average of 0.678. For women, it’s 0.721 versus 0.719.

Still, the rankings feel awfully stale. The key issue is one that Elo can’t help us solve. So far, we’ve been looking at players who are keeping active. But the really out-of-date names on the official lists are the ones who have stayed home. Should Federer still be #6? Heck if I know! In the past, if an elite player missed 14 months, Elo would knock him down a couple hundred points, and if that adjustment were applied to Fed now, it would push down tau. But there’s no straightforward answer for how the inactive (or mostly inactive) players should be rated.

What we’ve learned today

This is the part of the post where I’m supposed to explain why this finding makes sense and why we should have suspected it all along. I don’t think I can manage that.

A good way to think about this might be that there is a sort of tour-within-a-tour that is continuing to play regularly. Federer, Barty, and many others haven’t usually been part of it, while several dozen players are competing as often as they can. The relative rankings of that second group are pretty good.

It doesn’t seem quite fair that Clara Tauson is stuck just inside the top 100 while her Elo is already top-50, or that Rublev remains behind Federer despite an eye-popping six months of results while Roger sat at home. And for some historical considerations–say, weeks inside the top 50 for Tauson or the top 5 for Rublev–maybe it isn’t fair that they’re stuck behind peers who are choosing not to play, or who are resting on the laurels of 18-month-old wins.

But in other important ways, the absolute rankings often don’t matter. Rublev has been a top-five seed at every event he’s played since late September except for Roland Garros, the Tour Finals, and the Australian Open, despite never being ranked above #8. When the tour-within-a-tour plays, he is a top-five guy. The likes of Rublev and Tauson will continue to have the deck slightly stacked against them at the majors, but even that disadvantage will steadily erode if they continue to play at their current levels.

Believing in science as I do, I will take these findings to heart. That means I’ll continue to complain about the problems with the official rankings–but no more than I did before the pandemic.

Expected Points, March 8: Clara Tauson, the Next Danish Tennis Superstar

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Up today: Tauson follows in the imposing footsteps of Caroline Wozniacki, Zizou Bergs leads a pack of qualifiers to unlikely feats, and Dubai becomes the first 1000-level event in the WTA’s puzzling new system.

Scroll down for a transcript.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

The Expected Points podcast is still a work in progress, so please let me know what you think.

Expected Points, March 5: Qualifiers Reach Quarterfinals in Doha and Buenos Aires

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Up today: Jessica Pegula deals Karolina Pliskova her worst loss in recent memory, Francisco Cerundolo tries to keep up with his younger brother, and Marcus Willis officially says goodbye.

Scroll down for a transcript.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

The Expected Points podcast is still a work in progress, so please let me know what you think.

Podcast Episode 97: Matt Futterman on the Australian Open and Sports With Fans

This week’s guest is Matt Futterman, reporter for the New York Times and author of Running to the Edge: A Band of Misfits and the Guru Who Unlocked the Secrets of Speed and Players: How Sports Became a Business.

Matt, who spent 15 days in hotel quarantine so that he could cover the Australian Open, talks about his time in isolation and what is was like to emerge into a semblance of normal life. He explains why sports aren’t really sports without fans, how close the Australian Open came to not happening, and why Sofia Kenin isn’t a bigger star.

I also take advantage of Matt’s extensive knowledge of distance running to ask whether the unique schedules of marathoners provide any insight into how tennis players can better manage the pandemic, how tennis pros can gain some of the benefits of being part of a team, and which active player would run the fastest marathon.

Thanks for listening!

(Note: this week’s episode is about 48 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Podcast housekeeping:

In case you haven’t heard, I’m one month into a short (~4 minutes) daily podcast called Expected Points. Here’s today’s episode.
The TAP book club will reconvene next week with our next selection, John Updike’s 1968 novel, Couples. Read along with us, share your thoughts, and suggest topics/questions/comments for our discussion in a future episode. (Yes, I know I said “next week” last week, too. This time I mean it. Probably.)

Expected Points, March 4: A Trio of Upsets Shift the Rotterdam Balance of Power

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Up today: Garbine Muguruza and Aryna Sabalenka deliver a high-quality match, all the remaining seeds in the Rotterdam top half are gone, and Sumit Nagal goes where few Indians have gone before.

Scroll down for a transcript.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

The Expected Points podcast is still a work in progress, so please let me know what you think.

Expected Points, March 3: Garbine Muguruza Sets Up a Titantic Second-Rounder in Doha

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Up today: Frances Tiafoe and Sumit Nagal make Buenos Aires an unusually international affair, Alex De Minaur needs 44 shots to put away John Millman, and Muguruza gets another round-of-16 draw worthy of a final.

Scroll down for a transcript.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

The Expected Points podcast is still a work in progress, so please let me know what you think.

Expected Points, March 2: Novak Djokovic is Poised to Break a Coveted Record

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Up today: Clara Tauson scores a breakthrough victory in Lyon, Alexander Bublik sets a new standard for hopeless returning in Singapore, and Djokovic topples another one of Roger Federer’s all-time records.

Scroll down for a transcript.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

The Expected Points podcast is still a work in progress, so please let me know what you think.

Alexander Bublik and Return of Serve Futility

In Sunday’s Singapore final, Alexander Bublik won six return points. Not a typo. Out of Alexei Popyrin’s 52 serve points, that’s a win percentage of 11.6%. The technical term for this level of performance is… bad.

Yet somehow, Bublik concentrated four of those points in the fifth game and broke serve. (Popyrin helped–one of the four was a double fault.) Even more miraculously, it was the only break in the opening set, so the Russian won the set and got halfway to the title. Alas, he cranked the futility up another notch, winning only one return point the rest of the way, and it was Popyrin who came away with his maiden championship.

Freakish statistical feats tend to raise three questions: What are the odds? Has this ever happened before? And, can we learn anything from this nonsense?

What are the odds?

If Bublik had that exact 11.6% chance of winning each service point, his probability of breaking in any given game would be 0.26%, or about 1 in 384. In reality, it’s probably higher than that, because servers aren’t robots. Presumably Popyrin’s level dipped a bit. Still, if we take that 0.26% as the answer, Bublik’s likelihood of breaking serve at least once in the 22-game match were less than 3%.

You probably don’t need the precise numbers to recognize that, if you win six return points in the whole match, your odds of breaking serve aren’t that great.

Has this ever happened before?

The answer depends on what you mean by “this.” In our 30 years of ATP tour matches with stats on things like return points won and breaks of serve, the Singapore final was the first time that a player broke serve and won a set despite winning six or fewer return points.

It’s fairly common for a player to have a very bad return day, or face an extremely hot server. On average, there are about 30 completed tour-level matches per year in which the loser manages six or fewer return points. But of those 900-plus matches, the official stats only show seven times that the loser managed to break serve. (I emphasize “official” here because the ATP’s stats do have errors, and extreme situations like these tend to bring them out of hiding. A simple data-entry error can easily make a routine match look like a record-breaker.)

The most recent instance of six-return-points-and-a-break was in 2010, when Lukasz Kubot concentrated his efforts in a single return game of a Bucharest first-round match against Filippo Volandri. Every match on the list was a first-rounder except for a 1995 quarter-final at the Tokyo Indoors, when Alexander Volkov managed to break Michael Chang despite winning only those few return points.

Every six-pointer was a straight set loss, at least until Bublik came along.

Except… it’s possible to win six or fewer return points and win a set without breaking serve. In fact, it’s theoretically possible to win an entire match with only two return points going your way, if you deploy them in tiebreaks and remain flawless on your own deal. Reilly Opelka did exactly that (well, he won six points, not two) in Basel two years ago against Cristian Garin. Garin won all but 6 of his 69 service points but lost, 7-6(5) 7-6(10).

Bublik’s feat in the Singapore final wasn’t quite that level of oddity, but as an accomplishment amid return futility, his break-and-a-set is a close second.

Can we learn anything from this nonsense?

Bublik is a talented player, but he’s not a very good returner. This was his third career ATP final (excluding a two-game retirement in January), and his rates of return points won in those matches are 26.7%, 18.9%, and now 11.6%. It’s no surprise that he’s still looking for his first title. It turns out that underarm serving doesn’t have any secret advantages for his return game.

He has won 35.6% of his return points over the last 52 weeks–an improvement over his 34.1% mark at tour level in 2019, but still only good for 42nd out of the current ATP top 50. If he continues to serve big, that’s good enough for an Isner-like career, possibly spending considerable time in the top 20, maybe even with a brief stop in the top ten.

But to reach the next level, the Russian will need to return a lot better. Several years ago, I looked at the “minimum viable return game” necessary for an elite player. At the time, I was interested in Nick Kyrgios’s chances at a spot near the top of the rankings despite his own brand of return futility. In the 25 years between 1991 and 2015, when I wrote that piece, only four players finished a season in the top five while winning less than 37% of their return points, and two of those were within a percentage point of the threshold.

Kyrgios wasn’t close to that level then, and he still isn’t. Bublik is closer, but he’s still on the wrong side of the line. Optimists can point to the Russian’s relative youth–he turns 24 in June–and trust he’ll improve. Of course he might, but history isn’t on his side there, either. Kyrgios’s lack of progress is typical of the breed. Mediocre returners may improve their skills and tactics, but as they do so, they face more difficult opponents, keeping their numbers down.

If there is a positive take-away from the Singapore final, it’s that Bublik did manage to bunch his return points. Kyrgios outplays his numbers by saving his heroics for bigger moments. (Another way of looking at “outplaying his numbers” is “underperforming given his skills.”) Bublik shows signs of doing the same, so when he does manage to win more than six return points, he may be able to eke disproportionate gains out of them.

That’s the theory, anyway.

Expected Points, March 1: A Weekend of Teenage Exploits

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Up today: Juan Manuel Cerundolo makes Argentinian tennis history, Iga Swiatek proves she’s more than a one-hit wonder, and Rotterdam kicks off with a slightly depleted field.

Scroll down for a transcript.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

The Expected Points podcast is still a work in progress, so please let me know what you think.

Do Players Like Daniil Medvedev Eventually Start Winning on Clay?

Daniil Medvedev is within a whisker of the ATP number two ranking, and he has twice reached a grand slam final. He has a big serve, but he’s more than a serve-bot, and his resourceful, varied baseline game suggests he has the tools to excel on all surfaces.

Yet out of 28 career tour-level matches on clay, he’s won 10. Ten wins is an awfully meager haul for a 25-year-old with his sights set on the sport’s top honors.

I put together a list of about 140 ATP top-tenners–that’s basically all of them, with the exception of those whose careers were well underway at the start of the Open Era. For each one, I tallied up their clay court winning percentage in their first 28 matches (or 29 or 30, if the 28th came in the middle of an event), their hard-court results up to the same point in their career, and their eventual clay court results.

When Medvedev played his most recent match on dirt last September, it dropped his clay winning percentage to 35.7%, compared to a hard court record of 116-51, or 69.5%. Few top ATPers have begun their careers so ineffective on clay or so deadly on hard.

In fact, only 5 of the 140 players were worse in their first 28 clay matches. It’s a motley bunch, ranging from Joachim Johansson (who only played 17 matches on the surface in his career) to Kevin Curren (who only got there at age 34) to Diego Schwartzman, who is best on clay, but was overmatched early in his career. The guys tied with Medvedev are an equally mixed crowd, including those who preferred to skip the clay–Tim Henman, Paradorn Srichaphan–and those who took some time to get their footing at tour level, such as Nicolas Almagro and Robin Soderling.

Unlike Daniil Medvedev

This sampling of names suggests that the question I started with is difficult to define. On paper, Henman was “like” Medvedev. By the time he finished his 28th clay match, he had already played 152 times at tour level on hard courts, winning two thirds of them. But their playing styles are so different that the statistical similarities could be misleading.

Let’s narrow the list of comparable players to those who meet the following criteria:

lost more than half of their first 28 clay matches
had played at least 75 hard-court matches by the time they played their 28th on clay (in other words, they weren’t slow-starting dirtballers like Schwartzman or Almagro)
played at least 40 more clay-court matches in their careers (to exclude the blatant clay-avoiders like Curren and Srichaphan)

The following table shows the remaining 14 players, plus Medvedev. I’ve included the age when they played their 28th clay match, and their winning percentages on clay and hard up to that point. The final three columns show how things proceeded from there–after the tournament when they played they 28th clay match (“Future”), you can see how many clay matches they played, what percentage they won, and how many titles they took home:

Player        Age  Clay%  Hard%  Future: M   W%  Titles  
T Johansson  24.1    29%    56%         79  38%       0  
Soderling    21.7    34%    53%        109  70%       3  
Henman       24.7    34%    66%         90  59%       0  
Medvedev     24.6    36%    69%                          
Enqvist      22.1    39%    67%         93  51%       1  
Federer      19.8    41%    62%        266  80%      11  
Rafter       24.4    41%    60%         41  59%       0  
Cilic        20.5    43%    64%        174  65%       2  
Anderson     25.9    43%    60%         80  56%       0  
Isner        25.9    45%    60%        100  57%       1  
Kiefer       21.9    45%    62%         94  45%       0  
Blake        23.4    46%    56%         72  46%       0  
Murray       21.9    48%    76%        125  74%       3  
Bjorkman     25.1    48%    60%         71  31%       0  
Rusedski     24.0    48%    54%         50  30%       0

The results don’t exactly leave Rafael Nadal quaking in his Nikes. 8 of the 14 never won a clay title, and Isner’s 2013 win in Houston barely saves him from making it 9. The combined post-28th-match winning percentage of these guys is just shy of 60%, which isn’t bad, until you consider that without Roger Federer, the rate drops to 55%. The four players that offer some hope for Medvedev–Federer, Soderling, Andy Murray, and Marin Cilic–all played their 28th tour-level match on clay before their 22nd birthday, and even given their relative inexperience, all but Soderling did better in their first 28 than the Russian did.

When we take age into consideration, Henman looks like an even better comp, alongside characters like Pat Rafter and Greg Rusedski. They were more obviously one-dimensional than Medvedev is, but their early-career results offer decent parallels. Medvedev can only hope the similarities end there.

One thing I learned in putting together this list was probably already obvious to most of you–there aren’t a lot of players, now or in the past, who can easily be described as “like” Daniil Medvedev. That makes forecasting even trickier than usual. His height and recent serving prowess almost classes him with Isner and Kevin Anderson, while his game style puts him in a category with … Murray?

There’s another lesson in trying to locate parallels for Medvedev. He’d better hope that he continues to defy easy classification. It’s a bit late to become the next Federer, so if he’s going to become more an occasional threat on dirt, he’ll have a whole lot of historical precedent to overcome.