Expected Points, Feb. 7: Time to Embrace the Russian Future

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Daniil Medvedev and Andrey Rublev place Team Russia at the top of the international heap, Felix Auger Aliassime’s final-round struggles continue, and Serena Williams chases a record that, adjusted for common sense, she has already passed.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

This is very much still an experiment, so please let me know what you think.

Everybody’s a Tricky Lefty

Tennis players like routine, so maybe that’s what makes left-handed opponents “tricky.” The phrase “tricky lefty” is so common as to be a cliché, leading me to ask on Twitter last night whether there’s such a thing as a lefty who aren’t described as tricky.

A few of you responded, suggesting names such as Petra Kvitova and Rafael Nadal. It’s true, great left-handed players win matches because they’re great, not because they’re unusual. Plenty of adjectives come to mind for Petra and Rafa before “tricky.” Tour coach Marc Lucero suggested a broader framework:

I suspect Marc is right, and he would know better than I would. It doesn’t make sense that all lefties are tricky, even if players don’t face them very often.

Let’s be pedantic and go back to my original question, though: Are there any lefties who aren’t described as tricky? Mihaela Buzarnescu is certainly no Kvitova, but she is more aggressive than the average WTAer, at least according to Match Charting Project stats. To answer this question, I did some hardcore 21st-century research and googled it.

More specifically, I googled the following:

“tricky lefty” tennis

That’s not a perfect filter, because it excludes things like “tricky left-hander,” “a lefty whose tricky game…” and so on. But it gives us a good overview. Skipping over results with instructional content (“how to handle a tricky lefty serve!”) and pages discussing amateur players, here are the first 27 players Google told me are tricky lefties:

Alas, the world’s content writers do not hold to Lucero’s logically consistent definition. While some of the examples that Google gave me come from blogs, which we might not expect to maintain high editorial standards (pot, kettle, etc), one of the mentions of Nadal’s trickiness came from a very respectable publication, written by a pundit whose name you would know. Many of the other players were described as tricky on the tour websites, or in direct quotes from players. (Caroline Wozniacki used the t-word for Buzarnescu.)

Are lefties tricky?

As I said at the outset, tennis players like routine. Unless you’ve reached the finals at Roland Garros, facing a lefty is out of the ordinary. It’s the same type of unusual as drawing an opponent with a monster serve (Ivo Karlovic is incessantly deemed “tricky”) or a finely-honed backhand slice. There’s a whole range of tired tennis tropes for the underspinners–they “slice and dice” (really? they chop up the tennis balls into small cubes?), and their trickiness is rivaled only by how “crafty” they are.

We can’t quantify this unless we reframe the question. If lefties are tricky–or, let’s say, they have more capacity to be tricky than right-handers do–it’s roughly equivalent to saying that left-handers have an advantage. And if southpaws have an edge, we’d expect to see more of them in high-level tennis than in the population as a whole.

Is there a disproportionate number of lefties? This was one of the first tennis analytics questions I tried to answer, almost exactly a decade ago, and my conclusion then was: not really.

In February 2011, 12 of the top 100 players in the ATP rankings were left-handed. That includes Nadal, who complicates things a bit, as he’s a natural righty. 10% of the population is left-handed, so 11 natural-born lefties out of 100 players is awfully close to what we’d expect if there was no advantage.

Don’t read too much into this, but things have changed a bit! At the moment, 15 of the top 100 ATPers are left-handed. (Still including Rafa, of course.) There’s only about a 4% chance that there would be so many lefties purely due to chance, or 7% if you class Rafa with the natural-born righties. It’s hardly a statistical slam dunk, and the case gets weaker when we broaden our view. There are 12 lefties among the next hundred male players, and only 18 lefties–fewer than we’d expect from chance alone–in the WTA top 200.

Paradoxically, the more lefty regulars on tour, the less uncomfortable they are to face. Put another way, the trickier they are, the less tricky they are.

There may well be an advantage to left-handedness, and its inherent trickery, in the junior or amateur ranks. (There is certainly an advantage when facing me!) But the evidence is flimsy that it extends to the highest level of the game. The real trick would be convincing everyone to start using a different adjective or–gasp!–treating non-superstar lefties as individuals with games that aren’t interchangeable, even if they do all use the same dominant hand.

Expected Points, Feb. 6: Garbine Muguruza Destroys the Field

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Garbine Muguruza posts her fourth consecutive easy victory, Felix Auger Aliassime is hitting untouchable first serves, and Sofia Kenin will have to beat the odds to defend her Australian Open title.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

This is very much still an experiment, so please let me know what you think.

Expected Points, Feb. 5: A Two-Match Day Leaves Karen Khachanov the Top Man Standing

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Some of the biggest names on the women’s tour advance via the just-instituted third-set match tiebreak, Khachanov will play Jannik Sinner in the only meeting between seeds in Saturday’s men’s semi-finals, and Tony Trabert leaves a legacy of excellent play and longtime service to the game.

You can subscribe on iTunes, Spotify, Stitcher (click the “subscribe” button in the player), and elsewhere in the podcast universe.

This is very much still an experiment, so please let me know what you think.

Expected Points, Feb. 4: A Bad Day For Positive Tests

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

On today’s episode: Thursday’s action in Melbourne was canceled due to a positive coronavirus test, Dayana Yastremska won’t be returning to action anytime soon, and Benoit Paire gets back in the swing of things with an ignominious service game.

You can subscribe on Spotify or Stitcher (click the “subscribe” button in the player), and iTunes is coming soon.

This is very much still an experiment, so please let me know what you think.

New Mini-Podcast! Expected Points, Feb. 3

I’m trying something new: A short, daily(!?) podcast to keep you up to date with the tennis world. Patterned after the Numbers by Barron’s finance podcast, Expected Points highlights three numbers to illustrate stats, trends, and interesting trivia around the sport. Today’s pilot episode is under four minutes long, and I’ll aim to keep each installment around this length.

Joining me for the inaugural episode is Carl Bialik of the Thirty Love podcast. Given the short duration, this will probably be a solo podcast most of the time, but I look forward to including other voices as time and logistics permit.

Today’s episode features Carlos Alcaraz, superstars falling early in the women’s tournaments, and the imminent return of Roger Federer.

Expected Points isn’t yet on iTunes … or anywhere else, for that matter. It will be soon. In the meantime, you can listen right here using the player below:

Please let me know what you think–format, content, whatever. I’ve opened up comments on this post so you can respond, and you may also send comments my way on Twitter.

(Don’t worry, the long-form Tennis Abstract Podcast isn’t going anywhere–we’ll continue with our sporadic schedule throughout the year.)

So You’d Like To Do Some Tennis Research

Great! Here’s some data.

Maybe you’ve got a class project that will allow to you pick your own dataset. Or perhaps you just think that tennis analytics are cool, and you’d like to jump in. One of the more common questions I get is from people in this situation who are looking for a little guidance in choosing a subject. Here are a few tips.

1. Scratch your own itch

I try not to pick topics for others, because I generally find that people do better work (and are more likely to stick with it) when they are “scratching their own itch,” working on what they find particularly interesting. If nothing comes to mind, keep reading.

2. Get skeptical

When you’re watching tennis or reading about it, get in the habit of questioning everything. Does that player really hit more wide serves on break points? Does that guy really play better when he’s leading? If you listen with this type of mindset, you can come away from watching a single match with half a dozen new ideas.

This tip presupposes what might be step 0 — watch and read about tennis! I assume that if you’ve found my blog and want to do analytics, you’re already a pretty big fan. Keep it up–any analyst can benefit from attentively watching more tennis. Reading analytical work is also key, both to get ideas, and to learn what effective studies look like.

3. Think analogically

Many of us who do tennis analytics also work in other sports. Others are academics such as economists and statisticians whose “real jobs” have them working in fields far from athletics. Non-tennis subjects aren’t irrelevant–quite the contrary! If you do an interesting hockey study, or read about an interesting experimental design in development economics, think about how else you could apply a similar approach. Sometimes it’s a dead end with no direct application to tennis, but the exercise itself has value–practicing this kind of thinking eventually pays off.

This tip can be particularly useful for those of you doing a class project. If your professor provides examples of the type of work they’d like to see, consider if there’s a close cousin in tennis analytics. That first thought might not be where you end up, but it’s a good way both to get ideas and to ensure that you’re doing roughly the sort of work that’s asked of you.

4. Chart a match (or ten)

The Match Charting Project is the largest public dataset of shot-by-shot tennis data. It can be overwhelming at first, so if you are considering doing research with the dataset, I strongly recommend charting a match or two as a way to get familiar with it.

Charting a match is also a great way to generate more questions. It forces you to watch closely, so you’ll notice tactics that you might not have otherwise seen. As you chart, you might find yourself dreaming up hypotheses–say, that a player’s service return is particularly effective when she steps inside the baseline. The rest of the match will offer more data to confirm or contradict, and it might help you develop more ideas about where to go from there.

5. Collect your own data

There’s more than enough tennis data out there to keep you busy for a very long time. But don’t be afraid to strike out in a new direction. Perhaps you’d like to study whether certain players are more effective under the lights, which would require tracking the start time of matches. Maybe you’d like to see if certain coaches are particularly good at extracting better performances from their charges, which means you’d need to build a database of coaches, look up when they worked with each of their players, and how the players fared during that time.

Many analysts think that their job is just that–analysis. But in some areas, there more to be gained from better data than from better analysis. Plus, building a new dataset doesn’t have to be a monumental task. The coaches example I gave might include only a few dozen coaches, who worked with a handful of players each.

6. Start small

Following some of my suggestions above can lead you into a huge, ambitious project. the most common result of taking on a huge project is an unfinished project, as I can tell you from experience. Before going big, try to find a “proof of concept” both to get your feet wet, and to see whether you’re on a useful track.

In the coaches example I just gave, you might look at what happened to the WTA rankings of Wim Fissette’s players when they worked with him. I don’t know if there’s a “Fissette effect,” and now that I mention it, I’m curious! That’s a mini-project you could do in an afternoon, and it gets you started on the path of a more thorough study.

Ok, ok, here’s a list

Still stuck? A few years ago, Carl and I put together a list of potential research topics. I’ve since taken it down, but Peter forked it, so it still exists on GitHub.

Some of the topics have already been done, and several others are beyond the scope of what’s possible with publicly-available data. That still leaves you with dozens of ideas.

Finally, once you’ve completed a study–big or small–be sure to post it on twitter and share with other tennis analysts. Your work might be the key that gives the next graduate student or hobby analyst the spark to start a project of their own.

Flipping Coins in the Rain

The singles final in last week’s ITF M15 Antalya event was washed out by rain. It went in the books as a walkover victory for Giovanni Fonio over Juan Manuel Cerundolo. Doubles specialist Harri Heliovaara explains (from the Finnish, via Google translate, via Peter Wetz):

Namely, the ITF competitions were also played here last week, and the men’s final scheduled for Sunday of that competition could not be played at all due to the rain. Normally in that situation, each player only gets the ATP points and prize money of the losing finalist, but now the players threw themselves creatively and decided to decide the final winner with a coin toss. They agreed that the winner of the coin toss would receive a surrender win and thus the winner’s ATP points, while the loser would receive the winner’s prize money, so each received more than what would have resulted from not playing the final anyway.

This is indeed a clever solution, and one that was sometimes employed in the amateur era. When grass courts made up a bigger part of the tour and court maintenance in general was more primitive, it was more common for the tail end of events to be left unplayed. Usually the finalists (or semifinalists, in extreme cases) divided the prizes, but occasionally they resorted to a coin toss, especially in mixed doubles, which has always had a bit of an “exhibition” vibe.

While Fonio and Cerundolo benefited (in different ways) from the coin flip, parts of this scenario don’t quite smell right to me. First I’ll explain why, then I’ll offer a solution.

  1. The winning player gave his prize money to the loser. Change the context a tiny bit, and that’s match fixing.
  2. The ITF doesn’t have a provision in their rulebook for coin-flipping (as far as I know), so this solution only worked because the losing player agreed to claim an injury. Again, this is an unusual situation, but attesting to a fake injury is frowned upon, to say the least.
  3. Fonio gets some extra ranking points. Typically when tournaments are washed out, those points aren’t awarded, so it isn’t as if there is a precedent that Fonio or Cerundolo “deserved” those points. Instead, Fonio gains an unearned edge (albeit a small one) over several similarly-ranked players, who are presumably competing for entry and seeding in the same events.
  4. Players in other unfinished events–such as the Nur Sultan and Potchefstroom Challengers that were halted due to Covid-19 last March–didn’t get a chance to divide the unawarded points, by coin-flipping or any other method.

We can collapse these four points into two issues: First, there’s no ITF rule, so swapping points for prize money requires treading very close to some ethical and rule-breaking lines. Second, allowing players to improvise (sometimes? depending on the attitude of the on-site supervisor? I don’t know) inevitably gives an unearned advantage to some players over others.

Edit the rulebook

Fortunately, this is an easy fix. By providing a simple guideline for situations like this, the ITF can avoid the iffy behavior of prize-swapping and lying about injuries, ensure fairness for players across the whole tour, and do a better job of delivering the rewards that players expect when they show up for a tournament.

How about this:

Matches that cannot be played due to weather or force majeure will be decided by a coin flip, with ranking points awarded to the winner.

I’m not a lawyer, so my one-liner is probably missing a few paragraphs, but the main idea is pretty straightforward.

Prize money is trickier. Should anyone–“winner” or “loser”–receive prize money for unplayed matches? I don’t know. Tournaments–even the occasional ITF–earn revenue from ticket sales and broadcast rights, so they would surely prefer to hold back prize money from unplayed matches. On the other hand, players spend money and travel to events with the expectation that certain rewards are on offer.

Reap the benefits

The biggest gain in establishing this rule is consistency. As fans, we expect that sporting bodies treat players equally, and at the moment, handling unplayable matches is a real (if rare) source of inconsistency.

The other benefit is in guaranteeing at least some of what competitors have been promised. In several past articles, I’ve used metrics such as “expected points” to quantify how a player can expect to perform at a tournament. If he has a 50% chance of reaching the second round, he has a 50% chance of earning those points; if he has a 15% chance of winning the tournament, he has a 15% chance of earning those additional points. Most players don’t explicitly choose tournaments by predicting exact draws and calculating expected points, but many–including Heliovaara, incidentally–very much think in these terms.

If a tournament is forced to end early, those calculations–explicit or not–are worthless. As a qualifier and the fifth seed, respectively, Fonio and Cerundolo probably would’ve been happy with finalist points in Antalya. But what about the players who ended up making trips to Kazakhstan and South Africa last March for nothing better than quarter-finalist points?

As I’ve said above, we can debate whether tournaments should be expected to pay out prize money (and whether prize money should go to the coin-flip losers, as it did in Antalya), but there’s no reason for a similar dispute about ranking points. Sure, some players would get lucky in that a coin proclaims them the winner, but it’s not much different from finding oneself the beneficiary of a withdrawal, or even in a weak section of the draw.

Fonio and Cerundolo ended up with the right solution, and I’m glad the tournament supervisor didn’t stand in the way. I just wish the coin flip were standard practice, to avoid the ethical tightrope walk. I’m sure that players would appreciate the increased clarity, as well.

Recreating the 1957 Women’s Tennis Season in 2,600 Easy Steps

Another historical season in the database! In 1957, Althea Gibson was so good it was almost boring. She was in the middle of a 161-week streak at the top of the Elo rankings, and with a 66-2 won-loss record this year, she finished the campaign more than 200 Elo points ahead of the number two player, Dorothy Head Knode.

Of course, no one knew about Elo in 1957, and there weren’t even week-by-week rankings. It didn’t take an advanced algorithm to know that Gibson belonged at the top of the heap. However, the newspapermen who published the most respected year-end ranking lists had at least as many blind spots as the WTA computer does these days. While Althea was comfortably on top, Knode was never considered to be better than 5th.

About 240 events worth of results–that’s about 2,600 matches–from this season are now on Tennis Abstract, and you can jump in via the 1957 season page. There’s a week-by-week calendar, year-end rankings, stats breakdowns for the top players, the most common head-to-heads, and country-by-country comparisons. All of this is now available for 11 pre-Open Era seasons.

The raw data has been added to my GitHub repo, and I offer another hearty round of thanks to the contributors at tennisforum’s Blast From the Past forum, who did the heavy lifting of typing out so many of these results from contemporary newspapers and annuals.

Podcast Episode 92: Natural Experiments and Second-Order Pandemic Effects

Episode 92 of the Tennis Abstract Podcast, with Carl Bialik, of the Thirty Love podcast, addresses the opportunity generated by the Covid-19 pandemic to study natural experiments in sports.

Many of the things we used to take for granted–stadiums full of fans, weekly travel schedules, consistent training opportunities–have been disrupted for some or all players, in tennis and other major sports. We consider what we can learn about home-court advantage, the predictability of results, the role of unchanging venues, and even the speed of play, by comparing pre-pandemic numbers with their corresponding figures since sports got back underway. We also wonder about the limitations of these sorts of studies, because there are always confounding variables. The biggest confounder of all: the pandemic itself.

I’ve been writing about these issues occasionally. Click for my posts on the predictability of match results, the effect of an empty stadium on serves, and the pace of play with no fans, no towelkids, and no linespeople.

Thanks for listening!

In housekeeping notes:

  • The TAP book club will reconvene in four weeks or so with our next selection, John Updike’s 1968 novel, Couples. Read along with us, tell us what you think, and suggest topics/questions/comments for our discussion in a future episode.
  • Fans of the TA podcast will also want to check out Dangerous Exponents, Carl’s and my Covid-19 podcast. Later today, we’re releasing a new episode about masks–the science behind wearing them, the ways researchers study their benefits, how they stack up against other public health interventions, and much more.

(Note: this week’s episode is about 51 minutes long; in some browsers the audio player may display a different length. Sorry about that! Also, I refer to this episode as episode 91, because for a numbers guy, I’m pretty bad at counting.)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.