Podcast Episode 2: Doubles, Wild Cards, and Megastars

In the second episode of the Tennis Abstract Podcast, Carl Bialik and I give some much-deserved top billing to doubles, especially new ATP No. 1 Henri Kontinen and Elo doubles favorite Jack Sock.

We also cover the role of megastars in tennis, and the benefits and challenges they offer to the sport’s promoters. As we discuss, big names may be key to expanding the appeal of doubles, and they are the one major argument for the continuing existence of wild cards–on whichever side of the Maria Sharapova debate you find yourself.

Listen here, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

 

New at Tennis Abstract: ATP Doubles!

At last, I’ve added ATP doubles results to player pages at Tennis Abstract. Doubles has long been relegated to second-class status by tennis analytics, largely because the data just isn’t there. Now, much more is readily available.

Tennis Abstract now has career doubles results (including Challengers, Futures, and Satellites) for thousands of ATP players, and they’ll be updated throughout each day’s play, just like singles results. Match times and traditional match stats are included for most 2016 ATP and Challenger tournaments, and I hope that will continue to be the case in 2017 and beyond.

Let me give you a brief tour of what you’ll find, using doubles legend Jack Sock as a starting point:

The big red “1” shows where to click to switch over to doubles results. For full-time doubles specialists, you won’t have to click–the site will automatically show you doubles results.

The “2” indicates three new doubles-specific filters: by partner, by opponent, and by opposing team. For instance, you can see Sock’s results with Vasek Pospisil, his eight matches against Daniel Nestor, or his twelve meetings with the Bryan Brothers. You may always combine multiple filters, so for example, you can look at Sock’s record against the Bryans only when partnering Pospisil.

There are three more new filters, marked by the big “3” toward the bottom. The “vs Hands” filter allows you to select matches against righty-righty, righty-lefty, or lefty-lefty teams. “Partner Hand” and “Partner Rank” make it possible to limit matches to those in which the partner had certain characteristics.

Finally, the “4” shows you where to access more detailed stats. Doubles results take a lot more room to display than singles results, so on the default view, the only “stats” on offer are match time and Dominance Ratio. Click on “Serve,” “Return,” or “Raw” to get the other traditional numbers, such as ace rate, first-serve points won, or break points converted. All of these numbers are totals for each team; individual player stats are almost never available for doubles matches.

I hope you enjoy this new resource. It’s something I’ve wanted for a long time, so I’m excited to be able to use it myself. There are still some minor gaps in the record, as well as some kinks in the functionality, so please be patient as I try to work all of that out.

For those of you who’d like to see WTA doubles results, as well: Me too! I can’t promise any particular deadline, but I’ve already done much of the work to build the dataset, so I’m hoping to add them to women’s player pages early this year. Stay tuned!

The Unexpectedly Predictable IPTL

December is here, and with the tennis offseason almost five days old, it’s time to resume the annual ritual of pretending we care about exhibitions. The hit-and-giggle circuit gets underway in earnest tomorrow with the kickoff, in Japan, of the 2016 IPTL slate.

The star-studded IPTL, or International Premier Tennis League, is two years old, and uses a format similar to that of the USA’s World Team Tennis. Each match consists of five separate sets: one each of men’s singles, women’s singles, (men’s) champions’ singles, men’s doubles, and mixed doubles. Games are no-ad, each set is played to six games, and a tiebreak is played at 5-5. At the end of all those sets, if both teams have the same number of games, representatives of each side’s sponsors thumb-wrestle to determine the winner. Or something like that. It doesn’t really matter.

As with any exhibition, players don’t take the competition too seriously. Elites who sit out November tournaments due to injury find themselves able to compete in December, given a sufficient appearance fee. It’s entertaining, but compared to the first eleven months of the year, it isn’t “real” tennis.

That triggers an unusual research question: How predictable are IPTL sets? If players have nothing at stake, are outcomes simply random? Or do all the participants ease off to an equivalent degree, resulting in the usual proportion of sets going the way of the favorite?

Last season, there were 29 IPTL “matches,” meaning that we have a dataset consisting of 29 sets each of men’s singles, women’s singles, and men’s doubles. (For lack of data, I won’t look at mixed doubles, and for lack of interest, forget about champion’s singles.) Except for a handful of singles specialists who played doubles, we have plenty of data on every player. Using Elo ratings, we can generate forecasts for every set based on each competitor’s level at the time.

Elo-based predictions spit out forecasts for standard best-of-three contests, so we’ll need to adjust those a bit. Single-set results are more random, so we would expect a few more upsets. For instance, when Roger Federer faced Ivo Karlovic last December, Elo gave him an 89.9% chance of winning a traditional match, and the relevant IPTL forecast is a more modest 80.3%. With these estimates, we can see how many sets went the way of the favorite and how many upsets we should have expected given the short format.

Let’s start with men’s singles. Karlovic beat Federer, and Nick Kyrgios lost a set to Ivan Dodig, but in general, decisions went the direction we would expect. Of the 29 sets, favorites won 18, or 62.1%. The Elo single-set forecasts imply that the favorites should have won 64.2%, or 18.6 sets. So far, so predictable: If IPTL were a regular-season event, its results wouldn’t be statistically out of place.

The results are similar for women’s singles. The forecasts show the women’s field to be more lopsided, due mostly to the presence of Serena Williams and Maria Sharapova. Elo expected that the favorites would win 20.4, or 70.4% of the 29 sets. In fact, the favorites won 21 of 29.

The men’s doubles results are more complex, but they nonetheless provide further evidence that IPTL results are predictable. Elo implied that most of the men’s doubles matches were close: Only one match (Kei Nishikori and Pierre-Hugues Herbert against Gael Monfils and Rohan Bopanna) had a forecast above 62%, and overall, the system expected only 16.4 victories for the favorites, or 56.4%. In fact, the Elo-favored teams won 19, or 65.5% of the 29 sets, more than the singles favorites did.

The difference of less than three wins in a small sample could easily just be noise, but even so, a couple of explanations spring to mind. First, almost every team had at least one doubles specialist, and those guys are accustomed to the rapid-fire no-ad format. Second, the higher-than-usual number of non-specialists–such as Federer, Nishikori, and Monfils–means that the player ratings may not be as reliable as they are for specialists, or for singles. It might be the case that Nishikori is a better doubles player than Monfils, but because both usually stick to singles, no rating system can capture the difference in abilities very accurately.

Here is a summary of all these results:

Competition      Sets  Fave W  Fave W%  Elo Forecast%  
Men's Singles      29      18    62.1%          64.2%  
Women's Singles    29      21    72.4%          70.4%  
ALL SINGLES        58      39    67.3%          67.3%  
                                                       
Men's Doubles      29      19    65.5%          56.4%  
ALL SETS           87      58    66.7%          63.7%

Taken together, last season’s evidence shows that IPTL contests tend to go the way of the favorites. In fact, when we account for the differences in format, favorites win more often than we’d expect. That’s the surprising bit. The conventional wisdom suggests that the elites became champions thanks to their prowess at high-pressure moments; many dozens of pros could reach the top if they were only stronger mentally. In exhos, the mental game is largely taken out of the picture, yet in this case, the elites are still winning.

No matter how often the favorites win, these matches are still meaningless, and I’m not about to include them in the next round of player ratings. However, it’s a mistake to disregard exhibitions entirely. By offering a contrast to the high-pressure tournaments of the regular season, they may offer us perspectives we can’t get anywhere else.

Forecasting Davis Cup Doubles

One of the most enjoyable aspects of Davis Cup is the spotlight it shines on doubles. At ATP events, doubles matches are typically relegated to poorly-attended side courts. In Davis Cup, doubles gets a day of its own, and crowds turn out in force. Even better, the importance of Davis Cup inspires many players who normally skip doubles to participate.

Because singles specialists are more likely to play doubles, and because most Davis Cup doubles teams are not regular pairings, forecasting these matches is particularly difficult. In the past, I haven’t even tried. But now that we have D-Lo–Elo ratings for doubles–it’s a more manageable task.

To my surprise, D-Lo is even more effective with Davis Cup than it is with regular-season tour-level matches. D-Lo correctly predicts the outcome of about 65% of tour-level doubles matches since 2003. For Davis Cup World Group and World Group Play-Offs in that time frame, D-Lo is right 70% of the time. To put it another way, this is more evidence that Davis Cup is about the chalk.

What’s particularly odd about that result is that D-Lo itself isn’t that confident in its Davis Cup forecasts. For ATP events, D-Lo forecasts are well-calibrated, meaning that if you look at 100 matches where the favorite is given a 60% chance of winning, the favorite will win about 60 times. For the Davis Cup forecasts, D-Lo thinks the favorite should win about 60% of the time, but the higher-rated team ends up winning 70 matches out of 100.

Davis Cup’s best-of-five format is responsible for part of that discrepancy. In a typical ATP doubles match, the no-ad scoring and third-set tiebreak introduce more luck into the mix, making upsets more likely. A matchup that would result in a 60% forecast in the no-ad, super-tiebreak format translates to a 64.5% forecast in the best-of-five format. That accounts for about half the difference: Davis Cup results are less likely to be influenced by luck.

The other half may be due to the importance of the event. For many players, regular-season doubles matches are a distant second priority to singles, so they may not play at a consistent level from one match to the next. In Davis Cup, however, it’s a rare competitor who doesn’t give the doubles rubber 100% of their effort. Thus, we appear to have quite a few matches in which D-Lo picks the winner, but since it uses primarily tour-level results, it doesn’t realize how heavily the winner should have been favored.

Incidentally, home-court advantage doesn’t seem to play a big role in doubles outcomes. The hosting side has won 52.6% of doubles matches, an edge which could have as much to do with hosts’ ability to choose the surface as it is does with screaming crowds and home cooking. This isn’t a factor that affects D-Lo forecasts, as the system’s predictions are as accurate when it picks the away side as when it picks the home side.

Forecasting Argentina-Croatia doubles

Here are the D-Lo ratings for the eight nominated players this weekend. The asterisks indicate those players who are currently slated to contest tomorrow’s doubles rubber:

Player                 Side  D-Lo     
Juan Martin del Potro  ARG   1759     
Leonardo Mayer         ARG   1593  *  
Federico Delbonis      ARG   1540     
Guido Pella            ARG   1454  *  
                                      
Ivan Dodig             CRO   1856  *  
Marin Cilic            CRO   1677     
Ivo Karlovic           CRO   1580     
Franco Skugor          CRO   1569  *

As it stands now, Croatia has a sizable advantage. Based on the D-Lo ratings of the currently scheduled doubles teams, the home side has a 189-point edge, which converts to a 74.8% probability of winning. But remember, that’s the chance of winning a no-ad, super-tiebreak match, with all the luck that entails. In best-of-five, that translates to a whopping 83.7% chance of winning.

Making matters worse for Argentina, it’s likely that Croatia could improve their side. Argentina could increase their odds of winning the doubles rubber by playing Juan Martin del Potro, but given Delpo’s shaky physical health, it’s unlikely he’ll play all three days. Marin Cilic, on the other hand, could very well play as much as possible. A Cilic-Ivan Dodig pairing would have a 243-point advantage over Leonardo Mayer and Guido Pella, which translates to an 89% chance of winning a best-of-five match. Even Mayer’s Davis Cup heroics are unlikely to overcome a challenge of that magnitude.

Given the likelihood that Pella will sit on the bench for every meaningful singles match, it’s easy to wonder if there is a better option. Sure enough, in Horacio Zeballos, Argentina has a quality doubles player sitting at home. The two-time Grand Slam doubles semifinalist has a current D-Lo rating of 1758, almost identical to del Potro’s. Paired with Mayer, Zeballos would bring Argentina’s chances of upsetting a Dodig-Franco Skugor team to 43%. Zeballos-Mayer would also have a 32% chance of defeating Dodig-Cilic.

A full Argentina-Croatia forecast

With the doubles rubber sorted, let’s see who is likely to win the 2016 Davis Cup. Here are the Elo– and D-Lo-based forecasts for each currently-scheduled match, shown from the perspective of Croatia:

Rubber                      Forecast (CRO)  
Cilic v Delbonis                     90.8%  
Karlovic v del Potro                 15.8%  
Dodig/Skugor v Mayer/Pella           83.7%  
Cilic v del Potro                    36.3%  
Karlovic v Delbonis                  75.8%

Elo still believes Delpo is an elite-level player, which is why it makes him the favorite in the pivotal fourth rubber against Cilic. The system is less positive about Federico Delbonis, who it ranks 68th in the world, against his #41 spot on the ATP computer.

These match-by-match forecasts imply a 74.2% probability that Croatia will win the tie. That’s more optimistic than the betting market which, a few hours before play begins, gives Croatia about a 65% edge.

However, most of the tweaks we could make would move the needle further toward a Croatia victory. Delpo’s body may not allow him to play two singles matches at full strength, and the gap in singles skill between him and Mayer is huge. Croatia could improve their doubles chances if Cilic plays. And if there is a home-court or surface advantage, it would probably work against the South Americans.

Even more likely than a Croatian victory is a 1-1 split of the first two matches. If that happens, everything will hang in the balance tomorrow, when the world tunes in to watch a doubles match.

Forecasting the 2016 ATP World Tour Finals

Andy Murray is the #1 seed this week in London, but as I wrote for The Economist, Novak Djokovic likely remains the best player in the world. According to my Elo ratings, he would have a 63% chance of winning a head-to-head match between the two. And with the added benefit of an easier round-robin draw, the math heavily favors Djokovic to win the tournament.

Here are the results of a Monte Carlo simulation of the draw:

Player        SF      F      W  
Djokovic   95.3%  73.9%  54.6%  
Murray     86.3%  58.3%  29.7%  
Nishikori  60.4%  24.9%   7.8%  
Raonic     50.9%  16.3%   3.3%  
Wawrinka   29.4%   7.8%   1.6%  
Monfils    33.2%   8.7%   1.4%  
Cilic      23.9%   5.8%   1.1%  
Thiem      20.7%   4.1%   0.5%

I don’t think I’ve ever seen a player favored so heavily to progress out of the group stage. Murray’s 86% chance of doing so is quite high in itself; Novak’s 95% is otherworldly. His head-to-heads against the other players in his group are backed up by major differences in Elo points–Dominic Thiem is a lowly 15th on the Elo list, given only a 7.4% chance of beating the Serb.

If Milos Raonic is unable to compete, Djokovic’s chances climb even higher. Here are the probabilities if David Goffin takes Raonic’s place in the bracket:

Player        SF      F      W  
Djokovic   96.8%  75.2%  55.4%  
Murray     86.2%  60.7%  30.6%  
Nishikori  60.7%  26.3%   8.1%  
Monfils    47.7%  12.4%   1.8%  
Wawrinka   29.3%   8.5%   1.7%  
Cilic      23.8%   6.2%   1.1%  
Thiem      29.5%   5.8%   0.7%  
Goffin     26.0%   4.9%   0.5%

The luck of the draw was on Novak’s side. I ran another simulation with Djokovic and Murray swapping groups. Here, Djokovic is still heavily favored to win the tournament, but Murray’s semifinal chances get a sizable boost:

Player        SF      F      W  
Djokovic   92.8%  75.1%  54.9%  
Murray     90.9%  58.1%  29.8%  
Nishikori  58.4%  26.9%   7.5%  
Raonic     52.3%  14.3%   3.3%  
Wawrinka   26.9%   8.4%   1.6%  
Monfils    35.3%   7.5%   1.4%  
Cilic      21.9%   6.2%   1.0%  
Thiem      21.6%   3.4%   0.5%

Elo rates Djokovic so highly that he is favored no matter what the draw. But the draw certainly helped.

Doubles!

I’ve finally put together a sufficient doubles dataset to generate Elo ratings and tournament forecasts for ATP doubles. While I’m not quite ready to go into detail, I can say that, by using the Elo algorithm and rating players individually, the resulting forecasts outperform the ATP rankings about as much as singles Elo ratings do.

Here is the forecast for the doubles event at the World Tour Finals:

Team               SF      F      W  
Herbert/Mahut   76.4%  49.5%  32.1%  
Bryan/Bryan     68.7%  36.8%  19.9%  
Kontinen/Peers  55.7%  29.1%  13.8%  
Dodig/Melo      58.4%  28.1%  13.2%  
Murray/Soares   48.3%  20.8%   8.6%  
Lopez/Lopez     37.7%  16.4%   6.2%  
Klaasen/Ram     30.2%  11.9%   4.0%  
Huey/Mirnyi     24.6%   7.3%   2.2%

This distribution is more like what round-robin forecasts usually look like, without a massive gap between the top of the field and the rest. Pierre-Hugues Herbert and Nicolas Mahut are the top rated team, followed closely by Bob Bryan and Mike Bryan. Max Mirnyi was, at his peak, one of the highest Elo-rated doubles players, but his pairing with Treat Huey is the weakest of the bunch.

The men’s doubles bracket has some legendary names, along with some players–like Herbert and Henri Kontinen–who may develop into all-time greats, but it has no competitors who loom over the rest of the field like Murray and Djokovic do in singles.

The Weirdest Thing About David Marrero’s Suspicious Mixed Doubles Match

You’ve probably seen the news: There was suspicious betting activity on a mixed doubles match a few days ago, hinting that some bettors knew ahead of time that David Marrero and Lara Arruabarena were going to lose to Andrea Hlavackova and Lukasz Kubot.

I don’t know whether it was a fix, or if someone leaked information, or if it was a publicity stunt by Pinnacle, who reported the suspicious activity. I don’t really care. Instead, what stuck out to me was this odd claim from Marrero, as reported by the Times:

“Normally, when I play, I play full power, in doubles or singles,” said Marrero, who won the doubles title at the 2013 ATP World Tour Finals. “But when I see the lady in front of me, I feel my hand wants to play, but my head says, ‘Be careful.’ This is not a good combination.”

As the Times also points out, Marrero’s record in mixed doubles is abysmal: 7-21 (with nine different partners), including 10 consecutive losses. He has, at times, ranked among the best doubles players in the world, yet managed to lose mixed matches alongside other greats, such as Hlavackova and Sara Errani. In six matches with Arantxa Parra-Santonja, a doubles specialist with eight tour-level titles, he’s lost the lot.

Assuming Marrero isn’t regularly fixing Grand Slam mixed doubles matches–after all, fixing a match this week would be awfully dumb–it’s clear that he’s not very good in this format. Here’s the weird thing: Before this mini-scandal, nobody was paying any attention.

Yeah, of course, it’s mixed doubles, which is little more than a glorified exhibition. Tennis isn’t great when it comes to statkeeping, and there’s virtually no one paying attention to doubles stats. The situation with mixed doubles is even worse. But if singles player had a losing streak of 10 of just about anything, fans would know about it, and people would be watching closely.

Given the nature of the mixed doubles event–specialists frequently switch partners, and the format includes a super-tiebreak in place of a third set–we wouldn’t expect too many extremes. In fact, of the 36 players who have contested at least 15 mixed matches since 2009 (28 slams plus the 2012 Olympics), only Leander Paes, with a 63-21 record, has been as good as Marrero has been bad. No one else has won more than 70% of their mixed matches.

And since mixed doubles draws are full of non-specialists (like Naomi Broady and Neal Skupski, who beat Marrero and Parra-Santonja at Wimbledon in 2014) we would expect the specialists to perform better than average. Sure enough, of those 36 regulars, 25 have winning percentages of 50% or better, and all but four have won at least 43% of matches. Only Marrero and Raquel Atawo (formerly Kops-Jones) hold winning percentages below 36%.

Let’s say we give Marrero the benefit of the doubt–as far the fixing goes, anyway–and accept his claim that he’s uncomfortable playing when there’s a woman across the net. It’s a strange state of affairs when (a) he continues playing almost every possible mixed doubles event despite his discomfort; (b) women choose to partner with him, either ignorant of his discomfort or simply happy to get into the draw; and (c) it’s possible to play 21 Grand Slams before the public gets any inkling that one of the 64 players in the mixed draw has a fundamental issue playing normally on the mixed doubles court.

Such comprehensive, long-standing ignorance isn’t out of place in tennis, especially in doubles. But given what we now know about David Marrero, the suspicious betting activity isn’t the influx of money against him–it’s the fact that anyone ever put money on him to win a mixed doubles match.

Lopsided Four-Setters, Orderly Doubles, and Sock’s Luck

On Wednesday, Guillermo Garcia-Lopez appeared to give Juan Martin del Potro quite the battle, taking him to four sets, with two tiebreaks along the way.  It wasn’t what anyone expected from Delpo’s first-round match against someone ranked outside the top 70.

Looking behind the scoreline, however, it becomes evident that the Argentine dominated the match.  Frequent HT commenter Tom Welsh pointed out that del Potro’s Dominance Ratio (DR) was 1.64, a mark that Delpo had not reached in his previous nine matches, and not since posting a 1.68 DR in a routine victory against Bernard Tomic in Washington.

Of course, a stat like DR, which considers the total number of return points won and service points lost, will not capture the ups and downs within a match..  What it does tell you is, over the course of the afternoon, how well both guys were playing.  And comparatively speaking, del Potro was playing much better.

Delpo had previously played 29 matches in his career in which he finished with a DR between 1.6 and 1.7, and in all but one of those (a three-setter against Dudi Sela in Washington in 2008) he won in straight sets.

It turns out, though, that in Grand Slam play, dropping a set in the middle of an otherwise routine performance–as measured by DR–isn’t that uncommon.  While the average DR in a Slam four-setter is only 1.37, the winner has tallied a DR of 1.64 or better in more than 12% of Slam matches since 1991.

If there’s a takeaway here, it’s something we should already know.  In a tennis match-especially one with tiebreaks–some points are tremendously more important than others.  Garcia-Lopez saved 9 of 13 break points.  Take away one of those in the second set, and we’re not having this discussion.  Give Delpo one more of the first 12 points in the second-set tiebreak, and things could’ve turned out differently.  One well-timed, high-leverage point has the potential to overturn dozens of points worth of poor play.

Yesterday I mused on the chaos that is men’s doubles, and the Bryan brothers’ ability to rise above it.  Yesterday’s action was surprisingly unchaotic.

By the end of play yesterday, 15 of the 16 men’s doubles seeds had completed their first-round matches.  (Sixth seeds Edouard Roger-Vasselin and Rohan Bopanna play today.)  Of those 15, 10 reached the second round, including every top-seven seed who has played.

Compare that to men’s singles, in which 10 of 32 seeds crashed out in the first round.  For a more direct comparison, consider that 4 of the top 16 men’s singles seeds lost in the first-round.  Arguably, the doubles players have a tougher task.  Since the field is made up of only 64 teams, the first round can be more challenging in doubles than in singles.

What makes the sticking power of these top seeds surprising is the number of good doubles players who aren’t part of seeded teams.  Because the game is less physically demanding, doubles specialists can play on to much more advanced ages than can singles players.  One of the teams that executed an upset yesterday, Jonathan Erlich and Andy Ram, was in 2008 ranked among the top few pairings in the world.  Further, plenty of singles players have proven themselves quite adept at doubles, but don’t play enough to amass much of a ranking.

Part of the reason why the seeds have progressed more-or-less intact is the US Open format of three full sets.  At other levels, the third-set match tiebreak essentially turns the contest into a coin flip.  Both the second- and fifth-seeded pairs were forced into a third set, and at an event with a ten-point tiebreak, the odds would’ve been much higher that one of them would be headed home.

Jack Sock is playing only his fifth Grand Slam, and his first as a direct entry, having recently gotten his ranking into the top 100.  Part of the reason he was able to move into that rarefied air is his lucky path to the third round in last year’s US Open.

In 2012, his first-round draw was Florian Mayer, who retired in the middle of the third set.  That gave him a shot at the relatively weak Flavio Cipolla, who he beat in straight sets.  He gave Nicolas Almagro a scare in the third round but ultimately lost.  Still, he took home 90 ranking points instead of the 10 he would’ve collected had he lost to a healthy Mayer in the first round.

Defending those points, one might expect the young American to take a tumble in the rankings after the US Open.  After all, your typical 86th-ranked player doesn’t have much chance to reach the third round, let alone do so two years in a row.

But fortune has favored him again.  In the first round, he drew Philipp Petzschner, who retired in the middle of the third set.  (Sound familiar?)  Yesterday, he defeated the clay-court specialist qualifier Maximo Gonzalez, who did him the huge favor of knocking out Jerzy Janowicz in the first round.

It’s hard to imagine an easier route to a Slam round of 32.

At his site Betting Market Analytics, Michael Beuoy shows us the trajectory of Vicky Duval’s historic first-round upset, similar to some of the win-probability work I’ve done in the past.

Finally, more Duval: I charted her match last night, and have reams of data to show for it.