Danielle Collins and Surprise Major Semi-finalists

Italian translation at settesei.it

With a three-set win today over Anastasia Pavlyuchenkova, Danielle Collins became the first woman into the 2019 Australian Open semi-finals. She was already the biggest surprise of the eight quarter-finalists. A week ago, most pundits (myself included) would’ve picked dozens of players more likely to find themselves in the final four.

Collins, a 25-year-old American, has doubled her grand slam match experience at a single tournament. She first made a name for herself as a stellar collegiate player, winning national titles in 2014 and 2016, which earned her wild cards into her first two majors. While she gave Simona Halep a scare by taking a set in their 2014 US Open encounter, no wins resulted from either of the wild cards. After her run to the Miami semi-finals last year, she earned her way into three more slams, but she drew seeds at all three and had to settle for first-round loser’s checks. All told, Collins’s experience at majors amounted to five main draws, five first-round losses, and a couple of wins in qualifying.

There’s simply no precedent for what she has done in Melbourne. She opened by narrowly upsetting 14th seed Julia Goerges, then won six sets in a row to knock out Sachia Vickery, 19th seed Caroline Garcia, and 2nd seed Angelique Kerber, needing barely one hour per match. Today’s contest took a bit longer, but the end result was the same: a 2-6 7-5 6-1 victory over Pavlyuchenkova, who was playing in her fifth major quarter-final.

A berth in a major semi-final with no previous grand slam match wins: that’s something worth a database query. Since 1980, only three other women have done the same: Monica Seles at the 1989 French Open, Jennifer Capriati at the 1990 French, and Alexandra Stevenson in 1999 at Wimbledon. Collins doesn’t exactly fit in with that trio: Seles and Capriati were playing their first majors, and neither had reached their 16th birthdays. Stevenson was 18 years old, playing only her third slam main draw. The closest comp for Collins is found in the men’s game, where 25-year-old Marco Cecchinato reached the semis at Roland Garros last year despite recording no wins in his previous attempts at majors.

Reaching the final four in one’s sixth slam isn’t as rare. 12 different women have done so, including Seles, Capriati, and Stevenson, along with Venus Williams and Eugenie Bouchard. But again, Collins’s time at the University of Virginia sets her apart from this group: all but one were teenagers, and the only other exception, Clarisa Fernandez, was 20 years old when she reached the 2002 Roland Garros semi-final. The least experienced 25-year-old semi-finalist was Fabiola Zuluaga, who made it to the 2004 Australian Open semis in her 17th major, with 22 match wins in her first 16 tries.

History offers few precedents for Collins. While male collegiates such as Kevin Anderson and John Isner have established themselves in the top ten and gone deep at majors, the women’s game has always skewed younger. Yes, the days of 15-year-old sensations like Capriati and Seles are behind us, but the most recent major title went to 20-year-old Naomi Osaka, and the same year that Collins won her first national title for Virginia, Bouchard–who is two months younger than the American–reached the Wimbledon final. The greatest success story in women’s collegiate tennis belongs to Lisa Raymond, who is best known for her exploits on the doubles court.

Perhaps Collins’s success will change that, much as Anderson–whose first major semi-final came at age 31, in his 34th slam–has shown that college can fit in the plans of a would-be ATP star. With 20% of the WTA top 100 in their thirties, there’s more for a late starter to look forward to than ever before. It’s unreasonable to expect that Collins will be a regular feature at the tail end of grand slams, but it’s possible she’ll outdo Raymond, who peaked at 15th in the singles rankings. Next time we see her in the second week of a major, we won’t be so surprised.

Frances Tiafoe’s Narrow Margins

Italian translation at settesei.it

Yesterday, Frances Tiafoe added another breakthrough to his young career with a fourth-round defeat of 20th seed Grigor Dimitrov at the Australian Open. The whole tournament has been a coming-out party for the just-turned 21 year old, as Tiafoe only got this far thanks to an even more impressive upset of 5th seed Kevin Anderson in the second round. The American will see his ranking climb into the top 30 for the first time, and his marketability as a potential superstar will soar even higher.

The role of the statistical analyst is often to stand athwart an exciting trend yelling “Stop!,” and I’m afraid that’s my role today. Yes, Tiafoe is a compelling young player with a lot of potential. Throughout 2018 he repeatedly demonstrated he could hang with the best players in the world, something he further solidified with the win over Anderson last week. But the Dimitrov win, life-changing as it may be, was a bit of a fluke.

In fact, yesterday’s match was–by a couple of simple metrics–less impressive than a lot of his 2018 losses, including a defeat at the hands of Dimitrov in Toronto last year. Across 337 points against the Bulgarian on Sunday, Tiafoe lost more than half of them, winning only 34.7% of his return points compared to Dimitrov’s 39.5%. The resulting Dominance Ratio (DR) for the match is 0.88, a mark that almost never results in victory. (DR is the ratio of return points won to opponent return points won: 1.0 means that the players performed equally, and higher is better.) On the ATP tour last year, more than 92% of winners recorded a DR of 1.0 or better, and 97.4% of winners–that’s 39 out of every 40–won enough points to amass a DR of 0.9.

As I’ve said, many of Tiafoe’s losses have seen him play better. Against Dimitrov in Toronto, his DR was 0.98; versus Anderson in Miami his DR was 0.99 in a straight-set defeat; and even in his routine, 6-4 6-4 loss to Joao Sousa in the Estoril final, his DR was almost as good as it was yesterday, at 0.87. In the range of close-but-outplayed matches–let’s say DRs from 0.85 to 0.99–Tiafoe won 4 of 18 last year, and all but one of the wins were closer than yesterday’s triumph.

The trick to winning a match while tallying fewer than half the total points and a lower rate of return points than your opponent is to play better in the big moments, like break points. The American certainly did that, converting 5 of 13 break opportunities while limiting Dimitrov to only 3 of 18. Execution in tiebreaks also helps, though it didn’t make a difference in yesterday’s upset, as the two men split a pair of breakers. To Tiafoe’s credit, he outplayed the Bulgarian when it mattered most. In that sense, he deserved the victory, no matter what the stats say.

But break point and tiebreak performance tends to even out. Just because the 21-year-old captured lightning in a bottle at a few key moments to win a high-profile match doesn’t mean he’ll be able to do it again. Just as there are almost no players who win tiebreaks any more often than their overall performance would suggest, players with excellent single-year break-point records quickly regress to the mean. It may not be correct to say that Tiafoe was lucky to win yesterday–he may well have kept his focus and maintained his level better than opponent did–but whatever made the difference, it’s not something with predictive power. Next week, next major, or next year, he isn’t any more likely than the next guy to post a DR of 0.88 and come out on top.

Still, I’m not here just to throw cold water on a young player’s prospects. For one thing, had a couple of break points gone the other way yesterday and Dimitrov gotten through, a fourth-result result would still represent an encouraging step forward for the American. His upset of Anderson sported a particularly impressive DR of 1.29–35.1% of return points won compared to Kevin’s 27.2%–which was better than all but ten of Anderson’s matches last year. (Three of those ten came at the hands of Novak Djokovic, and seven of the ten were against top ten players.)

Tiafoe is getting better, and there are plenty of signs that indicate he’s the brightest young star in American men’s tennis. He’s accomplished a lot of things in Melbourne, but outplaying Dimitrov isn’t one of them.

Podcast Episode 45: Australian Open Week One

Episode 45 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, is our attempt to cover the entirety of seven days of grand slam tennis in a one-hour podcast. On the men’s side, we discuss Federer’s vulnerability to an early upset, what to think about Tiafoe and the young American resurgence in general, and some solid under-the-radar performances from Milos Raonic and Roberto Bautista Agut.

We then make some cautious predictions about the Simona-Serena fourth round match and consider whether we should be as excited about Ashleigh Barty as my Elo ratings are. We even talk a bit about doubles, though it’s mostly about why it’s hard to talk about doubles. But don’t worry–we’ll keep trying.

Thanks for listening!

(Note: this week’s episode is about 65 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

The Happy Slam is the Speedy Slam

Italian translation at settesei.it

Two years ago, during the 2017 Australian Open, I offered a partial explanation of the many upsets at that year’s first major. Novak Djokovic, Andy Murray, Angelique Kerber, Simona Halep and many others had been ousted before the quarter-finals, all to players with a more aggressive, attacking style. It turned out that the courts that year were playing particularly fast–quicker than any of the other slams, including Wimbledon, as well as most hard-court tour stops.

In Melbourne this year, the courts are playing even faster.

Through three rounds of play, almost 90% of the tournament’s singles matches are in the books. Based on my surface-speed metric, which measures how many aces are struck at each tournament while controlling for the mix of servers and returners, the 2019 Australian Open can boast the quickest surface at the event since at least 2011*, and the second-fastest conditions of any major in that time span.

* Match stats, even simple ones such as service points and aces, are increasingly tough to come by for the women’s game before 2011.

The average of my surface-speed ratings for the men’s and women’s events at 2019’s first major is 1.28, meaning that there have been 28% more aces than expected, given the mix of servers and returners across the matches played so far. The notably fast 2017 event was 1.23, the fastest US Open of the last eight years was 1.14 (in 2015), and last year’s Wimbledon, played on the surface that is supposed to be fastest of all, was a mere 1.06.

Here are the top ten fastest slam surfaces from 2011 to the present:

Speed Rating Tournament      
1.31     2011 Wimbledon    
1.28     2019 Australian Open* 
1.27     2014 Wimbledon    
1.27     2016 Australian Open 
1.23     2017 Australian Open 
1.20     2015 Australian Open 
1.18     2015 Wimbledon    
1.17     2013 Wimbledon    
1.17     2012 Wimbledon    
1.15     2014 Australian Open

* through first three rounds

Last year’s Aussie Open was a bit of an outlier, but even still, it barely missed this list, coming in 12th at 1.12.

At least most players arrived prepared. The warm-up events in Brisbane and Auckland ranked among the fastest conditions since the beginning of last season: Brisbane rates at 1.29 while Auckland came in at a blink-and-you’ll-miss-it 1.35. Last year, only four events per tour were faster.

In theory, such a speedy surface should work to the advantage of big servers with aggressive games. At least so far, it hasn’t worked out that way. Unlike in 2017, Djokovic, Halep, and Kerber are still in the running, while Kevin Anderson was an early casualty. On the other hand, the court speed does jibe with some results, like Maria Sharapova’s third-round upset of defending champion Caroline Wozniacki.

If the conditions are to impact the result of the tournament, it will have to happen in matches yet to come. A slick surface tends to favor Roger Federer, even if Djokovic remains the popular pick to hoist the trophy next Sunday. More immediately, a fast surface doesn’t bode well for Halep’s chances in her fourth-round match against Serena Williams. Facing Serena is difficult enough without the conditions working against you, too.

Dayana Yastremska Hits Harder Than You

Italian translation at settesei.it

At the 2019 Australian Open, tennis balls have more to fear than ever before. Serena Williams is back and appears to be in top form, Maria Sharapova is playing well enough to oust defending champion Caroline Wozniacki, and Petra Kvitova has followed up her Sydney title with a stress-free jaunt through the first three rounds.

And then there are the youngsters. Hyper-aggressive 20-year-old Aryna Sabalenka crashed out in the third round against an even younger threat, Amanda Anisimova. But still in the draw, facing Serena on Saturday, is the hardest hitter of all, 18-year-old Ukrainian Dayana Yastremska. Watch a couple of Sabalenka matches, and you might wonder if we’ve reached the apex of aggression on the tennis court. Nope: Yastremska turns it up to 11.

When Lowell first introduced his aggression score metric a few years ago, Kvitova was the clear leader of the pack, the player who ended points–for good or ill–most frequently with the ball on her racket. Madison Keys wasn’t far behind, with Serena coming in third among the small group of players for which we had sufficient data. Since then, two things have changed: The Match Charting Project now has a lot more data on many more players, and a new generation of ball-bashers has threatened to make the rest of the tour look like weaklings in comparison.

The aggression score metric packs a lot of explanatory power in a simple calculation: It’s the number of point-ending shots (winners, unforced errors, or shots that induce a forced error from the opponent) divided by the number of shot opportunities. The resulting statistic ranges from about 10% at the lower extreme–Sara Errani’s career average is 11.6%–to 30%* at the top end. Individual matches can be even higher or lower, but no player with at least five charted matches sits outside of that range.

* Readers with a keen memory or a penchant for following links will notice that in Lowell’s orignial post, Kvitova’s aggregate score was 33% and Keys was also a tick above 30%. I’m not sure whether those were flukes that have since come back down with larger samples, or whether I’m using a slightly different formula. Either way, the ordering of players has remained consistent, and that’s the important thing.

Here are the top ten most aggressive WTA tour regulars of the 2010s before Sabalenka and Yastremska came along:

Rank  Player                      Agg 
1     Petra Kvitova             27.1%  
2     Julia Goerges             26.8%  
3     Serena Williams           26.8%  
4     Jelena Ostapenko          26.5%  
5     Camila Giorgi             26.0%  
6     Madison Keys              25.9%  
7     Coco Vandeweghe           25.9%  
8     Sabine Lisicki            25.6%  
9     Anastasia Pavlyuchenkova  24.0%  
10    Maria Sharapova           23.2%

All of these women rank among the top 15% of most aggressive players. They end points more frequently on their own racket than plenty of competitors we also consider aggressive, like Venus Williams (21.9%), Karolina Pliskova (21.6%), and Johanna Konta (22.3%). Ostapenko bridges the gap between the two generations; she wasn’t part of the discussion when aggression score was first introduced, though once she started winning matches, it was immediately clear that she’d challenge Kvitova at the top of this list.

Here’s the current top ten:

Rank  Player               Agg  
1     Dayana Yastremska  28.6%  
2     Aryna Sabalenka    27.6%  
3     Petra Kvitova      27.1%  
4     Julia Goerges      26.8%  
5     Serena Williams    26.8%  
6     Jelena Ostapenko   26.5%  
7     Viktoria Kuzmova   26.0%  
8     Camila Giorgi      26.0%  
9     Madison Keys       25.9%  
10    Coco Vandeweghe    25.9%

Yastremska, Sabalenka, and even Viktoria Kuzmova have elbowed their way into the top ten. Yastremska’s and Kuzmova’s places on this list might be a little premature, since their scores are based on only seven and nine matches, respectively. But Sabalenka’s pugnaciousness is well-documented: her Petra-topping score of 27.6% is an average across almost 30 matches.

Tennis tends to swing between extremes, with one generation developing skills to counteract the abilities of the previous one. It’s not yet clear whether the aggression of these young women will catapult them to the top–after all, Sabalenka won only five games today against Anisimova, whose aggression score is a more modestly high 23.0%. Perhaps as they gain experience, they’ll develop more well-rounded games and return Kvitova to her place at the top.

In the meantime, we have the privilege of watching some of the hardest hitters in WTA history battle it out. Tomorrow, Yastremska will contest her first third round at a major in a must-watch match against Serena. There will be fireworks, but it’s safe to say there won’t be much in the way of rallies.

A Closer Look at Tiebreak Tactics

Italian translation at settesei.it

In theory, tiebreaks are a showcase for big serving, the skill that generates enough holds of serve to push a set to 6-6. But no matter how two players get there, the tiebreak itself doesn’t always work out that way.

Two examples suffice from Wednesday’s Australian Open action. Roger Federer’s second-round match against Daniel Evans opened with twelve straight service holds, threatened by only one break point. Yet in the tiebreak, which Federer won 7-5, the returner claimed 9 of 12 points. Across the grounds in front of a much smaller crowd, Thomas Fabbiano and Reilly Opelka forced a fifth-set super-tiebreak. Through 52 games and 319 points, Opelka hit 67 aces and the pair averaged 2.9 shots per “rally.” In the match-deciding tiebreak, Opelka hit no aces, Fabbiano got all but one of his serves back in play, and they averaged 5.5 shots per point.

When I started researching tiebreaks several years ago, I found that the balance of power shifts away from the server: returners win more points in tiebreaks than at other points during the set. It’s not a huge effect, accounting for about a 6% drop in server winning percentage, possibly due to the fact that players almost always give 100% on each point, unlike weak returners facing 40-0 in the middle of the set. Sure, Federer-Evans and Fabbiano-Opelka are outliers: even if servers suffer a bit in the typical tiebreak, the whole sport doesn’t usually turn upside down. Still, the effect is worth a deeper dive.

Isner isn’t the only conservative

Let’s start with some overall trends. Filtering for men’s matches from 2010-19, I found 831 tiebreaks with shot-by-shot data from the Match Charting Project. For each set that ended in a tiebreak, I tallied several stats for both tiebreak points and non-tiebreak points, calculated the single-set ratio for each stat, and then aggregated all 831 breakers to get some tour-wide numbers. Here’s what happens to stats in tiebreaks:

  • Service points won: -6.5%
  • Aces: -6.1%
  • First serve in: +1.3%
  • Returns in play: +8.5%
  • Rally length: +18.9%

(Technical note: When aggregating the ratios from all 831 tiebreaks, I weighted by the number of points in each tiebreak, but only up to a maximum of 11. Longer tiebreaks tend to be the ones if which servers are the strongest, like the 17-15 marathon in the first set of Fabbiano-Opelka. If those were weighted for their true length, we’d bias the results towards the best serving performances.)

Judging by the increase in successful first offerings, it looks like servers are a bit more conservative in tiebreaks. The large drop in aces and even bigger increase in returns in play provide additional evidence. Focused returners may be able to erase a small number of aces, but not that many, and they wouldn’t be able to convert so many into successful returns. The nearly 20% increase in rally length can be explained in part by the drop in aces (those one-shot rallies are replaced with more-shot exchanges), but the magnitude of the rally length effect suggests that players are more conservative on both sides of the ball.

More than one way

Not every player handles breakers the same way. Several men, including Federer, serve about as well as usual in these high-pressure situations. Certain others, like Rafael Nadal, appear to be more conservative, but make up for it by feasting on the toned-down offerings of opposing servers. Still others, like the impossible-to-write-about-tiebreaks-without-bringing-up Ivo Karlovic, underperform on both sides of the ball.

Here are the 20 players with the most tiebreaks recorded by the Match Charting Project since 2010. For each one, you can see how their rates of service points and return points won in tiebreaks compare to non-tiebreak situations. For instance, Jo Wilfried Tsonga wins 5.4% more service points in tiebreaks than otherwise, compared to the usual shift of 6.5% in the opposite direction. But Tsonga’s rate of return points won falls 3.4%, while the typical player increases his haul on return by 6.5%.

Player                    SPW    RPW  
Jo Wilfried Tsonga       5.4%  -3.4%  
Roger Federer            0.4%   3.2%  
Stan Wawrinka           -0.1%   4.2%  
John Isner              -0.6%   6.4%  
Novak Djokovic          -0.8%  11.8%  
Andy Murray             -2.2%   8.7%  
Alexander Zverev        -2.7%  18.7%  
Juan Martin del Potro   -3.3%   5.3%  
Nick Kyrgios            -4.1%  10.5%  
Dominic Thiem           -4.6%  12.1%  
----ATP AVERAGE----     -6.5%   6.5%  
Kevin Anderson          -7.1%   8.9%  
Gilles Simon            -8.0%  16.3%  
Tomas Berdych           -8.4%   6.8%  
Milos Raonic            -9.2%   9.1%  
Rafael Nadal            -9.4%  13.6%  
Marin Cilic            -10.2%   5.8%  
Bernard Tomic          -11.3%   4.5%  
Ivo Karlovic           -12.6%  -0.9%  
Grigor Dimitrov        -13.8%   5.1%  
Karen Khachanov        -25.1%  -5.4%

For most players, the goal appears to be to win enough extra return points to counteract the drop in service success. Nadal is the most extreme example, winning almost 10% fewer service points than usual, but doing even more damage to his opponents. Alexander Zverev is the most impressive of the bunch, dropping his serve level only a bit, while converting himself into a Rafa-like returner. As you might expect, his tiebreak record is outstanding, winning far more than expected last season. We’ll see whether his eye-popping numbers persist.

A winning strategy

Ideally, I would wrap up a post like this with a recommendation. You know, analyzing the various approaches, based on these numbers, we can confidently say that players should….

It’s not that easy. It’s hard enough to identify which players are good at tiebreaks, let alone why. As I’ve written many times before, tiebreak results are closely related to overall tennis-playing skill, but not to serving prowess or excellence in the clutch. In any given season, some players amass outstanding tiebreak records, but their success one year rarely translates to the next. At various times in the past, I’ve highlighted Federer, Isner, Nadal, and Andy Murray as players who defy the odds and consistently outperform expectations in tiebreaks, but even they don’t always manage it. Isner, the poster boy for triumph via tiebreak, won slightly fewer breakers than expected in both 2016 and 2018.

Still, let’s look at these four guys in the light of the shot-by-shot data I’ve shared so far. Federer, Isner, and Murray are in the minority of players who hit more aces in tiebreaks than otherwise. However, it it doesn’t necessarily mean they are much more aggressive; of the the three, only Federer makes fewer first serves than usual. Isner manages to reduce the number of returns in play by 10%, compared to non-tiebreak situations, while the other two do not. Nadal breaks the mold entirely, making 6% more first serves than usual and hitting barely half as many aces.

In other words, there’s no single path to success. Federer and Isner maintain their superlative serving while taking advantage of their opponents’ nerves or conservative tactics. (I’ve previously suggested that the difference in serve points won comes from players like Isner upping their return game in pressure situations. He does, but not any more than the average player.) Nadal plays to his own strengths, forcing players into rallies from both sides of the ball. There may be some quality that ties these four men together (like focus), but we’re not going to find it here.

Mackie McDonald’s Secret Weapon

Italian translation at settesei.it

In the first round on Monday, the 23-year-old American Mackenzie McDonald defeated young Russian Andrey Rublev in four sets, 6-4 6-4 2-6 6-4. While Rublev missed part of the 2018 season due to injury and carries a ranking just inside the top 100, the victory still qualifies as a bit of an upset for McDonald, who has never come close to Rublev’s peak of No. 31.

The handful of fans who kept tabs on Court 10 were treated to an unusual display. The American relentlessly attacked Rublev’s second serve, rushing the net behind his return almost two dozen times. Many players don’t hit return approach shots that often in an entire year. What’s more, the tactic worked. Without it, the already close match would have been a coin flip.

By my count, in the log I kept for the Match Charting Project, McDonald came in behind his second serve return 22 times. Approach shot counts are never precise, because when a player hits a winner or an error, he may lean forward as if to continue toward the net, but quickly stop when he realizes it’s unnecessary. To be precise, he came in at least 22 times, and perhaps one more return winner or a couple of return errors should also be added to the total. No matter, the conclusions are similar regardless of whether the number is 22 or 24.

Rublev hit 62 second serves, but 9 of those resulted in double faults, so we’re looking at 53 playable second serves. McDonald netrushed 22 of those, winning 10. Of the other 31, he won only 11. That’s a return winning percentage of 45% on return approaches compared to 35% on other returns. Had he won all of those points at the 35% rate, it would have cost him two, perhaps three points off his overall total. He barely outscored Rublev as it was, 124 points to 118, so every little bit helped.

A rarity in context

The Match Charting Project has shot-by-shot data for nearly 2,000 men’s matches from this decade, and Monday’s four-setter was the first one of those in which a player hit at least 20 second-serve return approaches. (Dustin Brown approached at a higher rate in multiple matches, including his 2015 Wimbledon upset of Rafael Nadal.) There are only ten other matches in the database in which one player hit at least ten such approaches, and Mischa Zverev accounts for three of them. More than three-quarters of the time, the total number of second-serve return approaches is zero.

McDonald is not alone in enjoying some success with the tactic: The 1500 or so second-serve return approaches in the dataset were about 14% more effective than non-approaches in the same matches. However, it’s hard to be sure what that number is telling us, since most players approach so rarely. Some of the attacks are probably on-the-fly decisions against particularly weak serves, not pre-planned plays like many of Mackie’s netrushes on Monday.

Thus, it’s difficult to know how much success most men would have with the tactic, were they to adopt it more often. The fact that they employ it so rarely might tell us all we need to know: If more players thought that attacking the net behind the second serve return would win them more points, they’d do it. But for McDonald, it doesn’t matter what his peers do; it only matters what works for him. These 22 return approaches represented a lot more aggression than he displayed in the four previous matches we’ve charted, and it paid off.

It wasn’t enough to get him a win today against Marin Cilic, but he did outperform expectations, taking a set against the 6th seed and defending finalist. Best of all, he won more than half of Cilic’s second-serve points–a better rate than he managed against Rublev, and several ticks above 46%, the fraction that the average opponent manages against Cilic. In a sport often criticized for its uniformity of tactics, McDonald is an up-and-comer worth watching.

Watch Out For Tomas Berdych

Italian translation at settesei.it

For years, Tomas Berdych has flown beneath the radar. Even when he spent several seasons in the top ten, he rarely challenged the big four, picking up his 13 career titles against weaker competition. His quarter-final showing at last year’s Australian Open was surprising, but it was also symbolic of his entire career: a couple of nice wins followed by a straight set loss to Roger Federer.

The rest of Berdych’s 2018 campaign went downhill from there. He won back-to-back matches only twice more (one of the pairs came in Marseille, thanks to a Damir Dzumhur retirement), lost five in a row between Miami and the French Open, and surrendered to a back injury before Wimbledon, missing the rest of the season. He turned 33 during his time away, so it would have been understandable had he struggled upon return, or even if he decided that 2019 would represent his farewell tour.

Neither appears to be the case. The Czech reached the final in his first tournament back, in Doha this month, coming within a set of ousting Roberto Bautista Agut and bagging his first title since 2016. On Monday in Melbourne, he barely broke a sweat en route to a straight-set defeat of 13th seed and defending semi-finalist Kyle Edmund. A 33-year-old returning from a back injury is unlikely to return to his career high of No. 4 in the rankings, but should he stay healthy, the top ten isn’t an impossible goal, especially among a somewhat weaker field than the one he faced in the early part of the decade. After all, we learned last week that the players who manage to stick around can improve even into their mid-30’s.

A big part of the case for a Berdych resurgence is that his abbreviated 2018 season wasn’t as bad as it looked. Yes, he lost as many matches as he won, and only one of his victories came against a top-20 player. But even without accouting for the injury that slowed him down, he was quite unlucky. Of his eleven losses, he was at least the equal of his opponent in five of them, according to Dominance Ratio (DR), the ratio between return points won and opponent return points won. That’s just bad luck: In his career through 2017, he lost 35 such matches, but won another 35 when his opponents slightly outplayed him. Flip a few of those results, and Berdych’s 11-11 record becomes at least 14-8 in those matches, and we would have seen more of him in late rounds, assuming his body allowed it.

A more precise way to pin down his 2018 performance is by using stats adjusted for competition level, which I outlined in a previous post. His adjusted DR for each season is displayed below, with age along the horizontal axis:

His adjusted DR last year–his age 33 season–was 1.22, his best single-year mark since 2012, when he finished 6th in the year-end rankings. With only 22 matches in the books, we could be looking at a fluky result due to the limited sample, but on the other hand, a healthier Berdych should be even better. A stronger back should be able to cancel out the effect a few bounces failing to go his way.

And based on some very early results, “stronger” is exactly the word for it. In his five matches at the Australian Open last year, his average first serve speed fluctuated between 191 and 198 km/h (119 to 123 mph), including a first-round mean of 195 km/h. On Monday against Edmund, he averaged 201 km/h (125 mph). His fastest serve of the 2018 Australian was 212 km/h (132 mph) in the third round; he peaked at 211 km/h yesterday. His 2018 overall rate of serve points won was his lowest since 2009, meaning that his solid overall numbers were thanks to superior returning. If he comes back serving better than he did last year, it’s another positive sign.

The rest of this week offers a good test of Berdych’s form. On Wednesday he’ll face Robin Haase, an opponent that a would-be top-tenner should dispatch easily. The third round may involve a clash with Diego Schwartzman, a matchup that slightly favors the Czech on a hard court, but will force him to work harder than the Edmund match did. Should he reach the second week, his probable fourth-round foe would be Rafael Nadal. He would enter that match with extremely low expectations, but hey, that’s no different than the many times that they faced off in the past. And there’s always hope: Rafa has won 18 of their last 19 meetings, but the sole loss came almost exactly four years ago, at the Australian Open.

What I Should’ve Known About Playing Styles and Upsets

In the podcast Carl Bialik and I recorded yesterday, I mentioned a pet theory I’ve had for awhile, that upsets are more likely in matches between players with contrasting styles. The logic is fairly simple. If you have two counterpunchers going at it, the better counterpuncher will probably win. If two big servers face off, the better big server should have no problem. But if a big server plays a counterpuncher … then, all bets are off.

We’ve seen Rafael Nadal struggle against the likes of John Isner and Dustin Brown, and and we’ve seen big servers neutralized by their opposites, as in Marin Cilic’s 1-6 record against Gilles Simon. There are upsets when similar styles clash, as well, but as untested theories go, this one is appealing and not obviously flawed.

Then, to kick off the 2019 Australian Open, Reilly Opelka knocked out Isner. Playing styles don’t come much more evenly matched, and the veteran was the heavy favorite. It was a perfect example of the kind of match I would expect to follow the script, yet the underdog came out on top. They played four tiebreaks and there were only two breaks of serve, but Opelka didn’t even need the Australian Open’s new fifth-set 10-point tiebreak. While it’s just one match, of course, it suggested that I ought to look more closely at my assumptions.

After a couple of hours playing with data this afternoon, my theory is no longer untested … and it turned out to be flawed. Fortunately, it isn’t just another negative result. Playing style is related to upset likelihood, but not in the way I predicted.

Measuring predictability

Let me explain how I tested the idea, and we’ll work our way to the results. First, I used used Match Charting Project data to calculate aggression score for every ATP player with at least 10 charted matches since 2010. Aggression score is, essentially, the percentage of shots that end the point (by winner, unforced error, or inducing a forced error), as will serve as our proxy for playing style. That gives us a group of 106 players, from the conservative Simon and Yoshihito Nishioka with aggression scores around 13%, to the freewheeling Brown and Ivo Karlovic, with scores nearing 30%. I divided those 106 players into quartiles (by number of matches, not number of players, so each quartile contains between 21 and 31 players) so we could see how each general playing style fares against the others. Here are the groups:

(Aggression score conflates two things: big serving/big hitting and tactical aggression. Isner is sometimes not particularly aggressive, but because of his size and serve skill, he is able to end points so frequently that, statistically, he appears to be extremely aggressive. Accordingly, I’ll refer to “big servers” and “aggressive players” interchangeably, even though in reality, there are plenty of differences between the two groups.)

Limiting our view to these 106 men, I found just over 11,000 matches to evaluate and divided them into groups based on which quartiles the two players fell into. Each of the ten possible subsets of matches, like Q1 vs Q2, or Q4 vs Q4, contains at least 400 examples.

For every match, I used surface-adjusted Elo ratings to determine the likelihood that the favorite would win. That gives us pre-match odds that aren’t quite as accurate as what sportsbooks might offer, though they’re close.

Those pre-match odds are key to determining whether certain groups are more predictable than others. If there are 100 matches in which the favorite is given a 60% chance of winning, and the favorites win 70 of them, we’d say that the results were more predictable than expected. If the favorites win only 50, the results were less predictable.

Goodbye, pet theory

For the matches in each of the ten quartile-vs-quartile subsets, I calculated the average favorite’s chance of winning (“Fave Odds”), then compared that to the frequency with which the favorites went on to win (“Fave Win%”). The table below shows the results, along with the relationship between those two numbers (“Ratio”). A ratio of 1.0 means that matches within the subset are exactly as predictable as expected; higher ratios mean that the favorites were even better bets than the odds gave them credit for, and lower ratios indicate more upsets than expected.

MatchupMatchesFave OddsFave Win%Ratio
Q1 vs Q141271.1%75.2%1.06
Q1 vs Q2107269.5%70.6%1.02
Q1 vs Q3138269.7%68.6%0.98
Q1 vs Q4118769.7%70.0%1.00
Q2 vs Q261270.2%69.9%1.00
Q2 vs Q3161668.8%67.8%0.99
Q2 vs Q4143468.8%67.4%0.98
Q3 vs Q388666.7%60.3%0.90
Q3 vs Q4168567.3%66.8%0.99
Q4 vs Q479167.1%61.2%0.91

There’s a striking finding here: The largest ratio, marking the most predictable bucket of matches, is for the most conservative pairs of players, while the smallest ratio, pointing to the most frequent upsets, is for the most aggressive players.

Before analyzing the relationship, let’s check one more thing. The very best players aren’t evenly divided throughout the quartiles, since Q1 has two of the big four. Elo-based match predictions–one of the building blocks of these results–are tougher to get right for the best players and the most uneven matchups, so we need to be careful whenever the elites might be influencing our findings. Therefore, let’s look at the same numbers, but this time for only those matches in which the favorite has a 50% to 70% chance of winning. This way, we exclude many of the best players’ matchups and all of their more lopsided contests:

MatchupMatchesFave OddsFave Win%Ratio
Q1 vs Q119659.5%62.8%1.05
Q1 vs Q260459.8%60.6%1.01
Q1 vs Q373159.7%58.1%0.97
Q1 vs Q466359.9%60.6%1.01
Q2 vs Q232259.0%54.7%0.93
Q2 vs Q393159.8%59.8%1.00
Q2 vs Q482259.7%57.2%0.96
Q3 vs Q354459.5%55.0%0.92
Q3 vs Q4102459.5%58.2%0.98
Q4 vs Q449359.3%55.0%0.93

We discard about 40% of our sample, but the predictability trend remains the generally the same. In both the overall sample and the narrower 50%- to 70%-favorite subset, the strongest relationship I could find was between the predictability ratio and the quartile of the less aggressive player. In other words, a counterpuncher is likely to have more predictable results–regardless of whether he faces a big server, a fellow counterpuncher, or anyone in between–than a more aggressive player.

Back to basics

My initial theory is clearly wrong. I expected to find that Q1 vs Q1 matches were more predictable than average, and I was right. But by my logic, I also guessed that Q4 vs Q4 matches went according to script, and that other pairings, like Q1 vs Q4, would be more upset-prone. I would have done better had I let the neighbor’s cat make my predictions for me.

Instead, we find that that matches with more aggressive players are more likely to result in surprises. That doesn’t sound so groundbreaking, and it’s something I should’ve seen coming. Big servers tend to hold serve more often and break serve less frequently, meaning that their matches end with narrower margins, opening the door for luck to play a larger role, especially when sets and matches are determined by tiebreaks.

After all this, you might be thinking that I’ve squandered my afternoon, plus another few minutes of your attention, arriving at something obvious and unremarkable. I agree that it’s not that exciting to proclaim that big servers are more influenced by luck. But there’s still a useful–even surprising–discovery buried here.

Exponential upset potential

We know that the most one-dimensional players are more subject than others to the ups and downs of luck, thanks to the narrow margins of tiebreaks. For a man who rarely breaks serve, no match is a guaranteed win; for a man who rarely gets broken, no opponent is impossible to beat. However, I would have expected that the unpredictability of big servers was already incorporated into our match predictions, via the Elo ratings of the big servers. If a player has unusually random results, we’d expect his rating to drift toward tour average. That’s one reason that it’s very difficult for poor returners to reach the very top of the rankings.

But apparently, that isn’t quite right. The randomness-driven Elo ratings of our big servers do a nearly perfect job of predicting match outcomes against counterpunchers, and they’re only a little bit too confident against the more middle-of-the-road players in Q2 and Q3. Against each other, though, upsets run rampant. That extremely volatile fraction of results–the tiebreak-packed outcomes when the biggest servers face off–only accounts for part of these players’ ratings.

We’re accustomed to getting unpredictable results from the most aggressive players, with their big serves, inconsistent returns, and short rallies. Today’s findings give us a better idea of when these do and do not occur. Against counterpunchers, things aren’t so unpredictable after all. But when big servers play each other, we expect the unexpected–and the results are even more unpredictable than that.

Podcast Episode 44: Murray’s Retirement and an Australian Open Preview

Episode 44 of the Tennis Abstract Podcast, with Carl Bialik of the Thirty Love podcast, starts with some reflection on the outstanding career and premature retirement of Andy Murray. We spend some time talking about the surprising ATP aging curves I wrote about a few days ago, then delve into the 2019 Australian Open.

We assess the dangers awaiting the coachless Simona Halep, including a potential fourth-round meeting with Serena Williams, as well as the long list of women with a plausible chance to lift the trophy two weeks from now. We agree on the predictable pick to win the men’s title, but note that Novak Djokovic’s last few months carry a few warning signs.

Thanks for listening!

(Note: this week’s episode is about 67 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.