Facts, Figures and Myths About Walkovers

Novak Djokovic advanced to the final of the Miami Masters today when Kei Nishikori withdrew from the event due to injury. Oddly, it was the second match at the Sony Open that Djokovic didn’t have to play, as Florian Mayer pulled out before their scheduled third round match.

It’s a rare occurrence in professional tennis–so rare that it had only happened once since 1968, when several players benefited from multiple walkovers at the French Open. In Miami two years ago, Andy Murray also skipped his third round and semifinal matches, as both Milos Raonic and Rafael Nadal dropped out due to injury.

The fact that it was Djokovic who got the free pass immediately gave rise to all sorts of speculation. Will the lack of match play hurt the Serbian? Does Novak get more walkovers than most? Are opponents more likely to withdraw if they’re facing a top player?

Let’s take these questions in order. I addressed a similar issue a couple of years ago in this post. Walkovers are rare, but the available evidence suggests that there’s no positive or negative effect from winning via withdrawal. A player’s chances of winning his next match are roughly what they would’ve been anyway.

Djokovic does gain from walkovers more often than the average player, but he’s far from the top of the list. Opponents have withdrawn five times in his 695 matches, good for 0.7%, roughly the same rate as opponents of Murray, Nadal, Roger Federer … and Donald Young and Dmitry Tursunov. Jo Wilfried Tsonga has benefited from six walkovers in 432 matches, a 1.3% rate, highest among tour veterans.

Top players win by walkover more often than others–but as we’ll see in a moment, it isn’t because they are top players. It’s intuitive to figure that mildly injured players are more likely to take the court if they think they have a better chance of winning, but the evidence suggests there’s little, if any, effect.

Men ranked in the top five win by walkover 0.6% of the time, while those in the next five get free passes 0.3% of the time, and most of the rest of the pack benefits at the tour average rate of 0.2%–once every 500 matches. (All of these aggregate rates are based on tour-level matches from 1991 through 2014 Indian Wells.)

For the most part, top players get walkovers because they hang around until the late rounds of tournaments. Walkovers occur at the highest rate in the quarterfinals of events, when 1.1% of matches end before they begin. Round-of-16 contests are almost as bad, at 1.0%, and semifinals are also considerably more walkover-prone than average, at 0.6%.

When we take these dangerous middle rounds out of the equation, the number of walkovers shrinks, as does the difference between top players and the rest of the pack. Less than 0.15% of pre-R16 matches end in walkover, and the rate at which top-five players benefit from them falls to 0.4%. That’s still more frequent than the rate for the rest of the field, but keep in mind the tiny numbers we’re dealing with here. It’s 13 walkovers in over 3000 matches. Take away five of those withdrawals–roughly two per decade–and the top five would benefit at the same rate as players ranked 16-20.

It’s not as interesting a narrative, but it appears that players usually withdraw when they are too injured to compete, and that’s most likely to happen midway through a tournament. The highest-ranked players benefit–because of their previous success on the court, not their intimidating influence off of it.

Disorder of Play

Imagine you’re a rabid Chicago Cubs fan (sorry), and you’re looking forward to the season starting in a couple of weeks. You’re thinking of making a road trip to see your favorite team. You go to the Yankees website, and all you can find are some vague references to a big series in St. Louis in May. Nothing more.

You check out MLB.com and find a story about the matchup between the Cubs and White Sox, but it’s mostly about last year. Finally you start checking the websites for other MLB stadiums, and you discover that the Cubs are scheduled to play in Milwaukee for three days in June. You consider checking another couple dozen sites and finally give up.

Baseball fans know just how ridiculous that is–you can find a Cubs schedule in any of hundreds of places, with clickable links to every other MLB team’s slate for the season. You can see a list of every Opening Day matchup or, if you want, every game scheduled for the 5th of September.

Yet this fictional scenario of fruitless schedule-hunting is exactly what tennis fans face every week of the season. It’s easy to find out where tournaments will be held, but often impossible–and always irritating–to establish who will be playing. If you want to know what the next few weeks look like for your favorite player (especially if your favorite player isn’t named Roger, Rafa, Novak, or Andy), good luck. Patience is a virtue, I guess.

Unlisted lists

Players formally commit to tour events several weeks ahead of time. Each tournament has an entry deadline (top-tier events are six weeks in advance, Challengers three weeks), and once entries are in, we have what is called–you guessed it–an entry list. You can see the list for the ATP Houston event here, since the tournament organizers chose to publish it. Not all events do.

And even when they do, they rarely keep them up to date. Throughout the several weeks between the initial list and the beginning of qualifying rounds, players withdraw and alternates enter the mix. Especially at the 250 level, it’s not uncommon for 10 or more alternates to find their way into the main draw.  But with an old list (if there is a list at all), how to know whether Tim Smyczek or Dominic Thiem or Dudi Sela or Somdev Devvarman is going to be there?

Making matters worse, Wild Card entries–players who are chosen in part to increase fan interest at an event–are often published elsewhere, for instance in a press release. 18 days from the opening of the tournament, Houston hasn’t said anything about who any of those players will be. (Though if I were a betting man, here’s where I’d put my money.)

Usually, if you’re willing to put in some effort and you want to know a specific fact–Is Bernard Tomic going to play Monte Carlo? Is Tommy Robredo going to defend his title in Casablanca?–you can find it. But is that really the best the ATP can do? Again, think of the scenario in which it takes a super sleuth to find out where the Cubs will be playing in a month.

Help us become bigger fans

Sporting organizations thrive on big fans, the ones who travel to events (paying for lots of tickets), pony up for year-long subscriptions to streaming services, and stock up on branded merchandise. These fans want to know what’s going on all the time, and they care about more than just the two players who might appear on the front page of the newspaper.

It would be so simple to make available an actual schedule, like other sports started doing back in the 19th century. In fact, before the ATP password-protected their entry lists, I did just that. Here’s a simple page that shows everyone who was on an entry list in a six-week period, along with links to the lists for each event.

That information is out there. It’s an insult to fans to hide it. We want to get excited about our favorite players–both the ones who are guaranteed a seed and the ones who are holding out hope of a spot in the main draw. We deserve better.

Uncontrolled Aggression

Italian translation at settesei.it

Listen to tennis commentary–or a broadcast of any sport, really–and wait for the first mention of “consistency.” You won’t have to wait for long.

“Consistent” is good, and “inconsistent” is bad. Or so we’re told. At first blush, it makes sense. Consistency is a good thing when it comes to following through on your forehand or brushing your teeth every day. But unless you’re the very best player in the world, consistency doesn’t win you Grand Slam titles.

Think of it this way: Every player has an “average” level they are capable of playing. If average Rafael Nadal plays average anybody else on clay, average Nadal wins. If average Richard Gasquet plays average anybody-outside-the-top-fifty, average Gasquet wins. These situations, for the likes of Nadal and Gasquet, are when consistency is actually a good thing. Sure, Rafa might be able to raise him game to previously unheard-of heights, but what’s the point? It’s a matter of winning 6-1 6-0 instead of 6-3 6-2. Nadal’s main concern is avoiding an off day.

Consider the same example from the perspective of Rafa’s opponent. If you’re Tomas Berdych and you play at your usual level against Nadal, you’ll lose. That’s what consistency gets you: thirteen straight losses.

Uncontrolled aggression

Very aggressive players tend to get a bad rap. The guys who always go for their shots–think Lukas Rosol or Nikolay Davydenko–rack up huge winner and unforced error counts. Sometimes it works and often it doesn’t. When it doesn’t, the conventional wisdom always seems to be that these players need to rein in their aggression. They need to be more consistent.

But they don’t. If Rosol stopped unleashing huge shots in every direction, he’d make fewer unforced errors, but he’d hit far fewer winners. He might still hover around #50 in the world, but more likely, he’d still be lurking in the Challenger ranks, looking for the breakthrough that such a passive style might never earn for him. As it is, Rosol’s go-for-broke approach got him that career-defining upset over Nadal, not to mention an ATP title in Bucharest last spring, when he beat three higher-ranked players.

Rather than the pundit’s favored phrase of “controlled aggression,” players score big upsets and major breakthroughs with uncontrolled aggression. (It only looks controlled because it’s working that day.) If you rein in an aggressive player, he may win more of the matches he’s supposed to win, but he’s much less likely to score an upset.

The balance myth

The game of tennis has so much variety–surfaces, climates, playing styles–and so much alternation–deuce/ad, serve/return–that pundits are constantly endorsing balance. Andy Murray needs to get better on clay, they say. Jerzy Janowicz needs to improve his return game. Monica Niculescu needs to learn how to hit a forehand.

It’s a tempting argument to make, because the best players in the game do have that balance. Nadal and Djokovic and Serena and Li have a wide variety of devastating shots and tactics that are effective on every surface. If you want to play like them and reap the same rewards, you need to have that same balance.

Except that, for the vast majority of players–even top-tenners–that just isn’t going to happen. I don’t care if David Ferrer hires a coaching team of Pete Sampras and Mark Philippoussis, he’ll never be much more effective on serve. John Isner could work all offseason with Andre Agassi and remain among the game’s weakest returners.

What’s keeping these players from climbing any higher in the rankings isn’t the fact that they aren’t more balanced. It’s the simple fact that they aren’t better. By definition, most people will never be a once-in-a-generation talent.

Most players are not balanced. And that’s fine. Rather than chasing the impossible dream of out-Novaking Novak, they need to take more risks to outplay their betters in one or two areas. When it doesn’t work, it doesn’t matter–they would’ve lost anyway.

The cluster principle

Tennis rewards the streaky. If you only win four return points in a set, it’s much better to win them consecutively than to spread them out. It’s better to win five matches in one week and go winless for the next four weeks than win one match per week.

Whether it’s points, games, sets, matches, or even titles, it’s better to cluster your triumphs.

If you strive for a balanced game, the best players simply won’t let you go on a streak. Fabio Fognini or Sabine Lisicki might give you a few gifts, but Nadal never will. The only way to cluster your victories over Rafa is to play such aggressive tennis that even he can’t neutralize it. It usually won’t work, but for most players, it’s their only hope. There’s a reason the hyper-aggressive Davydenko is the only active player with a winning record against him.

Stan’s untold narrative

Stanislas Wawrinka probably wouldn’t have beaten a healthy Nadal over five sets on Sunday. But he was winning when Rafa’s back acted up, and he did so by unleashing every weapon in his arsenal.

Whatever the rankings say this week, Wawrinka isn’t one of the best three tennis players in the world. At least “average Stan” isn’t. But that’s the whole point. Tennis doesn’t reward players with ranking points and prize money for consistency. Consistency got Berdych into the top ten and has kept him there for so long … but it has prevented him from spending much time in the top five.

Wawrinka won’t always beat Nadal or Djokovic, and he’ll continue to suffer his share of defeats at the hands of the players ranked below him. The high-risk style of play that earned him a place in the history books won’t always pay off. That’s all part of the package. Stan didn’t get this far by being consistent.

A Glimmer of Hope for Stan Wawrinka

Stanislas Wawrinka has played 26 sets of tennis against Rafael Nadal, and lost them all. That doesn’t bode well for Stan’s chances in his first Grand Slam final.

As Novak Djokovic can tell you, though, Wawrinka has improved. He has long been a threat to top-ten players, and even before beating Djokovic in the quarterfinals, he had taken the Serb to five sets twice in twelve months.

One of the hidden signs of Stan’s rise comes from his last match with Rafa, at the London Tour Finals last November. Wawrinka lost that match in straight sets–as he has done, of course, every time he’s played Nadal–but it was the tightest match they’d played in four years, going to a pair of tiebreaks.

If we look beyond the scoreline, last fall’s contest was even closer than the pair’s previous two-tiebreak match at the 2009 Miami Masters. This time, Wawrinka won more points than Nadal did–83 to 80, good for 51% of the total. While it isn’t unheard of for the player who wins more points to lose the match, the player who wins more points does end up triumphant in more than 95% of tour-level matches. In their eleven previous meetings, Stan had never won as many as 48% of the total points played.

The quality of Wawrinka’s performance is even more striking when we turn to Dominance Ratio (DR), the ratio of the winner’s rate return points won to the loser’s rate return points won. In 93.5% of matches, the winner of the match is the man who won the higher rate of return points. By expressing this as a ratio, we can get an idea of the winning player’s dominance. 1.0 is a dead heat, and the higher than number, the more dominant the winning player.

In the match last November, Nadal’s DR was 0.86. Rafa won 31.1% of return points while Stan won 36.0%. If you look at 100 straight-sets matches with those stats, you’ll rarely find even one in which the 31.1% RPW player comes out the winner.

In fact, since 1991, there have been fewer than 150 matches in which a player had a DR less than or equal to 0.9 and still won in straight sets. (Matches that go the distance more commonly have this sort of profile, when the winner takes two [or three] tight sets but loses a blowout set, with a score like 7-6 1-6 7-6.)  Only about 50 of these were more extreme than the Nadal-Wawrinka match.

Based on the evidence of this last matchup, we can conclude that Wawrinka has the skills to challenge Nadal. Yet despite coming much closer than in any of their eleven previous meetings–Rafa’s lowest DR in any of them was 1.13–the Swiss didn’t win. Why not?

Let’s recognize the core issue: Stan may have won more points, but he won them at the wrong times. (Or, he didn’t win quite enough of them at the right times.) He held serve more convincingly than his opponent did but didn’t play as well in the tiebreaks. Any explanation has to address this “wrong time” issue. Here are a few:

  1. Nadal raised his game in the important moments. There’s some evidence for this–he outperforms expectations in tiebreaks, and he also wins more break points than non-break points. Some of the break point advantage comes from being left-handed (and taking proper advantage of it), though his break point advantage seems to be even bigger than his lefty advantage.
  2. Wawrinka faltered in the important moments. From the stat sheet of a single match, it would be tough to distinguish this from the first explanation. But perhaps he was overwhelmed by the opportunity he had generated for himself.
  3. Luck. Randomness in tennis isn’t limited to net cords, bad calls, and mishits. If you put two tennis-playing robots out on the court and had them play five consecutive matches, the result wouldn’t be the same every time. Wawrinka misses shots sometimes, and according to the stat sheets (though I’ve never seen it myself), Nadal does too. Just because one of those errors comes at a key moment doesn’t mean the man who committed it is a mental midget.

As much as we like to assign narratives to every possible nook and cranny of a tennis match, I suspect the truth of the matter is a hefty dose of #3 with a bit of #1 thrown in. When the outcome of a match comes down to two seven-point tiebreaks, it’s anybody’s game. It just wasn’t Stan’s that day.

If I’m right, there just might be hope for Wawrinka today. In his last two sets against Nadal, he held his own, which is more than just about everybody else on tour can say for themselves.

Unfortunately for Stan, one meeting doesn’t outweigh eleven, and a bit of momentum won’t erase Rafa’s well-earned status as the world #1. Perhaps worst of all, Wawrinka has proven himself Nadal’s almost-equal in two sets. Today, he’ll have to win three.

Bouchard, Radwanska, and Second Serve Futility

In yesterday’s women’s semifinals, we were treated to some impressive return-of-serve performances. Li Na won almost 65% of points on Eugenie Bouchard‘s serve–a higher percentage than she won on her own.

A less positive view of the situation is that we saw some dreadful serving performances. In particular, both Bouchard and Agnieska Radwanska struggled to win any points at all on their second serves. Genie won just 5 of 27 after missing her first serve, while Aga won only 2 of 16.

You don’t need an IBM Key to the Match to realize that those numbers aren’t going to cut it.

The WTA features a more return-oriented game and more breaks of serve than the ATP does, but these numbers are far out of the ordinary, especially for a solid server such as Bouchard. Here are some circuit-wide averages, derived from about 1,000 tour-level matches played last season:

  • WTA players win 55.5% of service points: 62.3% on first serves and 44.6% on second serves.
  • When the second serve lands in play–in other words, excluding double faults, players win 51.8% of second-serve points.
  • In the average losing performance, players won 57.1% of first-serve points, 40.0% of second-serve points, and 47.2% of second-serve points in play.

Then again, Li and Dominika Cibulkova–especially the Slovakian–aren’t average returners. In 16 Cibulkova wins for which I have serve statistics, she never failed to win at least half of second-serve return points. Only once did she win less than 58% of them, and her median performance was a whopping 63% of second-serve points won. In 7 of the 16 matches, she won second-serve return points at a higher rate than her own first-serve points.

Domi’s dismantling of Radwanska’s second serve still stands out, but in this context, it doesn’t look quite so unusual. When Cibulkova is hitting the ball well, you might as well be throwing batting practice once you miss your first offering.

While Li’s best return performances don’t quite stack up with Cibulkova’s, she has little trouble neutralizing her opponents in Melbourne. In six matches, she has won more than half of second-serve return points in every match, peaking with a 12-of-15 performance in the fourth round against Ekaterina Makarova. Overall, Li has won 86 of 136 second-serve return points in the tournament, good for 63%.

On Saturday, one of these powerful forces will have to give way to the other. The last time Li and Cibulkova met, in Toronto last summer, Domi had one of her worst serving performances of the year, winning only 35.5% of second-serve points, 44.0% of those that landed in play. In that match, Cibulkova failed to display the dominating return game that has been her trademark in Australia, winning barely half of Li’s second offerings, and only 41% when excluding double faults.

But as Cibulkova showed by crushing Radwanska for only the second time in six career meetings, her performances aren’t predictable. Her all-or-nothing style guarantees that we’ll see some fireworks in the final from both servers and returners. And at the rate she’s going, Domi might set some more records in the process.

For even more detailed analysis of yesterday’s semifinals, check out the charting-based analysis of Li-Bouchard and Radwanska-Cibulkova.

Surprise Semifinalists at the Australian Open

Of the eight singles semifinalists in Melbourne, only two entered the tournament seeded in the top four. Rafael Nadal, the top seed in the men’s draw, has survived, and Li Na, the fourth seed in the women’s draw, is the highest-ranked player still alive on her side.

We haven’t exactly followed the script.

The women’s singles draw, with the top three seeds eliminated, is particularly unusual. It is only the 10th time in the last 35 years that none of the top three seeds have made it through to the final four of a Grand Slam. Such events have been heavily concentrated in the last decade or so–the fourth seed was the highest-ranked surviving player at Wimbledon in 2011 (Victoria Azarenka) and 2013 (Agnieszka Radwanska), and the fifth seed was the apparent favorite at Roland Garros in 2011 (Francesca Schiavone).

You might notice a pattern. In these nine Slams when no top-three seed reached the semifinal stage, the best remaining player didn’t fare so well. Both Vika and Aga fell to lower-ranked opponents when they were the remaining favorites at Wimbledon, and Schiavone lost her shot at the French Open to Li. Only twice in these nine majors did the highest-remaining seed in the semifinals go on to win: Martina Hingis, when she was seed fourth at the 1997 Australian Open, and Anastasia Myskina, when she was the sixth seed at the 2004 French Open.

In a tournament full of surprises, we might not be done yet. It stands to reason that once the favorites are eliminated, the odds of subsequent upsets increase. The lower you go in the rankings, the less difference there usually is between players–there’s a bigger gap between Azarenka and Maria Sharapova than there is between, say, Jelena Jankovic and Angelique Kerber. The smaller the gap, the more likely the upset.

While only one top-four seed remains in the men’s draw, the odds of upsets are moving in the opposite direction. While Nadal can always count on a tough fight from second-seed Novak Djokovic, he typically has little trouble with lower-ranked players. He has won his last 15 matches against the other three players left in the drawRoger Federer, Tomas Berdych, and Stanislas Wawrinka–and lost only 4 of 41 matches against the trio since 2008.

The historical precedent for this sort of semifinal draw also favors Rafa. 14 Grand Slams in the Open Era have featured a semifinal round in which the top seed is the only one remaining of the top four. The top seed has gone on to win 9 of the 14, including 8 of the last 10. The most recent final four that fit this profile was in Melbourne four years ago, when Federer swept the final two rounds without losing a set.

But even this rosy picture for Nadal offers Roger a glimmer of hope.  The last time the top seed was alone in the final four and didn’t go on to win was the 2002 US Open. Lleyton Hewitt was the #1 who failed, paving the way for a 31-year-old Pete Sampras to win one final slam before he retired.

Roger isn’t going to call it quits this week, but he’d sure like to emulate Pete’s success in seizing a wide-open Grand Slam draw.

A Quarterfinal on Federer’s Racquet

The Roger FedererAndy Murray head-to-head is a bit of a baffling one. In twenty career meetings–18 of them on hard courts–Murray has won 11, including four of the last five.

Yet for a superficially tight one-on-one record, Fed and Murray haven’t played many tight matches against each other, especially lately. When they went five sets in last year’s Australian Open semifinal, it was the first time they had gone the distance in ten matches. The outcome of a match between them is up for grabs, but whoever wins it tends to do so by a handy margin.

Even that five-set semifinal last year wasn’t as close as it looked. Murray won 54.0% of total points and racked up a Dominance Ratio (DR) of 1.32, meaning that he won far more return points than Roger did. Five setters are usually much closer to 50% and 1.0, respectively. While Murray won far more points, Federer displayed his historically-great tiebreak skill to keep himself in the match.

DR is a convenient measure of the closeness of a match, where 1.0 is a dead heat. Only two Fed-Murray matches–both before 2009–fell in the range between 0.85 and 1.15. By contrast, Novak Djokovic and Rafael Nadal have played seven matches (including two Grand Slam finals) in that range, and Djokovic and Murray have played five.

Tactical nonsense

To traffic in conventional wisdom for a moment, Federer is the most aggressive of the Big Four, while Murray is the most passive. To the extent Andy is likely to hurt Roger, it has more to do with his ability to force Fed into trying to do too much, particularly on the backhand side. If Federer plays patiently and picks his spots, he can crush Murray. If he plays too passively or hits bunches of unforced errors, it can be a rough day at the office.

However, there may not be much Murray can do to determine which Roger shows up.  Simply forcing Fed to hit backhands certainly isn’t enough. The Match Charting Project has amassed shot-by-shot data, including the number of groundstrokes hit from either side, for 23 Federer matches so far. Nadal is particularly good at directing the ball to Federer’s backhand, forcing Roger to hit 56% to 58% of groundstrokes from the backhand side in both a win (last year’s World Tour Finals) and a bad loss (the 2011 Tour Finals).

Taking the average of these 23 matches (most of which are Federer wins, as the Match Charting Project seems to have drawn lots of Fed fans), Roger hits 52.5% of his groundstrokes from the forehand side. This reflects the balance of two factors: Federer wanting to hit his forehand, and opponents trying to keep the ball away from it.

Surprisingly, hitting lots of balls to Fed’s backhand side seems to have few benefits. There is no meaningful correlation between DR and the percentage of groundstrokes Fed hit on the backhand side.

Based on the limited data available, it appears that Murray has tried a variety of tactics.

In the two Fed-Murray matches for which we have shot-by-shot data–the 2010 Australian Open final and the 2012 Dubai final–Murray took opposite approaches to the problem. In the Melbourne final, he managed to direct 57% of balls to Fed’s backhand, which is as good as anyone but Nadal has managed. In the Dubai match, Roger hit 64% of his groundstrokes from the forehand side, the second-highest rate of any of the 23 Federer matches in the database.

In both cases, Murray lost. To take another example, Juan Martin del Potro has beaten Fed while letting him hit 57% forehands and lost to him while forcing him to hit 57% backhands.

The database–limited in matches and biased as it is toward Fed’s victories–probably can’t take us any farther. But from here, we can speculate that Federer has it in his power to win or lose regardless of the tactics thrown his way. Murray, like Nadal, has always forced him to hit one extra ball. The sort of aggression that takes a player far out of position to hit, for instance, an inside-out forehand can backfire against such a talented defensive player.

In four matches at the Australian Open so far, Federer has offered us plenty of glimpses of his glory days. Murray will likely prove to be his biggest test of the tournament, but Fed’s fate still hangs on his own racquet.

Bouchard, Halep, and First-Time Quarterfinalists

Two of the final eight women in Melbourne, Eugenie Bouchard and Simona Halep, are playing in their first Grand Slam quarterfinals. Let’s take a look at how other women have done in their first appearances this late in a Slam.

In the Open era, 267 different women have reached the final eight of a Slam. At the time of their debut quarterfinal, their average age was roughly 21 years and four months. Their average WTA ranking was 42, not considering those who predated the ranking system or those who reached their first quarterfinal as an unranked player.

Of the 267, 197 (73.8%) progressed no further in their breakthrough slam. 52 (26.4%) won one more match, losing in the semifinals; 12 (6.1%) reached the final but lost; and the remaining six players won the title when the reached their first Open-era quarter.

However small 6 of 297 sounds, such an outcome is actually even rarer. Three of those six first-time quarterfinalists don’t really count–they reached their first QF in 1968, the first year of the Open era. Billie Jean King, winner of the Australian Open that year, isn’t that great a comp for Bouchard or Halep. The only other players to win a Grand Slam in their first quarterfinal appearance are Chris O’Neil (1978 Australian), Barbara Jordan (1980 Australian), and Serena Williams (1999 US Open).

While we can’t count on Bouchard or Halep winning the tournament this week, their appearances in Slam quarterfinals at relative young ages bodes well. The earlier a player reaches her first major QF, the more QFs she is likely to reach over the course of her career.  In fact, of the 22 women who have reached more than 10 Slam quarterfinals since 1984, only one of them–Jana Novotna–failed to reach her first one in her teens. She didn’t make it until the ripe old age of 20 years and 8 months.

Bouchard has just snuck in before her 20th birthday, which she’ll celebrate next month. Her most age-appropriate comp is Victoria Azarenka, who reached her first major quarterfinal–at the 2009 French Open–just a few weeks younger than Genie is now. Less than five years later, Vika will play her 12th Slam QF.

Less optimistic comparisons for Bouchard are Yanina Wickmayer and Anna Chakvetadze, both of whom reached their first major quarterfinal in the last two months of their teens. Chakvetadze made two more final eights; Wickmayer is still looking for her second.

If history is any guide, Halep’s prospects are bleaker. At 22 years and four months, she is much older than any of the players who have reached double-digit Slam quarterfinals except for Li Na, who is playing in her 10th QF this week. Li didn’t play in the final eight of a Grand Slam until she was 24 years old.

The 61 players who reached their first Slam QF at an older age than Halep did not, on average, achieve much more. They’ve totaled 81 additional QFs–well below two per person.

Of course, the age profile of the WTA is changing, so a 22-year-old debutante isn’t nearly the oddity it was a decade or two ago. It’s no coincidence that Halep’s most optimistic comp is Li, an active player. That’s the most positive outlook for the Romanian, anyway. To rack up an impressive career record, she’ll have to follow Li’s lead and overcome a late start.

The ATP final eight also features a newbie, Grigor Dimitrov. The changing age profile of the ATP is even more drastic, so age-based analysis is less meaningful. But we can take a quick look at the precedents for the Bulgarian’s first Slam quarterfinal.

There have been 329 ATP Slam quarterfinalists in the Open era, and first-timers stand a better chance in the men’s game. 32.5% of debut Slam quarterfinalists have advanced to the semis, and 13 of them (4.0%) went on to win the tournament. Then again, none of them had to beat Rafael Nadal in the quarters.

While Dimitrov is older than Halep–and as noted, 22-year-olds didn’t used to be considered so young on the ATP tour–there are some positive examples for Grigor to follow.

Michael Stich reached his first Slam QF at almost exactly the same age as Dimitrov is now, and he not only reached the semis at that event (the 1991 French Open), but qualified for the final eight in nine more majors. Jo Wilfried Tsonga, David Ferrer, and Nikolay Davydenko all reached their first Slam QF later than Dimitrov, and each has played in the final eight at least ten times.

On average, those optimistic comps are outweighed by all the guys who made it to one or two Slam QFs later in their career. The 153 players who reached their first final eight later than Dimitrov’s current age have returned to a total of 362 additional quarterfinals–good for one or two more appearances per player.

Despite all the hype, Dimitrov’s performance this year isn’t a drastic breakthrough. It’s only a single step in the right direction–especially considering that he reached this milestone by beating the #73 player in the world. He could be the next Tsonga, or he could be the next Robby Ginepri.

The Limited Value of Head-to-Head Records

Italian translation at settesei.it

Yesterday at the Australian Open, Ana Ivanovic defeated Serena Williams, despite having failed to take a set in four previous meetings. Later in the day, Tomas Berdych beat Kevin Anderson for the tenth straight time.

Commentators and bettors love head-to-head records. You’ll often hear people say, “tennis is a game of matchups,” which, I suppose, is hardly disprovable.

But how much do head-to-head records really mean?  If Player A has a better record than Player B but Player B has won the majority of their career meetings, who do you pick? To what extent does head-to-head record trump everything (or anything) else?

It’s important to remember that, most of the time, head-to-head records don’t clash with any other measurement of relative skill. On the ATP tour, head-to-head record agrees with relative ranking 69% of the time–that is, the player who is leading the H2H is also the one with the better record. When a pair of players have faced each other five or more times, H2H agrees with relative ranking 75% of the time.

Usually, then, the head-to-head record is right. It’s less clear whether it adds anything to our understanding. Sure, Rafael Nadal owns Stanislas Wawrinka, but would we expect anything much different from the matchup of a dominant number one and a steady-but-unspectacular number eight?

H2H against the rankings

If head-to-head records have much value, we’d expect them–at least for some subset of matches–to outperform the ATP rankings. That’s a pretty low bar–the official rankings are riddled with limitations that keep them from being very predictive.

To see if H2Hs met that standard, I looked at ATP tour-level matches since 1996. For each match, I recorded whether the winner was ranked higher than his opponent and what his head-to-head record was against that opponent. (I didn’t consider matches outside of the ATP tour in calculating head-to-heads.)

Thus, for each head-to-head record (for instance, five wins in eight career meetings), we can determine how many the H2H-favored player won, how many the higher-ranked player won, and so on.

For instance, I found 1,040 matches in which one of the players had beaten his opponent in exactly four of their previous five meetings.  65.0% of those matches went the way of the player favored by the head-to-head record, while 68.8% went to the higher-ranked player. (54.5% of the matches fell in both categories.)

Things get more interesting in the 258 matches in which the two metrics did not agree.  When the player with the 4-1 record was lower in the rankings, he won only 109 (42.2%) of those matchups. In other words, at least in this group of matches, you’d be better off going with ATP rankings than with head-to-head results.

Broader view, similar conclusions

For almost every head-to-head record, the findings are the same. There were 26 head-to-head records–everything from 1-0 to 7-3–for which we have at least 100 matches worth of results, and in 20 of them, the player with the higher ranking did better than the player with the better head-to-head.  In 19 of the 26 groups, when the ranking disagreed with the head-to-head, ranking was a more accurate predictor of the outcome.

If we tally the results for head-to-heads with at least five meetings, we get an overall picture of how these two approaches perform. 68.5% of the time, the player with the higher ranking wins, while 66.0% of the time, the match goes to the man who leads in the head-to-head. When the head-to-head and the relative ranking don’t match, ranking proves to be the better indicator 56.5% of the time.

The most extreme head-to-heads–that is, undefeated pairings such as 7-0, 8-0, and so on, are the only groups in which H2H consistently tells us more than ATP ranking does.  80% of the time, these matches go to the higher-ranked player, while 81.9% of the time, the undefeated man prevails. In the 78 matches for which H2H and ranking don’t agree, H2H is a better predictor exactly two-thirds of the time.

Explanations against intuition

When you weigh a head-to-head record more heavily than a pair of ATP rankings, you’re relying on a very small sample instead of a very big one. Yes, that small sample may be much better targeted, but it is also very small.

Not only is the sample small, often it is not as applicable as you might think. When Roger Federer defeated Lleyton Hewitt in the fourth round of the 2004 Australian Open, he had beaten the Aussie only twice in nine career meetings. Yet at that point in their careers, the 22-year-old, #2-ranked Fed was clearly in the ascendancy while Hewitt was having difficulty keeping up. Even though most of their prior meetings had been on the same surface and Hewitt had won the three most recent encounters, that small subset of Roger’s performances did not account for his steady improvement.

The most recent Fed-Hewitt meeting is another good illustration. Entering the Brisbane final, Roger had won 15 of their previous 16 matches, but while Hewitt has maintained a middle-of-the-pack level for the last several years, Federer has declined. Despite having played 26 times in their careers before the Brisbane final, none of those contests had come in the last two years.

Whether it’s surface, recency, injury, weather conditions, or any one of dozens of other factors, head-to-heads are riddled with external factors. That’s the problem with any small sample size–the noise is much more likely to overwhelm the signal. If noise can win out in the extensive Fed-Hewitt head-to-head, most one-on-one records don’t stand a chance.

Any set of rankings, whether the ATP’s points system or my somewhat more sophisticated (and more predictive) jrank algorithm, takes into account every match both players have been involved in for a fairly long stretch of time. In most cases, having all that perspective on both players’ current levels is much more valuable than a noise-ridden handful of matches. If head-to-heads can’t beat ATP rankings, they would look even worse against a better algorithm.

Some players surely do have an edge on particular opponents or types of opponents, whether it’s Andy Murray with lefties or David Ferrer with Nicolas Almagro. But most of the time, those edges are reflected in the rankings–even if the rankings don’t explicitly set out to incorporate such things.

Next time Kevin Anderson draws Berdych, he should take heart. His odds of beating the Czech next time aren’t that much different from any other man ranked around #20 against someone in the bottom half of the top ten. Even accounting for the slight effect I’ve observed in undefeated head-to-heads, a lopsided one-on-one record isn’t fate.

Should WTA Players Approach the Net More?

Italian translation at settesei.it

21st-century women’s tennis is a baseline game. Some players are better able to identify opportunities to approach the net than others, and some can handle themselves quite well when they get there. But if a fan from a few decades ago were dropped off at the 2014 Australian Open, she would be shocked by the rarity of net points and the clumsiness of many players when they move forward.

Since almost all television commentators were excellent players in a more net-centric era, a frequent refrain during almost any broadcast is that players should rush the net more often. “Frequent” might be understating it–in a fit of pique, I was driven to say this:

Regardless of repetition, it’s worth further investigation. It’s certainly true that a skilled netwoman could win more points by moving forward. But when pros don’t emphasize that part of their game and they gain little match experience approaching the net, do they have the skills necessary to take advantage of such an opportunity?

Enter some numbers

At this point, you might be tempted to look at the oft-collected “Net Points” stat. Resist the urge. In a baseline-oriented match, net points can have little to do with net approachesAttempting to return a drop shot is considered a net point. Putting away a weak service return is considered a net point. In many WTA matches, more than half of “net points” do not involve an approach. The player was induced to come to the net for some reason.

Making matters worse, that non-approach segment of net points has little to do with net approaches. Given a weak, floating return, any competent player should be able to whack it for a swinging volley winner. At the other end of the spectrum, chasing down a drop shot relies on a different set of skills than picking a moment to hit an approach shot and then confidently placing a volley or two.

Fortunately, the Match Charting Project gives us some more detailed, approach-specific data.

Twenty matches in the charting database are from the first month of the 2014 WTA season, most of them from the first week in Melbourne. This data differentiates between “net approaches” and “net points.” In one of the more aggressive performances in the database, Angelique Kerber, in her loss to Tsvetana Pironkova in Sydney, won 15 of 19 net points. Of her ten net approaches, she won all ten.

(For any match report in the charting database–here’s the Kerber-Pironkova match–click one of the two “Net Points” links to see those stats. There is a different table for each player.)

Kerber’s ten net approaches is tied for the most of any of the WTA matches that have been charted this year. Last night, Garbine Muguruza also tallied ten net approaches, though she did so in a longer match.

In these twenty matches, only 27 of 40 players made even one traditional net approach. Including those who made zero, the average is just over three net approaches per match. The 27 who approached the net at least once averaged 4.7 per match.

Clearly, a lot of opportunities for offense are going unclaimed.

How they’re doing

Of the 126 net approaches we’ve tracked, the approaching player has won 84–exactly two-thirds. While that isn’t an overwhelming endorsement–many approach shots are hit in response to a weak groundstroke that already puts the opponent at a disadvantage–it certainly doesn’t count as evidence against the practice.

In half of all net approaches, the netrusher either hits an outright winner at the net or induces a forced error with a net shot.  Only 12% of the time does the opponent hit a passing shot winner. In another 5% of these points, the opponent induces a forced error with a passing shot. In 12% of net approach points, the player who moved forward hits an unforced error at the net.

Of the 27 players in the database who approached the net at least once, only six failed to win half of those points (three of whom only came forward once), and three more won exactly half of their net approach points.

The women in this sample who seize the most opportunities to rush the net have been particularly successful, as well. Seven of the eight who moved forward the most won more than half of their approach points.  This allows us to tentatively conclude that all the other players–the ones who picked only a few spots to approach the net during their matches–could have seized more opportunities. There may be a limit in the modern game to how much netrushing is wise, but the observed maximum of ten points per match doesn’t seem to be it.

Inevitable unknowns

Whether we look at Kerber and her 10/10 net-approach performance in Sydney or Sloane Stephens and her 1/1 tally yesterday against Elina Svitolina, it’s impossible to know the results of the next approach shot–or the next five.  We can compare single-match results and see that it’s possible for a WTA player to have a perfect record on her ten net approaches, but we can’t perform lab experiments in which Sloane plays Svitolina again and comes forward ten times instead of one.

For all the success that players enjoy when they do move forward, there are plenty of reasons not to. As I said at the outset, today’s players don’t practice net skills nearly as much as baseline skills, and they certainly don’t get much in-match practice. If someone isn’t comfortable approaching the net at a certain time, is it really a good idea for her to do so?

In the abstract, both intuition and statistical analysis supports the position that WTA players could move forward more. When they do approach the net, they are often successful, putting away volley winners and rarely getting passed. But I suspect this implies a long-term strategy more than the sort of thing a coach should emphasize during a changeover.

When commentators suggest that a player should move forward, what I think they really mean is this: “If this player were more comfortable with her transition game, this would be a great opportunity to take advantage of that.” Or: “Players should work harder on their approach shots on the practice court so that they’re ready for opportunities like this one.” Or simply: “Martina would have won that point ten shots ago.”

There seems to be opportunity waiting for more, well, opportunistic young players. But it isn’t one that can be generated simply by a sudden coaching change or a harangue from John McEnroe. Only when a player emerges with the baseline game to contend with the best pros and a transition/net game that exceeds most of those on the tour today will we find out just how much opportunity today’s players have wasted.