Break Point Conversions and the Close Matches Federer Isn’t Winning

Italian translation at settesei.it

The career head-to-head between Roger Federer and Novak Djokovic sits at 21-21, but the current era of this rivalry is hardly even. Since the beginning of 2011, Djokovic has won 15 of 23, including last night’s US Open final.

These matches tend to be close ones. In only 7 of the 23 matches has either player won more than 55% of points, and in more than half (12 of 23), neither player has won more than 53% of points, fitting my proposed definition of lottery matches.

In the 12 lottery matches between Fed and Novak since 2011, the player who won the most points always won the match. Yet Djokovic wins far more (9 of 12) of these close matches. Last night was a perfect example: Federer won more return points than his opponent, and it was the third time since the 2012 Tour Finals that the Novak beat Fed while winning 50.3% of points.

When a player wins 50.3% of points, he wins the match only 59% of the time. Even at 51.8%, Novak’s total points won in three other Federer matches, the player with more points wins only 91% of the time.

If many of the matches are close, and one player is winning so many of the matches, there must be more to the story.

Back to break points

Clearly, Novak is winning more big points than Roger is. Since Federer has won more than half of the tiebreaks between them, the next logical place to look is break points.

Federer’s perceived inability to convert break points has been a concern for years. Early last year, I wrote about his success rate on break points, and found that while he does, in fact, convert fewer break points than expected, it’s only a few percentage points. Further, it’s not a new problem: He was winning fewer break points than he should have been back when he was the unchallenged top player in the game.

Against Novak, though, it’s another story, and since they’ve faced each other so often, we can no longer write off a poor break-point performance as an outlier.

In these last 23 matches–including last night’s 4-for-23 on break points–Federer has converted 15% fewer break points than expected, twice as bad as his worst single-season mark. Djokovic, on the other hand, has converted break points at almost the same rate as other return points.

I’m often hesitant to use the c-words, but the evidence is piling up that in these particular clutch situations, Roger is choking. At the very least, we can eliminate a couple of alternative explanations, those based on break point opportunities and on performance in the ad court.

Let’s start with break point opportunities. 4-for-23 on break points is painful to look at, but there is a positive: You have to play very well to generate 23 break point chances against a top player. In fact, there’s a very clear, almost linear relationship between return points won and break point chances generated, and Federer beat expectations by 77% yesterday. Over 21 return games, a player who won 39% of return points, as Roger did, would be expected to create only 13 break point opportunities. A 4-for-13 mark would still be disappointing, but it wouldn’t induce nearly as many grimaces.

In these 23 matches, Federer has generated exactly as many break point chances as expected. Djokovic has done the same. The story here is clearly about performance at 30-40 or 40-AD, not on anything earlier in the game. On non-break points yesterday, Fed returned more effectively.

The other explanation would be that Roger’s poor break point record has to do with the ad court. Against Rafael Nadal, that might be true: Much of the Spaniard’s effectiveness saving break points has to do with the way he skillfully uses left-handed serving in that court.

But in the Novak-Fed head-to-head, we can rule this out as well.  According to Match Charting Project data, which includes more than 40 Djokovic matches and 90 Federer matches, neither player performs much better in either half of the court. Djokovic wins more service points in the deuce court–65% to 64% in general, 66% to 64% on hard courts, and Federer wins return points at the same rate in both courts.

Pundits like to say that tennis is a game of matchups, and in this rivalry, both players defy their typical patterns. Over the course of his career, Novak has saved break points more effectively than average, but not nearly as well as he does against Federer. Federer, for his part, has turned in some of his best return performances against Djokovic … except for these dismal efforts converting break points, when he is far worse than his already-weak averages.

Perhaps the only solution for Roger is to find even more ways to improve his world-class service games. In the previous match against Novak, he converted only one of eight break point chances–the sort of stat that would easily explain a loss. That day in Cincinnati, though, Federer’s one break of serve was better than Djokovic’s zero.

Fed won 56.4% of total points in that match, his third highest rate against Djokovic since 2011. If Novak is going to play better clutch tennis and win the close matches, that leaves Federer with an unenviable alternative. To win, he must decisively outplay the best player in the world.

A New Way of Looking at Lottery Matches

Italian translation at settesei.it

When Rafael Nadal was eliminated from the US Open last week, a bit of bad luck was involved. He won only two fewer points than his opponent, Fabio Fognini, claiming 49.7% of the total points played. In his career up to that point, Rafa had won 8 of 18 matches in which he won between 49% and 50% of total points. It doesn’t take much to flip the result of such a match.

Matches in which neither player wins more than 51% of points represent nearly one in ten contests on the ATP tour. As Michael Beuoy demonstrated last year, those matches are very much up for grabs: the player with the most points wins less than 65% of the time.

In writing about the small subset of matches in which the loser wins a higher percentage of return points than the winner, Carl Bialik has coined the useful term “lottery matches.” However, Bialik has limited the term to those matches that have an unexpected result. I’d like to expand the definition a bit to all those tight matches that could go either way, even if the player who wins the most points ends up winning as expected.

(A quick side note: Bialik prefers comparing return points, the building blocks of his Dominance Ratio metric. Matches are won a bit more frequently when the winner’s DR is below 1.0 than when he wins fewer than 50% of total points played. These metrics often overlap, of course. To make this arcane subject a bit more accessible, I’m going to stick with the traditional total-points-won stat.)

As Beuoy showed, matches aren’t guaranteed to go to the player who wins the most points unless that guy wins at least 53% of points. (Even then, there’s a slight possibility of an upset, but it’s sufficiently rare that, for today’s purposes, I’m going to ignore it.) 52.5% is much better than 50.5%, but at 52.5%, you’re still going to lose about one of every 25 matches.

By extending the “lottery match” umbrella to all those matches in which neither player wins 51%, 52%, or even 53% of total points, we acknowledge that none of these matches are sure things, and we can look at a broader range of matches to determine whether players are winning as many tight matches as they should. Further, by considering such a category of tight matches, we’ll be able to identify those men who play a lot of them–and by doing so, leave themselves vulnerable to lucky upsets.

Winning the lottery (matches)

Let’s start with the broadest category: all matches in which neither player won more than 53% of total points. These represent everything from true toss-ups at 50% to near-guarantees at 52.9%. Using Beuoy’s model, we can take the total points won from each of these matches and calculate the likelihood that the player with the greater number of points won the match.

Nadal, for instance, is one of the more effective players in these tight matches. Going into the US Open, he had played 168 of them, winning 115. By taking the total points won from each of these matches, we find that he “should have” won only 102.5 of them, meaning that by some combination of clutch play and luck, he’s outperformed expectations by 12%.

Among active players with at least 100 of these matches, Nadal ranks an impressive fourth overall, behind John Isner, Fognini, and Jurgen Melzer. Novak Djokovic and Andy Murray are just inside the top 20, exceeding expectations by 6% and 5%, respectively, while Roger Federer is much further down the list, winning 7% fewer of these tight matches than he should.

Finding Fed on the negative end of this list is a surprise, since Federer, Nadal, and Isner are among the very, very few players who consistently beat expectations in tiebreaks. Tiebreak skill should be closely related to outperforming expectations in tight matches. In any event, my collaborator on a related project, Ryan Rodenberg, has written at length about Federer’s lack of success in some lottery matches.

When we narrow the focus to matches in which neither player won more than 51% of points–true toss-up matches–Nadal is still among the best. In fact, the top four of Rafa, Fognini, Melzer, and Isner remains the same, as each of those players has won between 36% and 38% more often than they should in contests with these extremely slim margins.  Once again, Djokovic and Murray are positive, at +16% and +6%, respectively, while Federer trails far behind, at -9%.

Careening downward

A big advantage of using the broader, 53-percent-of-points definition of lottery matches is that it gives us a larger sample to work with. Nadal has only played 27 matches in his career when the loser won more points than the winner did, and only 40 when neither player topped 51% of total points won.

In the 53% category, though, Nadal has amassed several matches each year of his career, allowing us to look at more meaningful trends. Each season from 2005-11, he averaged about 15 tight matches per year, and won at least one more than we would’ve expected of him, often two or three. Since the beginning of last year, though, he’s played 25, winning only 13 when he should have won 16.

Even with the bigger sample, these are small margins. If Nadal comes roaring back next year and beyond, again winning more close matches than expected, we’ll ultimately see these two seasons as outliers. Yet most of Nadal’s peers post surprisingly consistent records in tight matches. In the last decade, Djokovic and Murray have each had only one season each below -10%, and Federer has reliably underperformed, never reaching +7% for a full season. Not every player is as good in these matches as Nadal, but the ones who do excel post roughly similar numbers from one year to the next.

The bigger picture

Winning tight matches is useful, but as Federer’s experience demonstrates, it’s hardly necessary. And in the case of Fognini, exceeding expectations in lottery matches is hardly sufficient for more general success.

Even better than winning tight matches is winning easy matches, and a useful side effect of studying lottery matches is generating measurements of who plays them the most–and, of course, the least.

Lottery matches–again, those in which neither player wins more than 53% of points–represent fewer than 20% of Rafa’s career matches. His 19.7% rate of close contests is lower than any other player since 2000 (minimum 100 matches). In this category, the big four are bunched together as expected. Among active players, Federer is second lowest, Djokovic is third, and Murray is eighth. Kei Nishikori and David Ferrer are also among the top ten.

At the other end of the spectrum, we find the usual big-serving suspects. Vasek Pospisil tops the list at 49.5%, with Ivo Karlovic (44.5%), Isner (41.9%), and Jerzy Janowicz (40.5%) filling out the top four.

Analyzing the results of very close matches–whichever definition you prefer–is a useful way of identifying players on lucky or unlucky streaks, or even those who appear to play particularly well on big points. However, the more meaningful metric–certainly the one that more closely correlates with elite-level success–is the one that tells us who is avoiding tight matches. The only thing better than luck is not needing it.

The Effects (and Maybe Even Momentum) of a Long Rally

Italian translation at settesei.it

In yesterday’s quarterfinal between Simona Halep and Victoria Azarenka, a highlight early in the third set was a 25-shot rally that Vika finished off with a forehand winner. It was the longest point of the match, and moved her within a point of holding serve to open the set.

As very long rallies often do, the point seemed like it might represent a momentum shift. Instead, Halep sent the game back to deuce after a 10-stroke rally on the next point. If there was any momentum conferred by these two points, it disappeared as quickly as it arose. It took eight more points before Azarenka finally sealed the hold of serve.

Does a long rally tell us anything at all? Does it have predictive value for the next point, or even the entire game, or is it just highlight-reel fodder that is forgotten as soon as the umpire announces the score?

To answer those questions, I delved into the shot-by-shot data of the Match Charting Project, which now contains point-by-point accounts of nearly 1,100 matches. I identified the longest 1% of points–17 shots or longer for women, 18 shots for men–and analyzed what happened afterwards, looking for both fatigue and momentum effects.

The next point

There’s one clear effect of a long rally: The next point will be shorter than average. The 10-shot rally contested by Vika and Simona yesterday was an outlier: Women average 4.45 shots on the point after a long rally, while the overall average (controlled for server and first or second serve) is 4.85. Men average 4.03 shots on the following point, compared to an average of 4.64.

For women, fatigue is also a factor for the server. Following a long rally, women land only 61.3% of first serves, compared to an average of 64.6%. Men don’t exhibit the same fatigue effect; the equivalent numbers are 62.3% and 62.2%.

There’s more evidence of an immediate fatigue factor for women, as well. The players who win those long rallies are slightly better than their opponents, winning 50.7% of points on average. Immediately after a long rally, however, players win only 49% of points.  It’s not obvious to me why this should be the case. Perhaps the player who won the long rally worked a bit harder than her opponent, maybe putting all of her remaining effort into a groundstroke winner, or finishing the point with a couple of athletic shots at the net.

In any case, there’s no equivalent effect for men.  After winning a long rally, players win 51.1% of their next points, compared to an expected 50.8%. That’s either a very small momentum effect or, more likely, a bit of statistical noise.

Both men and women double fault more often than usual after a long rally, though the effect is much greater for women. Immediately following these points, women double fault 4.7% of the time, compared to an average of 3.3%. Men double fault 4.5% of the time after a long rally, compared to an expected rate of 4.2%.

Longer-term momentum

Beyond a slightly effect on the characteristics of the next point, does a long rally influence the outcome of the game? The evidence suggests that it doesn’t.

For each long rally, I identified whether the winner of the rally went on to win the game, as Vika did yesterday. I also combined the score after the long rally with the average rate of points won on the appropriate player’s serve to calculate the odds that, from such a score, the player who won the rally would go on to win the game. To use yesterday’s example, when Azarenka held game point at AD-40, her chances of winning the game were 77.6%.

For both men and women, there is no significant effect. Women who won long rallies went on to win 66.2% of those games, while they would have been expected to win 65.7%. Men won 64.4% of those games, compared to an expected rate of 64.1%.

With a much larger dataset, these findings might indicate a very slight momentum effect. But limited to under 1,000 long-rally points for each gender, the differences represent only a few games that went the way of the player who won the long point.

For now, we’ll have to conclude that the aftereffects of a long rally have a very short lifespan: barely one point for women, perhaps not even that long for men. These points may well have a greater effect on fans than they do on the players themselves.

Is Kevin Anderson Developing Into an Elite Player?

Italian translation at settesei.it

With his upset win over Andy Murray on Monday, Kevin Anderson reached his first career Grand Slam quarterfinal. At age 29, he’ll ascend to a new peak ranking, and with a bit of cooperation from the rest of the draw, one more win could put him in the top ten for the first time.

Anderson has been a stalwart in the top 20 for two years now, but this additional step comes as a bit of a surprise. Despite the overall aging of the ATP tour and the emergence of Stan Wawrinka as a multi-Slam champion, it’s still a bit difficult to imagine a player in his late twenties taking major steps forward in his career.

What’s more, Anderson’s game is very serve-dependent. With an excellent backhand, he isn’t as one-dimensional a player as John Isner, Ivo Karlovic, or perhaps even Milos Raonic, but it’s much easier to categorize him with those players than with more baseline-oriented peers.

In today’s game, it is very difficult to reach the very top ranks without a quality return game. Tiebreaks are too much of a lottery to depend on in the long-term; you have to consistently break serve to win matches. As I wrote in a post about Nick Kyrgios earlier this year, almost no players have finished a season in the top ten without winning at least 37% of return points. Anderson has achieved that mark only once, in 2010. Entering the US Open this year, he was winning only 34.2% of return points.

The only top-ten player this year with a lower rate of return points won is Raonic, at 30.2%. Raonic is a historical anomaly, and as his tiebreak winning percentage has tumbled, from a near-record 75% last year to a more typical 51% this year, his place in the top ten is in jeopardy as well. In other words, the only servebot in the top ten has to rely on plenty of luck–or outstanding, perhaps one-of-a-kind skills in the clutch–to remain among the game’s elite.

Anderson is a more well-rounded player than Raonic, and he wins more return points than that. But he still falls well short of the next-worst return game in the top ten, Wawrinka’s 36.7%. The 2.5 percentage points between Anderson and Wawrinka represent a big gap, almost one-fifth of the entire range between the game’s best and worst returners.

The less effective a player’s return game, the more he must rely on tiebreaks to win sets, and that’s one explanation for Anderson’s success this season. His 62%(26-16) tiebreak winning percentage in 2015 is the best of his career, and considerably higher than his career tiebreak winning percentage of 54%. Again, it sounds like a small difference, but take away three or four of the tiebreaks he’s won this year, and he no longer reached the final at Queen’s Club … or might not be preparing for a quarterfinal in New York.

Very few players have managed to spend meaningful time in the top ten while depending so heavily on winning tiebreaks. Another metric to help us see this is the percentage of sets won that are won in tiebreaks. Entering the US Open, just over 25% of Anderson’s sets won were won in tiebreaks. Only four times since 1991 has a player sustained a rate that high and ended the year in the top ten: Raonic last year, Andy Roddick in 2007 and 2009, and Greg Rusedski in 1998.

In fact, between 1991 and 2014, only 17 times did a player finish a season in the top ten with this rate above 20%. Roddick represents five of those times, and almost all, except for Roddick at his peak, were players who finished outside the top five. Wawrinka’s and Raonic’s 2014 seasons were the only occurrences in the last decade.

The one ray of light in Anderson’s statistical profile this season is a significantly improved first serve. His 2015 ace rate is over 18%, compared to the 2014 (and career average) rate of 14%. His percentage of first-serve points won is up to 78.8%, from last season’s 75.4% and a career average of 75.8%.

This is a major improvement, and is the reason why he is one of only five players on tour (along with Isner, Karlovic, Roger Federer, and Novak Djokovic) winning more than 69% of service points this year. In many ways, Anderson’s stats are similar to those of Feliciano Lopez, but the Spaniard–another player who has long stood on the fringes on the top ten–has never topped 68% of service points won for a full season.

If Anderson can sustain this new level of first-serve effectiveness, he will–at the very least–continue to see a bit more success in tiebreaks. A tiebreak winning percentage higher than his career average of 54% (though still probably below his 2015 rate of 62%) will help keep him in the top 15. However, even for the best servers, tiebreaks are often little more than coin flips, and players don’t join the game’s elite by relying on coin flips.

As his quarterfinal appearance at the Open shows, Anderson is moving in the right direction. It’s easy to see a path for him that involves ending the season in the top ten. But to move up to the level above that, following the path of someone like Wawrinka, he’ll need to start serving like peak Andy Roddick, or–perhaps just as difficult–significantly improve his return game.

Break Point Persistence: Why Venus is Better Than Her Ranking

Some points matter a lot more than others. A couple of clutch break point conversions or a well-played tiebreak make it possible to win a match despite winning fewer than half of the points. Even when such statistical anomalies don’t occur, one point won at the right time can erase the damage done by several other points lost.

Break points are among the most important points, and because tennis’s governing bodies track them, we can easily study them. I’ve previously looked at break point stats, with a special emphasis on Federer, here and here. Today we’ll focus on break points in the women’s game.

The first step is to put break points in context. Rather than simply looking at a percentage saved or converted, we need to compare those rates to a player’s serve or return points won in general. Serena Williams is always going to save a higher percentage of break points than Sara Errani does, but that has much more to do with her excellent service game than any special skills on break points.

Once we do that, we have two results for each player: How much better (or worse) she is when facing break point on serve, and how much better (or worse) she is with a break point on return.

For instance, this year Serena has won 2.8% more service points than average when facing break point, and 7.5% more return points than average with a break point opportunity. The latter number is particularly good–not only compared to other players, but compared to Serena’s own record over the last ten years, when she’s converted break points exactly as often as she has won other break points.

Serena’s experience isn’t unusual. From one year to the next, these rates aren’t persistent, meaning that most players don’t consistently win or lose many more break points than expected. Since 2006, Maria Sharapova has converted 1% fewer break points than expected. Caroline Wozniacki has recorded exactly the same rate, while Victoria Azarenka has converted 2% fewer break points than expected.

On serve, the story is similar, with a slight twist. Inexperienced players seem to perform a little worse when trying to convert a break point against a more experienced opponent, so most top players save break points about 4% more often than they win other service points. Serena, Sharapova, Wozniacki, Azarenka, and Petra Kvitova all have career rates at about this level.

Unlike in the men’s game, there’s little evidence that left-handers have a special advantage saving break points on serve. Angelique Kerber is a few percentage points above average, but Kvitova, Lucie Safarova, and Ekaterina Makarova are all within one percentage point of neutral.

While a few marginal players are as much as ten percentage points away from neutral saving break points or converting them, the main takeaway here is that no one is building a great career on the back of consistent clutch performances on break points. Among women with at least 250 tour-level matches in the last decade, only Barbora Strycova has won more than 3% more break points (serve and return combined) than expected. Maria Kirilenko is the only player more than 3% below expected.

This analysis doesn’t tell us anything very interesting about the intrinsic skills of our favorite players, but that doesn’t mean it’s without value. If we can count on almost all players posting average numbers over the long term, we can identify short-term extremes and predict that certain players will return to normal.

And that (finally) brings us to Venus Williams. Since 2006, Venus has played break points a little bit worse than average, saving 2% more break points than typical serve points (compared to +4% for most stars) and winning break points on return 3% less often than other return points.

But this year, Venus has saved break points 17% less often than typical service points, the lowest single-season number from someone who played more than 20 tour-level matches. That’s roughly once per match this year that Venus has failed to save a break point that–in an average year–she would’ve saved.

There’s no guarantee that saving those additional break points would’ve changed many of Venus’s results this year, but given the usual strength of her service game, holding serve even a little bit more would make a difference.

This type of analysis can’t say whether a rough patch like Venus’s is due to bad luck, mental lapses, or something else entirely, but it does suggest very strongly than she will bounce back. In fact, she already has. In her successful US Open run, she’s won about 66% of service points while saving 63% of break points. That’s not nearly as good as Serena’s performance this year, but it’s much closer to her own career average.

Like so many tennis stats that fluctuate from match to match or year to year, this is another one that evens out in the end. A particularly good or bad number probably isn’t a sign of a long-term trend. Instead, it’s a signal that the short-term streak is unlikely to last.

Sabr Metrics: The Case For the Hyper-Aggressive Return

Italian translation at settesei.it

Roger Federer has made waves the last few weeks by occasionally moving way up the court to return second serves. While the old-school tactic was nearly extinct in today’s game of baseline attrition, it seems to be working for Fed.

At least in one sense, it’s too early to say whether the kamikaze return is an effective tactic. Federer has used it sparingly for only a handful of matches, and in that tiny sample, he’s missed plenty of returns. But in the view of many pundits, the hyper-aggressive return gets in his opponents’ heads, making the tactic more valuable than simply changing the result of a few points. Presumably Roger agrees, since he keeps using it.

I agree that the tactic is a good one, though for a different reason. By taking greater risks, Fed is generating more unpredictability, or streakiness, on his opponents’ service games, which is valuable even if he doesn’t win any more return points.

Watching and waiting

To win a match, a player usually needs to break serve, and in the contemporary men’s game, that’s not an easy thing to do. On average, servers win about 64% of points and hold about 80% of service games. On hard courts, the equivalent numbers are even higher. Against a good server–let alone John Isner, Fed’s opponent tonight–they are higher still.

Returners who stand well behind the baseline and try only to put the ball back in play are basically crossing their fingers and hoping for the best. Maybe their opponent will miss several first serves, or the server will make a couple of errors against those weak returns. It can work, and for a brilliant returner such as Novak Djokovic, hitting moderately aggressive returns and winning some of the ensuing rallies is usually good enough for several breaks per match.

For most players, however, breaks of serve rely more on the server’s occasional lapses. To put it in numerical terms: A passive returner is playing the lottery in every return game–a lottery with only a 10% to 20% chance of winning.

Generating the coin flip

The best way to earn more breaks of serve, of course, is to win more return points. But unless you’re spending the offseason at Djokovic’s training camp, that’s unlikely.

The alternative is to change the rules of the lottery. Instead of accepting a steady rate of 35% of return points, a hyper-aggressive strategy is more likely to make the point-by-point results more streaky, even if the overall rate doesn’t change.

To see why this is effective, we need to oversimplify a bit. A player who wins 35% of return points will, on average, break in 17% of his return games. If we introduce a slight variation in the rate of return points won, we see a slight improvement in break rate, as well. If that same player wins 30% of return points in half of his games and 40% of return points in the other half, he’ll break serve 18% of the time.

That one percent improvement is barely noticeable. It probably represents what’s already going on in most matches, often because servers are a bit streaky already. The more volatility we introduce, though, the more the odds tilt toward the returner.

Double the variation and say that the returner wins 25% of return points half the time and 45% the other half. Now he’ll break serve in 21% of games, or one extra break per 25 return games. Still not overwhelming, but that’s one extra break in a five-setter.

The real magic happens when we expand the variation to an even split between 20% of return points and 50% of return points. In that scenario–when, remember, our returner is still winning 35% of points–the break rate improves to 26%, almost one more break per ten return games. On average, that’s an extra break per best-of-three match, and closer to two extra breaks in a typical best-of-five match.

Back to reality

A hyper-aggressive return game is going to result in more return errors as well as more return winners. That’s true regardless of return position: Mikhail Kukushkin managed to break Marin Cilic four times on Friday by going for return winners, even if he stayed in the general area of the baseline.

So a new return tactic is unlikely to make a player much better in general. And of course, it’s unlikely to generate anything like the neat, theoretical examples shown above, when one game is better and one game is worse.

However, I suspect that higher-risk shots are more likely to be streaky, which would result in something like those neat examples. And if the pundits are right, that Fed’s kamikaze return unnerves his opponents, that ought to make his return games even streakier still, as his opponents deal with a new challenge mid-match.

Whenever there’s an opportunity to change the nature of the game and make it less predictable, the underdog should take it. Odd as it is to think of Federer as the underdog, he–like everyone else on the men’s tour–is in fact fighting an uphill battle in every return game. Hyper-aggressive tactics are a small step toward leveling the field.

A Closer Look at the Winner-Unforced Error Ratio

Italian translation at settesei.it

Few tennis statistics are more frequently cited than winners and unforced errors. Nearly every broadcast displays them, and the ratio between the two numbers is discussed during matches as much as any other metric in the game.

If we set aside the problems with unforced errors, the winner-unforced error (W/UFE) ratio does appear to have some value. Winners are unquestionably good, so more winners must be better than fewer winners. Errors are definitely bad, so fewer is better.

It’s one small step from those anodyne assumptions to the conventional wisdom that a player should aim to tally more winners than unforced errors, resulting in a ratio of 1.0 or more.

Like any metric, this one isn’t perfect. With the help of detailed stats from over 1,000 matches in Match Charting Project data, we can take a closer look.

Is the W/UFE ratio all it’s cracked up to be?

If you compare two players’ W/UFE ratio, you’ll find that the player with the better ratio almost always wins. No surprise there, since winners and unforced errors directly represent points won and lost.

It isn’t perfect, though. In both men’s and women’s matches, the player with the lower W/UFE ratio wins the match 11% of the time. Winners and unforced errors only represent about 70% of total points, so if the remaining 30% of points tilt heavily in one direction–especially in a close match–we’ll see an unexpected result.

Things get a little messier when we test the magic W/UFE ratio of 1.0. That’s the number commentators cite all the time, as if it is the line between winning and losing. W/UFE ratios differ quite a bit by gender, so we’ll need to look at men and women separately.

In the 512 men’s matches logged by the Match Charting Project, players recorded a ratio of 1.0 or better only 41.3% of the time. In over a quarter of those “successes,” though, they lost the match. That means we have plenty of false positives and false negatives:  losers who beat the target ratio as well as plenty of winners who failed to meet it.

Players who met or exceeded a 1.0 ratio won 74% of men’s matches. But the range just above the target–from 1.0 to 1.1–only resulted in wins about 60% of the time.

There’s no clear line separating a good ratio from a bad one: Even at 1.2 W/UFE, men only win about 70% of matches. As low as 0.8, they win nearly half.

Much of the problem here is that players influence each others’ numbers. Against a defensive baseliner, an average player will see his winners decrease and his unforced error count rise. In that hypothetical match, both players will have ratios below 1.0. Against an aggressive, big server, that same player will hit more winners, and because rallies end sooner, will tally fewer unforced errors. That scenario will often give you two ratios above 1.0.

A different story for women

In the sample of 552 women’s matches, players only recorded W/UFE ratios of 1.0 or better 26% of the time. Because the average ratio is so low–about 0.7–there aren’t very many false positives. Players who met the 1.0 standard won 89% of matches.

For women, a more reasonable target is in the 0.85 range. It’s roughly equivalent to 1.2 for men, in that a ratio at that level translates into about a 70% chance of winning.

There’s certainly no magic number. Even if we settle on revised targets like 0.85, winner and unforced error counts leave out too much data. In yesterday’s up-and-down match between Sara Errani and Jelena Ostapenko, Errani tallied 11 winners against 24 unforced. Ostapenko struck 54 winners against 49 unforced. A 0.46 ratio, like Errani’s, results in a win only 29% of the time, while a 1.1 ratio, like Ostapenko’s, is good for a victory 87% of the time. Yet, Errani is the one still standing.

Targeting the components

The Errani-Ostapenko match suggests another way of looking at the subject. Errani’s ratio was dreadful, but by keeping her unforced error rate low, she achieved at least half of the goal, leading to more Ostapenko errors. And while Ostapenko hit tons of winners, her own unforced error count was high enough to keep Errani in the match.

Looking at winners and unforced errors independently still doesn’t give us any magic numbers, but it does tell us more than the W/UFE ratio reveals by itself. Errani committed unforced errors on only 14% of points, which–taken by itself–results in a win about 70% of the time. Ostapenko’s error rate of 28% translates into success only 20% of the time.

By isolating the two components of the ratio, we can come up with clear targets for each. In women’s tennis, an error rate between about 14% and 16%–taken by itself–results in a 70% chance of winning. Consider winners independently, and we see that a winner rate of 19% to 20% also implies a 70% chance of victory.

These findings also cast a bit of light on another frequent question: Which is more important, increasing winners or decreasing errors? Based on this evidence, the answer is decreasing errors, but only by a whisker–and only in women’s matches. The player with more winners claims 68% of contests, while the player with fewer errors wins 73% of matches. A more sophisticated look, in which I separated all matches into buckets based on winner rate and error rate, suggests an even narrower margin. The relationship between error rate and winning percentage was very slightly stronger (r^2 = 0.92) than the relationship between winner rate and winning percentage (r^2 = 0.90).

Men’s components

For men, the 70% thresholds are different. Taken alone, a winner rate of about 22% will get you a 70% chance of winning. An unforced error percentage of 15% will achieve the same goal.

The relative importance of winners and unforced errors is different on the ATP tour, perhaps because aces–which are counted as winners–are such a large part of the game. Again, the difference is minor, but here, the relationship between winner rate and winning percentage is a bit stronger (r^2 = 0.94) than the relationship between error rate and winning percentage (r^2 = 0.92).

I’m almost done

Most men play plenty of matches in which they meet the W/UFE target of 1.0 and still lose. Most women fail to reach the 1.0 standard much of the time, and some players, like Errani, put together excellent careers despite almost never reaching it. We could do a lot better.

For a generic rule-of-thumb, the W/UFE target ratio of 1.0 isn’t horrible. But as we’ve seen, a slightly more nuanced view–one that takes into account the differences between men and women, as well as the independent value of winner rate and error rate–would be considerably more valuable.

The Myth of the Tricky First Meeting

Italian translation at settesei.it

Today, both Roger Federer and Stan Wawrinka will play opponents they’ve never faced before. In Federer’s case, the challenger is Steve Darcis, a 31-year-old serve-and-volleyer playing in his 22nd Grand Slam event. Wawrinka will face Hyeon Chung, a 19-year-old baseliner in only his second Slam draw.

For all those differences, both Federer and Wawrinka will need to contend with a new opponent–slightly different spins, angles, and playing styles than they’ve seen before.  In the broadcast introduction to each match, we can expect to hear about this from the commentators. Something along the lines of, “No matter what the ranking, it’s never easy to play someone for the first time. He’s probably watched some video, but it’s different being out there on the court.”

All true, as even rec players can attest. But does it matter? After all, both players are facing a new opponent. While Darcis, for example, has surely watched a lot more video of Federer than Roger has of him, isn’t it just as different being out on the court facing Federer for the first time?

Attempting to apply common sense to the cliche will only get us so far. Let’s turn to the numbers.

Math is tricky; these matches aren’t

Usually, when we talk about “tricky first meetings,” we’re referring to these sorts of star-versus-newcomer or star-versus-journeyman battles. When two newcomers or two journeymen face off for the first time, it isn’t so notable. So, looking at data from the last fifteen years, I limited the view to matches between top-ten players and unseeded opponents.

This gives us a pretty hefty sample of nearly 7,000 matches. About 2,000 of those were first meetings. Even though the sample is limited to matches since 2000, I checked 1990s data–including Challengers–to ensure that these “first meetings” really were firsts.

Let’s start with the basics. Top-tenners have won 86.4% of these first meetings. The details of who they’re facing doesn’t matter too much. Their record when the new opponent is a wild card is almost identical, as is the success rate when the new opponent came through qualifying.

The first-meeting winning percentage is influenced a bit by age. When a top-tenner faces a player under the age of 24 for the first time, he wins 84.6% of matches. Against 24-year-olds and up, the equivalent rate is 88.0%. That jibes with what we’d expect: a newcomer like Chung or Borna Coric is more likely to cause problems for a top player than someone like Darcis or Joao Souza, Novak Djokovic‘s first-round victim.

The overall rate of 86.4% doesn’t do justice to guys like Federer. As a top-tenner, Roger has won 95% of his matches against first-time opponents, losing just 8 of 167 meetings. Djokovic, Rafael Nadal, and Andy Murray are all close behind, each within rounding distance of 93%.

By every comparison I could devise, the first-time meeting is the easiest type of match for top players.

The most broad (though approximate) control group consists of matches between top-tenners and unseeded players they have faced before. Favorites won 76.9% of those matches. Federer and Djokovic win 91% of those matches, while Nadal wins 89% and Murray 86%. In all of these comparisons, first-time meetings are more favorable to the high-ranked player.

A more tailored control group involves first-time meetings that had at least one rematch. In those cases, we can look at the winning percentage in the first match and the corresponding rate in the second match, having removed much of the bias from the larger sample.

Against opponents they would face again, top-tenners won their first meetings 85.1% of the time. In their second meeting, that success rate fell to 80.2%. It’s tough to say exactly why that rate went down–in part, it can be explained by underdogs improving their games, or learning something in the first match–but to make a weak version of the argument, it certainly doesn’t provide any evidence that first matches are the tough ones.

It may be true that first matches–no matter the quality of the opponent–feel tricky. It’s possible it takes more time to get used to first-time opponents, and that those underdogs are more likely to take a first set, or at least push it to a tiebreak. That’s a natural thing to think when such a match turns out closer than expected.

Whether or not any of that is true, the end result is the same. Top players appear to be generally immune to whatever trickiness first meetings hold, and they win such contests at a rate higher than any comparable set of matches.

Certainly, Fed fans have little to worry about. Most of his first-meeting losses were against players who would go on to have excellent careers: Mario Ancic, Guillermo Canas, Gilles Simon, Tomas Berdych, and Richard Gasquet.

His last loss facing a new opponent was his three-tiebreak heartbreaker to Nick Kyrgios in Madrid, only his third first-meeting defeat in a decade. As a rising star, Kyrgios fits the pattern of Fed’s previous first-meeting conquerors. Darcis, however, looks like yet another opponent that Federer will find distinctly not tricky.

Will the US Open First-Round Bloodbath Benefit Serena Williams?

After only two days of play, the US Open women’s draw is a shell of its former self.

Ten seeds have been eliminated, only the fifth time in the 32-seed era that the number of first-round upsets has reached double digits. Four of the top ten seeds were among the victims, marking the first time since 1994 that so many top-tenners failed to reach the second round of a Grand Slam.

Things are particularly dramatic in the top half of the draw, where Serena Williams can now reach the final without playing a single top-ten opponent. In a single day of play, my (conservative) forecast of her chances of winning the tournament rose from 42% to 47%, only a small fraction of which owed to her defeat of Vitalia Diatchenko.

However, plenty of obstacles remain. Serena could face Agnieszka Radwanska or Madison Keys in the fourth round, and then Belinda Bencic–the last player to beat her–in the quarters. A possible semifinal opponent is Elina Svitolina, a rising star who took a set from Serena at this year’s Australian Open.

The first-round carnage didn’t include most of the players who have demonstrated they can challenge the top seed. Five of the last six players to beat Serena–Bencic, Petra Kvitova, Simona Halep, Venus Williams, and Garbine Muguruza–are still alive. Only Alize Cornet, the 27th seed who holds an improbable .500 career record against Serena, is out of the picture.

What’s more, early-round bloodbaths haven’t, in the past, cleared the way for favorites. In the 59 majors since 2001, when the number of seeds increased to 32, the number of first-round upsets has had little to do with the likelihood that the top seed goes on to win the tournament.

In 18 of those 59 Slams, four or fewer seeds were upset in the first round. The top seed went on to win five times. In 22 of the 59, five or six seeds were upset in the first round, and the top seed won eight times.

In the remaining 19 Slams, in which seven or more seeds were upset in the first round, the top seed won only five times. Serena has “lost” four of those events, most recently last year’s Wimbledon, when nine seeds fell in their opening matches and Cornet defeated her in the third round.

This is necessarily a small sample, and even setting aside statistical qualms, it doesn’t tell the whole story. While Serena has failed to win four of these carnage-ridden majors, she has won three more of them when she wasn’t the top seed, including the 2012 US Open, when ten seeds lost in the first round and Williams went on to beat Victoria Azarenka in the final.

Taken together, the evidence is decidedly mixed. With the exception of Cornet, the ten defeated seeds aren’t the ones Serena would’ve chosen to remove from her path. While her odds have improved a bit on paper, the path through Keys, Bencic, Svitolina, and Halep or Kvitova in the final is as difficult as any she was likely to face.

The Unalarming Rate of Grand Slam Retirements

Italian translation at settesei.it

Yesterday, Vitalia Diatchenko proved to be even less of a match for Serena Williams than expected. She retired down 6-0, 2-0, winning only 5 of 37 points. She also sparked the usual array of questions about how Grand Slam prize money–$39,500 for first-round losers–incentivizes players to show up and collect a check even if they aren’t physically fit to play.

Diatchenko wasn’t the only player to exit yesterday without finishing a match. Of the 32 men’s matches, six ended in retirement. On the other hand, none of those were nearly as bad. All six injured men played at least two sets, and five of them won a set.

The prominence of Serena’s first-round match, combined with the sheer number of Monday retirements, is sure to keep pundits busy for a few days proposing rule changes. As we’ll see, however, there’s little evidence of a trend, and no need to change the rules.

Men’s slam retirements in context

Before yesterday’s bloodbath, there had been only five first-round retirements in the men’s halves of this year’s Grand Slams. The up-to-date total of 11 retirements is exactly equal to the annual average from 1997-2014 and the same as the number of first-round retirements in 1994.

The number of first-round Slam retirements has trended up slightly over the last 20 years. From 1995 to 2004, an average of ten men bowed out of their first-round matches each year. From 2005 to 2014, the average was 12.2–in large part thanks to the total of 19 first-round retirements last season.

That rise represents an increase in injuries and retirements in general, not a jump in unfit players showing up for Slams. From 1995 to 2004, an average of 8.5 players retired or withdrew from Slam matches after the first round, while in the following ten years, that number rose to 10.8.

Retirements at other tour-level events tell the same story. At non-Slams from 1995-2004, the retirement rate was about 1.3%, and in the following ten years, it rose to approximately 1.8%. (There isn’t much of a difference between first-round and later-round retirements at non-Slams.)

Injury rates in general have risen–exactly what we’d expect from a sport that has become increasingly physical. Based on recent results, we shouldn’t be surprised to see more retirements in best-of-five matches, as most of yesterday’s victims would’ve survived to the end of a best-of-three contest.

Women’s slam retirements

In most seasons, the rate of first-round retirements in women’s Grand Slam draws is barely half of the corresponding rate in other tour events.

In the last ten years, just over 1.2% of Slam entrants have quit their first-round match early. The equivalent rate in later Slam rounds is 1.1%, and the first-round rate at non-Slam tournaments is 2.26%. Diatchenko was the fifth woman to retire in a Slam first round this year, and if one more does so today, the total of six retirements will be exactly in line with the 1.2% average.

One painful anecdote isn’t a trend, and the spotlight of a high-profile match shouldn’t give any more weight to a single data point. Even with the giant checks on offer to first-round losers, players are not showing up unfit to play any more often than they do throughout the rest of the season.