Measuring the Clutchness of Everything

Matches are often won or lost by a player’s performance on “big points.” With a few clutch aces or un-clutch errors, it’s easy to gain a reputation as a mental giant or a choker.

Aside from the traditional break point stats, which have plenty of limitations, we don’t have a good way to measure clutch performance in tennis. There’s a lot more to this issue than counting break points won and lost, and it turns out that a lot of the work necessary to quantify clutchness is already done.

I’ve written many times about win probability in tennis. At any given point score, we can calculate the likelihood that each player will go on to win the match. Back in 2010, I borrowed a page from baseball analysts and introduced the concept of volatility, as well. (Click the link to see a visual representation of both metrics for an entire match.) Volatility, or leverage, measures the importance of each point–the difference in win probability between a player winning it or losing it.

To put it simply, the higher the leverage of a point, the more valuable it is to win. “High leverage point” is just a more technical way of saying “big point.”  To be considered clutch, a player should be winning more high-leverage points than low-leverage points. You don’t have to win a disproportionate number of high-leverage points to be a very good player–Roger Federer’s break point record is proof of that–but high-leverage points are key to being a clutch player.

(I’m not the only person to think about these issues. Stephanie wrote about this topic in December and calculated a full-year clutch metric for the 2015 ATP season.)

To make this more concrete, I calculated win probability and leverage (LEV) for every point in the Wimbledon semifinal between Federer and Milos Raonic. For the first point of the match, LEV = 2.2%. Raonic could boost his match odds to 50.7% by winning it or drop to 48.5% by losing it. The highest leverage in the match was a whopping 32.8%, when Federer (twice) had game point at 1-2 in the fifth set. The lowest leverage of the match was a mere 0.03%, when Raonic served at 40-0, down a break in the third set. The average LEV in the match was 5.7%, a rather high figure befitting such a tight match.

On average, the 166 points that Raonic won were slightly more important, with LEV = 5.85%, than Federer’s 160, at LEV = 5.62%. Without doing a lot more work with match-level leverage figures, I don’t know whether that’s a terribly meaningful difference. What is clear, though, is that certain parts of Federer’s game fell apart when he needed them most.

By Wimbledon’s official count, Federer committed nine unforced errors, not counting his five double faults, which we’ll get to in a minute. (The Match Charting Project log says Fed had 15, but that’s a discussion for another day.) There were 180 points in the match where the return was put in play, with an average LEV = 6.0%. Federer’s unforced errors, by contrast, had an average LEV nearly twice as high, at 11.0%! The typical leverage of Raonic’s unforced errors was a much less noteworthy 6.8%.

Fed’s double fault timing was even worse. Those of us who watched the fourth set don’t need a fancy metric to tell us that, but I’ll do it anyway. His five double faults had an average LEV of 13.7%. Raonic double faulted more than twice as often, but the average LEV of those points, 4.0%, means that his 11 doubles had less of an impact on the outcome of the match than Roger’s five.

Even the famous Federer forehand looks like less of a weapon when we add leverage to the mix. Fed hit 26 forehand winners, in points with average LEV = 5.1%. Raonic’s 23 forehand winners occurred during points with average LEV = 7.0%.

Taking these three stats together, it seems like Federer saved his greatness for the points that didn’t matter as much.

The bigger picture

When we look at a handful of stats from a single match, we’re not improving much on a commentator who vaguely summarizes a performance by saying that a player didn’t win enough of the big points. While it’s nice to attach concrete numbers to these things, the numbers are only worth so much without more context.

In order to gain a more meaningful understanding of this (or any) performance with leverage stats, there are many, many more questions we should be able to answer. Were Federer’s high-leverage performances typical? Does Milos often double fault on less important points? Do higher-leverage points usually result in more returns in play? How much can leverage explain the outcome of very close matches?

These questions (and dozens, if not hundreds more) signal to me that this is a fruitful field for further study. The smaller-scale numbers, like the average leverage of points ending with unforced errors, seem to have particular potential. For instance, it may be that Federer is less likely to go for a big forehand on a high-leverage point.

Despite the dangers of small samples, these metrics allow us to pinpoint what, exactly, players did at more crucial moments. Unlike some of the more simplistic stats that tennis fans are forced to rely on, leverage numbers could help us understand the situational tendencies of every player on tour, leading to a better grasp of each match as it happens.

Break Point Conversions and the Close Matches Federer Isn’t Winning

The career head-to-head between Roger Federer and Novak Djokovic sits at 21-21, but the current era of this rivalry is hardly even. Since the beginning of 2011, Djokovic has won 15 of 23, including last night’s US Open final.

These matches tend to be close ones. In only 7 of the 23 matches has either player won more than 55% of points, and in more than half (12 of 23), neither player has won more than 53% of points, fitting my proposed definition of lottery matches.

In the 12 lottery matches between Fed and Novak since 2011, the player who won the most points always won the match. Yet Djokovic wins far more (9 of 12) of these close matches. Last night was a perfect example: Federer won more return points than his opponent, and it was the third time since the 2012 Tour Finals that the Novak beat Fed while winning 50.3% of points.

When a player wins 50.3% of points, he wins the match only 59% of the time. Even at 51.8%, Novak’s total points won in three other Federer matches, the player with more points wins only 91% of the time.

If many of the matches are close, and one player is winning so many of the matches, there must be more to the story.

Back to break points

Clearly, Novak is winning more big points than Roger is. Since Federer has won more than half of the tiebreaks between them, the next logical place to look is break points.

Federer’s perceived inability to convert break points has been a concern for years. Early last year, I wrote about his success rate on break points, and found that while he does, in fact, convert fewer break points than expected, it’s only a few percentage points. Further, it’s not a new problem: He was winning fewer break points than he should have been back when he was the unchallenged top player in the game.

Against Novak, though, it’s another story, and since they’ve faced each other so often, we can no longer write off a poor break-point performance as an outlier.

In these last 23 matches–including last night’s 4-for-23 on break points–Federer has converted 15% fewer break points than expected, twice as bad as his worst single-season mark. Djokovic, on the other hand, has converted break points at almost the same rate as other return points.

I’m often hesitant to use the c-words, but the evidence is piling up that in these particular clutch situations, Roger is choking. At the very least, we can eliminate a couple of alternative explanations, those based on break point opportunities and on performance in the ad court.

Let’s start with break point opportunities. 4-for-23 on break points is painful to look at, but there is a positive: You have to play very well to generate 23 break point chances against a top player. In fact, there’s a very clear, almost linear relationship between return points won and break point chances generated, and Federer beat expectations by 77% yesterday. Over 21 return games, a player who won 39% of return points, as Roger did, would be expected to create only 13 break point opportunities. A 4-for-13 mark would still be disappointing, but it wouldn’t induce nearly as many grimaces.

In these 23 matches, Federer has generated exactly as many break point chances as expected. Djokovic has done the same. The story here is clearly about performance at 30-40 or 40-AD, not on anything earlier in the game. On non-break points yesterday, Fed returned more effectively.

The other explanation would be that Roger’s poor break point record has to do with the ad court. Against Rafael Nadal, that might be true: Much of the Spaniard’s effectiveness saving break points has to do with the way he skillfully uses left-handed serving in that court.

But in the Novak-Fed head-to-head, we can rule this out as well.  According to Match Charting Project data, which includes more than 40 Djokovic matches and 90 Federer matches, neither player performs much better in either half of the court. Djokovic wins more service points in the deuce court–65% to 64% in general, 66% to 64% on hard courts, and Federer wins return points at the same rate in both courts.

Pundits like to say that tennis is a game of matchups, and in this rivalry, both players defy their typical patterns. Over the course of his career, Novak has saved break points more effectively than average, but not nearly as well as he does against Federer. Federer, for his part, has turned in some of his best return performances against Djokovic … except for these dismal efforts converting break points, when he is far worse than his already-weak averages.

Perhaps the only solution for Roger is to find even more ways to improve his world-class service games. In the previous match against Novak, he converted only one of eight break point chances–the sort of stat that would easily explain a loss. That day in Cincinnati, though, Federer’s one break of serve was better than Djokovic’s zero.

Fed won 56.4% of total points in that match, his third highest rate against Djokovic since 2011. If Novak is going to play better clutch tennis and win the close matches, that leaves Federer with an unenviable alternative. To win, he must decisively outplay the best player in the world.

Sabr Metrics: The Case For the Hyper-Aggressive Return

Roger Federer has made waves the last few weeks by occasionally moving way up the court to return second serves. While the old-school tactic was nearly extinct in today’s game of baseline attrition, it seems to be working for Fed.

At least in one sense, it’s too early to say whether the kamikaze return is an effective tactic. Federer has used it sparingly for only a handful of matches, and in that tiny sample, he’s missed plenty of returns. But in the view of many pundits, the hyper-aggressive return gets in his opponents’ heads, making the tactic more valuable than simply changing the result of a few points. Presumably Roger agrees, since he keeps using it.

I agree that the tactic is a good one, though for a different reason. By taking greater risks, Fed is generating more unpredictability, or streakiness, on his opponents’ service games, which is valuable even if he doesn’t win any more return points.

Watching and waiting

To win a match, a player usually needs to break serve, and in the contemporary men’s game, that’s not an easy thing to do. On average, servers win about 64% of points and hold about 80% of service games. On hard courts, the equivalent numbers are even higher. Against a good server–let alone John Isner, Fed’s opponent tonight–they are higher still.

Returners who stand well behind the baseline and try only to put the ball back in play are basically crossing their fingers and hoping for the best. Maybe their opponent will miss several first serves, or the server will make a couple of errors against those weak returns. It can work, and for a brilliant returner such as Novak Djokovic, hitting moderately aggressive returns and winning some of the ensuing rallies is usually good enough for several breaks per match.

For most players, however, breaks of serve rely more on the server’s occasional lapses. To put it in numerical terms: A passive returner is playing the lottery in every return game–a lottery with only a 10% to 20% chance of winning.

Generating the coin flip

The best way to earn more breaks of serve, of course, is to win more return points. But unless you’re spending the offseason at Djokovic’s training camp, that’s unlikely.

The alternative is to change the rules of the lottery. Instead of accepting a steady rate of 35% of return points, a hyper-aggressive strategy is more likely to make the point-by-point results more streaky, even if the overall rate doesn’t change.

To see why this is effective, we need to oversimplify a bit. A player who wins 35% of return points will, on average, break in 17% of his return games. If we introduce a slight variation in the rate of return points won, we see a slight improvement in break rate, as well. If that same player wins 30% of return points in half of his games and 40% of return points in the other half, he’ll break serve 18% of the time.

That one percent improvement is barely noticeable. It probably represents what’s already going on in most matches, often because servers are a bit streaky already. The more volatility we introduce, though, the more the odds tilt toward the returner.

Double the variation and say that the returner wins 25% of return points half the time and 45% the other half. Now he’ll break serve in 21% of games, or one extra break per 25 return games. Still not overwhelming, but that’s one extra break in a five-setter.

The real magic happens when we expand the variation to an even split between 20% of return points and 50% of return points. In that scenario–when, remember, our returner is still winning 35% of points–the break rate improves to 26%, almost one more break per ten return games. On average, that’s an extra break per best-of-three match, and closer to two extra breaks in a typical best-of-five match.

Back to reality

A hyper-aggressive return game is going to result in more return errors as well as more return winners. That’s true regardless of return position: Mikhail Kukushkin managed to break Marin Cilic four times on Friday by going for return winners, even if he stayed in the general area of the baseline.

So a new return tactic is unlikely to make a player much better in general. And of course, it’s unlikely to generate anything like the neat, theoretical examples shown above, when one game is better and one game is worse.

However, I suspect that higher-risk shots are more likely to be streaky, which would result in something like those neat examples. And if the pundits are right, that Fed’s kamikaze return unnerves his opponents, that ought to make his return games even streakier still, as his opponents deal with a new challenge mid-match.

Whenever there’s an opportunity to change the nature of the game and make it less predictable, the underdog should take it. Odd as it is to think of Federer as the underdog, he–like everyone else on the men’s tour–is in fact fighting an uphill battle in every return game. Hyper-aggressive tactics are a small step toward leveling the field.