How Much Is a Challenge Worth?

When the Hawkeye line-calling system is available, tennis players are given the right to make three incorrect challenges per set. As with any situation involving scarcity, there’s a choice to make: Take the chance of getting a call overturned, or make sure to keep your options open for later?

We’ve learned over the last several years that human line-calling is pretty darn good, so players don’t turn to Hawkeye that often. At the Australian Open this year, men challenged fewer than nine calls per match–well under three per set or, put another way, less than 1.5 challenges per player per set. Even at that low rate of fewer than once per thirty points, players are usually wrong. Only about one in three calls are overturned.

So while challenges are technically scarce, they aren’t that scarce.  It’s a rare match in which a player challenges so often and is so frequently incorrect that he runs out. That said, it does happen, and while running out of challenges is low-probability, it’s very high risk. Getting a call overturned at a crucial moment could be the difference between winning and losing a tight match. Most of the time, challenges seem worthless, but in certain circumstances, they can be very valuable indeed.

Just how valuable? That’s what I hope to figure out. To do so, we’ll need to estimate the frequency with which players miss opportunities to overturn line calls because they’ve exhausted their challenges, and we’ll need to calculate the potential impact of failing to overturn those calls.

A few notes before we get any further.  The extra challenge awarded to each player at the beginning of a tiebreak would make the analysis much more daunting, so I’ve ignored both that extra challenge and points played in tiebreaks. I suspect it has little effect on the results. I’ve limited this analysis to the ATP, since men challenge more frequently and get calls overturned more often. And finally, this is a very complex, sprawling subject, so we often have to make simplifying assumptions or plug in educated guesses where data isn’t available.

Running out of challenges

The Australian Open data mentioned above is typical for ATP challenges. It is very similar to a subset of Match Charting Project data, suggesting that both challenge frequency and accuracy are about the same across the tour as they are in Melbourne.

Let’s assume that each player challenges a call roughly once every sixty points, or 1.7%. Given an approximate success rate of 30%, each player makes an incorrect challenge on about 1.2% of points and a correct challenge on 0.5% of points. Later on, I’ll introduce a different set of assumptions so we can see what different parameters do to the results.

Running out of challenges isn’t in itself a problem. We’re interested in scenarios when a player not only exhausts his challenges, but when he also misses an opportunity to overturn a call later in the set. These situations are much less common than all of those in which a player might want to contest a call, but we don’t care about the 70% of those challenges that would be wrong, as they wouldn’t have any effect on the outcome of the match.

For each possible set length, from 24-point golden sets up to 93-point marathons, I ran a Monte Carlo simulation, using the assumptions given above, to determine the probability that, in a set of that length, a player would miss a chance to overturn a later call. As noted above, I’ve excluded tiebreaks from this analysis, so I counted only the number of points up to 6-6. I also excluded all “advantage” fifth sets.

For example, the most common set length in the data set is 57 points, which occured 647 times. In 10,000 simulations, a player missed a chance to overturn a call 0.27% of the time. The longer the set, the more likely that challenge scarcity would become an issue. In 10,000 simulations of 85-point sets, players ran out of challenges more than three times as often. In 0.92% of the simulations, a player was unable to challenge a call that would have been overturned.

These simulations are simple, assuming that each point is identical. Of course, players are aware of the cap on challenges, so with only one challenge remaining, they may be less likely to contest a “probably correct” call, and they would be very unlikely to use a challenge to earn a few extra seconds of rest. Further, the fact that players sometimes use Hawkeye for a bit of a break suggests that what we might call “true” challenges–instances in which the player believes the original call was wrong–are a bit less frequent that the numbers we’re using. Ultimately, we can’t address these concerns without a more complex model and quite a bit of data we don’t have.

Back to the results. Taking every possible set length and the results of the simulation for each one, we find the average player is likely to run out of challenges and miss a chance to overturn a call roughly once every 320 sets, or 0.31% of the time. That’s not very often–for almost all players, it’s less than once per season.

The impact of (not) overturning a call

Just because such an outcome is infrequent doesn’t necessarily mean it isn’t important. If a low-probability event has a high enough impact when it does occur, it’s still worth planning for.

Toward the end of a set, when most of these missed chances would occur, points can be very important, like break point at 5-6. But other points are almost meaningless, like 40-0 in just about any game.

To estimate the impact of these missed opportunities, I ran another set of Monte Carlo simulations. (This gets a bit hairy–bear with me.) For each set length, for those cases when a player ran out of challenges, I found the average number of points at which he used his last challenge. Then, for each run of the simulation, I took a random set from the last few years of ATP data with the corresponding number of points, chose a random point between the average time that the challenges ran out and the end of the set, and measured the importance of that point.

To quantify the importance of the point, I calculated three probabilities from the perspective of the player who lost the point and, had he conserved his challenges, could have overturned it:

  1. his odds of winning the set before that point was played
  2. his odds of winning the set after that point was played (and not overturned)
  3. his odds of winning the set had the call been overturned and the point awarded to him.

(To generate these probabilities, I used my win probability code posted here with the assumption that each player wins 65% of his service points. The model treats points as independent–that is, the outcome of one point does not depend on the outcomes of previous points–which is not precisely true, but it’s close, and it makes things immensely more straightforward. Alert readers will also note that I’ve ignored the possibility of yet another call that could be overturned. However, the extremely low probability of that event convinced me to avoid the additional complexity required to model it.)

Given these numbers, we can calculate the possible effects of the challenge he couldn’t make. The difference between (2) and (3) is the effect if the call would’ve been overturned and awarded to him. The difference between (1) and (2) is the effect if the point would have been replayed. This is essentially the same concept as “leverage index” in baseball analytics.

Again, we’re missing some data–I have no idea what percentage of overturned calls result in each of those two outcomes. For today, we’ll say it’s half and half, so to boil down the effect of the missed challenge to a single number, we’ll average those two differences.

For example, let’s say we’re at five games all, and the returner wins the first point of the 11th game. The server’s odds of winning the set have decreased from 50% (at 5-all, love-all) to 43.0%. If the server got the call overturned and was awarded the point, his odds would increase to 53.8%. Thus, the win probability impact of overturning the call and taking the point is 10.8%, while the effect of forcing a replay is 7.0%. For the purposes of this simulation, we’re averaging these two numbers and using 8.9% as the win probability impact of this missed opportunity to challenge.

Back to the big picture. For each set length, I ran 1,000 simulations like what I’ve described above and averaged the results. In short sets under 40 points, the win probability impact of the missed challenge is less than five percentage points. The longer the set, the bigger the effect: Long sets are typically closer and the points tend to be higher-leverage. In 85-point sets, for instance, the average effect of the missed challenge is a whopping 20 percentage points–meaning that if a player more skillfully conserved his challenges in five such sets, he’d be able to reverse the outcome of one of them.

On average, the win probability effect of the missed challenge is 12.4 percentage points. In other words, better challenge management would win a player one more set for every eight times he didn’t lose such an opportunity by squandering his challenges.

The (small) big picture

Let’s put together the two findings. Based on our assumptions, players run out of challenges and forgo a chance to overturn a later call about once every 320 matches. We now know that the cost of such a mistake is, on average, a 12.4 percentage point win probability hit.

Thus, challenge management costs an average player one set out of every 2600. Given that many matches are played on clay or on courts without Hawkeye, that’s maybe once in a career. As long as the assumptions I’ve used are in the right ballpark, the effect isn’t even worth talking about. The mental cost of a player thinking more carefully before challenging might be greater than this exceedingly unlikely benefit.

What if some of the assumptions are wrong? Anecdotally, it seems like challenges cluster in certain matches, because of poor officiating, bad lighting, extreme spin, precise hitting, or some combination of these. It seems possible that certain scenarios would arise in which a player would want to challenge much more frequently, and even though he might gain some accuracy, he would still increase the risk.

I ran the same algorithms for what seems to me to be an extreme case, almost doubling the frequency with which each player challenges, to 3.0%, and somewhat increasing the accuracy rate, to 40%.

With these parameters, a player would run out of challenges and miss an opportunity to overturn a call about six times more often–once every 54 sets, or 1.8% of the time. The impact of each of these missed opportunities doesn’t change, so the overall result also increases by a factor of six. In these extreme case, poor challenge management would cost a player the set 0.28% of the time, or once every 356 sets. That’s a less outrageous number, representing perhaps one set every second year, but it also applies to unusual sets of circumstances which are very unlikely to follow a player to every match.

It seems clear that three challenges is enough. Even in long sets, players usually don’t run out, and when they do, it’s rare that they miss an opportunity that a fourth challenge would have afforded them. The effect of a missed chance can be enormous, but they are so infrequent that players would see little or no benefit from tactically conserving challenges.

Winners, Errors, and Misinformation

Of the general ways in which points end–winners, unforced errors, and forced errors, which is the most common? It’s so basic a question that I’d never thought to investigate it. As it turns out, other people have, and they’re making tenuous claims based on their results.

A friend sent me a link to this advertisement for an instructional course, which–eventually, far into a painfully slow video–explains that more points on the pro tour end in forced errors than in winners or unforced errors. And because of this, the video argues, you can use some of the same patterns the pros use with the goal of generating forced errors. Apparently, aiming for winners is too risky, as is waiting for unforced errors.

Pedagogically, it seems reasonable enough to encourage patience and tactical conservatism. I don’t know the first thing about helping amateurs improve their tennis game, and I’ll happily defer to the experts.

However, the use of pro tennis data sparked my interest. I was immediately skeptical of these claims, which were apparently based on Grand Slam matches from 2012.

Using my datasets extracted from IBM Pointstream’s records of the last several slams, I tested the 2015 French Open and the 2015 US Open, tallying winners, unforced errors, and forced errors for men and women at both events. Here’s how they break down:

Dataset    Winners  Unforced  Forced  
FO Men       33.8%     32.9%   33.3%  
FO Women     32.7%     37.8%   29.5%  
                                      
USO Men      34.3%     31.6%   34.1%  
USO Women    31.0%     38.0%   30.9%

On both surfaces, men’s points split fairly evenly among the three categories. For women, winners are roughly even with forced errors (though there are more winners on clay) and unforced errors are the most common type of point-ending shot.

The Pointstream-based dataset has limitations, though, and you might have already guessed what it is. A sizable percentage of forced errors are serve returns, which don’t really seem pertinent to a discussion of tactics. We can separate aces from winners and double faults from unforced errors, but not forced error returns from forced errors.

For that, we need the resources of the Match Charting Project. That data gives us almost 1500 matches (evenly split between men and women) once we limit our view to tour-level contests. The MCP dataset contains everything Pointstream does–winners, unforced and forced errors–and much, much more. For our purposes, the key addition is rally length, which allows to differentiate between forced error returns and forced errors that came later in rallies.

With the MCP data, we can remove serve statistics from this discussion altogether, excluding aces, double faults, and forced error returns, none of which are tactics in the sense we usually use the word.

Here’s the frequency of each type of point-ender:

Dataset  Winners  Unforced  Forced  
Men        32.5%     45.8%   21.7%  
Women      32.4%     49.4%   18.2%

When serves are no longer cluttering the picture, winners retain their relative importance, but the distribution of errors changes enormously. Now, we see that once the returner gets the ball back in play (or receives a serve he or she should be able to put back in play), unforced errors outnumber forced errors by more than two to one.

(I also calculated clay-specific numbers, and all the rates were within one percentage point of the overall averages.)

Forced errors are the most common type of point-ender in only 14 of 728 charted men’s matches and 4 of 751 charted women’s matches. Even if you’re concerned about the representativeness of the MCP sample or the error-labeling tendencies of the charters and add make substantial adjustments to allow for them, these results overwhelming establish that unforced errors are the most common way in which rallies end.

I’m not sure how applicable the tactics and tendencies of pro players are to amateur coaching, so it’s possible that these numbers are irrelevant to a great deal of coaching pedagogy. But if you’re going to base your instructional technique on pro tennis stats, it seems reasonable to start by getting the numbers right.

The Match Charting Project is making it possible to answer questions about tennis that were previously unanswerable. Project data is open to all researchers. Please help us grow the project by watching tennis and charting matches!

The Difficulty (and Importance) of Finding the Backhand

One disadvantage of some one-handed backhands is that they tend to sit up a little more when they’re hit crosscourt. That gives an opponent more time to prepare and, often, enough time to run around a crosscourt shot and hit a forehand, which opens up more tactical possibilities.

With the 700 men’s matches in the Match Charting Project database (please contribute!), we can start to quantify this disadvantage–if indeed it has a negative effect on one-handers. Once we’ve determined whether one-handers can find their opponents’ backhands, we can try to answer the more important question of how much it matters.

The scenario

Let’s take all baseline rallies between right-handers. Your opponent hits a shot to your backhand side, and you have three choices: drive (flat or topspin) backhand, slice backhand, or run around to hit a forehand. You’ll occasionally go for a winner down the line and you’ll sometimes be forced to hit a weak reply down the middle, but usually, your goal is to return the shot crosscourt, ideally finding your opponent’s backhand.

Considering all righty-righty matchups including at least one player among the last week’s ATP top 72 (I wanted to include Nicolas Almagro), here are the frequency and results of each of those choices:

SHOT    FREQ  FH REP  BH REP    UFE  WINNER  PT WON  
ALL             9.9%   68.1%  10.8%    5.8%   43.1%  
SLICE  11.9%   34.1%   49.5%   7.1%    0.6%   40.2%  
FH     44.9%    2.8%   69.0%  13.0%    9.8%   42.1%  
BH     43.3%   10.7%   72.2%   9.5%    3.1%   45.0%  
                                                     
1HBH   42.6%   12.0%   69.5%   9.3%    3.8%   44.2%  
2HBH   43.5%   10.0%   73.4%   9.6%    2.8%   45.4%

“FH REP” and “BH REP” refer to a forehand or backhand reply, and we can see just how much shot selection matters in keeping the ball away from your opponent’s forehand. A slice does a very poor job, while an inside-out forehand almost guarantees a backhand reply, though it comes with an increased risk of error.

The differences between one- and two-handed backhands aren’t as stark. One-handers don’t find the backhand quite as frequently, though they hit a few more winners. They hit drive backhands a bit less often, but that doesn’t necessarily mean they are hitting forehands instead. On average, two-handers hit a few more forehands from the backhand corner, while one-handers are forced to hit more slices.

One hand, many types

Not all one-handed backhands are created equal, and these numbers bear that out. Stanislas Wawrinka‘s backhand is as effective as the best two-handers, while Roger Federer‘s is typically the jumping-off point for discussions of why the one-hander is dying.

Here are the 28 players for whom we have at least 500 instances (excluding service returns) when the player responded to a shot hit to his backhand corner. For each, I’ve shown how often he chose a drive backhand or forehand, and the frequency with which he found the backhand–excluding his own errors and winners.

Player                 BH  BH FRQ  FIND BH%  FH FRQ  FIND BH%  
Alexandr Dolgopolov     2   45.7%     94.2%   43.3%     98.7%  
Kei Nishikori           2   51.1%     94.0%   38.9%     98.1%  
Andy Murray             2   41.0%     92.4%   46.5%     98.6%  
Stanislas Wawrinka      1   48.6%     92.1%   37.5%     98.0%  
Bernard Tomic           2   33.8%     91.7%   43.8%     97.9%  
Novak Djokovic          2   47.2%     91.7%   41.4%     98.5%  
Kevin Anderson          2   41.0%     91.5%   45.8%     96.6%  
Borna Coric             2   46.5%     90.7%   44.2%     96.9%  
Pablo Cuevas            1   41.9%     90.6%   54.5%     96.5%  
Marin Cilic             2   45.4%     89.7%   43.3%     97.2%  
                                                               
Player                 BH  BH FRQ  FIND BH%  FH FRQ  FIND BH%  
Tomas Berdych           2   41.6%     89.3%   44.2%     97.5%  
Pablo Carreno Busta     2   55.4%     87.8%   41.1%     93.5%  
Fabio Fognini           2   46.0%     87.4%   47.0%     96.1%  
Richard Gasquet         1   57.2%     87.3%   32.1%     96.8%  
Andreas Seppi           2   40.3%     87.2%   50.0%     93.9%  
Nicolas Almagro         1   53.6%     86.5%   39.3%     98.0%  
Dominic Thiem           1   38.5%     86.2%   50.0%     96.5%  
Gael Monfils            2   48.0%     85.3%   46.3%     85.3%  
David Ferrer            2   48.2%     84.9%   40.4%     97.1%  
Roger Federer           1   42.7%     84.8%   43.6%     94.5%  
                                                               
Player                 BH  BH FRQ  FIND BH%  FH FRQ  FIND BH%  
Gilles Simon            2   46.9%     84.6%   46.5%     94.6%  
David Goffin            2   45.4%     84.6%   45.7%     94.9%  
Roberto Bautista Agut   2   39.6%     83.3%   46.7%     98.4%  
Jo Wilfried Tsonga      2   43.5%     82.0%   44.5%     96.3%  
Grigor Dimitrov         1   41.4%     78.6%   39.4%     92.8%  
Milos Raonic            2   31.5%     63.5%   56.5%     94.3%  
Jack Sock               2   27.0%     62.5%   62.9%     96.3%  
Tommy Robredo           1   26.6%     56.1%   62.3%     88.4%

One-handers Wawrinka, Pablo Cuevas, and Richard Gasquet (barely) are among the top half of these players, in terms of finding the backhand with their own backhand. Federer and his would-be clone Grigor Dimitrov are at the other end of the spectrum.

Taking all 60 righties I included in this analysis (not just those shown above), there is a mild negative correlation (r^2 = -0.16) between a player’s likelihood of finding the opponent’s backhand with his own and the rate at which he chooses to hit a forehand from that corner. In other words, the worse he is at finding the backhand, the more inside-out forehands he hits. Tommy Robredo and Jack Sock are the one- and two-handed poster boys for this, struggling more than any other players to find the backhand, and compensating by hitting as many forehands as possible.

However, Federer–and, to an even greater extent, Dimitrov–don’t fit this mold. The average one-hander runs around balls in their backhand corner 44.6% of the time, while Fed is one percentage point under that and Dimitrov is below 40%. Federer is perceived to be particularly aggressive with his inside-out (and inside-in) forehands, but that may be because he chooses his moments wisely.

Ultimate outcomes

Let’s look at this from one more angle. In the end, what matters is whether you win the point, no matter how you get there. For each of the 28 players listed above, I calculated the rate at which they won points for each shot selection. For instance, when Novak Djokovic hits a drive backhand from his backhand corner, he wins the point 45.4% of the time, compared to 42.3% when he hits a slice and 42.4% when he hits a forehand.

Against his own average, Djokovic is about 3.6% better when he chooses (or to think of it another way, is able to choose) a drive backhand. For all of these players, here’s how each of the three shot choices compare to their average outcome:

Player                 BH   BH W   SL W   FH W  
Dominic Thiem           1  1.209  0.633  0.924  
David Goffin            2  1.111  0.656  0.956  
Grigor Dimitrov         1  1.104  0.730  1.022  
Gilles Simon            2  1.097  0.922  0.913  
Tomas Berdych           2  1.085  0.884  0.957  
Pablo Carreno Busta     2  1.081  0.982  0.892  
Kei Nishikori           2  1.070  0.777  0.965  
Roberto Bautista Agut   2  1.055  0.747  1.027  
Stanislas Wawrinka      1  1.050  0.995  0.936  
Borna Coric             2  1.049  1.033  0.941  
                                                
Player                 BH   BH W   SL W   FH W  
Bernard Tomic           2  1.049  1.037  0.943  
Jack Sock               2  1.049  0.811  1.010  
Gael Monfils            2  1.048  1.100  0.938  
Fabio Fognini           2  1.048  0.775  0.987  
Milos Raonic            2  1.048  0.996  0.974  
Nicolas Almagro         1  1.046  0.848  0.964  
Kevin Anderson          2  1.038  1.056  0.950  
Novak Djokovic          2  1.036  0.966  0.969  
Andy Murray             2  1.031  1.039  0.962  
Roger Federer           1  1.023  1.005  0.976  
                                                
Player                 BH   BH W   SL W   FH W  
Richard Gasquet         1  1.020  0.795  1.033  
Andreas Seppi           2  1.019  0.883  1.008  
David Ferrer            2  1.018  0.853  1.020  
Alexandr Dolgopolov     2  1.010  1.010  0.987  
Marin Cilic             2  1.006  1.009  0.991  
Pablo Cuevas            1  0.987  0.425  1.048  
Jo Wilfried Tsonga      2  0.956  0.805  1.095  
Tommy Robredo           1  0.845  0.930  1.079

In this view, Dimitrov–along with his fellow one-handed flame carrier Dominic Thiem–looks a lot better. His crosscourt backhand doesn’t find many backhands, but it is by far his most effective shot from his own backhand corner. We would expect him to win more points with a drive backhand than with a slice (since he probably opts for slices in more defensive positions), but it’s surprising to me that his backhand is so much better than the inside-out forehand.

While Dimitrov and Thiem are more extreme than most, almost all of these players have better results with crosscourt drive backhands than with inside-out (or inside-in forehands). Only five–including Robredo but, shockingly, not including Sock–win more points after hitting forehands from the backhand corner.

It’s clear that one-handers do, in fact, have a slightly more difficult time forcing their opponents to hit backhands. It’s much less clear how much it matters. Even Federer, with his famously dodgy backhand and even more famously dominant inside-out forehand, is slightly better off hitting a backhand from his backhand corner. We’ll never know what would happen if Fed had Djokovic’s backhand instead, but even though Federer’s one-hander isn’t finding as many backhands as Novak’s two-hander does, it’s getting the job done at a surprisingly high rate.

Are Two First Serves Ever Better Than One?

It’s one of those ideas that never really goes away. Some players have such strong first serves that we often wonder what would happen if they hit only first serves. That is, if a player went all-out on every serve, would his results be any better?

Last year, Carl Bialik answered that question: It’s a reasonably straightforward “no.”

Bialik showed that among ATP tour regulars in 2014, only Ivo Karlovic would benefit from what I’ll call the “double-first” strategy, and his gains would be minimal. When I ran the numbers for 2015–assuming for all players that their rates of making first serves and winning first-serve points would stay the same–I found that Karlovic only breaks even. Going back to 2010, 2014 Ivo was the only player-season with at least 40 matches for whom two first serves would be better than one.

Still, it’s not an open-and-shut case. What struck me is that the disadvantage of a double-first strategy would be so minimal. For Karlovic (and others, mainly big servers, such as Jerzy Janowicz, Milos Raonic,and John Isner), hitting two first serves would only slightly decrease their overall rate of service points won. For Rafael Nadal and Andy Murray, opting for double-first would reduce their rate of service points won by just under two percentage points.

Here’s a visual look at 2015 tour regulars (minimum 30 matches), showing the hypothetical disadvantage of two first serves. The diagonal line is the breakeven level; Ivo, Janowicz, and Isner are the three points nearly on the line.

myplot

Since some players are so close to breaking even, I started to wonder if some matchups make the double-first strategy a winning proposition. For example, Novak Djokovic is so dominant against second serves that, perhaps, opponents would be better off letting him see only first serves.

However, it remains a good idea–at least in general–to take the traditional approach against Djokovic. Hypothetically, two first serves would result in Novak raising his rate of return points won by 1.2 percentage points. Gilles Simon and Andy Murray are in similar territory, right around 1 percentage point.

Here’s the same plot, showing the disadvantage of double-first against tour-regular returners this season:

myplot2

There just aren’t any returners who would cause the strategy to come as close to breaking even as some big servers do.

The match-level tactic

What happens if a nearly-breakeven server, like Karlovic, faces a not-far-from-breakeven returner, like Djokovic? If opting for double-first is almost a good idea for Ivo against the average returner, what happens when he faces someone particularly skilled at attacking second serves?

Sure enough, there are lots of matches in which two first serves would have been better than one. I found about 1300 matches between tour regulars (players with 30+ matches) this season, and for each one, I calculated each player’s actual service points won along with their estimated points won had they hit two first serves. About one-quarter of the time, double-first would have been an improvement.

This finding holds up in longer matches, too, avoiding some of the danger of tiny samples in short matches. In one-quarter of longer-than-average matches, a player would have still benefited from the double-first strategy. Here’s a look at how those matches are distributed:

myplot3

Finally, some action on the left side of the line! One of those outliers in the far upper right of the graph is, in fact, Ivo’s upset of Djokovic in Doha this year. Karlovic won 85% of first-serve points but only 50% of second-serve points. Had he hit only first serves, he would’ve won about 79% of his service points instead of the 75% that he recorded that day.

Another standout example is Karlovic’s match against Simon in Cincinnati. Ivo won 81% of first-serve points and only 39% of second-serve points. He won the match anyway, but if he had pursued a double-first strategy, Simon could’ve caught an earlier flight home.

Predicting double-first opportunities

Armed with all this data, we would still have a very difficult time identifying opportunities for players to take advantage of the strategy.

For each player in every match, I multiplied his “double-first disadvantage” (the number of percentage points of serve points won he would lose by hitting two first serves) with the returner’s double-first disadvantage. Ranking all matches by the resulting product puts combinations like Karlovic-Djokovic and Murray-Isner together at one extreme. If we are to find instances where we could retroactively predict an advantage from hitting two first serves, they would be here.

When we divide all these matches into quintiles, there is a strong relationship between the double-first results we would predict using season-aggregate numbers and the double-first results we see in individual matches. However, even if the most double-first-friendly quintile–the one filled with Ivo serving and Novak returning–there’s still, on average, a one-percentage-point advantage to the traditional serving tactic.

It is only at the most extreme that we could even consider recommending two first serves. When we take the 2% of matches with the smallest products–that is, the ones we would most expect to benefit from double first–26 of those 50 matches are one in which the server would’ve done better to hit two first serves.

In other words, there’s a ton of variance at the individual match level, and since the margins are so slim, there are almost no situations where it would be sensible for a player to hit two first serves.

A brief coda in the real world

All of this analysis is based on some simplifying assumptions, namely that players would make their first serves at the same rate if they were hitting two instead of one, and that players would win the same number of points behind their first serves even if they were hitting them twice as often.

We can only speculate how much those assumptions mask. I suspect that if a player hit only first serves, he would be more likely to see streaks of both success and failure; without second serves to mix things up, it would be easier to find oneself repeating mechanics, whether perfect or flawed.

The second assumption is probably the more important one. If a server hit only first serves, his ability to mix things up and disguise serving patterns would be hampered. I have no idea how much that would affect the outcome of service points–but it would probably act to the advantage of the returner.

All that said, even if we can’t recommend that players hit two first serves in any but the extreme matchups, it is worth emphasizing that the margins we’re discussing are small. And since they are small, the risk of hitting big second serves isn’t that great. There may be room for players to profitably experiment with more aggressive second serving, especially when a returner starts crushing second serves.

Ceding the advantage on second-serve points to a player like Djokovic must be disheartening. If the risk of a few more double faults is tolerable, we may have stumbled on a way for servers to occasionally stop the bleeding.

Sabr Metrics: The Case For the Hyper-Aggressive Return

Roger Federer has made waves the last few weeks by occasionally moving way up the court to return second serves. While the old-school tactic was nearly extinct in today’s game of baseline attrition, it seems to be working for Fed.

At least in one sense, it’s too early to say whether the kamikaze return is an effective tactic. Federer has used it sparingly for only a handful of matches, and in that tiny sample, he’s missed plenty of returns. But in the view of many pundits, the hyper-aggressive return gets in his opponents’ heads, making the tactic more valuable than simply changing the result of a few points. Presumably Roger agrees, since he keeps using it.

I agree that the tactic is a good one, though for a different reason. By taking greater risks, Fed is generating more unpredictability, or streakiness, on his opponents’ service games, which is valuable even if he doesn’t win any more return points.

Watching and waiting

To win a match, a player usually needs to break serve, and in the contemporary men’s game, that’s not an easy thing to do. On average, servers win about 64% of points and hold about 80% of service games. On hard courts, the equivalent numbers are even higher. Against a good server–let alone John Isner, Fed’s opponent tonight–they are higher still.

Returners who stand well behind the baseline and try only to put the ball back in play are basically crossing their fingers and hoping for the best. Maybe their opponent will miss several first serves, or the server will make a couple of errors against those weak returns. It can work, and for a brilliant returner such as Novak Djokovic, hitting moderately aggressive returns and winning some of the ensuing rallies is usually good enough for several breaks per match.

For most players, however, breaks of serve rely more on the server’s occasional lapses. To put it in numerical terms: A passive returner is playing the lottery in every return game–a lottery with only a 10% to 20% chance of winning.

Generating the coin flip

The best way to earn more breaks of serve, of course, is to win more return points. But unless you’re spending the offseason at Djokovic’s training camp, that’s unlikely.

The alternative is to change the rules of the lottery. Instead of accepting a steady rate of 35% of return points, a hyper-aggressive strategy is more likely to make the point-by-point results more streaky, even if the overall rate doesn’t change.

To see why this is effective, we need to oversimplify a bit. A player who wins 35% of return points will, on average, break in 17% of his return games. If we introduce a slight variation in the rate of return points won, we see a slight improvement in break rate, as well. If that same player wins 30% of return points in half of his games and 40% of return points in the other half, he’ll break serve 18% of the time.

That one percent improvement is barely noticeable. It probably represents what’s already going on in most matches, often because servers are a bit streaky already. The more volatility we introduce, though, the more the odds tilt toward the returner.

Double the variation and say that the returner wins 25% of return points half the time and 45% the other half. Now he’ll break serve in 21% of games, or one extra break per 25 return games. Still not overwhelming, but that’s one extra break in a five-setter.

The real magic happens when we expand the variation to an even split between 20% of return points and 50% of return points. In that scenario–when, remember, our returner is still winning 35% of points–the break rate improves to 26%, almost one more break per ten return games. On average, that’s an extra break per best-of-three match, and closer to two extra breaks in a typical best-of-five match.

Back to reality

A hyper-aggressive return game is going to result in more return errors as well as more return winners. That’s true regardless of return position: Mikhail Kukushkin managed to break Marin Cilic four times on Friday by going for return winners, even if he stayed in the general area of the baseline.

So a new return tactic is unlikely to make a player much better in general. And of course, it’s unlikely to generate anything like the neat, theoretical examples shown above, when one game is better and one game is worse.

However, I suspect that higher-risk shots are more likely to be streaky, which would result in something like those neat examples. And if the pundits are right, that Fed’s kamikaze return unnerves his opponents, that ought to make his return games even streakier still, as his opponents deal with a new challenge mid-match.

Whenever there’s an opportunity to change the nature of the game and make it less predictable, the underdog should take it. Odd as it is to think of Federer as the underdog, he–like everyone else on the men’s tour–is in fact fighting an uphill battle in every return game. Hyper-aggressive tactics are a small step toward leveling the field.

Measuring WTA Tactics With Aggression Score

Editor’s note: Please welcome guest author Lowell! He’s a prolific contributor to the Match Charting Project, and the author of the first guest post on this blog.

The Problem

Quantifying aggression in tennis presents a quandary for the outsider. An aggressive shot and a defensive shot can occur on the same stroke at the same place on the court at the same point in a rally. To know whether one occurred, we need information on court positioning and shot speed, not only of the current shot, but the shots beforehand.

Since this data only exists for a fraction of tennis matches (via Hawkeye) and is not publicly available, using aggressive shots as a metric is untenable for public consumption. In a different era, net points may have been a suitable metric, but almost all current tennis, especially women’s tennis, revolves around baseline play.

Net points also can take on a random quality and may not actually reflect aggression. Elina Svitolina, according to data from the Match Charting Project, had 41 net points in her match against Yulia Putintseva at Roland Garros this year. However, this was not an indicator of Svitolina’s aggressive play so much as Putintseva hitting 51 drop shots in the match.

The Match Charting Project does give some data to help with this problem however. We can use the data to get the length of rallies and whether a player finished the point, i.e. he/she hit a winner or unforced error or their opponent hit a forced error. If we assume an aggressive player would be more likely to finish the point and would be more likely to try to finish the point sooner rather than later in a rally, we can build a metric.

The Metric

To calculate aggression using these assumptions, we need to know how often a player finished the point and how many opportunities did they have to finish the point, i.e. the number of times they had the ball in play on their side of the net. To measure the number of times a player finished the point, we add up the points where they hit a winner or unforced error or their opponent hit a forced error. For short, I will refer to these as “Points on Racquet”.

To measure how many opportunities a player had to finish the point, we calculate the number of times the ball was in play on each player’s side of the net. For service points, we add 1 to the length of each rally and divide it by 2, rounding up if the result is not an integer. For return points, we divide each rally by 2, rounding up if the result is not an integer. These adjustments allow us to accurately count how often a player had the ball in play on their side of the net. For brevity, I will call these values “Shot Opportunities”.

If we divide Points on Racquet by Shot Opportunities we will get a value between 0 and 1. If a player has a value of 0, they never finish points when the ball is on their side of the net. If the player has a value of 1, they only hit shots that end the point. As the value increases, a player is considered more aggressive. For short, I will call this measure an “Aggression Score.”

The Data

Taking data from the latest upload of the Match Charting Project, I found women’s players with 2000 or more completed points in the database (i.e. all points that were not point penalties or missed points). Eighteen players fitted these criteria. Since the Match Charting Project is, unfortunately, a nonrandom sample of matches, I felt uncomfortable making assessments below a very large number of data points. Using 2000 or more data points, however, an overwhelming amount of data would be required to overcome these assessments, giving some confidence that, while bias exists, we get in the neighborhood of the true aggression values.

The Results

Below are the results from the analysis. Tables 1-3 provide the Aggression Scores for each player overall, broken down into serve and return scores and further broken down into first and second serves. They also provide differences between where we would expect the player to be more aggressive (Serve v. Return, First Serve v. Second Serve and Second Serve Return v. First Serve Return).

Table 1: Aggression Scores

Name         Overall  On Serve  On Return  S-R Spread  
S Williams     0.281    0.3114     0.2476      0.0638  
S Halep       0.1818    0.2058     0.1537      0.0521  
M Sharapova   0.2421    0.2471     0.2358      0.0113  
C Wozniacki   0.1526    0.1788     0.1185      0.0603  
P Kvitova     0.3306     0.347      0.309       0.038  
L Safarova    0.2475    0.2694     0.2182      0.0512  
A Ivanovic    0.2413     0.247     0.2335      0.0135  
Ka Pliskova    0.256    0.2898     0.2095      0.0803  
G Muguruza     0.231     0.238     0.2214      0.0166  
A Kerber      0.1766    0.2044     0.1433      0.0611  
B Bencic      0.1742    0.1784     0.1687      0.0097  
A Radwanska   0.1473    0.1688     0.1207      0.0481  
S Errani      0.1232    0.1184     0.1297     -0.0113  
E Svitolina   0.1654    0.1769     0.1511      0.0258  
M Keys        0.3017    0.3284     0.2677      0.0607  
V Azarenka    0.1892    0.1988     0.1762      0.0226  
V Williams    0.2251     0.247     0.1944      0.0526  
E Bouchard    0.2458    0.2695     0.2157      0.0538  
WTA Tour       0.209    0.2254     0.1877      0.0377

Table 2: Serve Aggression Scores

Name          Serve  First Serve  Second Serve  1-2 Spread  
S Williams   0.3114       0.3958        0.2048       0.191  
S Halep      0.2058       0.2298        0.1587      0.0711  
M Sharapova  0.2471       0.2715        0.1989      0.0726  
C Wozniacki  0.1788       0.2016         0.121      0.0806  
P Kvitova     0.347       0.3924        0.2705      0.1219  
L Safarova   0.2694       0.3079        0.1983      0.1096  
A Ivanovic    0.247       0.2961        0.1732      0.1229  
Ka Pliskova  0.2898       0.3552        0.1985      0.1567  
G Muguruza    0.238       0.2906        0.1676       0.123  
A Kerber     0.2044       0.2337        0.1384      0.0953  
B Bencic     0.1784       0.2118        0.1218        0.09  
A Radwanska  0.1688       0.2083        0.0931      0.1152  
S Errani     0.1184       0.1254        0.0819      0.0435  
E Svitolina  0.1769       0.2196         0.105      0.1146  
M Keys       0.3284       0.3958        0.2453      0.1505  
V Azarenka   0.1988       0.2257        0.1347       0.091  
V Williams    0.247       0.3033        0.1716      0.1317  
E Bouchard   0.2695       0.3043        0.2162      0.0881  
WTA Tour     0.2254       0.2578        0.1679      0.0899

Table 3: Return Aggression Scores

Name          Serve  1st Return  2nd Return  Spread  
S Williams   0.2476      0.2108      0.3116  0.1008  
S Halep      0.1537      0.1399      0.1778  0.0379  
M Sharapova  0.2358      0.2133      0.2774  0.0641  
C Wozniacki  0.1185      0.1098       0.132  0.0222  
P Kvitova     0.309      0.2676      0.3803  0.1127  
L Safarova   0.2182      0.1778      0.2725  0.0947  
A Ivanovic   0.2335      0.1952      0.3027  0.1075  
Ka Pliskova  0.2095      0.1731      0.2715  0.0984  
G Muguruza   0.2214      0.1888      0.2855  0.0967  
A Kerber     0.1433      0.1127       0.191  0.0783  
B Bencic     0.1687      0.1514       0.197  0.0456  
A Radwanska  0.1207      0.1049      0.1464  0.0415  
S Errani     0.1297      0.1131      0.1613  0.0482  
E Svitolina  0.1511      0.1175      0.1981  0.0806  
M Keys       0.2677      0.2322      0.3464  0.1142  
V Azarenka   0.1762      0.1499      0.2164  0.0665  
V Williams   0.1944      0.1586       0.255  0.0964  
E Bouchard   0.2157      0.1757      0.2837   0.108  
WTA Tour     0.1877      0.1609      0.2341  0.0732

The first plot shows the relationship between serve and return aggression scores as well as the regression line with a confidence interval (note: since there are only 18 players in the sample, treat this regression line and all of the others in this post with caution).

Figure2

The second and third plots show the relationships between players’ aggression scores on first serves and their aggression scores on second serves for serve and return points respectively as well as the regression lines with confidence intervals.

Figure3

Figure4

The fourth and fifth plots show the relationship between the spread of serve and return aggression scores between first and second serve and the more aggressive point for the player, i.e. first serve for service points and second serve for return points as well as the regression lines with confidence intervals.

Figure5

Figure6

 

We can take away five preliminary observations.

Sara Errani knows where her money is made. The WTA is notoriously terrible for providing statistics. However, they do provide leaderboards for particular statistics, including return points and games won. Errani leads the tour in both this year. She also uniquely holds a higher Aggression Score on return points than serve points. From this information, we can hypothesize that Errani may play more aggressive on return points because she has greater confidence she can win those points or because she relies on those points more to win.

Maria Sharapova is insensitive to context; Elina Svitolina is highly sensitive to context. She falls outside of the confidence interval in all five plots. More specifically, Sharapova consistently is more aggressive on return points, second serve service points and first serve return points than her scores for service points, first serve service points and second serve return points respectively would predict. She has also lower spreads on serve and return than her more aggressive points would predict.

This result suggests that Sharapova differentiates relatively little in how she approaches points according to whether she is serving or returning or whether it is first serve or second serve. Svitolina exhibits the opposite trend as Sharapova. Considering anecdotal thoughts from watching Sharapova and Svitolina, these results make sense. Sharapova’s serve does not seem to vary between first and second and we see a lot of double faults. Svitolina can vary between aggressive shot-making and big first serves and conservative play. Hot takes are not always wrong.

Lucie Safarova, meet Eugenie Bouchard; Ana Ivanovic, meet Garbine Muguruza. Looking at the plots, it is interesting to note how Safarova and Bouchard seem to follow each other across the various measures. The same is true for Ivanovic and Muguruza. A potential application of the aggression score is that it can point us to players that are comparable and may have similar results. Players with good results against Safarova and Ivanovic may have good results against Bouchard and Muguruza, two younger players whom they are much less likely to have played.

Serena Williams and Karolina Pliskova serve like Madison Keys and Petra Kvitova, but they are very different. Serena, Pliskova, Keys and Kvitova are all players that are known for their serves as their weapons. Serena and Pliskova have the third and fourth highest Aggression Scores respectively. However, they also have wide spreads on serve and return scores and they have much lower second serve service point scores than their first serve scores would predict, whereas Keys is about where the prediction places her and Kvitova is far more aggressive than her first serve points would predict.

While Serena is still a relatively aggressive returner, she rates lower on first serve return aggression than Maria Sharapova. Pliskova falls to the middle of the pack on return aggression. Kvitova and Keys, in contrast, are both very aggressive on return points. My hypothesis for the difference is that while Serena and Pliskova are aggressive players, their scores get inflated by using their first serve as a weapon and they are only somewhat more aggressive than the players that score below them. Kvitova and Keys, on the other had, are exceptionally aggressive players.

The WTA runs through Victoria Azarenka and Madison Keys. Oddly, the players who seemed to best capture the relationships between all of the aggression scores and spreads of aggression scores were Victoria Azarenka and Madison Keys. Neither strayed outside of the confidence interval and often ended up on the best-fit line from the regressions. They define average for the WTA top 20.

These thoughts are preliminary and any suggestions on how they could be used or improved would be helpful. I also must beseech you to help with the Match Charting Project to put more players over the 2,000 point mark and get more points for the players on this list to help their Aggression Scores a better part of reality.

Uncontrolled Aggression

Listen to tennis commentary–or a broadcast of any sport, really–and wait for the first mention of “consistency.” You won’t have to wait for long.

“Consistent” is good, and “inconsistent” is bad. Or so we’re told. At first blush, it makes sense. Consistency is a good thing when it comes to following through on your forehand or brushing your teeth every day. But unless you’re the very best player in the world, consistency doesn’t win you Grand Slam titles.

Think of it this way: Every player has an “average” level they are capable of playing. If average Rafael Nadal plays average anybody else on clay, average Nadal wins. If average Richard Gasquet plays average anybody-outside-the-top-fifty, average Gasquet wins. These situations, for the likes of Nadal and Gasquet, are when consistency is actually a good thing. Sure, Rafa might be able to raise him game to previously unheard-of heights, but what’s the point? It’s a matter of winning 6-1 6-0 instead of 6-3 6-2. Nadal’s main concern is avoiding an off day.

Consider the same example from the perspective of Rafa’s opponent. If you’re Tomas Berdych and you play at your usual level against Nadal, you’ll lose. That’s what consistency gets you: thirteen straight losses.

Uncontrolled aggression

Very aggressive players tend to get a bad rap. The guys who always go for their shots–think Lukas Rosol or Nikolay Davydenko–rack up huge winner and unforced error counts. Sometimes it works and often it doesn’t. When it doesn’t, the conventional wisdom always seems to be that these players need to rein in their aggression. They need to be more consistent.

But they don’t. If Rosol stopped unleashing huge shots in every direction, he’d make fewer unforced errors, but he’d hit far fewer winners. He might still hover around #50 in the world, but more likely, he’d still be lurking in the Challenger ranks, looking for the breakthrough that such a passive style might never earn for him. As it is, Rosol’s go-for-broke approach got him that career-defining upset over Nadal, not to mention an ATP title in Bucharest last spring, when he beat three higher-ranked players.

Rather than the pundit’s favored phrase of “controlled aggression,” players score big upsets and major breakthroughs with uncontrolled aggression. (It only looks controlled because it’s working that day.) If you rein in an aggressive player, he may win more of the matches he’s supposed to win, but he’s much less likely to score an upset.

The balance myth

The game of tennis has so much variety–surfaces, climates, playing styles–and so much alternation–deuce/ad, serve/return–that pundits are constantly endorsing balance. Andy Murray needs to get better on clay, they say. Jerzy Janowicz needs to improve his return game. Monica Niculescu needs to learn how to hit a forehand.

It’s a tempting argument to make, because the best players in the game do have that balance. Nadal and Djokovic and Serena and Li have a wide variety of devastating shots and tactics that are effective on every surface. If you want to play like them and reap the same rewards, you need to have that same balance.

Except that, for the vast majority of players–even top-tenners–that just isn’t going to happen. I don’t care if David Ferrer hires a coaching team of Pete Sampras and Mark Philippoussis, he’ll never be much more effective on serve. John Isner could work all offseason with Andre Agassi and remain among the game’s weakest returners.

What’s keeping these players from climbing any higher in the rankings isn’t the fact that they aren’t more balanced. It’s the simple fact that they aren’t better. By definition, most people will never be a once-in-a-generation talent.

Most players are not balanced. And that’s fine. Rather than chasing the impossible dream of out-Novaking Novak, they need to take more risks to outplay their betters in one or two areas. When it doesn’t work, it doesn’t matter–they would’ve lost anyway.

The cluster principle

Tennis rewards the streaky. If you only win four return points in a set, it’s much better to win them consecutively than to spread them out. It’s better to win five matches in one week and go winless for the next four weeks than win one match per week.

Whether it’s points, games, sets, matches, or even titles, it’s better to cluster your triumphs.

If you strive for a balanced game, the best players simply won’t let you go on a streak. Fabio Fognini or Sabine Lisicki might give you a few gifts, but Nadal never will. The only way to cluster your victories over Rafa is to play such aggressive tennis that even he can’t neutralize it. It usually won’t work, but for most players, it’s their only hope. There’s a reason the hyper-aggressive Davydenko is the only active player with a winning record against him.

Stan’s untold narrative

Stanislas Wawrinka probably wouldn’t have beaten a healthy Nadal over five sets on Sunday. But he was winning when Rafa’s back acted up, and he did so by unleashing every weapon in his arsenal.

Whatever the rankings say this week, Wawrinka isn’t one of the best three tennis players in the world. At least “average Stan” isn’t. But that’s the whole point. Tennis doesn’t reward players with ranking points and prize money for consistency. Consistency got Berdych into the top ten and has kept him there for so long … but it has prevented him from spending much time in the top five.

Wawrinka won’t always beat Nadal or Djokovic, and he’ll continue to suffer his share of defeats at the hands of the players ranked below him. The high-risk style of play that earned him a place in the history books won’t always pay off. That’s all part of the package. Stan didn’t get this far by being consistent.