The Problem With “Unforced Errors”

In any sport, there are a handful of stats that are frequently cited, but are ultimately of limited use.  Often, these statistics tell you something, but are misunderstood to imply something more.  Simple examples are many “counting” stats — points scored in basketball, touchdowns thrown in football, RBI in baseball.  In all of those cases, they indicate something good, but don’t give you context — lots of field goal attempts, a great offensive line, or good hitters on base in front of you, to take those three cases.

The stat in tennis that aggravates me most is the unforced error.  Not only does it ignore some important context (as in the other-sport stats I just mentioned), but it relies on the judgment of a scorer.


The second problem is the more problematic one.  How much does a number mean if two people watching the same match wouldn’t come up with the same result?  This was a hot-button issue during Wimbledon, when the scorers were assigning an unusually small number of UEs, especially on serve returns.

If you’re watching the match, you might not notice.  If the end-of-set stats show that Nadal had 8 UEs and Federer had 17, that does tell you something … Federer was making more obvious mistakes.  But if you want to compare that to a Nadal/Federer match three weeks ago, or last year, those numbers are all but useless.

I suspect that, at events like Wimbledon, someone from the ITF, or maybe IBM, is giving standardized instructions to scorers with general rules for categorizing errors.  That would be a good start, especially if it were implemented across all tournaments at all professional levels.

…but it doesn’t matter

I suspect that no matter how consistent scorers are, the distinction between “unforced” and “forced” errors will always be arbitrary.  Consider the case of service returns.  There are occasional points, especially on second serve returns, where the returning player misses an easy shot.  But more frequently, the returning player is immediately on defense.  When is an error “unforced” on the return of a 130 mile-per-hour shot?

Ultimately, we will probably have computerized systems that classify errors for us.  If you have all the necessary data and crunch the numbers, a 125-mph serve down the T in the ad court might be returned 60% of the time, meaning there is a 40% chance of an error or non-return.  With those numbers on every serve (and every other shot, eventually), we could set the line for an “unforced” error on a shot that the average top-100 player would make, say, 75% of the time.  Or we could have different classifications: “unforced errors,” “disastrous errors,” “mildly forced errors,” and so on, indicating different percentage ranges.

The problem we have now is that professionals are so good (and their equipment is so advanced), that almost every shot can be offensive, meaning that players are almost always–to some extent–on defense.  If you’re rallying with Nadal, you might hit some winners, but you’re always fighting the spin.  If you’re rallying with Federer, the spin isn’t so bad, but you’re always trying to keep the ball away from his forehand.  (If you’re rallying with Djokovic, you’re wishing you had hit a better serve.)  That perpetual semi-defensive posture means that nearly every error is, to some extent, forced.  And because players are so good, we expect them to return every reachable ball, suggesting that nearly every error is, to some extent, unforced.


The wisdom of baseball analysts

A very similar problem arises in baseball.  If a fielder makes a misplay (according to the official scorer), he is charged with an “error.”  Paradoxically, some of the best fielders end up with the highest error totals.  If, say, a shortstop has great range, he’ll reach a lot of groundballs, and have more chances to make bad throws, thus racking up the errors.

For decades, fans considered errors to be the standard measure of defensive prowess–a stat called “fielding percentage” measures the ratio of plays-successfully-made to chances.   (In other words, 1 minus “error rate.”)  But because of the paradox mentioned above, the highest fielding percentages do not necessarily belong to the best fielders.

The solution: Ignore errors, look only at plays made.  (This is an oversimplification, but not by much.)  If Shortstop A makes more plays than Shortstop B, it doesn’t matter whether A makes more errors.  The guy you want on your team is the one who makes more plays.

Essentially, baseball errors correspond to tennis unforced errors, and baseball plays-not-made (shortstop dives for the ball and can’t reach it) correspond to tennis forced errors.  The stat that ends up mattering to baseball analysts–“plays made”–corresponds to “shots successfully returned.”  The analogy is imperfect, but it illustrates the problem with separating one type of non-play from another.


If we don’t distinguish between different types of errors, we’re left with “shots made” and “shots not made,” or–even less satisfactorily–“points won” and “points lost.”  Not exactly a step in the right direction, since we’re already counting points!

Still, I suspect it’s better to have no stat than to have a misleading stat.  Rally counts are a positive step, since we can look at outcomes for different types of points.  If you win a lot of 10-or-more-stroke rallies, that identifies you as a certain type of player (or playing a certain kind of match).  It doesn’t matter whether you lose that sort of point on an unforced error or your opponent’s winner–both outcomes might stem from the same tactical mistake three or four strokes sooner.

Either that, or we can wait until we can calculate real-time win probability and start categorizing errors with extreme precision.  “Unforced errors” aren’t going away any time soon, but as fans, we can be smarter about how much attention we grant to individual numbers.

5 thoughts on “The Problem With “Unforced Errors””

  1. Always read (via Rick Devereux) what you have to say with great interest as you both are striving to shed light upon tennis results that both explain what just happened and will predict what happens in the future.

    I noticed a new tennis stat during telecasts toward end of hardcourt season in the midst of another Joker win over Nadal…the percentage of rallies won that lasted 10 strokes. Fact that Joker won well over half of such rallies, indicated that not only was Rafa losing, but he was losing in a domain (The long rally) in which he was considered to be king.

    Arguably, if not clearly, that particular stat would be strongly indicative of future outcomes between THESE TWO PLAYERS.

    However, what if Isner had beaten Nadal in the French Open? In trying to explain that outcome and predict future outcomes between THOSE TWO PLAYERS, I suspect the “rally length” stat wouldn’t be so useful.

    I’m guessing no single stat can be the holy grail when predicting outcomes.

    As you noted, the quality and quantity (Heavy Topspin) of Nadal’s topspin FH is a factor that elicits so-called unforced errors.

    Could there be such a stat for the heaviness of one’s topspin as there already is for the MPH’s of one’s serve.

    Correspondingly to Rafa’s heavy topspin, there is Isner’s height, which diminishes the nuissance of Rafa’s topspin.

    Granted, the Isner / Rafa match-up is an anomaly and I’m sure it won’t prevent you from eventually de-mystifying match results.

    As a total non-sequitar, I’d like to make one brief comment on or pose a question on RBI’s.
    Shouldn’t the number of runners in scoring position who are left on base be somehow factored into and partially subtracted from one’s RBI total?

    1. Thanks for reading!

      Yes, the rally length stat is of limited use — it probably tells you if a player is exceeding expectations (as in Rafa/Novak) — but is more limited with Isner/Rafa. In that case, since Isner is so automatic on serve, maybe ‘exceeding expectations’ for Isner means winning 30% of long rallies, since he starts with an advantage serving.

      Eventually, yes, I think we will have a stat for strength of topspin — I saw something on a telecast this week noting the height of an Isner kick serve crossing the baseline, so that could be calculate for any shot — or estimated from the ball’s trajectory, in the case of a ball that is picked up on the rise. The exact trajectory of the spin might be difficult to quantify in a meaningful way, but simply having “max height” for a player’s groundstrokes would be a step in the right direction. And as you note, that max height would have different levels of effectiveness depending on the opponent’s height.

      On RBI — it’s pretty much irredeemable as a stat, because it is so context-dependent. I think some baseball folks have tried to adjust it for context, like RBI per runner in scoring position, but no matter what you do with it, it doesn’t have much value beyond simply looking at what the batter does himself.

  2. Steve – I like your point about Joker winning the longer rallies against Rafa. It doesn’t take statistics to see what affect this may have been having on Rafa’s head, but if we knew that this had become a pattern in all or most of Rafa’s losses (at least on hard courts), it would suggest that winning long rallies against him is an important factor in beating him.To satisfy our conviction with stats,this would have to be compared to other common factors in his losses, and looked at in the context of his wins against others ranked similarly.

  3. Maybe instead of examining this from the error-ee’s perspective, a better way of looking at this would be from the opponent’s. Two things come to mind
    1. who forces the most errors (I’m guessing a raonic, isner, fed type) and
    2. who causes the most unforced errors (Nadal / djokovic / murray – ahh I can’t get it by them, I’ll go for more…beep).

    This would make all errors almost an offensive stat, and treat each player more like a pitcher as opposed to a fielder. Then you could probably even get into metrics of comparing these numbers vs. specific players, or head to heads, and would be pretty useful. (For example, nadal probably forces quite a few more errors out of fed than djok)

Comments are closed.