What Grega Zemlja Can Tell Us About American Tennis

Italian translation at settesei.it

Last week, virtually unknown Slovenian qualifier Grega Zemlja reached the final in Vienna.  Like many players–Eastern Europeans in particular–in the back half of the top 100, he has finally established a toehold on tour after putting together a good sequence of challenger results.

The final run in Vienna–only his 16th tour-level event–will help keep him in the top 100 for most of the next year, earning him direct entries into all of the Grand Slams and many smaller ATP events.

Zemlja turned 26 one month ago, so he is hardly a “prospect.”  But I call your attention to him because he has achieved his new berth in the top 50 almost entirely by merit.  When the AELTC awarded him a wild card into the Wimbledon main draw this summer, it was the first tour-level wild card of his career.  In fact, he has only received a single wild card into a challenger main draw.

While the Slovenian has been a fixture in the top 200 since the end of 2008, he hasn’t gotten any favors.

The distribution of wild cards

As it turns out, he’s not alone.  21 players in the top 100 (including Tomas Berdych and Janko Tipsarevic) didn’t receive a single tour-level wild card before their 25th birthday.  Another 16 (Novak Djokovic and David Ferrer among them) got only one, and yet another 23 received only two.

When I started researching this post, I expected to find that Zemlja was uniquely disadvantaged.  But no: Wild cards are the privilege of players who happen to be born in the right places.  Free entries tend to go to home favorites, with a few more awarded to star youngsters like Grigor Dimitrov.

Thus, the geographical distribution of wild cards has everything to do with where tournaments are located.  And tournament locations have an awful lot to do with where the tennis world was centered 20, 50, or even 100 years ago.

The U.S. of Assistance

Much has been said of Donald Young‘s 27 tour-level wild cards.  (Some of it by Patrick McEnroe, recipient of 37.)  But that’s just the tip of the iceberg.  Did you know that the seven active players who received the most wild cards before age 25 play for the USA?  Young is followed by Mardy Fish, Ryan Harrison, Sam Querrey, Jesse Levine, John Isner, and James Blake.  (Blake has been handed by far the most career wild cards, but the majority have come in his more recent comeback attempts.)

The current top 200 players received 748 wild cards before the age of 25.  139, or 18.6% of those, have gone to these seven, or 3.5% of players.

Put simply, the distribution of tennis tournaments doesn’t match the distribution of tennis talent.  The US is the only country with more than one Masters 1000 event–it has three.  Plus a slam.  And two 500s.  And another seven 250s, at least in 2012.

All those tournaments have at least three wild cards to give out.  This year, seven of them handed main draw spots to Jack Sock, who at age 20 has already amassed 10 career tour-level wild cards, more WCs than 90% of the top 200 have received.

A structural problem

This is an easy subject to get worked up about, especially if you prefer to support players like Zemlja.  Yet it’s difficult to blame anyone in particular.

Tournaments fiercely guard the few wild card spots they are given, so it would be difficult for the ATP to meddle.  The events want to attract fans, and an up-and-comer with an easy-to-pronounce name is a great way to sell tickets.  And you certainly can’t blame a player for accepting main draw berths.

Here’s a modest proposal: Convert a few more “wild card” spots to merit-based spots.  The USTA is doing more of this, setting up playoffs for reciprocal wild card placements at the Australian and French Opens, among other strategies.  But that doesn’t help with geographical distribution, since only Americans can compete!

Better yet is a version of how Zemlja got into Wimbledon.  He won the Nottingham challenger two weeks previous, and the AELTC wasn’t going to give away all the free spots to Brits.  The Slovenian was a deserving up-and-comer, even though he doesn’t play under the right flag.

Perhaps every Slam and Masters event should reserve a spot for the winner of a corresponding challenger.  Or every tournament with a 48-or-bigger draw should be required to hand at least one wild card to a non-national.

If a player is good enough, he’ll break in eventually.  But wouldn’t the sport be better off if some players didn’t have to wait longer than others, based simply on how many tournaments are played in the country they represent?

Daniel Brands and Ace Records in Context

In the Vienna round of 16 last week, Juan Martin Del Potro beat Daniel Brands in a three-set, three-tiebreak match.  The courts are fast, Delpo serves big, and apparently Brands has quite the weapon of his own, as both players hit at least 30 aces.  Brands hit 32.

We can’t help but be impressed at the sheer numbers.  As it turns out, it’s an ATP first, at least since 1991, when the ATP started keeping such stats.  Never before had both players hit at least 30 aces in a three-set match.

Here are the top nine matches in the ATP record books, in which both servers reached a certain ace milestone:

minAces  Winner                 Loser              Year  Event               Surface  Score                 wAces  lAces  
30       Juan Martin Del Potro  Daniel Brands      2012  Vienna              Hard     6-7(5) 7-6(4) 7-6(6)     30     32  
29       John Isner             Gilles Muller      2010  Atlanta             Hard     4-6 7-6(6) 7-6(7)        33     29  
28       Andrei Pavel           Gregory Carraz     2005  Milan               Carpet   7-6(0) 6-7(5) 7-6(3)     28     33  
25       Greg Rusedski          Joachim Johansson  2004  Moscow              Carpet   7-6(5) 6-7(1) 7-6(7)     25     26  
25       Arnaud Clement         Thomas Johansson   2008  Cincinnati Masters  Hard     7-6(4) 6-7(5) 6-3        25     28  
24       Mark Philippoussis     Greg Rusedski      2002  Queen's Club        Grass    6-7(1) 7-6(3) 7-6(5)     25     24  
24       Joachim Johansson      Kristof Vliegen    2006  Stockholm           Hard     6-7(5) 7-6(5) 7-6(7)     24     24  
24       Andy Roddick           Ivo Karlovic       2009  Queen's Club        Grass    7-6(4) 7-6(5)            24     26  
24       Richard Gasquet        Joachim Johansson  2009  Kuala Lumpur        Hard     4-6 7-6(1) 6-2           26     24

(There are several matches in which both players hit 23, including two on clay, both from 2011: Isner/Karlovic in Houston, and Federer/Feliciano Lopez in Madrid.  Both went to three tiebreaks.)

Aces in a losing effort

Even independent of Del Potro’s 30 aces, it stands out that Brands racked up 32 aces in a best-of-three losing effort.  But that’s not a record–it ties him for 16th of all time with several others, including Sam Querrey, Milos Raonic, Ivo Karlovic, and Goran Ivanisevic, who did it twice.

Mardy Fish may not be proud of this record, but he simply blows away the rest of the field, having served past the eminently ace-able Olivier Rochus 43 times despite losing to the Belgian.  Though Karlovic may not sit atop the list, he makes up for it by dominating the middle.

lAces  Winner              Loser             Year  Event             Surface  Score                  wAces  
43     Olivier Rochus      Mardy Fish        2007  Lyon              Carpet   6-7(5) 7-6(6) 7-6(15)      2  
37     Yevgeny Kafelnikov  Alexander Waske   2002  Tashkent          Hard     6-7(6) 7-6(5) 7-6(6)      10  
35     Pete Sampras        Goran Ivanisevic  1996  Tour Finals       Carpet   6-7(6) 7-6(4) 7-5         17  
35     Andy Roddick        Feliciano Lopez   2011  Queen's Club      Grass    7-6(2) 6-7(5) 6-4         15  
35     Feliciano Lopez     Ivo Karlovic      2004  Madrid Masters    Hard     6-4 6-7(10) 7-6(5)         8  
35     Yen Hsun Lu         Ivo Karlovic      2012  Queen's Club      Grass    6-7(3) 7-6(6) 7-6(7)       6  
35     Rafael Nadal        Ivo Karlovic      2008  Queen's Club      Grass    6-7(5) 7-6(5) 7-6(4)       6  
35     Arnaud Clement      Ivo Karlovic      2004  's-Hertogenbosch  Grass    7-6(8) 6-7(5) 6-3          2  
34     Thomas Johansson    Ivan Ljubicic     2002  Canada Masters    Hard     4-6 6-4 7-6(6)            17  
34     Lars Burgsmuller    Wayne Arthurs     2006  Tokyo             Hard     6-7(5) 7-6(7) 7-6(3)      10  
34     Richey Reneberg     Richard Krajicek  1997  Halle             Grass    4-6 7-6(2) 7-6(6)          6

Total aces in a single match

If there has never been a match in which both players hit 30 aces, a match total of 62 aces must be pretty impressive, right?

Indeed it is.  Del Potro and Brands are now tied for the record, initially set by John Isner and Gilles Muller two years ago in Atlanta.  It’s only the fourth time that two players have combined for 60 or more aces in a best-of-three contest.

totAces  Winner                 Loser             Year  Event               Surface  Score                 wAces  lAces  
62       Juan Martin Del Potro  Daniel Brands     2012  Vienna              Hard     6-7(5) 7-6(4) 7-6(6)     30     32  
62       John Isner             Gilles Muller     2010  Atlanta             Hard     4-6 7-6(6) 7-6(7)        33     29  
61       Andrei Pavel           Gregory Carraz    2005  Milan               Carpet   7-6(0) 6-7(5) 7-6(3)     28     33  
60       Goran Ivanisevic       Magnus Norman     1997  Zagreb              Carpet   7-6(5) 6-7(4) 7-5        40     20  
58       Frank Dancevic         Peter Wessels     2007  Stockholm           Hard     6-1 6-7(7) 7-6(6)        35     23  
55       Jan Michael Gambill    Wayne Arthurs     2002  San Jose            Hard     7-5 6-7(5) 7-6(4)        22     33  
55       Bohdan Ulihrach        Goran Ivanisevic  1999  Rotterdam           Carpet   6-7(6) 7-6(3) 7-5        23     32  
53       Andy Roddick           Wayne Arthurs     2006  Memphis             Hard     6-7(4) 7-6(9) 7-6(2)     20     33  
53       Andy Roddick           Sam Querrey       2010  San Jose            Hard     2-6 7-6(5) 7-6(4)        21     32  
53       Arnaud Clement         Thomas Johansson  2008  Cincinnati Masters  Hard     7-6(4) 6-7(5) 6-3        25     28  
53       Joachim Johansson      Gregory Carraz    2004  Canada Masters      Hard     7-6(4) 6-7(3) 7-6(4)     30     23

The higher bar of ace rate

If you want to set a record in a best-of-three-sets match, getting to those three tiebreaks is a good idea.  The more points you play, the more likely you’ll hit more aces, as evidenced by Fish’s losing performance, where he not only reached three tiebreaks, but played at least twelve points in each one!

For greater context, we should open up the field to all matches regardless of length, and compare them by ace rate.

Del Potro’s 30 aces came in 125 service points, for an ace rate of 24%.  Brands hit 32 in 131, for an ace ate of 24.4%.  It’s not often that one player (not named Isner, anyway) hits nearly one-quarter of his serves for aces, so it is particularly unusual for both players to do so.

In all tour-level matches (including grand slams) since 1991, a minimum ace rate of 24.0% is only good for 17th.  Andy Roddick was particularly adept at bringing about these kinds of matches, appearing in 6 of the top 11 on this list:

minA%  Winner            Loser              Year  Event            Surface  Score                wA%    lA%  
33.3%  Andy Roddick      Ivo Karlovic       2009  Queen's Club     Grass    7-6(4) 7-6(5)      33.3%  35.1%  
29.8%  Mikhail Youzhny   Ivan Ljubicic      2007  Rotterdam        Hard     6-2 6-4            29.8%  29.8%  
29.2%  Gregory Carraz    Martin Verkerk     2004  Milan            Carpet   6-3 7-6(3)         30.4%  29.2%  
27.3%  Goran Ivanisevic  Boris Becker       1996  Antwerp          Carpet   6-4 7-6(5)         30.8%  27.3%  
27.1%  John Isner        Gilles Muller      2010  Atlanta          Hard     4-6 7-6(6) 7-6(7)  27.5%  27.1%  
27.0%  Robin Soderling   Andy Roddick       2008  Lyon             Carpet   7-6(5) 7-6(5)      27.0%  27.2%  
26.7%  Janko Tipsarevic  Peter Luczak       2010  s-Hertogenbosch  Grass    6-3 6-3            26.7%  27.1%  
26.1%  Andy Roddick      Gilles Muller      2008  Memphis          Hard     6-4 7-6(4)         27.4%  26.1%  
25.4%  Andy Roddick      Joachim Johansson  2004  San Jose         Hard     6-3 7-6(7)         36.5%  25.4%  
25.4%  Andy Roddick      Nicolas Mahut      2008  Lyon             Carpet   7-6(5) 6-4         29.0%  25.4%  
25.3%  Andy Roddick      Feliciano Lopez    2008  Dubai            Hard     6-7(8) 6-4 6-2     26.2%  25.3%

Ace rate in a losing effort

While losers rarely hit as many aces as Brands did last week, losers often hit aces at a much higher rate.  Brands doesn’t register anywhere near the top of this all-time list.

Think of it this way: The shorter the match, the more likely a player will do something off-the-charts, rate-wise.  Karlovic tops this list, with 28 aces in his 70 service points.  Brands didn’t maintain anywhere near the same rate that Ivo did, but Brands did have to hit nearly twice as many serves!  Had Karlovic continued for 61 more serves, he probably would’ve done better than 24.4%, but it is very unlikely he would have continued at a 4-in-10 pace.

This is also a reason why we haven’t seen many best-of-five matches on the ace-rate leaderboards.  Even if one player is acing like a madman while quickly losing, he still has to keep up the pace for three sets.

lA%    Winner              Loser               Year  Event               Surface  Score                     lAces  
40.0%  Florent Serra       Ivo Karlovic        2009  Basel               Hard     7-6(5) 6-4                   28  
37.5%  Alex Obrien         Mark Philippoussis  1996  Cincinnati Masters  Hard     6-4 6-4                      21  
36.6%  Thomas Johansson    Ivan Ljubicic       2002  Canada Masters      Hard     4-6 6-4 7-6(6)               34  
35.8%  Richey Reneberg     Richard Krajicek    1997  Halle               Grass    4-6 7-6(2) 7-6(6)            34  
35.1%  Andy Roddick        Ivo Karlovic        2009  Queen's Club        Grass    7-6(4) 7-6(5)                26  
34.8%  Paul Henri Mathieu  Ivo Karlovic        2009  Cincinnati Masters  Hard     7-6(9) 6-4                   23  
34.8%  Paul Henri Mathieu  Chris Guccione      2008  Adelaide            Hard     4-6 6-3 6-4                  24  
34.2%  Andre Agassi        Joachim Johansson   2005  Australian Open     Hard     6-7(4) 7-6(5) 7-6(3) 6-4     51  
33.8%  Jonas Bjorkman      Mark Philippoussis  2002  Memphis             Hard     7-6(6) 7-6(1)                26  
33.3%  Thomas Johansson    Wayne Arthurs       2001  Nottingham          Grass    7-6(3) 7-6(3)                24  
33.3%  Yevgeny Kafelnikov  Marc Rosset         2002  Marseille           Hard     6-3 7-6(5)                   19  
33.3%  Andre Agassi        Goran Ivanisevic    1994  Vienna              Carpet   6-4 6-4                      19

Combined ace rate

As you might have guessed by now, 24% isn’t going to be good enough to crack this final all-time list.  Roddick, Karlovic, and Mark Philippousis simply played too many matches to allow that to happen.

Indeed, the Brands/Del Potro combined rate of 24.2% isn’t even close to the top of this list.  To show up here, it’s necessary to come within an ace or two of the 30% mark.  With Andy’s retirement and Ivo’s decline, this leaderboard looks particularly safe at the moment.

totA%  Winner              Loser              Year  Event                 Surface  Score          totAces    wA%    lA%  
34.2%  Andy Roddick        Ivo Karlovic       2009  Queen's Club          Grass    7-6(4) 7-6(5)       50  33.3%  35.1%  
31.6%  Andy Roddick        Thomas Johansson   2004  Bangkok               Hard     6-3 6-4             31  38.2%  23.3%  
31.6%  Andy Roddick        Joachim Johansson  2004  San Jose              Hard     6-3 7-6(7)          42  36.5%  25.4%  
31.6%  Martin Verkerk      Thomas Enqvist     2003  Milan                 Carpet   6-3 6-4             30  46.0%  15.6%  
30.6%  Robin Soderling     Gregory Carraz     2004  Marseille             Hard     6-3 6-4             30  42.6%  19.6%  
30.4%  Jonathan Stark      Goran Ivanisevic   1997  Indian Wells Masters  Hard     7-5 6-3             34  37.7%  23.7%  
29.9%  Mark Philippoussis  Lionel Roux        1996  Paris Masters         Carpet   6-4 6-4             35  49.1%  11.7%  
29.8%  Mikhail Youzhny     Ivan Ljubicic      2007  Rotterdam             Hard     6-2 6-4             28  29.8%  29.8%  
29.8%  Gregory Carraz      Martin Verkerk     2004  Milan                 Carpet   6-3 7-6(3)          36  30.4%  29.2%  
29.0%  Jonathan Stark      Thomas Enqvist     1993  Halle                 Grass    6-4 6-2             27  37.8%  20.8%  
29.0%  Goran Ivanisevic    Boris Becker       1996  Antwerp               Carpet   6-4 7-6(5)          38  30.8%  27.3%

Andy, we’re missing you already.

Responding to Pressure at 5-5

In a post last week, I presented some data that suggested that servers weaken a bit under the pressure of a tiebreak.  It’s not a strong effect, but it’s a consistent one.  A possible explanation–that all that time between points gives servers a chance to psych themselves out, yet may not affect returners the same way–would apply almost as much to games toward the business end of a set, such as at 5-5 or 5-6.

In other words, if players don’t serve as well (or they return better) when things get tight, we’d expect to see more breaks toward the end of a set–more breaks than expected at 5-5, but perhaps fewer breaks than expected at 2-2.

This also opens up a possible method for evaluating players, as Carl Bialik has suggested.  If someone is losing more sets 5-7 than they are winning 7-5, it may be that they are wilting under the pressure of 5-5 more than the average player.  It would make sense if the players who consistently exceed tiebreak expectations also regularly outperform 7-5 expectations as well.

Within the constraints of the ATP’s Matchstats, 7-5 sets are a great way to identify these patterns.  While some 6-4 sets end with a break (or a break followed by a set-sealing hold), a 6-4 set doesn’t necessarily end that way.  But a 7-5 set must have reached 5-5 before one player took control.

If the hypothesis is correct that players get tighter on serve as the end of the set approaches, we would expect more 7-5 sets in the real world than simulations would imply.

To estimate the number of sets that should end 7-5, we need to take each player’s service points won from each match.  With that, we can calculate the probabilities that sets will end at any given score.  Repeat the process for every match over a period of time and we get a general idea of how often we should see 7-5 sets.

As it turns out, 7-5 sets should make up about 7.8% of all sets.  In fact, 8.8% of sets end 7-5.  Not a huge difference, but one that is fairly consistent from year to year.  Every year since 1991, where this dataset begins, there have always been more 7-5s than expected.  It certainly adds more weight to the claim that the balance of power swings to the returner toward the end of a tight set.

(My set-prediction model doesn’t exactly replicate reality, since players win more games than their service winning percentages predict, in large part because almost all servers are better in either the deuce or ad court, and the variance between them makes it more likely that the player wins a given service game.  When applying a crude adjustment for this, the crumbling-server hypothesis looks even better–the more games servers are predicted to win, the fewer predicted 7-5 sets.)

Identifying the unbreakable

This type of discussion must make you wonder: Which players are good as this stuff?  If it is true that late-set pressure results in more breaks, it seems obvious that some players are more prone to that pressure, and that other players take advantage of that pressure.

In an ideal world, we’d be able to identify some great 7-5 records, point out some 5-7 records, and have some great new insights into players.

As it is … we might.

As we saw last week with tiebreak analysis, we can’t simply count up a player’s 7-5 sets and compare that total to his 5-7 set losses.  Over the last three years, Andy Roddick won more than 55% of his 7-5 and 5-7 sets, but given the players he faced in those sets and their performances in those matches, he should have won 62%.

There are two ways to quantify player accomplishments in this department.  The first evaluates how well a player avoids losing 5-7 when he reaches 5-5; the other compares his ability to break for 7-5 against his proneness to being broken for 5-7.

Let’s call the first stat Five-Seven AVoidance, or FSAV.  For any player, we first add up the sets that reached 5-5, then count the sets that he won 7-5 or reached a tiebreak.  Then we use the general method described above to estimate how many times the player should have reached 5-5, and how many of those times he should have avoided 5-7.   Since the beginning of 2010, Kei Nishikori has avoided a 5-7 finish in about 92% of the sets in which he reached 5-5.  My model would have expected him to avoid 5-7 only about 84% of the time.  (The model expects that most players will avoid 5-7 about 82-90% of the time they reach 5-5.)

From those numbers, we discover that Nishikori lost 5-7 less than half as often as we would have expected him to.  No other player comes close to that mark. In everyday language, FSAV approximates how often a player was able to hold serve at 5-5 or 5-6.  Important skill, that.

The second stat is more narrowly focused on 5-5 sets that do not reach a tiebreak.  Let’s call this one the Seven-Five Outperformance Rate, or SFOR, similar to the TBOR (TieBreak Outperformance Rate) I introduced last week.

Here, instead of comparing 5-7s to all 5-5 sets, we compare 5-7s to 7-5s.  In other words: Is the player more likely to break for 7-5 or be broken for 5-7?  As with the previous stat, after calculating the simple rate (that is, number of 7-5 sets divided by total number of 7-5 and 5-7 sets), we compare that to the results that the model would have expected the player to post.

Bizarrely enough, our three-year leader in SFOR is Ernests Gulbis, who has won about 73% of his 7-5 and 5-7 sets, compared to the 50% the model expects of him.  (It’s even more impressive when compared to the 7% that I personally would have expected from him.)

As the highlighting of Gulbis suggests, these stats probably don’t yet belong in our everyday toolbox.  There simply aren’t very many 7-5 sets, even if–as I established above–there are a few more than we would expect.  For reference, there are almost twice as many tiebreaks as 7-5s.

And to keep Gulbis in the spotlight, it may be that winning 7-5 sets is more a function of getting to 5-5 when you shouldn’t.  Perhaps many of those 7-5s racked up by the Latvian came when he should have put the set away 6-2.  Once 5-5 came along, he finally decided to get serious.  As Gulbis himself might tell you, it’s anybody’s guess.

Follow the jump for FSAV and SFOR on about 50 or so of the most active players (including all tour-level matches (but excluding Davis Cup) since the beginning of 2010, sorted by FSAV) and decide for yourself.

Continue reading Responding to Pressure at 5-5

More New Toys on TennisAbstract.com

If you’re not yet using TennisAbstract.com as your go-to ATP results and stats resource, it might be time to switch.

Last week, I added tournament pages for every ATP and Challenger event.   For instance, you can now see every match (and all of its stats, and every player’s ranking) for any tournament, like last week’s Shanghai Masters, or the 2001 Milan event.  Most sites require that you click a pop-up window to get match stats.  Now you can compare every match (including qualies, for the last several years) by  ace rate, return points won, and dozens of other stats.

As with the player pages, the match table is sortable by almost every column, and a handful of filters in the left-hand sidebar allow you zero in on the matches you are interested in.

One feature I particularly like is the ability to select subgroups of players in the top-right table.  Each tourney page defaults to displaying event totals for the eight men who reached the quarters, so you can compare tournament-long statistics for the top contenders.  By clicking the links at the bottom of that table, you can get a quick glimpse of the seeds, qualifiers, or wild cards … or stats for everyone in the main draw.

These tournament pages are accessible from every player page (and vice versa).

There are several more new features that I hope you find interesting:

  • If you haven’t already seen TA’s current tournament pages, this is a great week to check them out, with three tournaments in action.  These pages show all completed matches, all upcoming matches, and jrank-derived odds given who is left in the draw.
  • A small recent adjustment to those pages is particularly handy.  For the projections, you can click on previous round names (e.g. “R32”) to see what the projections looked like at that point in the tournament.  Find out who scored the biggest upsets, who has increased their odds the most, or just hunt for my model’s most egregious errors.
  • Many of you have asked for regularly updated surface rankings.  Here you go!
  • The first of several new reports is the Head-to-Head Matrix.  See H2H records for the current top 15.  Click on the records themselves to see a fuller view of all of the relevant head-to-heads..
  • Also new: Current Rankings by Age.  Find the top players under 19, under 21, and under 23, along with those 28+, 30+ and 32+.  Use the dropdown menu to see similar reports for each year’s year-end rankings back to 1984.
  • Inspired by a long-ago blog post, see which players are performing the most and least consistently.
  • Finally, compare tournaments by field quality.  This is fascinating stuff, so much so that I made a report for the last 52 weeks of Challengers, as well.

All of the reports are accessible from the TennisAbstract.com home page.  I hope you enjoy them, and that you keep an eye out for the next wave of new toys, as well.

The Luck of the Tiebreak

Italian translation at settesei.it

Yesterday, I introduced a method to separate “good tiebreak playing” from “good tennis playing.”  For the most part, better players win more tiebreaks, but some guys win more tiebreaks than their general betterness would suggest.

That impels some questions: Why do those players win more tiebreaks than expected?  Do they do so regularly?  Is it their style of play?  Is it magical tiebreak-fu?  Is it possible to get through two paragraphs of a post about tiebreaks without mentioning John Isner?

Here are two hypotheses, which I will discuss in turn:

  1. Players who win more tiebreaks than expected do so because their game is suited to tiebreaks–which probably means that they serve particularly well.
  2. Player who win more tiebreaks than expected do so because, in some intangible way, they are very good at tiebreaks, perhaps due to clutch play, calm under pressure, or intimidation of their opponents.

The server advantage hypothesis

Earlier this week, I reported my results that players seem to serve worse (fewer aces, fewer points won) in tiebreaks than in the sets that preceded those tiebreaks.  If everyone declined the same amount, everyone would win roughly the number of tiebreaks we expect of them.

But much more likely, some players do not see their serves decline in tiebreaks.  Some might even improve in breakers.  If they do, they outperform the average, and they win more tiebreaks than expected.

Another angle here is that for some players, a bit of serve decline doesn’t matter much.  In last week’s match between Isner and Kevin Anderson, Isner won 79% of service points and Anderson won 77%.  Nearly one in five serves for the entire match went for aces–imagine how many more were service winners.  If both players served a bit more conservatively in the breakers, would we even notice?  When Fernando Verdasco starts playing it safe, it’s impossible not to notice–and easier to beat him in a breaker.  Perhaps that isn’t so for the likes of Isner.

These are appealing theories.  (Especially to me–I thought them up myself and believed in them for several hours.)  However, the numbers don’t bear them out.  There is no consistent statistical relationship between big serving and outperforming tiebreak expectations.  To take a few examples: Isner is a tiebreak monster–probably the best tiebreak-player of this generation.  Pete Sampras and Roger Federer are also among the greats.  Below average, though, are the likes of Ivo Karlovic, Sam Querrey, Marc Rosset, and Robin Soderling.

Let’s try another…

The intangibles hypothesis

If there is some intangible mental factor that causes some players to win more tiebreaks than they would otherwise, it’s impossible to test for that effect directly–if it were possible, it wouldn’t be intangible.

But, if some players had that tiebreak-fu, they would probably hold on to it for more than a single season.  For instance, when Novak Djokovic won an impressive 19% and 16% more tiebreaks than expected in 2006 and 2007, respectively, we should have been able to assume that he’s really good at tiebreaks, then predict that he would continue to excel in breakers in 2008.  Yet in 2008, 2009, and 2010, Djokovic barely outperformed average, winning 2% or 3% more than expected.  Ok, so we have a new forecast for Novak in the new decade: just a bit more tiebreak-magic than others.  Yet in 2011, Djokovic won 10% fewer tiebreaks than expected.  He’s 9% below average this year.

Sometimes, these changes might be explained by confidence.  But more often, they are just plain random.  While a few players (including Isner and Federer) put up great numbers every year, the vast majority of the field fluctuates, seemingly at random.  The year-to-year correlation for the population of players with at least 15 tiebreaks in two consecutive years (going back to 1991) is almost exactly zero.  (Set the bar higher if you wish; still barely distinguishable from zero.)

If tiebreak-related intangibles were widespread, there would be some kind of year-to-year correlation.  Perhaps a small number of players do have that magic, but for the purposes of most analysis, it is more accurate to assume that when it comes to a player’s overperformance in tiebreaks, his record one year has very little to do with how he’ll perform the next.

One tiny ray of light

This gets a bit frustrating after a while.  It seems that something should turn up as the cause of tiebreak excellence.  One simple stat does, to a small degree: number of tiebreaks played.  In other words, the guys who play the most tiebreaks tend to be the ones who beat expectations in those tiebreaks.

The connection that immediately springs to mind (after serving prowess, which we’ve already discarded) is practice.  The more match-court breakers you play, the better you become.  Isner, Federer, Sampras–they spend more time at 6-6 than almost anyone, and their tiebreak records are among the best.

Of course, the causation could go the other way.  Perhaps confidence in one’s tiebreak skills cause a player to be more comfortable going to a breaker.  While Djokovic or Andy Murray would press particularly hard for a break a game away from a 6-4 or 7-5 set, Isner is comfortable cruising into a tiebreak.

It’s a minor effect (r < 0.2), one that doesn’t explain anywhere near the observed year-to-year variance in tiebreak under- and over-performance.  But it’s something.

The implications of the luck of the tiebreak

What if overperforming or underperforming your expected tiebreak performance is, essentially, luck?  Or more generally (and safely) speaking, what if it says little about you likelihood of being good or bad at tiebreaks in the future?

For one thing, it would have a major impact on forecasting.  If tiebreak performance one year doesn’t predict tiebreak performance the next, players with extreme under- or over-performances one year can be expected to regress to the mean the following year.  It’s unclear exactly what that would mean in practice, but if you take away Feliciano Lopez‘s five tiebreaks more than expected in 2011, you’re left with a player who probably isn’t ranked within the top 20.  You would expect a decline as he stops winning quite so many breakers.

On a more practical level, these implications might aid the confidence of players with middling tiebreak records.  If you’re Andreas Seppi, who has a career losing record in breakers, you might be excused for some negativity when you reach 6-6 against, say, Karlovic.   But if you know your own poor record is only loosely related to your skills, and Karlovic’s record isn’t nearly as good as it looks, you might take a different approach.  Indeed, Seppi underperformed tiebreak expectations every year from 2006 to 2011, but has won more than expected this season–including one breaker each against Djokovic and Isner.

There’s plenty more work to do here–calling a couple of popular hypotheses into question hardly puts the issue to bed.  But if we’ve learned nothing else this week, it is that tiebreaks are not at all what they seem.  The players you think are masters are often middling performers, and regardless of the conventional wisdom, the breaker is about a whole lot more than a big serve.

Who Actually Excels in Tiebreaks?

Italian translation at settesei.it

I’ve never understood the fixation that some fans and commentators seem to have with tiebreak winning percentage.  Sure, winning tiebreaks is nice, but it seems obvious that the main cause of exemplary tiebreak performance is being good at tennis.  Though some players may in fact be better than others at this facet of the game, a big part of what tiebreak winning percentage tells us is about general tennis skill.

In other words, Roger Federer is very good at tiebreaks because he is very good at serving and returning, the same skills that get him so many wins, regardless of whether any of the sets go to tiebreaks.

If we ignore tiebreak winning percentage, what are we left with?  It’s still tempting to wonder whether some players have a kind of special skill–calm under pressure, a particularly consistent serve–that leads them to outperform expectations in breakers.

The key word there is “expectations.”  Given Federer’s general ability on the tennis court, we should expect him to win most tiebreaks–for example, two of the last three breakers he’s played came against Stanislas Wawrinka, who he should beat regardless of the format.  But our intuition will fail us if we look at Federer’s match record and try to estimate how many tiebreaks he should have won, then compare the “should” to the “did.”

Expected tiebreaks

Sounds like something computers do better than humans.  Given a player’s percentage of service and return points won in a certain match, we can estimate how likely he was to win a tiebreak–on the assumption that his performance level stayed the same throughout the match.

If two players are equally matched, each one would be “expected” to win 0.5 tiebreaks.  That’s nonsensical for a single match, but over the course of this season, we see that of John Isner‘s 53 tiebreaks, the algorithm would expect him to win 29.  In fact, he has won 38, exceeding expectations (in raw terms, anyway) more than anyone else on tour this year.

This gives us two stats that offer more insight into a player’s tiebreak performance than “tiebreaks won” and “tiebreak winning percentage.”  The raw number, the difference between actual tiebreaks won and expected tiebreaks won, tells us how many additional sets a player has taken because of his tiebreak performance.  Call it TBOE: TieBreaks Over Expectations.  A similar rate stat is derived by dividing TBOE by the number of tiebreaks, allowing us to compare players regardless of how many tiebreaks they played.  Call that one TBOR: TieBreak Outperformance Rate.

As we’ve seen, Isner is the 2012 king of TBOE, performing well in tiebreaks and playing far more of them than anyone else on tour.  Yet three players–Steve Darcis, Andy Murray, and Jurgen Melzer–have done better by TBOR, exceeding expectations at a greater rate than Isner has.  Darcis is particularly remarkable, winning 16 of his 19 tiebreaks through last week, despite his serve and return rates in those matches suggesting he should have won only 10 of them.

(And in Vienna on Monday, he won another one, extending his already untouchable lead over the pack.)

I’ll have more to say about this tomorrow, including a look at just how much meaning we can extract from TBOE and TBOR.  In the meantime,  look after the jump for the current 2012 leaderboard–through Shanghai, sorted by TBOR, minimum 15 tiebreaks.

Continue reading Who Actually Excels in Tiebreaks?

What Matters in Tiebreaks?

Italian translation at settesei.it

Players and fans tend to look at tiebreaks as a unique part of the sport of tennis, perhaps one susceptible to special skills.  The ATP website last week devoted an article to what those skills might be.  Players generally seemed to agree that it was nice to have a good serve, and a good return would also be handy.  Clearly, more analysis is needed.

Let me give you my hypothesis.  Tiebreaks are pressure packed, and pressure can affect any part of a player’s game.  But in general, they should impact some parts more than others.  You could make the case for either side of the ball–on the one hand, serving is a more “automatic” activity; on the other, there’s more time to think before each serve, and thinking can be dangerous when the pressure is on.  This is where it’s nice to have some data.

I found 388 tiebreaks from the last eight ATP slams.  For each one, I compared each player’s winning percentage on serve during the first 12 games of the set to his winning percentage on serve during the tiebreak.  If players were robots, there might be a difference between the set and the tiebreak for any given match, but in general, the numbers should be the same.

But players aren’t robots.  As it turns out, players win more return points than expected during tiebreaks.  The difference is noticeable if not enormous: about one more return point than expected every three matches.

Thus, tiebreaks are different from the sets that precede them in one of two ways.  Either some players are unable to serve up to their usual standard during tiebreaks, or some players manage to raise their return game in tiebreaks.

A breakdown by tournament suggests the answer.  The difference between server winning percentage in sets and tiebreaks is about the same for the Australian Open, the US Open, and Wimbledon, but is less than half as much at the French.  It seems, then, that faster courts give returners a bigger boost in the breaker.  A more likely interpretation is that servers are unable to hold on to their advantage on faster courts.  There’s less of an advantage to lose on clay.

My hypothesis at the outset focused on pressure, and combined with the numbers, it suggests that players are more affected by pressure when serving than when returning.  It’s also possible that players find it more difficult to get into a serving rhythm with only two serve points at a time.  It’s also possible that returners are less likely to concede aces during tiebreaks, meaning that the same serve quality and return potential results in more return points won.

Whether it is a matter of server timidity or returner aggression, there are certainly fewer aces in tiebreaks.  In these 388 tiebreaks, there were 83 fewer aces than would be expected if players kept acing at the rate of their first twelve games.  Given the relative infrequency of aces, that’s a more striking decrease than that of service winning percentage in general.

This analysis is hardly the final word.  But for aspiring tiebreak masters, it does offer a slightly more specific prescription than “get better at tennis.”  Rather than assuming that the tiebreak is all about the serve, recognize that returners have a slight advantage.  On serve, players can improve simply by ignoring the pressure (easy, right?) and serving as well as they did during the set.  When returning, players can be more aggressive in the knowledge that in general, servers will not be.

After all, a good serve may be the key to tiebreak success, but only if the serve is as good as usual in the breaker.

The Case for the Race

Last week, Peter Bodo argued in favor of giving the ATP year-to-date “Race to London” more weight over the traditional rolling 52-week ranking.  It’s a relevant point right now, when Roger Federer leads in the 52-week tally, but Novak Djokovic dominates in the year-to-date numbers.

In other words, Fed is racking up more records at #1 while Djokovic will almost certainly go in the books as the top player of 2012.  Bodo doesn’t go far enough: The old-fashioned rankings are weird, confusing, and–why stop there?–bad for tennis.

In most of the world’s most popular sports, everybody starts the year with a clean slate.  Imagine if a baseball team opened their schedule having to “defend” their previous year’s April winning streak.  Or if your favorite football team started the season seventh in their division.  This is essentially what happens when the ATP heads to Australia in January, altering rankings only when players do something different than what they accomplished last year.

Not only does this make it hard too root for underdogs in tennis, it makes it hard for the underdogs themselves.  You may not pity Bernard Tomic, but he surely spoke for many mid-pack players when he spoke about the mental challenge of defending points, not just beating world-class tennis players.  In other sports, hope springs eternal.  In tennis, it’s an immense struggle to crack the top 20 for a single week.

The greatest advantage of the Race is that it is so easy to understand.  Tomas Berdych reached the semifinals last week, so he gets 360 points.  Simple as that.  No comparison to last year’s totals, no concern about whether points are going on or coming off at a stagger from last year because of the Olympics, and–blessedly–nary a mention of zero-pointers.  Tennis rankings will always be more than simply incrementing the win column, but this is pretty close.

Bodo cites the unpredictability of the turn-of-the-century Australian Open as a reason why the Race didn’t catch on.  It doesn’t make sense to have Petr Korda atop much of anything, right?  In fact, that’s the beauty of it.  The 52-week rankings simply entrench the Big Four in our minds, while an emphasis on the race would make us think twice the next time a Korda, or a Marcos Baghdatis, or a Marin Cilic, makes a January splash.  Fans are smart enough to realize that leading the rankings early in the season isn’t the same as finishing at the top.

Some version of the 52-week ranking system will never go away, and that’s how it should be.  It’s purpose is to rate players–for seeding, and even more importantly, for tournament entry.  As I’ve written at length, it’s not a very good system for that purpose.  If we focused on the Race instead, the tournament entry methodology could become much more sophisticated and do a better job of putting the best players on court every week.

With its increasing focus on qualification for the Tour Finals, the ATP has taken some big steps toward presenting tennis as a high-stakes, year-long season, not merely a disjointed mishmash of events competing for attention.  Highlighting the Race rankings would make for much more spectator enjoyment.  It might even open the door to more important discussions of the chaotic tour schedule, eventually offering fans a coherent tennis season to follow every week.

Withdrawal Effects

Italian translation at settesei.it

Yesterday, Mardy Fish withdrew from his fourth-round match against Roger Federer.  As we saw earlier today, Federer may gain some benefit from the extra rest, but with the additional rest days built into the grand slam schedule, Roger runs the risk of getting too little time on court.

What’s the true effect, then?  Will the extra rest make Federer an even bigger favorite in his quarterfinal match against Tomas Berdych?  Or will match-court rust hold him back?

As it turns out, there is virtually no effect.  Players handed a walkover win almost exactly half of their next matches, and a closer look at those matches reveals that 50% is about what we would’ve expected from them, walkover or not.

To hunt for a potential relationship, I found 139 ATP main draw walkovers since 2001 where the winner went on to play another match at the same tournament–in other words, excluding finals.  While it may seem that players tend to withdraw when they’re least likely to win a match (as with Fish this week, or like the other two players to withdraw before facing Federer this year), there’s nothing to that theory, either. The average pre-match odds of the withdrawing player are about 51%.

Thus, we can work on the assumption that there’s little bias in the pool of 139 men who received a free pass to the next round.  For every Federer, there’s a Donald Young advancing uncontested over Richard Gasquet.  Balancing the withdrawals of players without a chance may be higher-ranked players who are quicker to withdraw because their success allows them to play it safe and make longer-term decisions.

In the 139 follow-up matches, our players went 67-72, winning 48.2% of the time.  Prematch predictions (generated by Jrank) would have projected a winning percentage of 48.9%.

If we narrow the search to slams, we get a nearly-meaningless pool of only 12 matches.  The player coming off the walkover went 6-6; prematch numbers would’ve predicted 7-5.  Perhaps rust does play a small part; considerably more likely is that the walkover simply doesn’t affect the beneficiary.

For Federer fans, though, there’s little reason for concern.  This is the ninth time in his career he’s advanced via walkover, and he’s only lost the next match twice.  One of those was in 2002.  The other was in Indian Wells in 2008.  The man who beat Fed?  Mardy Fish.

At Slams, Do Shorter Matches Lead to Later Success?

Italian translation at settesei.it

Over the weekend, Tom Perrotta made the claim that grand slam champions such as Roger Federer and Serena Williams got that way, in part, by keeping early matches short.  In his words: “They’re great at not being exhausted.”

This is intuitively appealing, especially after a third round in which Federer and Novak Djokovic barely broke a sweat, while Andy Murray, David Ferrer, and Tomas Berdych each dropped a set.  (Even Juan Martin Del Potro was forced to a tiebreak by Leonardo Mayer.)

Before we get carried away, let’s find out what the numbers tell us.  As we’ll see, slam champions usually are the men who spent fewer minutes on court getting to the final.  It’s less clear, though, whether there is a causal link: After all, a better player should have an easier time of it in the early going.

The ATP has complete match-length numbers for our purposes going back to 2001.  That gives us enough data to look at the last 47 slams.

In the last 47 grand slam finals, the favorite (defined simply as the guy with the better ATP ranking) won 33 times.  In 6 of the 14 slam finals in which the underdog won, the underdog had spent less time on court in his previous six matches than the favorite did in his.  Pretty good, huh?

One problem: Six other times, the favorite won the final despite having spent more time on court.  So if you have to pick between the favorite and the better-rested player, there’s nothing in this sample to differentiate your choices.

A more positive takeaway occurs when the favorite has spent less time on court.  There have been 35 such finals since 2001, and the better-rested favorite has gone 27-8.  Most of the time, the favorite has reached the final expending less effort than his challenger did, and perhaps we can view that as a confirmation of his status as favorite.

(If you prefer games played to minutes on court, perhaps in deference to the Nadal and Djokovic speed of play, rest assured the numbers come out almost identical.  There are a few cases where players spent less time on court but played more games–or vice versa–but if the analysis above replaced minutes with games, the results would be the same.)

All else equal, we’d bet on the finalist who has spent less time on court.  But that doesn’t necessary imply that the better-rested player is more likely to win the final because he hasn’t spent as much time on court.  That seems particularly true at slams, where players almost always get a day of rest between matches, and where top contenders almost never play doubles.

More likely is that one player spent less time on court because he is the favorite.  Surely no one was surprised when Federer breezed past Verdasco, and few were surprised that Murray needed more time to put away Feliciano Lopez.  Time on court is a clue that one man is playing better tennis, regardless of whether the extra rest aids him in later matches.

We can probably all agree on a safer claim: All else equal, the world’s best would certainly prefer to spend less time on court, even if it doesn’t boost his odds of winning the final.  It might be gratifying to fight off an early challenge, but surely it’s more enjoyable to remind the rest of the field why you’re the favorite.