The Most (and Least) Consistent Players on the ATP Tour

“Consistency” is one of the many terms that commentators frequently use but rarely define.  It’s often misused, too: we say we want a player to be more consistent, when we really just want him to stop playing badly.

To me, consistency for a tennis player is similar to the notion of “playing up to his ranking.”  In other words, if a player is consistent, he usually beats players ranked lower, and he usually loses to players ranked higher.  No player is perfect in this regard, but clearly, some are much more reliable than others.

A recent poster boy for inconsistency is Ernests Gulbis.  At Roland Garros, he lost to Blaz Kavcic, ranked 82nd in the world.  That was on clay, a surface on which Gulbis had posted some excellent results the previous year.  Two months later, in Los Angeles, he beat Juan Martin del Potro, someone he shouldn’t have even challenged on a hard court.

Quantifying consistency

With my player rankings and match prediction system, I’m able to assign a win probability to each player for every match.  For instance, when Ivan Dodig beat Nadal last week, I had given him a 14.4% chance of doing so.  As you might imagine, that’s a major upset–as I wrote the next day, it was the 10th-biggest upset of the season.

In these terms, an ideally consistent player will never be on either end of an upset.  If he is the favorite, he wins; if he is the underdog, he loses.  In practice, no tour-level player accomplishes this, though over the last two years, Florent Serra and Eduardo Schwank have come very close.

I’ve come up with a metric to measure consistency.  This is how it works:

  • Gather a list of all ATP-level matches for the desired time period.  (Today, I’m using everything from January 2010 through Montreal last week.)
  • Eliminate matches that ended in retirement or walkover, as well as those where we don’t have enough information to make an educated prediction.  (e.g. the first few comeback matches of Tommy Haas, or one with a wildcard playing his first professional match.)
  • For each player, count how many matches he played.
  • For each player, find the matches where he was the favorite and lost, or was the underdog and won.
  • For each of those matches, take the probability than the eventual winner would win (e.g. 18% — always under 50%), multiply by 100 (e.g. 18, not 18%),  subtract it from 50 (e.g. 50 – 18 = 32), and square the result (e.g. 32*32 = 1024).
  • Sum all of the squares, then divide by the number of total matches–not just the ones where the favorite lost.

Whew!  In something more like layman’s terms, we’re taking all the upsets a player was involved in, coming up with a number to represent how big (or surprising) the upset was, then averaging the results.

Using this method, we give big upsets considerably more weight than mini-upsets.  If a player had a 45% chance of winning a match and ends up winning, it barely counts as an upset–and this system treats it accordingly.  By dividing by the total number of matches, we give consistency credit to players who win the matches they’re “supposed to” win, and lose those they are supposed to lose.

Most importantly, the numbers this algorithm spits out are completely believable, matching up well with the conventional wisdom of which players are consistent and inconsistent.

The consistency of the top ten

The most consistent player on the tour, since the beginning of 2010, has been … Florent Serra.  Amazingly, Igor Kunitsyn comes in second.  But I doubt many of you care much about the consistency of guys like that.

Let’s start with the current top 10, ranked from most to least consistent:

Player              Upsets  Matches    Up%  UpsetScore  
David Ferrer            25      119  21.0%          55  
Rafael Nadal            16      131  12.2%          68  
Novak Djokovic          18      119  15.1%          69  
Roger Federer           21      123  17.1%          69  
Jo-Wilfried Tsonga      20       80  25.0%          75  
Mardy Fish              19       77  24.7%          82  
Tomas Berdych           32      113  28.3%         106  
Gael Monfils            24       89  27.0%         107  
Robin Soderling         23      115  20.0%         130  
Andy Murray             24       97  24.7%         151

The relevant column is the rightmost, “UpsetScore,” which is the result of the algorithm described above.  Ferrer has been part of more upsets than any of the top three (“Up%”), but his upsets are more minor.  Except for losses to Ivo Karlovic and Jarkko Nieminen early in the year on hard courts, Ferrer has not lost a match he had a 60% or better chance of winning.

The two ends of this list certainly line up with what I would have expected: Ferrer and Nadal are rock-solid (last week’s loss to Dodig notwithstanding), while Soderling and Murray both can be picked off by anybody, and frequently threaten higher-ranked players.

Right now, you may be tempted to put Djokovic higher on the list–after all, he’s ranked #1 and he’s beating everybody.  However, in the slightly longer term of 20 months, his movement around the top three has included some unexpected results, like losing to Ljubicic at Indian Wells last year, and victories over Federer and Nadal before his ranking suggested he would do so.

Tour wild cards

Outside of the top 10, there are a handful of players who are almost impossible to predict.  Some names that come to mind are Marcos Baghdatis, Ernests Gulbis, and Nikolay Davydenko, men who can take out one of the top three on a good day (well, maybe not Gulbis), but can lose to a qualifier on the next.

Player                 Upsets  Matches  Up%  UpsetScore  
Nikolay Davydenko          32       73  44%         273  
Marin Cilic                27       91  30%         180  
Marcos Baghdatis           38       89  43%         177  
Olivier Rochus             20       52  38%         164  
Milos Raonic               16       41  39%         164  
Juan Martin del Potro      11       48  23%         154  
Andy Murray                24       97  25%         151  
Jurgen Melzer              28       96  29%         150  
Fernando Verdasco          40      104  38%         150  
Ivan Ljubicic              27       69  39%         149  
Florian Mayer              30       72  42%         147  
Samuel Querrey             26       66  39%         146  
Andrei Goloubev            24       55  44%         143  
Ernests Gulbis             22       69  32%         140  
Jeremy Chardy              31       65  48%         133  
Juan Monaco                26       73  36%         131  
Robin Soderling            23      115  20%         130  
Michael Llodra             26       67  39%         130  
Rainer Schuettler          15       42  36%         119  
Mikhail Youzhny            25       78  32%         116

The “upset score” number tells the story for Davydenko.  The man who beat Nadal at the beginning of the year and threatened Djokovic last week recently suffered defeat at the hands of Cedrik-Marcel Stebe (twice!) and Antonio Veic.

While no one is in Davydenko’s league, names like Cilic, Baghdatis, Murray, and Verdasco seem appropriate.  Verdasco, along with Melzer and Milos Raonic suggest a flaw in this approach: the algorithm reads very fast improvement or decline as inconsistency, which isn’t quite right.  Yes, Raonic has shocked the tennis world repeatedly this season, but he hasn’t mixed in too many disastrous losses alongside the surprise upsets.  I tinkered with ways to include that in the model, but nothing worked very well.

A couple more interesting notes from the “most inconsistent” players are found in the upset percentage column.  Guys like Davydenko, Baghdatis, Mayer, Goloubev, and Chardy are involved in upsets nearly half the time.  Chardy is highest in that category.  In fact, if I expanded the study to challenger events, he might rocket to the top of this list, as he plays quite a few, and often manages to lose against players outside the top 100.

The consistent ones

The flip side is considerably less star-studded.  In the 20 most-consistent players of the last 19-20 months, Ferrer is the only top-10 guy present, though #11 Nicolas Almagro is there as well.

Here’s my seat-of-the-pants theory.  In this sense, “consistent” isn’t good.  Yes, “consistent” sounds good, especially when “inconsistent” means Davydenko losing to Antonio Veic or Mayer falling to Federico del Bonis.  But inconsistent means Davy beating Federer and Mayer beating Soderling.  So, the players who show up on as “most consistent” are in fact consistent, but they are also mediocre.  Their consistency (perhaps a mental advantage) has helped them move up from the top 200 to the top 50 or 100, but that’s all they can do.

Ferrer and Almagro are good examples of this, actually.  Neither has the weaponry that makes commentators say, “This guy could be number one!”  But they’ve earned their rankings by regularly reaching the quarters and semis of tournaments, not suffering the boneheaded losses that afflict the likes of Cilic and Baghdatis.

All that said, here’s the list:

Player            Upsets  Matches  Up%  UpsetScore  
Florent Serra         11       56  20%          23  
Igor Kunitsyn         14       40  35%          33  
Ilia Marchenko        14       46  30%          40  
Potito Starace        28       81  35%          46  
Victor Hanescu        26       77  34%          50  
Tobias Kamke          12       41  29%          52  
Andreas Seppi         24       81  30%          53  
Julien Benneteau      23       59  39%          53  
Viktor Troicki        25      101  25%          54  
David Ferrer          25      119  21%          55  
Fabio Fognini         18       71  25%          55  
Pere Riba             13       41  32%          56  
Lukas Lacko           14       44  32%          57  
Igor Andreev          17       62  27%          58  
Lukasz Kubot          26       63  41%          59  
Nicolas Almagro       22      112  20%          59  
Frederico Gil         15       40  38%          60  
Denis Istomin         25       76  33%          65  
Jarkko Nieminen       25       74  34%          66  
John Isner            21       82  26%          67

These lists hardly represent the final word on who is or is not consistent–for one thing, I haven’t said anything about consistency within matches, which may be a completely separate issue.  But this approach does, I think, provide some insight into who is more likely to be part of an upset, and suggests that consistency might not be such a good thing after all.

Andy Murray and The Worst Upsets of the Year

On Tuesday in Montreal, Andy Murray played an ugly, listless match against world #35 Kevin Anderson, losing 6-3 6-1.  While Murray has played some solid matches this year and is in no immediate danger of losing his top-four ranking, the Anderson loss is hardly the first disaster of his season.  Back in Indian Wells and Miami, he managed to lose to Donald Young and Alex Bogomolov in successive matches.  Ouch.

Using my rankings and match projection system, I’ve generated win probabilities for every ATP match of the season.  Combined with match outcomes, that allows us to find the upsets that were least expected on that surface, at that time.

Pre-match, my system gave Anderson a 16.3% chance of beating Murray–only a smidge better than Dodig’s 14.4% against Nadal.  (My system has never given the South African much credit; his hard-court ranking right now is #58.)  In fact, Anderson was the 4th-biggest underdog going into the 2nd round, ahead only of Dodig, Michael Russell, and Vasek Pospisil.

As it turns out, Anderson’s victory was the 14th-biggest upset win of the ATP season.  (I took out retirements and “comeback players,” like Fernando Gonzalez and Tommy Haas, whose rankings aren’t very predictive.)  That’s 14 out of nearly 1,700.

But, as you might guess, 14th-best of the season isn’t enough to be 1st with Murray on the losing end.  The Murray loss to Young in March is the biggest upset of the year–Donald entered the match with an 8.6% chance of winning.  The Bogomolov match comes in 4th overall; the American had a 10.1% chance before play began.

Edit: This is what I get for writing a draft the night before!  Dodig’s upset victory comes in tied for #10 on the season, pushing Murray/Anderson down one more spot on the list.  Nadal will go home having suffered the biggest upset of the Rogers Cup, though he played far and away better than Murray did to achieve the same outcome.

The biggest upsets of the year

I couldn’t possibly give you those numbers without following through with a complete table.  Here are the 36 matches where the winner entered the match with less than a 20% chance of winning.  This list is through last week’s matches, so it doesn’t yet show Murray’s latest meltdown and Dodig’s shocker.

(This site doesn’t show wide tables very well; click here for a clearer version.)

P(UPSET)  WINNER                 LOSER               TOURNEY          SCORE
 8.6%     Donald Young           Andy Murray         Indian Wells     7-6(4) 6-3
 9.4%     Bernard Tomic          Robin Soderling     Wimbledon        6-1 6-4 7-5
 9.7%     Jimmy Wang             Igor Kunitsyn       Newport          4-6 7-5 6-2
10.1%     Alex Bogomolov         Andy Murray         Miami            6-1 7-5
11.6%     Stephane Robert        Tomas Berdych       French Open      3-6 3-6 6-2 6-2 9-7
13.1%     Milos Raonic           Mikhail Youzhny     Australian Open  6-4 7-5 4-6 6-4
13.8%     Denis Kudla            Grigor Dimitrov     Newport          6-1 6-4
14.0%     James Ward             Stanislas Wawrinka  Queen's Club     7-6(3) 6-3
14.1%     Federico del Bonis     Florian Mayer       Stuttgart        6-2 6-3
14.4%     Jo-Wilfried Tsonga     Rafael Nadal        Queen's Club     6-7(3) 6-4 6-1               

P(UPSET)  WINNER                 LOSER               TOURNEY          SCORE
15.0%     Jan Hernych            Thomaz Bellucci     Australian Open  6-2 6-7(11) 6-4 6-7(3) 8-6
15.2%     Nikolay Davydenko      Rafael Nadal        Doha             6-3 6-2
15.6%     Andrey Kuznetsov       Marcos Baghdatis    Casablanca       6-4 4-6 6-4
16.4%     Flavio Cipolla         Andy Roddick        Madrid Masters   6-4 6-7(7) 6-3
16.6%     Lukas Rosol            Jurgen Melzer       French Open      6-7(4) 6-4 4-6 7-6(3) 6-4
16.8%     Antonio Veic           Nikolay Davydenko   French Open      3-6 6-2 7-5 3-6 6-1
17.0%     Sergei Bubka Jr.       Daniel Gimeno       Doha             6-0 6-3
17.1%     Leonardo Mayer         Marcos Baghdatis    French Open      7-5 6-4 7-6(6)
17.6%     Ivan Dodig             Robin Soderling     Barcelona        6-2 6-4
17.6%     Michael Yani           Dudi Sela           Newport          7-6(5) 6-3                   

P(UPSET)  WINNER                 LOSER               TOURNEY          SCORE
17.7%     Frank Dancevic         Feliciano Lopez     Johannesburg     6-7(5) 6-2 7-6(8)
17.8%     Alexander Dolgopolov   Robin Soderling     Australian Open  1-6 6-3 6-1 4-6 6-2
17.9%     James Ward             Samuel Querrey      Queen's Club     3-6 6-3 6-4
17.9%     Jan Hernych            Sergey Stakhovsky   Halle            6-3 6-7(5) 7-6(8)
18.2%     Jan Hernych            Denis Istomin       Australian Open  6-3 6-4 3-6 6-2
18.3%     Jo-Wilfried Tsonga     Roger Federer       Wimbledon        3-6 6-7(3) 6-4 6-4 6-4
19.0%     Milos Raonic           Fernando Verdasco   San Jose         7-6(6) 7-6(5)
19.0%     Lukasz Kubot           Gael Monfils        Wimbledon        6-3 3-6 6-3 6-3
19.2%     Federico del Bonis     Sergey Stakhovsky   Stuttgart        6-4 6-3
19.3%     Lukasz Kubot           Nicolas Almagro     French Open      3-6 2-6 7-6(3) 7-6(5) 6-4    

P(UPSET)  WINNER                 LOSER               TOURNEY          SCORE
19.3%     Bernard Tomic          Feliciano Lopez     Australian Open  7-6(4) 7-6(3) 6-3
19.7%     Thomaz Bellucci        Andy Murray         Madrid Masters   6-4 6-2
19.7%     Pavol Cervenak         Victor Hanescu      Stuttgart        6-3 7-6(6)
19.7%     Richard Gasquet        Roger Federer       Rome Masters     4-6 7-6(2) 7-6(4)
19.9%     Rajeev Ram             Grigor Dimitrov     Atlanta          6-4 6-4
20.0%     Philipp Kohlschreiber  Robin Soderling     Indian Wells     7-6(8) 6-4

Do Points Get Shorter as the Match Progresses?

On Friday, some interesting ideas were batted around in the comments to my post on the 61-shot rally.  One of the simpler ones boils down to the question that titles today’s post: Do points get shorter as the match progresses?

Two forces seem to work in opposite directions:

  • As players get used to each other’s games (and specifically their serves), more balls get returned.  Before looking at the numbers, I would’ve bet that this was the case, meaning that aces and service winners decline as you go deeper into a match.
  • The longer the match, the more tired the players.  Tired (or even slightly injured) players take more risks and probably have shorter rallies.

To answer the question, I looked at rally lengths shown in Pointstream at the last three grand slams.  That gives us close to 250 men’s matches, all best-of-five sets.

The short, unsatisfying conclusion is: The results are mixed.  At Wimbledon and Roland Garros, rally length increased later in matches–as much as 10% in London and 20% in Paris.  At the Australian Open, the result was the exact opposite, with rally length decreasing substantially.  Perhaps rally length increases in most cases, except when it is extremely hot or the players are not yet in top shape.  The blistering heat in Melbourne is certainly a plausible reason for a decrease in rally length.

As we’ll see when we move into more specific findings, the results get even more jumbled.  It seems that points generally get longer as a match progresses, but not necessarily because players read and return serves better.  While rally lengths increase, the number of one-stroke points (aces, service winners, service return errors) often increases, as well.

Follow the jump for my methodology and full results.

Continue reading Do Points Get Shorter as the Match Progresses?

The Simon/Monfils 61-Shot Rally: In Perspective

A couple of weeks ago, Gael Monfils and Gilles Simon made the unorthodox decision of extending their warm-up into the first game of the match.  Or somthing.  At 40-40 in the opening game, they counterpunched each other into oblivion, needing sixty-one shots before Monfils finally sent a slice long to end the point.

If you haven’t seen it, or you suffer from insomnia, click the link here.

What might be most remarkable about the rally is that, when Monfils made his error, there was no sign of the point drawing to a close — it isn’t hard to imagine those two hitting another 61 shots like that.  But even at 61, it’s an awfully long point.

So (asks the statistician) … how long was it?  Rally length is not widely available for ATP matches.  But thanks to IBM Pointstream, I do have rally length for each point on a Hawkeye court from the French Open.  (I’ve played around a bit with those numbers.)

From the French Open, we have roughly 20,000 men’s points to look at, which doesn’t count double faults.  About 35% of those points lasted only one stroke: an ace, a service winner, or an error of some sort on the return.  Only 15% of the points went 8 strokes or longer, and fewer than 10% reached 10 strokes.

In the entire tournament, only 12 rallies hit the 30-shot mark–only halfway to the Simon/Monfils level.  You won’t be surprised at most of the names involved in those dozen extreme points:

Mardy Fish    Gilles Simon       38  
Andy Murray   Viktor Troicki     37  
Gilles Simon  Robin Soderling    36  
David Ferrer  Sergiy Stakhovsky  33  
Andy Murray   Viktor Troicki     33  
David Ferrer  Gael Monfils       33  
Rafael Nadal  Pablo Andujar      32  
Tobias Kamke  Viktor Troicki     31  
David Ferrer  Sergiy Stakhovsky  31  
Rafael Nadal  Andy Murray        31  
Rafael Nadal  Pablo Andujar      30  
Andy Murray   Viktor Troicki     30

Both Simon and Monfils make an appearance, with Ferrer, Murray, and Nadal showing up multiple times.  What surprises me a bit are some of the guys who hung in there with the counterpunchers, especially Fish and Troicki.

In any event, 61 shots still stands out as a once-in-a-blue-moon accomplishment.

WTA rally length

Incidentally, you might suspect (as I did) that some WTA players would slug it out even longer.  Again using Pointstream data from the Hawkeye courts at the French, it turns out that ladies only reached the 30-shot threshold twice.  First, Marion Bartoli went to 33 against Olga Govortsova, and Na Li got to 32 shots against Silvia Soler-Espinosa.  The tongue-tying Wozniacki-Wozniak matchup comes in third, with a 28-stroke rally.

Wimbledon rallies

While we’re at it, let’s check the Wimbledon data.  Surprise, surprise–tied for the longest rally of the tournament is a 31-stroke exchange between Juan Martin del Potro and … Gilles Simon.  In fact, that match featured four of the 20 longest rallies of the tournament.

Also notable is Novak Djokovic, who reached 31, 30, and 29 against Bernard Tomic, and 25 (twice) and 24 against Marcos Baghdatis.

The true oddity in the top ten is John Isner and Nicolas Mahut, who somehow took a break from aces and errant groundstrokes to go 25-deep.  It was the  only point of the match that went longer than 12 shots.

 

Doubly Lopsided Matches

Italian translation at settesei.it

On Sunday, Novak Djokovic beat Rafael Nadal by a somewhat unusual score: 6-4 6-1 1-6 6-3.  A four-setter in the final doesn’t raise any eyebrows, but a 1-6 set … that’s a bit of a head-scratcher, especially on a fast surface.  Wimbledon is better known for server domination, which means 6-4’s, 7-5’s, tiebreaks, and the occasional 70-68.

The Djokovic-Nadal score got me curious about two questions:

  1. How often does a player lose a set 1-6 (or even 0-6) yet still win the match?
  2. How often does a player both win and lose a lopsided (6-1 or 6-0 ) set?

(Note: Yes, sometimes a 6-1 set includes only two breaks, in which case it is similar to a 6-2 set.  Yet 6-1/1-6’s are far less frequent that 6-2/2-6’s.  It would be nice to distinguish “two-break” 6-1’s from “three-break” 6-1’s, but for now, all we can do is enjoy the trivia and accept the limitations.)

Bi-directional bagels

First things first.  As we might guess, scores such as these are extremely rare at Wimbledon.  This year, the final was one of only two such matches.  The other was Xavier Malisse’s second-round win over Florian Mayer, which went in the books as 1-6 6-3 6-2 6-2.  Last year, only one Wimbledon match qualified: a first-rounder between Victor Hanescu and Andrey Kuznetsov.  Oddly enough, Hanescu dropped the third set 1-6 after splitting two tiebreaks.  In neither of these matches did the winner take his own lopsided set, as Djokovic did.

In this department, Wimbledon remains unique among the majors–it isn’t just a matter of “clay” and “everything else.”  At this year’s Australian Open, there were eight matches with 1-6 or 0-6 scores; last year there were 11.  At the 2010 US Open, there were six.  These scores are more common at the slams, because the five-set format makes it more likely that the loser of an early set (by any score) can come back to win the match.

The numbers

Last year, there were roughly 2600 tour-level matches that were played to their conclusion.  (That is, neither player retired.)   Of those, about two-thirds were straight-set victories, leaving us with 871 matches that went three sets (or five, at the slams).

Of those 871, only 94 matches contained a 1-6 or 0-6 set, and only 30 included a “lopsided” set in favor of both players, as in the Nadal-Djokovic final.  Both have been somewhat less frequent so far this year; in 1546 matches, 48 saw the winner lose a lopsided set, and 11 saw both players lose a lopsided set.  Combining the two years of data, the likelihood that any given match will include a 6-1 (or 6-0) and a 1-6 (or 0-6) is almost exactly 1 in 100.  Again, the five-set format of the slams increases the probability a bit, while the fast courts at Wimbledon have the reverse effect.

The offenders

Which players find themselves in these roller-coaster matches?  To answer that question, we have to stick with the less-specific filter of matches that include a 1-6 or 0-6 set.  If we also require a 6-1/6-0 from the winner, there isn’t enough data to make things interesting.

One might guess that the strongest servers would be far down the list, while counterpunchers populate the top.  That isn’t the case.  The players who are known for mental lapses–regardless of their serving and returning skills–seem to dominate the upper tier.

Looking at all tour-level matches from 2007 through last week, we find that Andy Murray takes the cake.  He has played in 18 of these matches, dropping a lopsided set in 10 of his victories, while winning a lopsided set in 8 of his losses.  Murray is in a class by himself–number two on the list is Guillermo Garcia-Lopez, at 13.  In third place is Djokovic, with 12 (he is 8-4 in such matches), though the Wimbledon final was the only occurence so far in 2011.

Twelve men are clustered at 10 and 11 of these matches, and the list features a lot of Frenchmen, and several other players known for questionable mental strength:

  • 11: Julian Benneteau, David Ferrer, Fabio Fognini, Fernando Verdasco
  • 10: Thomaz Bellucci, Mardy Fish, Richard Gasquet, Paul-Henri Mathieu, Phillipp Petzschner, Tommy Robredo, Radek Stepanek, Jo-Wilfried Tsonga

Of these, Fognini (9-2) and Tsonga (8-2) have the dubious honor of winning the most matches–that is, they are on the list because they drop lopsided sets in matches that they win.  Mathieu (2-8) is at the other extreme, dominating sets in the middle of losses.

The Wimbledon final was a rarity for Nadal–it was only the fourth time he’d been involved in a match with this sort of score, and it was only the second time he won a lopsided set in the middle of a loss.  Roger Federer has only played in three such matches.

We probably can’t read too much into these numbers, but it is interesting to see so many of the same types of players show up at the top of a list.  At the very least, we’ve learned that the 1-6 set in Sunday’s final was quite rare, and the 6-1 1-6 sequence was even rarer.

The Weak, Weak Newport Field

The ATP 250-level tournament in Newport this week is empty of the game’s best players.  The top seed is John Isner, ranked 46, and the 8th seed is Tobias Kamke, who is barely within the top 100.  This is no surprise.  Newport has one of the weakest ATP fields every year, situated as it is the week after Wimbledon, simultaneous with Davis Cup.

In a little study I did last year, I discovered that at least in 2009, Newport did have the weakest field of any ATP 250 event.  If you click the link, you’ll find a variety of metrics, but I think we can focus on just one: the median rank of main draw players.  By using median instead of average, the numbers aren’t skewed by a lowly-ranked wild card or qualifier.

In 2009, the players in the Newport draw had a median ranking of 125–that is, half the players in the main draw of an ATP event were ranked above 125.  Grand slams usually manage about 110 players below the 125 mark, but Newport only got 16–and most of those were closer to 125 than to 1.  Last year, the median fell to 129.5.  It may be a small consolation that Johannesburg’s field was equally weak.

A glance at this year’s draw can tell you that not much has changed.  Thanks to many late withdrawals, the cut fell to 218, which is considerably higher than the cut at some challengers.  For all that, the field quality has improved somewhat, to a median rank of 111.  That leaves Jo’burg in the dust; the South African event had a median rank of 118.5.

The non-challenger challengers

A few tour-level events–Newport, Jo’burg, and perhaps San Jose–obscure the line between the tour and challenger levels.  In the eyes of the ranking system, they are very different–Newport is worth 250 points to the winner, while no challenger is worth more than 125.  But for all intents and purposes, Newport and Jo’burg are challengers.

Last year, the May event in Bordeaux attracted a field with a median rank of 128–just above last year’s Newport and Jo’burg numbers.  This March, the odd 24-man field at Le Gosier had a median rank of 123.  Already in 2011, six challengers with 32-man fields had median ranks below 150, putting them in the same ballpark as the lowest rungs of the tour.

All of this is another strike against the ranking system, which treats Newport as if it were equivalent to, say, Sydney, where the last direct acceptance this year (#53 Benjamin Becker) was higher-ranked than Newport’s second seed (#60 Grigor Dimitrov).  Bad news for properly ordering second-tier pros, but good news for Isner, who can take advantage of this week’s cupcake draw to bounce back to as high as #36.

Bernard Tomic and the ATP Top 100: In Perspective

With his quarterfinal showing at Wimbledon, Bernard Tomic will break into the ATP top 100 for the first time on Monday.  He’ll do so with style, jumping from #158 to approximately #70.  (He will be considerably higher in my rankings–before the tournament, I had him just inside the top 50.)

As I’ve written before, a player’s chances of reaching the top of the men’s game have a lot to do with how early he cracks the top 100.  If you’re going to be a top-tenner, odds are you’re flashing some measure of those skills as a teenager.  In fact, to quote myself:

In the last 30 years, only one #1-ranked player (Pat Rafter) hadn’t reached the top 100 as a teenager, and he made it into the top 100 when he was 20.  Almost every eventual top-10 player had broken into the top 100 by age 21.

In that sense, Tomic is well ahead of the curve.  He doesn’t turn 19 until October, making him five months younger than Ryan Harrison, another teenager soon to break into the top 100.  Reaching #70 at such a young age isn’t a guarantee of future success, but it strongly points in that direction.  Again from my earlier post: 11% of players who cracked the top 100 at age 18 went on to become #1, and more than half (61%) eventually reached the top ten.

Tomic’s “comps”

Let’s take a narrower look and examine the 20 players who broke into the top 100 at ages closest to Tomic’s current age of 18.7 years.  It’s an impressive list, including Andy Roddick and Ivan Lendl, along with another 11 top-tenners.  Of these players the only “busts” were Andreas Vinciguerra (peak ranking: 33), Richard Fromberg (peak: 24), and Evgeny Korolev, who may yet improve on his peak ranking of 46.

In this group of 20 players, the average peak ranking is 11, and the median peak ranking is 8.  The average number of weeks in the top 100 is 362 (roughly eight years) and the median number of weeks is 410 (more than nine years).  Even 410 slightly understates a reasonable projection, since a few of these players (Roddick, Gael Monfils, Tommy Robredo, and Mikhail Youzhny) are guaranteed to add to their totals.

What may be most impressive about Tomic’s ranking at such a young age is that he has accomplished it the hard way.  He’s gotten plenty of wild cards–including at the Australian Open, where he reached the third round–but he qualified at Wimbledon, and a substantial chunk of his ranking points come from the challenger level, where he has reached four semifinals in 2011 alone.  His only “cheap” points are from Indian Wells, where he was wildcarded in, then beat Rohan Bopanna in the first round.

Now, Tomic’s ranking ensures that wild cards won’t be an issue, except at a few Masters 1000 tournaments.  If history is any guide, he’ll be a regular feature in the top echelon of the tour for most of this decade.

Fun With French Open Rally Length

At the ATP level, the ability to hang around in long rallies seems to be a key to success, especially on clay.  Most of the top players are such good defenders that a one-dimensional serve/forehand game just doesn’t cut it.

One stat you’ll occasionally see on television broadcasts is the number of points that reach a certain length, along with how each player is performing on those points.  The cutoff I’ve seen most frequently is 8 shots, and that seems like a reasonable enough line to draw.

Armed with point-by-point data from (most of) the men’s singles matches at the French Open, we can take a closer look.  The following table shows three numbers for each of the 16 players who reached the 4th round:

  • Average rally length–that’s the total number of shots per point for every point that the player contested.
  • Percentage of points that reached eight or more shots.
  • Percentage of eight-or-more-shot rallies that the player won.
PLAYER              Shots/Pt     8+  8+Wins  
Juan Ignacio Chela       5.3  25.7%   48.0%  
Gilles Simon             5.3  25.7%   59.8%  
Andy Murray              5.1  22.8%   50.5%  
Viktor Troicki           4.7  19.2%   48.9%  
Rafael Nadal             4.6  18.3%   56.5%  
Robin Soderling          4.6  19.5%   55.1%  
David Ferrer             4.5  16.8%   70.7%  
Alejandro Falla          4.5  19.7%   47.9%  
Gael Monfils             4.3  17.3%   44.8%  
Albert Montanes          4.3  15.2%   46.1%  
Fabio Fognini            4.3  15.5%   59.5%  
Novak Djokovic           4.1  16.0%   63.6%  
Richard Gasquet          4.0  13.9%   57.0%  
Roger Federer            3.9  14.0%   49.7%  
Ivan Ljubicic            3.7  11.8%   49.4%  
Stanislas Wawrinka       3.6  11.1%   46.2%

Unsurprisingly, the first two stats correlate quite closely.  The more eight-shot rallies you play, the higher your per-point average will be.  What may be more of a surprise is that the number of eight-shot rallies you play doesn’t appear to have much effect on your success in eight-shot rallies.  Andy Murray may be an instructive example here: He’s good at keeping himself in long points, but not always so good at doing what he needs to do to win them.

These numbers are far from authoritative–none of these stats comprise more than seven matches, and many comprise only four.  With so little data, a single opponent can skew the numbers.  For instance, Nadal was closer to the top of the rally-length leaderboard in the Australian Open, but a disproportionate number of his points came against John Isner, who is normally at the extreme other end.  Matches against Ljubicic and Federer also kept Nadal’s average down.

The same warning should be made about Ferrer’s impressive 70.7% winning percentage on long points.  I don’t doubt that he’s usually quite good in such rallies, but his four matches included two against players who are the exact opposite: Jarkko Nieminen and Sergiy Stakhovsky.

As more data of this sort becomes available, it will be interesting to see what trends emerge.

The French Open’s New Balls

Italian translation at settesei.it

Much has been made over the new balls at Roland Garros this year–players have complained that they are lighter, heavier, that they bounce differently.  As far as bounce and spin is concerned, there isn’t much we can glean from the available data.  But we can take a broad look at server dominance to get a sense of how the French is playing this year.

Several months ago, I looked at most of the ATP-level matches from 2010, and determined that the server wins points on different surfaces at the following rates:

  • Clay: 61.5%
  • Hard: 63.7%
  • Grass: 65.9%

The gaps between those numbers may not look very big, but they represent a major indicator in the differences between surfaces.  If the gap between clay and hard is 2.2%, then 2.2% must be a pretty big deal!

I’ve also determined the following regarding ace rates–again, using 2010 data.  “Ace rate” is simply the percentage of serves that are aces:

  • Clay: 5.5%
  • Hard: 8.5%
  • Grass: 10.5%

Now that’s a big difference.

Roland Garros

What about the French?  Taking the 2010 event as a whole, players won 62.4% of service points, and served aces 6.6% of the time.  Thus, I think it’s reasonable to conclude that the courts at RG last year played faster than most other clay-court ATP events.  Weather and–you guessed it–other equipment, such as balls, can also make a difference.

This year, 112 of the 127 men’s matches have already been played, including over 17,000 points, so I think it’s safe to start drawing conclusions.  This year, servers are only winning 62.0% of points–roughly halfway between the clay-court average and the results from last year’s “fast” RG.  More dramatically, players are only scoring aces on 5.6% of points, well below last year’s figure at the French.

I can’t shed any light on the specific quirks shown by the new balls, but for whatever reason, the French is playing more like an average clay-court event than it did last year.

Winners and Losers in the French Open Draw

Yesterday I offered a full breakdown of the French Open draw, with each player’s chances of advancing to each of several rounds.  In any draw of this kind, there are winners and losers, thanks to the luck of the … well, draw.

In a 128-player field seeded in the manner that Grand Slams are seeded, there are nearly 100,000 permutations of the draw.  The vast majority don’t matter — for instance, if you swapped Thiemo de Bakker and John Isner so that de Bakker played Nadal in the first round and Isner played Djokovic in the first round (instead of vice versa), no one’s chances of winning the title would change much.

But, of course, many of the possible permutations would matter a whole lot.  Just ask de Bakker or Isner!  Imagine how much better it would be for Isner to have a first round draw against, say, Yen-Hsun Lu, followed by a probable second-rounder against Sergiy Stakhovsky.  In fact, that’s Kei Nishikori’s draw, and in that sense, Nishikori was very lucky that the chips fell where they did.

Stepping back

In my previous draw simulations–like the one I published yesterday–I took the actual bracket as a given.  To generate the probabilities you see in yesterday’s chart, I had a computer program “play” the tournament 100,000 times, each time pitting Isner against Nadal in the first round, then the winner of that match against the winner of Giraldo/Andujar, and so on.

There’s a different way we could approach this.  Instead of starting the simulation from the point at which the draw is set, we could start from the point at which the field was set and seeded.  At that point, Isner would know that he is not seeded–and thus, that he would probably face a seed in the first or second round–but not which higher-ranked player he would face.

So, instead of 100,000 simulations of the actual French Open bracket, we can do 100,000 simulations of the draw itself, followed by simulating each ensuing bracket.  Sometimes, Isner draws Nadal, sometimes he draws Hanescu, and so on.

Measuring draw implications

A good way to gauge a player’s overall chances at a tournament is his predicted prize money.  Most players don’t have a significant chance of winning most tournaments (especially slams), so to compare Giraldo’s 0.01% chance of winning the title with Cuevas’s 0.02% chance doesn’t tell us much.  But if we consider the possibility that each player reaches each round, we can estimate that Giraldo will take home E24,600, while Cuevas will collect E29,500.  These numbers represent an average of the first-round prize money, second-round prize money, and so on, weighted by the probability that the player will reach each of those stages.

With this metric, we can compare the implications of the actual draw with the implications of the randomized draw, in which, for instance, Nadal could play any one of the 96 unseeded players in the first round.

Let’s compare the two outcomes in an extreme case.  As we’ve seen, the draw was not kind to John Isner.  My algorithm gives him a 12% chance of reaching the second round, and less than a 1% chance of reaching the semis.  Crunch the numbers, and you have predicted prize money of E22,700.  When you randomize the draw and he no longer has to beat Nadal in the first round, his chances of reaching the second round leap to 60%, and he has a 2% shot at a semifinal berth.  Predicted prize money: E40,100.

As it turns out, Isner is our biggest loser.  His predicted prize money fell more than 40% between the beginning and end of the draw ceremony.  What’s remarkable is that the next four players on the list all come from the same 1/16th of the draw–you guessed it, Djokovic’s section.

The draw effect on Thiemo de Bakker is similar to that on Isner–it doesn’t get any worse than drawing Djokovic in the first round.  Next on the list are Ernests Gulbis, Ivo Karlovic, and Juan Martin del Potro.  Karlovic and Gulbis not only have the misfortune of drawing Delpo in the first two rounds, but if by some chance they get past the Argentine, then they face Djokovic!  Each of those players lost more than 30% of their predicted prize money through the vagaries of the draw.

Del Potro is an interesting case.  As is, his predicted prize money is E184,600.  Before the draw was set, he could expect E266,000.  The biggest difference, of course, is his chance of reaching the round of 16.  In real life, he’ll need to beat Djokovic to get there, and he has a 30% chance of getting that far.  Before the bracket was drawn, the expectation was that he’d need only to defeat someone in the top 16 (or possibly, a player who had upset someone in the top 16).  He had a 63% chance of doing so.

Winners

Naturally, if there are so many players whose predicted prize money decreased, some players must have benefited from the way the draw played out.

One of the biggest winners was Andy Murray.  As we’ve seen, plenty of dangerous players are concentrated in Djokovic’s quarter; in fact, Djokovic, Nadal, and Federer were all hurt by the draw.  But Murray’s draw boosted his predicted prize money from E191,700 to E240,800.  He’ll face a qualifier or lucky loser in each of the first two rounds, then no one more challenging than Milos Raonic in the third.  Next would be Dolgopolov or Troicki–no walkovers, but compare that to Nadal’s possible fourth-rounder of Verdasco, and you see how the breaks went in Andy’s favor.

Murray’s quarter is the softest of the four, and other men benefit even more.  In fact, the two players whose chances the draw boosted the most are Nicholas Almagro and Jurgen Melzer, who will likely play in the fourth round for a matchup with Murray in the quarters.  Almagro, Melzer, and Juan Ignacio Chela (also in this section) all saw their predicted prize money jump by more than 40%.  For example, Almagro went from E76,100 to E112,400–and more than doubled his chances of winning the title from 0.6% to 1.3%–by landing where he did in the bracket.

Regardless of any player’s specific placement, the best man will probably win.  But the draw certainly has a say in how tricky the route to the title will be.