If Rafa Only Plays on Clay

Since suffering the injury that would lead him to miss the second half of 2012, Rafael Nadal has said that he may have to cut back his tournament schedule so that he plays fewer matches on hard courts.

For someone who wants to remain at the top of the game, that’s a tough ask.  The majority of ATP ranking points come from hard-court tournaments.  If Rafa stuck to the clay, he would only be able to contest one of the four majors.

Becoming a full-time clay courter would almost certainly knock Nadal out of the running for world #1.  (As well as give him plenty of R&R in Mallorca.)  But how bad is it?  Let’s consider the possibility that in some future season, he only plays on clay.

Here is a possible 2013 schedule for a clay-only player, along with each event’s ranking points.  Three 250s are on this schedule, placed to provide warm-ups after each multi-week layoff:

20-Feb  Buenos Aires   250   
27-Feb  Acapulco       500   
09-Apr  Casablanca     250   
16-Apr  Monte Carlo    1000  
23-Apr  Barcelona      500   
07-May  Madrid         1000  
14-May  Rome           1000  
28-May  Roland Garros  2000  
09-Jul  Stuttgart      250   
16-Jul  Hamburg        500

If Rafa ran the table and won all of those events, that’s 7000 ranking points (only two of the 250s would count).  Unless the rest of the field becomes much more level, that won’t be good enough for the #1 ranking.  But it is a greater point total than Rafa has right now, and it would keep him in the top four.  Even averaging finalist points for these 10 events would allow him to remain in the top eight.

(Getting credit for those tournament wins would be a little trickier.  Players are required to show up for at least 4 500-level events, including one after the US Open.  If you only play on clay, there are no options.  To avoid the dreaded “zero-pointer” for not playing, Rafa might have to contest, say, Valencia.  However, points from those events no longer automatically count as one of a player’s top 18 events, so as long as the requirement was met, Rafa’s six non-slam, non-required-Masters events could be Monte Carlo, Acapulco, Barcelona, Hamburg, and two 250s.)

In practice, it’s tough to imagine that Rafa (or anyone else, short of Alessio Di Mauro) would avoid hard-court events entirely.  Much more likely is a scenario in which he plays all the clay court events possible and competes in hard-court events only when he feels sufficiently healthy.  That might mean an occasional semifinal run; it probably also means more second-round exits.

As unlikely and unusual as it would be, the all-clay schedule may be Nadal’s best route to setting more records.  With fewer injuries and much more rest, it’s easy to imagine him racking up another four or five French Open titles, along with perhaps ten more Masters crowns.  It would be an unusual career trajectory, to be sure, but it would also generate more fodder for the next ten years of GOAT debates.

 

The Speed of Every Surface, Redux

One of the most popular posts on this blog has been this one, which quantified the speed of every ATP tournament’s surface.  At the very least, it’s time to provide some updated numbers.  Beyond that, we can improve on the methodology and say more about how much we can learn from the numbers.

I was prompted to improve the methodology when I ran an update this week to see how fast the courts are at the O2 Arena in London.  The algorithm, which compares the number of aces (or service points won, or first service points won) to the number we’d expect from those players based on their season average, told me that London is much slower than average–almost 20% below average, on par with Roland Garros and the pre-blue clay Madrid Masters.

Counterintuitive conclusions are fun, but that’s just wrong.

Here’s the problem: Service stats aren’t only affected by servers.  Sure, when Milos Raonic is serving, there will be more aces than when Mikhail Youzhny is serving.  But how many aces Raonic hits is also influenced by the returning skills of the man on the other side of the net.  It’s clear why the algorithm got London so wrong: The eight or nine best players in the world got to where they are (in part, anyway) by getting more balls back.  No matter how fast the court, Mardy Fish wasn’t going to hit as many aces past Jo Wilfried Tsonga or Rafael Nadal in London as he did against Bernard Tomic in Shanghai or Tokyo.

I’ll be more succinct.  The goal is to compare the number of aces on a particular surface to the number of aces we’d expect on a neutral surface.  The number of Expected aces depends on more than just the man serving; it also depends on the man receiving.

(In my article last year, I used three different stats (ace rate, first serve winning percentage, and overall winning percentage on serve) to measure surface speed.  They track each other fairly closely, so there’s not a lot of additional value gained by using more than one.  From here on out, I’m measuring surface speed only by relative ace rate.)

Incorporating more data

To factor in the additional variable, we need each player’s ace rate for the season along with his ace against rate.  With those two numbers, together with the overall ATP average, we can apply the odds ratio method to get a better idea of each match’s expected aces.

For each server in each match, we compare his actual aces to his expected aces, and then take the average of all of those ratios.  The tournament-wide average gives us an estimate of how fast the courts played at that event.

The improved algorithm still insists that aces were 3% lower than on a neutral surface at the 2011 Tour Finals, but counters that with the conclusion that aces were 18% and 8% more than on a neutral surface in 2009 and 2010, respectively.  A weighted average of those three seasons (more on that in a bit) estimates that the O2 Arena gives us 4% more aces than a neutral surface.

The variance from year to year–in some cases, like that of London, suggesting that a surface is faster than average one year, slower than average the next–is a bit worrisome.  At the very least, we can’t simply take a one-year calculation for a single tournament and treat it as the final word, especially when the event only includes 15 matches.

Multi-year averages and (extremely mild) projections

If we want to know exactly what happened in one edition of a tournament, the single-year number is instructive.  Perhaps the weather, or the lighting, was very bad or very good, causing an unusually high or low number of aces.  Just because a tournament’s number for 2012 doesn’t match its numbers for any of the previous three years doesn’t mean it’s wrong.

However, the variety of effects that give us this year-to-year variance do warn us that last year’s number will not accurately predict this year’s number.

The year-to-year correlation of relative ace rate (as I’ve described it above), is not very strong (r = .35).  One way to modestly improve it is to use a three-year weighted average.  A 3/2/1 weighted average of 2011, 2010, and 2009 numbers gives us a better forecast of how the surface will play in the following year (r = .5).

Another way of looking at these more reliable forecasts is that they get closer to isolating the effect of the surface.  As I noted in last year’s article, the weather effects of Hurricane Irene dampened the ace rate at last year’s US Open.  By my new algorithm, the ace rate last year was 7% lower than a neutral surface, while this year it was 5% higher than a neutral surface.  The three-year weighted average would have been able to look past Irene; using data from 2009-11, it estimated that courts in Flushing were exactly neutral.  That not only turned out to be a better projection for 2012 than the -7% of 2011, it also probably better described the influence of the court surface, as separate from the weather conditions.

Below the jump, find the complete list of all tour-level events that have been played in 2011 and/or 2012.  The first four numerical columns show the relative ace rate for each year from 2009 to 2012.  For instance, in Costa Do Sauipe this year, there were a staggering 61% more aces than expected.  The final two columns show the weighted averages for 2011 and 2012.  Each event’s “2012 Wgt” is my best estimate of the current state of the surface and how it will play next year.

I’ve also created a prettier, sortable version of the same table.

Continue reading The Speed of Every Surface, Redux

The Influence of a First-Set Tiebreak

Italian translation at settesei.it

In the first two rounds of last week’s Paris Masters, 12 matches began with a first-set tiebreak.  Of those dozen matches, nine of them finished as straight-set wins, with the second set more decisive than the first.  Polish qualifier Jerzy Janowicz won both of his first two matches according to this pattern.

This isn’t exactly what we’d expect.  A tiebreak isn’t purely random, but it’s close.  And if two players have reached a tiebreak, the available evidence suggests that they are playing at about the same level.  Thus, the winner of the first set is more likely to win the match–and perhaps a bit more likely to win the second set–but not so highly likely to find it easier going in the following set.

Anecdotally, this seems like a familiar pattern.  Tough fight in the first set, then the tiebreak winner cruises in the second–perhaps due to his own momentum, perhaps because the first-set loser stops trying so hard.

And it is fairly common.  Since 2000, about 9% of tour-level best-of-threes are straight set wins in which a tiebreak is followed by a more decisive set.  When the first set is decided by a tiebreak, by far the most frequent outcome (roughly half of these matches) is a straight set victory where the second set is more decisive than the first.

Evidence or forecast?

So what does it mean?  Does winning a first-set tiebreak actually give a player the boost he needs to run away with the second?  Or are first-set tiebreaks evidence that the tiebreak winner was the better player all along, suggesting that we could have forecast the ensuing 6-3 or 6-4 set before the match even started?

We won’t arrive at a clear answer to this question, but we can try to get closer.

To give us some context, let’s start by comparing matches with first-set tiebreaks to the overall pool of best-of-three contests since 2000:

  • In best-of-threes, the first-set winner wins in straight sets 66.1% of the time.  If the first set is decided by a tiebreak, the first-set winner takes the match in straights 60.5% of the time.
  • In all best-of-threes, the first-set winner wins the second set by at least one break (that is, without needing to play a breaker) 57.1% of the time.  If the first set was a tiebreak, the first-set winner wins the second set by at least one break 50.0% of the time.
  • The first set winner loses a best-of-three match 18.0% of the time.  If the first set is decided by a tiebreak, he loses 22.3% of the time.

Clearly, first-set tiebreaks indicate closer matches than average.  (You probably didn’t need me to crunch the numbers to tell you that.)  It’s still far from clear whether the first-set tiebreak gives the winning player a boost, or it simply reflects the balance between the two competitors.

Factoring favorite status

To isolate the effect of player skill, let’s look at matches with first-set tiebreaks, divided into four categories determined by how much the first-set winner was favored:

             Straights  Easy 2nd   Loss  
Underdogs        48.5%     39.3%  33.8%  
Even(ish)        61.2%     51.4%  19.2%  
Favorite         69.4%     57.3%  14.1%  
Extreme Fav      74.1%     62.0%   9.2%

No surprises here.  The more the first-set tiebreak winner is favored, the more likely he is to win the match in straight sets, the more likely he is to win the second set by at least one break, and the less likely he is to lose the match.

More importantly, a bit more crunching of these numbers shows that almost all–at least 80%–of the variation in these three percentages is determined by the relative skill levels of the two players.  It’s possible that a bit of the remainder can be ascribed to the lingering effects of a tight first-set triumph, but only possible, and only a bit.

A story for every sequence

I suggested at the outset that this pattern–7-6, 6-something–seems like a familiar one.  And of course it is, because there are only so many score permutations in best-of-three matches.

When we watch such a match, it’s easy to come up with a narrative that seems universal.  “Federer won the last three points of the tiebreak, leaving Isner looking overmatched.  No one was surprised when Isner got broken for the first time in the following game.”  The simple story accurately reflects at least part of the match, explains the scoreline, and it’s tempting to theorize that (a) Isner’s break was due to his loss of the first-set tiebreak, and (b) players generally suffer an early break in the second set after losing a tiebreak.

Fine.  Except often (just as often?), we have reason to construct another narrative: “Murray won the last three points of the tiebreak, leaving Tsonga looking overmatched.  No one was surprised, though, when Murray came out a bit stale in the second set and got broken for the first time in the following game.”

Some stories reflect actual trends, and that’s why so many of my posts on this site investigate the most popular stories.  But for any given story, it’s more likely than not that it has been constructed simply to give a bit more meaning to underlying randomness.

Janko Tipsarevic and the Masters of Retirement

When Janko Tipsarevic retired six points away from defeat against Jerzy Janowicz on Friday, many tennis fans were … unsurprised.   The Serb has quite the record when it comes to quitting early, having retired from matches at all four Grand Slams, the Olympics, and nearly half of the Masters 1000 events.  He has retired on every surface and in every round.

It’s hardly a record to be proud of.  Tipsarevic’s departure on Friday was his 17th career tour-level retirement–about 1 in every 25 matches over his 434-match career.  His “retirement rate” of 3.9% is the highest among active players with at least 400 matches.  It’s more than double the tour average of about 1.5%.

But that “at least 400” hides some context.  Expand the field to a still-respectable minimum of 200 tour-level matches and we have the following leaders in career retirement rate:

Player              Matches  Ret Rate  
Sergiy Stakhovsky       209      4.8%  
Michael Llodra          370      4.6%  
Yen Hsun Lu             222      4.5%  
Janko Tipsarevic        434      3.9%  
Denis Istomin           211      3.8%  
Paul Henri Mathieu      456      3.7%  
Filippo Volandri        367      3.5%  
Potito Starace          347      3.5%  
Xavier Malisse          531      3.0%  
Viktor Troicki          300      3.0%

Tipsy is still a standout, yet not an egregious one.  Both Paul Henri Mathieu and Xavier Malisse have retired in three of the four slams.  Michael Llodra has dropped out of Wimbledon three times, and the US Open twice.  (Not to mention retiring against Jo Wilfried Tsonga three times, and perhaps more remarkably, against both Tipsarevic and Mathieu.)

For a fuller view of the state of ATP retirement–including the 22 members of the top 100 who have never done so–click here for a sortable table with more fun stats.  (A few numbers are different than above, because my full database doesn’t yet include 2012 Bercy.)  Janko may quit early, but that doesn’t mean you have to.

The 2012 World Tour Finals Forecast

With Jo Wilfried Tsonga‘s win last night over Nicolas Almagro, the field is set for the tour finals.  Novak Djokovic and Roger Federer will each head one of the two round robin groups, and will be joined by Andy Murray, David Ferrer, Tomas Berdych, Juan Martin Del Potro, Tsonga, and Janko Tipsarevic.

Despite Federer’s dominance on indoor hard courts last year, he is hardly the same unstoppable force this season.  Not only did he lose in last week’s final to Del Potro, but my rating algorithm, Jrank, views him as a slightly inferior hard-court player to Murray.  Though it will certainly be close, my forecast favors both the Serb and the Brit over the soon-to-be world #2:

Player         SF      F      W  
Djokovic    77.7%  47.7%  28.8%  
Murray      70.0%  41.9%  23.3%  
Federer     72.6%  40.4%  22.3%  
Del Potro   45.9%  20.2%   8.3%  
Ferrer      45.4%  17.7%   6.5%  
Berdych     38.8%  15.2%   5.5%  
Tsonga      30.4%  11.3%   3.8%  
Tipsarevic  19.2%   5.5%   1.5%

As always, there are as many reasons to question these numbers as there are to put one’s faith in them.  Djokovic’s loss to Sam Querrey this week seriously questions his current ability to play his best tennis.  Murray’s loss to rising star Jerzy Janowicz isn’t quite so troubling, but it also fails to fit the profile of a dominant player.

In the bottom half of the pack, one or two of these guys are likely to play in the Paris final, meaning they’ll be relatively tired upon arrival in London.  It’s one thing to play the first round of a tournament on weak legs; it’s another when that event is the Tour Finals and your first opponent is a fellow top-tenner.

[UPDATE, 3 Nov]

The draw is set.  Federer is joined in Group B with Ferrer, Del Potro, and Tipsarevic, leaving Djokovic with Murray, Berdych, and Tsonga.  This is a dream setup for Federer, and even dreamier for Delpo.

Federer’s career H2H against the three men in his group is 31-3.  His career H2H against Novak’s opponents is 27-18.  He might prefer not to face Del Potro again so soon, but historically, the Argentine hasn’t been any more dangerous for Roger than any of the three men Djokovic will have to face.

As noted, it’s the absolute perfect draw for Delpo, too.  Statistically, Federer is weaker than Djokovic.  My numbers might overstate Ferrer’s competitiveness in London (and they still aren’t very high), and Tipsarevic is essentially a non-factor.  In the pre-draw simulation above, Del Potro has a 45.9% chance of reaching the semis and a 8.3% chance of winning it all.  Post-draw, 54.4% and 9.2%.  It’s an uphill battle no matter what the draw, but avoiding the Murray group is a huge help.

Here are the projections, now reflecting the draw:

Player         SF      F      W  
Djokovic    74.0%  47.2%  28.2%  
Federer     76.7%  41.2%  23.0%  
Murray      68.5%  41.6%  22.6%  
Del Potro   54.4%  22.4%   9.2%  
Ferrer      46.9%  17.9%   6.8%  
Berdych     31.2%  13.5%   5.0%  
Tsonga      26.3%  10.4%   3.6%  
Tipsarevic  22.1%   5.8%   1.6%

Thanks to his relatively weak round-robin group, Federer has the best shot at reaching the semis, but only the third best chance of reaching the final, since he’s likely to face either Djokovic or Murray in his semi.  Despite the tougher draw, Djokovic remains the favorite to win the event and put an exclamation point on his season-ending #1 ranking.

(A quick programming note for regular readers: I won’t be able to update these predictions throughout the tournament on TennisAbstract.com, and due to an uncooperative travel schedule, the next TA.com update (including Bercy results) may not occur until Tuesday or Wednesday.)

Bouncing Back From a Bagel

Yesterday, Sam Querrey posted an unusual achievement and did so in an unusual way.  He beat soon-to-be-#1 Novak Djokovic–a career milestone no matter how it happened.  And he did it after losing the first set 6-0.

This was only the fifth time in his ATP-level career that Querrey lost a set 6-0 (though it was the second time in two weeks), and it was the first time he was bageled in the first set.  Big servers like Sam aren’t generally found on either end of a bagel, since their style of play tends to ensure that both players win a service game or two.  Querrey has only bageled other players five times on tour.  Oddly enough, three of those have been in Los Angeles.

However rare 6-0 sets are, the shocking thing here is that he bounced back.  Not just in the sense that he recovered from the mental blow  of winning a mere 10 of 35 first-set points, but that he won two sets from a player who seemed to be so vastly superior to him on court.

As you might imagine, that doesn’t happen very often.  Of about 2100 best-of-three matches this year through the end of last week, 58 began with a bagel.  The first-set loser only came back to win three of those 58 times.  And of course, the losers in those three-setters were hardly of Djokovic’s caliber: Peter Polansky, Maximo Gonzalez, and Jarkko Nieminen.  (It wasn’t the first time for the Finn–he lost a match 6-0, 6-7, 6-7 in 2009.)

A bit of context

2012 has been a tough year for the victims of first-set bagels.  When we expand our focus to the entire 21st century, it turns out that first-set bagels have been occurring at a typical rate this year–about 2.5%, or 1 in 40 matches–but that players are finding it tougher to bounce back.

In best-of-three matches over the last thirteen seasons, there have been 753 first-set bagels.  The winner closed it out in straight sets 568, or 75.3%, of those times.  The rate this year has been almost identical, with straight-set wins finishing off 43 of the 58 matches with first-set bagels.

In the remaining matches, the underdogs have historically found easier going.  Over the last thirteen years, the player who lost the first set 0-6 managed to come back and win the match 75 times–about once every ten matches.  This year, Querrey was only the fourth (of 59, now) to do so.

What’s most interesting about the historical total of 75 is that is not much less than the number of matches that the first-set winner wins in three sets.

Let me put that another way.  Since 2000, the player who was bageled in the first set has come back to win the second set 185 times.  Since the vast majority of those second set scores are 7-6, 7-5, and 6-4, the first-set winner almost always had a more dominant run than the second set winner.    But that once-dominant first-set win only wins three-setters 40% of the time.

As we’ve seen, Querrey was only the fourth player to complete the comeback this year, though he was the 13th to reach a third set.  Based on the previous rate, we should have seen another couple of recoveries from an 0-6 start.

Winning the second set, as Querrey did today, doesn’t exactly put the comebacker on equal footing, but recent history shows that we can’t put too much weight on that outlier of a first set.  Perhaps 6-0s are simply too extreme to carry much weight.  Or perhaps winning the second set–even if it’s a much tighter margin than the first–provides a boost that carries over into the third set.

In any event, players should take heart in the knowledge that after dropping the first set 6-0, all is not lost.  But Querrey, who has now been bageled more in the last two weeks than he had been in the previous four years combined, probably shouldn’t hinge his hopes on many more fights like the one he posted yesterday.

After the jump, find the complete list of tour-level 0-6 comebacks since 2000.

Continue reading Bouncing Back From a Bagel