Andrey Kuznetsov and Career Highs of ATP Non-Semifinalists

When following this week’s ATP 250 tournament in Winston-Salem and seeing Andrey Kuznetsov in the quarterfinals the following question arose: Will he finally make it into the first ATP semifinal of his career? As shown here Andrey – with a ranking of 42 – is currently (by far) the best-ranked player who has not reached an ATP SF. And it looks as if he will stay on top of this list for some time longer after losing to Pablo Carreno Busta 4-6 3-6 on Wednesday.

With stats of 0-10 in ATP quarterfinals, he is still pretty far away from Teymuraz Gabashvili‘s streak of 0-16. Despite having lost six more quarterfinals before winning his first QF this January against a retiring Bernard Tomic, Teymuraz climbed only to a ranking of 50. Still, we could argue that the QF losing-streak of Teymuraz is not really over after having won against a possibly injured player.

Running the numbers can answer questions such as “Who could climb up highest in the rankings without having won an ATP quarterfinal?” Doing so will put Andrey’s number 42 into perspective and will possibly reveal some other statistical trivia.

Player                Rank            Date   On
Andrei Chesnokov        30      1986.11.03    1
Yen Hsun Lu             33      2010.11.01    1
Nick Kyrgios            34      2015.04.06    1
Adrian Voinea           36      1996.04.15    1
Paul Haarhuis           36      1990.07.09    1
Jaime Yzaga             40      1986.03.03    1
Antonio Zugarelli       41      1973.08.23    1
Bernard Tomic           41      2011.11.07    1
Omar Camporese          41      1989.10.09    1
Wayne Ferreira          41      1991.12.02    1
Andrey Kuznetsov        42      2016.08.22    0
David Goffin            42      2012.10.29    1
Mischa Zverev           45      2009.06.08    1
Alexandr Dolgopolov     46      2010.06.07    1
Andrew Sznajder         46      1989.09.25    1
Lukas Rosol             46      2013.04.08    1
Ulf Stenlund            46      1986.07.07    1
Dominic Thiem           47      2014.07.21    1
Janko Tipsarevic        47      2007.07.16    1
Paul Annacone           47      1985.04.08    1
Renzo Furlan            47      1991.06.17    1
Mike Fishbach           47      1978.01.16    0
Oscar Hernandez         48      2007.10.08    1
Ronald Agenor           48      1985.11.25    1
Gary Donnelly           48      1986.11.10    0
Francisco Gonzalez      49      1978.07.12    1
Paolo Lorenzi           49      2013.03.04    1
Boris Becker            50      1985.05.06    1
Brett Steven            50      1993.02.15    1
Dominik Hrbaty          50      1997.05.19    1
Mike Leach              50      1985.02.18    1
Patrik Kuhnen           50      1988.08.01    1
Teymuraz Gabashvili     50      2015.07.20    1
Blaine Willenborg       50      1984.09.10    0

The table shows career highs (up until #50) for players before they won their first ATP QF. A 0 in the last column indicates that the player can still climb up in this table, because he did not win a QF, yet. There may also be retired players being denoted with a 0, because they never managed to get past a QF during their career.

I wonder, who had Andrei Chesnokov on the radar for this? Before winning his first ATP QF he pushed his ranking as far as 30. He later went on to have a career high of 9. Nick Kyrgios could also improve his ranking quickly without the need to go as deep as a SF. His Wimbledon 2014 QF, Roland Garros 2015 R32, and Australian Open 2015 QF runs helped him to get up until #34 without a single win at an ATP QF. Also, I particularly would like to highlight Alexandr Dolgopolov who reached #46 before having even played a single QF.

Looking only at players who are still active and able to up their ranking without an ATP SF we get the following picture:

Player                 Rank            Date
Andrey Kuznetsov         42      2016.08.22
Rui Machado              59      2011.10.03
Tatsuma Ito              60      2012.10.22
Matthew Ebden            61      2012.10.01
Kenny De Schepper        62      2014.04.07
Pere Riba                65      2011.05.16
Tim Smyczek              68      2015.04.06
Blaz Kavcic              68      2012.08.06
Alejandro Gonzalez       70      2014.06.09

Andrey seems to be relatively alone with Rui Machado being second in the list having reached his highest ranking already about five years ago. Skimming through the remainder of the table, we would be surprised if anyone soon would be able to come close to Andrey’s 42, which doesn’t mean that a sudden unexpected streak of an upcoming player would render this scenario impossible.

So what practical implications does this give us for analyzing tennis? Hardly any, I am afraid. Still, we can infer that it is possible to get well within the top-50 without winning more than two matches at a single tournament over a duration that can even range over a player’s whole career. Of course it would be interesting to see how long such players can stay in these ranking areas, guaranteeing direct acceptance into ATP tournaments and, hence, a more or less regular income from R32, R16, and QF prize money. Moreover, as the case of 2015-ish Nick Kyrgios shows, the question arises how one’s ranking points are composed: Performing well at the big stage of Masters or Grand Slams can be enough for a decent ranking while showing poor performance at ATP 250s. On the other hand, are there players whose ATP points breakdown reveals that they are willing to go for easier points at ATP 250s while never having deep runs at Masters or Grand Slams? These are questions which I would like to answer in a future post.

This is a guest article by me, Peter Wetz. I am a computer scientist interested in racket sports and data analytics based in Vienna, Austria. I would like to thank Jeff for being open-minded and allowing me to post these surface-scratching lines here.

Teymuraz Gabashvili and ATP Quarterfinal Losing Streaks

Yesterday in Moscow, Teymuraz Gabashvili played his 16th career tour-level quarterfinal. Facing 118th-ranked Evgeny Donskoy, it was his best chance yet to reach an ATP semifinal, but just as in each of his previous 15 attempts, he lost.

No other player has contested so many tour-level quarterfinals without ever winning one. But while the streak of 16 consecutive quarterfinal losses is a rarity, it’s not a record. The all-time mark belongs to Gianluca Pozzi, who dropped 18 in a row between 1993 and 2000. Pozzi’s record, depressing as that streak is, might be an inspiration to Gabashvili: At age 35, Pozzi finally broke the streak, defeating Marat Safin, one of the best players he ever faced in a quarterfinal.

Gabashvili and Pozzi are among only twelve players who have strung together more than 10 quarterfinal losses at tour level. Here’s the complete list, including the dates of the first and last loss in each streak:

Player               QFs L Streak     Start       End  
Gianluca Pozzi                 18  19930104  20000501  
Teymuraz Gabashvili            16  20070219         *  
Paul Annacone                  14  19860127  19880704  
Ivan Molina                    12  19751110  19791105  
Mischa Zverev                  11  20060925  20090713  
Diego Perez                    11  19861124  19920810  
Anand Amritraj                 11  19750304  19810706  
Dennis Ralston                 11  19701101  19800602  
Bob Carmichael                 11  19720918  19751231  
Ricardas Berankis              10  20120917         *  
Yen Hsun Lu                    10  20070219  20130923  
Mikhail Youzhny                10  20041101  20060130

Ricardas Berankis is the only other player on this list to have an active streak, and since he’s five years younger than Gabashvili, another few years of mild success and quarterfinal futility could put him in the running for the all-time record. Alas, neither player is likely to repeat the post-streak success of Mikhail Youzhny, who went on to play 63 more tour-level quarterfinals, winning 33 of them.

If there’s a silver lining for Gabashvili, it’s that he’s reached all of those quarterfinals, sparing himself the fate of Rolf Thung, a Dutch player from the 1970s who reached the round of 16 at 18 tour events and lost them all.

Benoit Paire and Overqualified Challenger Contenders

With three ATP tour-level events on the slate this week, Benoit Paire considered his options and elected to play none of them. Instead, the world #23 is the top seed at the Brest Challenger, making him the highest ranked player to enter a challenger this year–by a wide margin.

Top-50 players may only enter challengers if they are given a wild card, and top-ten players may not enter them at all. Still, since 1990, a top-50 player has played a challenger just over 500 times, at a rate of about 20 per year. (Some of these players didn’t need a wild card, as entry is determined by ranking several weeks before the tournament, during which time rankings rise and fall.)

Many of the high-ranked wild cards fall into one of two categories: Players who lose early in Slams, Indian Wells, or Miami; and clay-court specialists seeking more matches on dirt. Paire’s decision this week–like the Frenchman himself–doesn’t follow one of these common patterns.

Anyway, here are the top-ranked players to contest challengers since 1990, along with their results. A result of “W” means that the player won the title, while any other result indicates the round in which the player lost.

Year  Event           Player               Rank  Result  
2003  Braunschweig    Rainer Schuettler    8     R16     
1991  Johannesburg    Petr Korda           9     SF      
1994  Barcelona       Alberto Berasategui  10    W       
1994  Graz            Alberto Berasategui  11    R16     
2008  Sunrise         Fernando Gonzalez    12    QF      
2004  Luxembourg      Joachim Johansson    12    W       
2011  Prostejov       Mikhail Youzhny      13    QF      
2008  Prostejov       Tomas Berdych        13    QF      
2003  Prague          Sjeng Schalken       13    W       
2005  Zagreb          Ivan Ljubicic        14    W       
2004  Bratislava      Dominik Hrbaty       14    F       
2004  Prostejov       Jiri Novak           14    QF      
2003  Prostejov       Jiri Novak           14    R32     
2007  Dnepropetrovsk  Guillermo Canas      15    SF      
2002  Prostejov       Jiri Novak           15    F       
1998  Segovia         Alberto Berasategui  15    QF      
1997  Braunschweig    Felix Mantilla       15    F       
1997  Zagreb          Alberto Berasategui  15    W

(Schuettler and Korda were outside the top ten a couple of weeks before their respective challengers.)

A look at this list suggests that Alberto Berasategui entered challengers as a top-fifty player more than anyone else. He’s close–with 12 such entries, he’s tied for second with Jordi Arrese. The player who dropped down a level the most times is Dominik Hrbaty, who played 17 challengers while ranked in the top 50. (The active leaders are Jarkko Nieminen with ten and Andreas Seppi with nine.)

Despite all those attempts, Hrbaty wasn’t particularly successful as a high-ranked challenger player. He won only 2 of those 17 events, reaching only one other final. Top-50 players aren’t guaranteed to win these titles, of course, but in general, they have outperformed Hrbaty, winning 18% of possible titles. Here are top-50 players’ results broken down by round:

Result       Frequency  
Title            18.1%  
Loss in F         9.3%  
Loss in SF       11.3%  
Loss in QF       17.1%  
Loss in R16      22.0%  
Loss in R32      22.2%

Paire is a better player than this sample’s average ranking of 37. Combined with a favorable surface, he gets a much more optimistic forecast from my algorithm, with a slightly better than one-in-three chance of winning the title. With a futures title, an ATP trophy, and a pair of challenger triumphs already in the books this year, it seems fitting that Benoit would add another oddity to his wide-ranging season.

Continue reading Benoit Paire and Overqualified Challenger Contenders

Lucky Losers and Familiar Faces

In the final round of qualifying Monday in Moscow, Darya Kasatkina easily defeated Paula Kania. Thanks to a couple of late withdrawals, both players ended up making the main draw … and tomorrow, they’ll play each other again.

This scenario is rare, but not unheard of. Since the mid-1990s, there have been 30 other instances when two women faced each other in qualifying and then again in the main draw. Most recently, Lauren Davis defeated Svetlana Kuznetsova twice at the 2013 Canadian Open. One year earlier, in Sydney, Alexandra Dulgheru beat Sofia Arvidsson in the first round of the main draw despite losing to her in the final round of qualifying.

Tomorrow’s Kasatkina-Kania rematch is far from a sure thing. In those 30 prior matches, barely more than half of the qualifiers–17 of 30–have managed to win both matches.

This sort of rematch is similarly uncommon on the ATP tour. Since 2007 (the earliest year for which I have qualifying results), this has happened a dozen times. Most recently, Albert Ramos-Vinolas defeated Robin Haase in back-to-back rounds in Monte Carlo. Ramos was on the opposite side of things five years ago, when Pablo Cuevas beat him twice in Valencia.

Earlier this year, in a variation on the theme in Auckland, Kenny de Schepper beat Alejandro Falla to qualify, and after both players won their first-round matches, Falla triumphed in the second-round rematch.

Programming note: After watching this sort of ad hoc research disappear into the barely-searchable void that is the Twitter archive, it occurred to me to post occasional brief notes such as this one. It’s not groundbreaking stuff, but at least it’ll be easier to find in the future. These curiosities won’t interfere with or replace my longer, more analytical posts.

A New Way of Looking at Lottery Matches

When Rafael Nadal was eliminated from the US Open last week, a bit of bad luck was involved. He won only two fewer points than his opponent, Fabio Fognini, claiming 49.7% of the total points played. In his career up to that point, Rafa had won 8 of 18 matches in which he won between 49% and 50% of total points. It doesn’t take much to flip the result of such a match.

Matches in which neither player wins more than 51% of points represent nearly one in ten contests on the ATP tour. As Michael Beuoy demonstrated last year, those matches are very much up for grabs: the player with the most points wins less than 65% of the time.

In writing about the small subset of matches in which the loser wins a higher percentage of return points than the winner, Carl Bialik has coined the useful term “lottery matches.” However, Bialik has limited the term to those matches that have an unexpected result. I’d like to expand the definition a bit to all those tight matches that could go either way, even if the player who wins the most points ends up winning as expected.

(A quick side note: Bialik prefers comparing return points, the building blocks of his Dominance Ratio metric. Matches are won a bit more frequently when the winner’s DR is below 1.0 than when he wins fewer than 50% of total points played. These metrics often overlap, of course. To make this arcane subject a bit more accessible, I’m going to stick with the traditional total-points-won stat.)

As Beuoy showed, matches aren’t guaranteed to go to the player who wins the most points unless that guy wins at least 53% of points. (Even then, there’s a slight possibility of an upset, but it’s sufficiently rare that, for today’s purposes, I’m going to ignore it.) 52.5% is much better than 50.5%, but at 52.5%, you’re still going to lose about one of every 25 matches.

By extending the “lottery match” umbrella to all those matches in which neither player wins 51%, 52%, or even 53% of total points, we acknowledge that none of these matches are sure things, and we can look at a broader range of matches to determine whether players are winning as many tight matches as they should. Further, by considering such a category of tight matches, we’ll be able to identify those men who play a lot of them–and by doing so, leave themselves vulnerable to lucky upsets.

Winning the lottery (matches)

Let’s start with the broadest category: all matches in which neither player won more than 53% of total points. These represent everything from true toss-ups at 50% to near-guarantees at 52.9%. Using Beuoy’s model, we can take the total points won from each of these matches and calculate the likelihood that the player with the greater number of points won the match.

Nadal, for instance, is one of the more effective players in these tight matches. Going into the US Open, he had played 168 of them, winning 115. By taking the total points won from each of these matches, we find that he “should have” won only 102.5 of them, meaning that by some combination of clutch play and luck, he’s outperformed expectations by 12%.

Among active players with at least 100 of these matches, Nadal ranks an impressive fourth overall, behind John Isner, Fognini, and Jurgen Melzer. Novak Djokovic and Andy Murray are just inside the top 20, exceeding expectations by 6% and 5%, respectively, while Roger Federer is much further down the list, winning 7% fewer of these tight matches than he should.

Finding Fed on the negative end of this list is a surprise, since Federer, Nadal, and Isner are among the very, very few players who consistently beat expectations in tiebreaks. Tiebreak skill should be closely related to outperforming expectations in tight matches. In any event, my collaborator on a related project, Ryan Rodenberg, has written at length about Federer’s lack of success in some lottery matches.

When we narrow the focus to matches in which neither player won more than 51% of points–true toss-up matches–Nadal is still among the best. In fact, the top four of Rafa, Fognini, Melzer, and Isner remains the same, as each of those players has won between 36% and 38% more often than they should in contests with these extremely slim margins.  Once again, Djokovic and Murray are positive, at +16% and +6%, respectively, while Federer trails far behind, at -9%.

Careening downward

A big advantage of using the broader, 53-percent-of-points definition of lottery matches is that it gives us a larger sample to work with. Nadal has only played 27 matches in his career when the loser won more points than the winner did, and only 40 when neither player topped 51% of total points won.

In the 53% category, though, Nadal has amassed several matches each year of his career, allowing us to look at more meaningful trends. Each season from 2005-11, he averaged about 15 tight matches per year, and won at least one more than we would’ve expected of him, often two or three. Since the beginning of last year, though, he’s played 25, winning only 13 when he should have won 16.

Even with the bigger sample, these are small margins. If Nadal comes roaring back next year and beyond, again winning more close matches than expected, we’ll ultimately see these two seasons as outliers. Yet most of Nadal’s peers post surprisingly consistent records in tight matches. In the last decade, Djokovic and Murray have each had only one season each below -10%, and Federer has reliably underperformed, never reaching +7% for a full season. Not every player is as good in these matches as Nadal, but the ones who do excel post roughly similar numbers from one year to the next.

The bigger picture

Winning tight matches is useful, but as Federer’s experience demonstrates, it’s hardly necessary. And in the case of Fognini, exceeding expectations in lottery matches is hardly sufficient for more general success.

Even better than winning tight matches is winning easy matches, and a useful side effect of studying lottery matches is generating measurements of who plays them the most–and, of course, the least.

Lottery matches–again, those in which neither player wins more than 53% of points–represent fewer than 20% of Rafa’s career matches. His 19.7% rate of close contests is lower than any other player since 2000 (minimum 100 matches). In this category, the big four are bunched together as expected. Among active players, Federer is second lowest, Djokovic is third, and Murray is eighth. Kei Nishikori and David Ferrer are also among the top ten.

At the other end of the spectrum, we find the usual big-serving suspects. Vasek Pospisil tops the list at 49.5%, with Ivo Karlovic (44.5%), Isner (41.9%), and Jerzy Janowicz (40.5%) filling out the top four.

Analyzing the results of very close matches–whichever definition you prefer–is a useful way of identifying players on lucky or unlucky streaks, or even those who appear to play particularly well on big points. However, the more meaningful metric–certainly the one that more closely correlates with elite-level success–is the one that tells us who is avoiding tight matches. The only thing better than luck is not needing it.

Statistical Quirks in Munich and Oeiras

From Monday to Sunday last week, the ATP 250s in Munich and Oeiras were filled with statistical quirks.  Here’s a rundown of some of the oddities you might have missed:

  • In both events, the top four seeds in qualifying advanced to the main draw. Since the beginning of the decade, there have been 173 ATP 250s, and these two events were only the 7th and 8th of those in which the four qualifiers were the top four seeds.
  • Not only that, but the four players those qualifiers defeated were the 5th through 8th seeds. Put another way, the eight players in the qualifying round were the top eight seeds.  That hadn’t happened in the 2010s–and it happened in two different tournaments last week.   Of the previous 171 events, seven seeds reached the qualifying round on four occasions, but never eight.
  • In Munich, Tommy Haas won his first two matches as a 36 year old. He wasn’t the first man that old to win a match this year–Marc Gicquel beat him to it in Montpelier–but he’s only the 12th player to do so since 2000.  Still, he has a long way to go to catch Ken Rosewall, who won over 350 tour-level matches after his 36th birthday.
  • In four matches, Fabio Fognini reached the Munich final, but he didn’t play a single direct entrant. After a first round bye, he faced Dustin Brown, a wild card, followed by three qualifiers, Thomaz Bellucci, Jan Lennard Struff, and Martin Klizan. In ATP history, no one has ever played every round of an event without facing another direct entrant.  There have been a few instances when a player faces four non-direct entrants (notably Richard Gasquet‘s 2007 Wimbledon run).  I also found a couple of WTA $10Ks in which a player faced five non-direct entrants, but there are eight qualifying and four wild card spots at that level.
  • Fognini lost to Klizan in the final, a repeat of the final result in 2012 in St. Petersburg. Klizan’s two titles both came against Fognini, making him only the fifth player in ATP history to win his only two finals against the same player.  The Slovak is in good company: The most recent guy on the list is David Ferrer, and before that was Carlos Moya.
  • Back in Portugal, the heavy favorite Tomas Berdych won the first set over Carlos Berlocq by the score of 6-0. That’s rare enough in a tour-level final. Berlocq made it much more unusual, though, when he came back to win.  It was only the 10th time in ATP history that a player won a final after dropping the first set 0-6. The last occurrence was quite recent, when Marcel Granollers came back to win in the Kitzbuhel final last year.
  • Berlocq is known as a fighter, but he had never come back from a 0-6 hole in a tour-level match before. He had done so only once as a pro, in a Challenger match against Marcos Daniel in 2004.
  • Berdych’s record was even more pure, having never lost a professional match after winning the first set 6-0.

One more quirk from the week: By winning the Tallahassee Challenger, Robby Ginepri ended an 11-year-long title drought at that level.  (Though he did win several ATP titles in that time.)  Amazingly, that isn’t a record.  Thomas Johansson went 12 years and two months between Challenger titles, and Tommy Robredo, who ended his own nearly 12-year drought when he won Caltanissetta in 2012.

The Madrid Masters has star power, but this year it’s unlikely to produce as many historical oddities as the week it follows.

The Misleading Stat Sheet

A glance at the stat sheet from Serena Williams’s third-round match against Jie Zheng suggests that Serena dominated.  23 aces to 1, 3 break point conversions to none, 54 winners to 21, 84% 2nd-serve points won to 50%, and 55% of the total points played.

Of course, according to the more important stats–games and sets–Serena didn’t dominate.  She barely snuck through, losing a first-set tiebreak and going to 9-7 in the third.

Rick Devereaux, who brought this contrast to my attention, suggests that grass-court tennis–with more clean winners and fewer unforced errors than slower-paced styles–may be responsible.  That’s certainly part of the equation.

In fact, the Serena/Zheng match highlights the limits of the traditional stat sheet, especially on a surface that particularly favors the server.  Except for winners and unforced errors, nearly every stat directly captures some aspect of serving prowess–either yours or your opponent’s.  And in an era where nearly everyone is an excellent server, it doesn’t matter much whether you’ve set down a great serving performance or merely a good one.

To get to tiebreaks (or 9-7, or 70-68), you don’t have to be as good as your opponent, you just need to be good enough to hold.  Even the “winners” stat has to do with serving dominance, since so many are third shots behind a serve.  The vast majority of the stats from Serena’s match tell us that the American was more dominant on her serve than Zheng was.  And, of course, while Zheng was good enough to hold to 6-6 and 7-7, she lost the second set fairly badly, so the stats are a weighted average of two almost-even sets and one lopsided one.

When we find a mismatch between stat sheet and scoreline, we’re usually seeing one of two things:

  1. One player was much more dominant on serve (think 4 or 5-point games instead of 6+)
  2. One player won a lot of clutch points (like deuce, on serve) — losing unimportant ones (like 40-0 on serve), thus padding her opponent’s stat sheet.

Oddly, in the men’s game, the players who we think of as most dominant on serve rarely give us mismatched score sheets like this–quite the opposite.  Note the wording: “one player was much more dominant.”  There’s no doubt John Isner can dominate on serve, but since almost all his opponents are also good servers, Isner’s weak return game means that he is often the less dominant server, winning service games at 40-30 and losing return games at 0-40 or 15-40.  In fact, Isner has won more than 20 career matches despite losing more than half of the points played!

The same reasoning doesn’t apply to Serena.  She may be as big a server (relative to her opponents) as Isner, but her return game is also world-class.  And in the WTA, there are far more weak-to-middling servers.  On grass, as Rick points out, those weak-to-middling servers are (usually) still able to hold, making it more likely that a dominant performance on paper ends at 9-7 in a deciding set.