Sebastian Ofner and ATP Debuts

This is a guest post by Peter Wetz.

Sebastian Ofner, the still relatively young Austrian, received some media attention this June when he qualified for the Wimbledon main draw at his first attempt and even reached the round of 32 by beating Thomaz Bellucci and Jack Sock. Therefore, some people, including me, had an eye on the 21-year-old when he made his ATP tour debut* at Kitzbuhel a few weeks later, where he was awarded a wild card.

Stunningly, Ofner made it into the semifinals despite having drawn top seed Pablo Cuevas in the second round. Cuevas, who admittedly seems to be out of form lately (or possibly is just regressing to his mean), had a 79% chance of reaching the quarterfinal when the draw came out, according to First Ball In’s forecast.

Let’s look at the numbers to contextualize Ofner’s achievement. How deep do players go when making their debut at ATP level? How often would we expect to see what Ofner did in Kitzbuhel?

The following table shows the results of ATP debutantes with different types of entry into the main draw (WC = wild card, Q = qualifier, Direct = direct acceptance, All = WC + Q + Direct). The data considers tournaments starting in 1990.

Round	WC       Q        Direct    All
R16	14.51%	 26.73%   24.46%    21.77%			
QF	 2.39%	  6.39%    4.32%     4.64%
SF	 0.51%	  2.30%    2.16%     1.59%
F	 0.17%	  0.64%    0.72%     0.46%
W	 0.17%	  0.26%    0.72%     0.27%

Since 1990 there have been 1507 ATP debuts: 586 wild cards (39%), 782 qualifiers (52%) and 139 direct acceptances (9%). Given these numbers, we would expect a wild card debutante to get to the semifinal (or further) every 9 years. In other words, it is a once in a decade feat. In fact, in the 28 years of data, only Lleyton Hewitt (Adelaide 1998), Michael Ryderstedt (Stockholm 2004) and Ernests Gulbis (St. Petersburg 2006) accomplished what Ofner did. Only Hewitt went on to win the tournament.

More than half of the players of all entry types who reached the final won the tournament. Speaking in absolute terms, 4 of 7 finalists (of ATP debutantes) won the tournament. (Due to the small sample size, it is perfectly possible that this is just noise in the data.)

If we exclude rounds starting from the semifinals because of small sample sizes, qualifiers outperform direct acceptances. This may be the result of qualifiers having already played two or three matches and having already become accustomed to the conditions, making it easier for them than it is for debutantes who got accepted directly into the main draw. But to really prove this, more investigation is needed.

For now we know that what Sebastian Ofner has achieved rarely happens. We should also know that by no means is his feat a predictor of future greatness.

* I define Kitzbuhel as Ofner’s ATP tour debut because Grand Slam events are run by the ITF. However, Grand Slam statistics, such as match wins, are included in ATP statistics.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.

Putting the Antalya Draw Into Perspective

This is a guest post by Peter Wetz.

When the pre-Wimbledon grass court tournament in Antalya was announced by the ATP in May 2016, some people were scratching their heads: Which top players will be willing to play in Antalya, Turkey one week ahead of Wimbledon? Even more so, because one week earlier two events are played in London and Halle, the latter being considerably closer to London. If a player wanted to participate in Antalya, he would have to fly from Halle (or London) to Antalya and then back to London for Wimbledon, not an ideal itinerary.

Taking a glance at the entry list, the doubts are verified: After Dominic Thiem, the only top 10 player entered in the event, there were just three other men (Paolo Lorenzi, Viktor Troicki and Fernando Verdasco) ranked within the top 40. Only three (Thiem, Verdasco, and Lorenzi) of the 28 players who were directly accepted to the main draw of the event, will be seeded at Wimbledon.

But how weak is the field really compared to others? Of course there are countless ways to measure the strength of a draw, but for a quick and dirty approach we will simply look at two measures, that is, the last direct acceptance (LDA) and the mean rank of quarterfinalists.

The LDA is the rank of the last player who gained direct entrance into a tournament’s main draw excluding lucky losers, qualifiers and special exempts. Comparing the last direct acceptance of the Antalya draw (86, Radu Albot) to all other ATP Tour level events with a draw size of 32 or 28 players, it turns out that Antalya is at the 39th percentile. This means that 39% of the other tournaments have a better/lower (or equal) LDA and that 61% have a worse/higher LDA, respectively. The following image shows a percentile plot of LDAs of tournaments since 2012, highlighting this week’s event in Antalya:

The fact that the LDA compares well against the other tournaments tells us that despite the lack of top ranked seeds, the field seems to be more dense at the bottom. Not that bad after all?

Let us take a look at the mean rank of the eight players who made it into the quarterfinals. Choosing quarterfinalists limits the calculation to the players who were able to perform well at the event, winning at least one, and usually two, matches. This should reduce some of the noise in the data that would be otherwise included due to lucky first round wins.

The mean rank of the quarterfinalists at the Antalya Open 2017 is 109. Out of the 726 tournaments since 2000 with 32 or 28 player draws which were considered in this analysis, only 35 tournaments had a higher mean rank of players at the quarterfinal stage. With nine out of those 35 tournaments, the Hall of Fame Tennis Championships at Newport–which takes place each year after Wimbledon–stands out from the pack. As the following plot shows, the Antalya Open is at the 95th percentile in this category. This seems to be more aligned with what we would have expected.

To provide some context, the following table lists the top 10 tournaments with links to the draws having the worst mean rank of quarterfinalists.

#  Tournament           Mean QF Rank
1  Newport '10          240
2  Newport '01          197
3  Delray Beach '16     191
4  Moscow '13           166
5  Newport '11          166
6  Newport '07          165
7  s-Hertogenbosch '09  164
8  Newport '08          163
9  Gstaad '14           156
10 Amsterdam '01        152
...
36 Antalya '17          109

The seeds are to blame for this: Of the eight seeds, only Verdasco managed to win a match. The other seven went winless. We have to go back as far as 1983’s Tel Aviv tournament to find a draw where only one seed won a match. In Tel Aviv, however, the third seed Colin Dowdeswell won three matches all in all, whereas Fernando Verdasco crashed out in the second round. By the way, Tel Aviv 1983 marks the first title of the then 16 years and 2 months old Aaron Krickstein, still the youngest player to win a singles title on the ATP Tour. That only two out of eight seeds win their first match happens about once per year. The last time this happened at the 2016 Brasil Open, where only Pablo Cuevas and Federico Delbonis won matches as seeds.

Despite the presence of only one top 30 player in this year’s Antalya draw, the middle and bottom of the field looked surprisingly solid, as we saw when considering the last direct acceptance. However, if we take into account the development of the tournament and calculate the mean rank of quarterfinalists, it becomes clear that the field got progressively weaker. Still, there have been worse draws in the past and there will doubtless be worse draws in future. Maybe even in the not too distant future, if we take a glance at this year’s Newport entry list.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.

Dominic Thiem played Davis Cup in Barcelona. Sort of…

This is a guest post by Peter Wetz.

Last week Dominic Thiem fought his way into the finals of the Barcelona Open by winning against Kyle Edmund, Daniel Evans, Yuichi Sugita, and Andy Murray. Three of these four players play for the same flag and Thiem won against each of them. Thiem is not exactly a champion of the current Davis Cup format–he has opted out of playing for Austria several times and has a rather poor record of 2-3 when he does compete–but in Barcelona he has, at least, shown that he can beat several players from the same country over a short amount of time. And that’s what Davis Cup is about, right?

In this post my goal is to put this statistical hiccup into some context. It is not the first time the Austrian defeated three players of the same nationality at one event: In 2016 at Buenos Aires Thiem already beat three players from Spain. However, given that Spanish players appear much more frequently in draws than Britons do, I will take a closer look.

Since 1990, there have only been three tournaments where a single player faced three players from Great Britain. And only one of these players who faced three Britons won each encounter. The following table shows the three tournaments and each of the matches where a player from Great Britain was faced by the same player. Wally Masur is the only player since 1990 who defeated three players from Great Britain in a single tournament. Thiem remains the only player who achieved this in a tournament outside of the island.

Tournament     Round Winner        Loser           Score
'93 Manchester R32   Wally Masur   Ross Matheson   6-4 6-4
'93 Manchester R16   Wally Masur   Chris Wilkinson 6-3 6-7(4) 6-3
'93 Manchester QF    Wally Masur   Jeremy Bates    6-4 6-3

'97 Nottingham R32   Karol Kucera  Martin Lee      6-1 6-1
'97 Nottingham SF    Karol Kucera  Tim Henman      6-4 2-6 6-4
'97 Nottingham F     Greg Rusedski Karol Kucera    6-4 7-5

'01 Nottingham R32   Martin Lee    Lee Childs      6-4 5-7 6-0
'01 Nottingham R16   Martin Lee    Arvind Parmar   6-4 6-3
'01 Nottingham QF    Greg Rusedski Martin Lee      6-3 6-2

Obviously, there are not many chances to face three Britons in a single tournament. And when one of those opponents is likely to be Andy Murray, a player’s chances of beating all three are even slimmer.

Let’s broaden the perspective a bit and take a look at how often a player defeated three (or more) players from the same country without looking only at Great Britain. The following table displays the results of this analysis. The first column contains the country, the second column (3W) shows how often a player defeated three players of this country, the third column (3WL) shows how often a player defeated two players of this country and then lost to a player of the same country, and so on.

Country  3W  3WL  4W  4WL  5W  5WL
USA      119 179  19  30   1   4
ESP      98  157  17  18   3   2
FRA      28  45   5   2    1   0
ARG      22  26   5   3    0   0
GER      15  18   1   1    0   0
AUS      13  9    0   0    0   0
SWE      9   16   1   0    0   0
CZE      4   5    0   0    0   0
NED      4   4    0   0    0   0
RUS      4   3    0   0    0   0
ITA      2   3    1   0    0   0
BRA      1   3    1   0    0   0
GBR      1   2    0   0    0   0
CHI      1   1    0   0    0   0
SUI      1   1    0   0    0   0

As we could have imagined, USA, ESP, and FRA come out on top here, simply, because for years they have had the highest density of players in the rankings. These are also the only countries of which a player was faced five times at a single tournament. Facing a player of the same country six or more times never happened according to the data at hand. The following table shows the most recent occasions of the entries printed in bold in the above table (5W).

Tournament    Round Winner        Loser             Score
'91 Charlotte R32   Jaime Yzaga   Chris Garner      7-6 6-3
'91 Charlotte R16   Jaime Yzaga   Jimmy Brown       6-4 6-4
'91 Charlotte QF    Jaime Yzaga   Michael Chang     7-6 6-1
'91 Charlotte SF    Jaime Yzaga   M. Washington     7-5 6-2
'91 Charlotte F     Jaime Yzaga   Jimmy Arias       6-3 7-5
                                                 
'07 Lyon      R32   Sebastien Gr. Rodolphe Cadart   6-3 6-2
'07 Lyon      R16   Sebastien Gr. Fabrice Santoro   4-6 6-1 6-2
'07 Lyon      QF    Sebastien Gr. Julien Benneteau  6-7 6-2 7-6
'07 Lyon      SF    Sebastien Gr. Jo Tsonga         6-1 6-2
'07 Lyon      F     Sebastien Gr. Marc Gicquel      7-6 6-4
                                                  
'08 Valencia  R32   David Ferrer  Ivan Navarro      6-3 6-4
'08 Valencia  R16   David Ferrer  Pablo Andujar     6-3 6-4
'08 Valencia  QF    David Ferrer  Fernando Verdasco 6-3 1-6 7-5
'08 Valencia  SF    David Ferrer  Tommy Robredo     2-6 6-2 6-3
'08 Valencia  F     David Ferrer  Nicolas Almagro   4-6 6-2 7-6

Finally, we take a look at the big four. Did they ever eliminate three or more players from the same country in a single tournament? Yes, they did. In 2014 Roger Federer beat three Czech players in Dubai. In 2005, 2008, and 2013 he beat three German players in Halle. In 2009 Andy Murray beat three Spanish players in Valencia. In 2007 Novak Djokovic beat three Spanish players in Estoril. In 2013 Rafael Nadal beat three Argentinian players both in Acapulco and Sao Paolo. In 2015 he even beat four Argentinian players in Buenos Aires. And there are many other examples where Rafa beat three of his countrymen at the same tournament.

We can see that this happens fairly often, specifically for countries where the tournament is organized, because more players of this country appear in the draw due to wild cards and qualifications. If we exclude these cases, Federer’s streak in Dubai stands out, as does Thiem’s streak in Barcelona.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.

Measuring the Performance of Tennis Prediction Models

With the recent buzz about Elo rankings in tennis, both at FiveThirtyEight and here at Tennis Abstract, comes the ability to forecast the results of tennis matches. It’s not far fetched to ask yourself, which of these different models perform better and, even more interesting, how they fare compared to other ‘models’, such as the ATP ranking system or betting markets.

For this, admittedly limited, investigation, we collected the (implied) forecasts of five models, that is, FiveThirtyEight, Tennis Abstract, Riles, the official ATP rankings, and the Pinnacle betting market for the US Open 2016. The first three models are based on Elo. For inferring forecasts from the ATP ranking, we use a specific formula1 and for Pinnacle, which is one of the biggest tennis bookmakers, we calculate the implied probabilities based on the provided odds (minus the overround)2.

Next, we simply compare forecasts with reality for each model asking If player A was predicted to be the winner (P(a) > 0.5), did he really win the match? When we do that for each match and each model (ignoring retirements or walkovers) we come up with the following results.

Model		% correct
Pinnacle	76.92%
538		75.21%
TA		74.36%
ATP		72.65%
Riles		70.09%

What we see here is how many percent of the predictions were actually right. The betting model (based on the odds of Pinnacle) comes out on top followed by the Elo models of FiveThirtyEight and Tennis Abstract. Interestingly, the Elo model of Riles is outperformed by the predictions inferred from the ATP ranking. Since there are several parameters that can be used to tweak an Elo model, Riles may still have some room left for improvement.

However, just looking at the percentage of correctly called matches does not tell the whole story. In fact, there are more granular metrics to investigate the performance of a prediction model: Calibration, for instance, captures the ability of a model to provide forecast probabilities that are close to the true probabilities. In other words, in an ideal model, we want 70% forecasts to be true exactly in 70% of the cases. Resolution measures how much the forecasts differ from the overall average. The rationale here is, that just using the expected average values for forecasting will lead to a reasonably well-calibrated set of predictions, however, it will not be as useful as a method that manages the same calibration while taking current circumstances into account. In other words, the more extreme (and still correct) forecasts are, the better.

In the following table we categorize the set of predictions into bins of different probabilities and show how many percent of the predictions were correct per bin. This also enables us to calculate Calibration and Resolution measures for each model.

Model    50-59%  60-69%  70-79%  80-89%  90-100% Cal  Res   Brier
538      53%     61%     85%     80%     91%     .003 .082  .171
TA       56%     75%     78%     74%     90%     .003 .072  .182
Riles    56%     86%     81%     63%     67%     .017 .056  .211
ATP      50%     73%     77%     84%     100%    .003 .068  .185
Pinnacle 52%     91%     71%     77%     95%     .015 .093  .172

As we can see, the predictions are not always perfectly in line with what the corresponding bin would suggest. Some of these deviations, for instance the fact that for the Riles model only 67% of the 90-100% forecasts were correct, can be explained by small sample size (only three in that case). However, there are still two interesting cases (marked in bold) where sample size is better and which raised my interest. Both the Riles and Pinnacle models seem to be strongly underconfident (statistically significant) with their 60-69% predictions. In other words, these probabilities should have been higher, because, in reality, these forecasts were actually true 86% and 91% percent of the times.3 For the betting aficionados, the fact that Pinnacle underestimates the favorites here may be really interesting, because it could reveal some value as punters would say. For the Riles model, this would maybe be a starting point to tweak the model.

In the last three columns Calibration (the lower the better), Resolution (the higher the better), and the Brier score (the lower the better) are shown. The Brier score combines Calibration and Resolution (and the uncertainty of the outcomes) into a single score for measuring the accuracy of predictions. The models of FiveThirtyEight and Pinnacle (for the used subset of data) essentially perform equally good. Then there is a slight gap until the model of Tennis Abstract and the ATP ranking model come in third and fourth, respectively. The Riles model performs worst in terms of both Calibration and Resolution, hence, ranking fifth in this analysis.

To conclude, I would like to show a common visual representation that is used to graphically display a set of predictions. The reliability diagram compares the observed rate of forecasts with the forecast probability (similar to the above table).

The closer one of the colored lines is to the black line, the more reliable the forecasts are. If the forecast lines are above the black line, it means that forecasts are underconfident, in the opposite case, forecasts are overconfident. Given that we only investigated one tournament and therefore had to work with a low sample size (117 predictions), the big swings in the graph are somewhat expected. Still, we can see that the model based on ATP rankings does a really good job in preventing overestimations even though it is known to be outperformed by Elo in terms of prediction accuracy.

To sum up, this analysis shows how different predictive models for tennis can be compared among each other in a meaningful way. Moreover, I hope I could exhibit some of the areas where a model is good and where it’s bad. Obviously, this investigation could go into much more detail by, for example, comparing the models in how well they do for different kinds of players (e.g., based on ranking), different surfaces, etc. This is something I will spare for later. For now, I’ll try to get my sleeping patterns accustomed to the schedule of play for the Australian Open, and I hope, you can do the same.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.

Footnotes

1. P(a) = a^e / (a^e + b^e) where a are player A’s ranking points, b are player B’s ranking points, and e is a constant. We use e = 0.85 for ATP men’s singles.

2. The betting market in itself is not really a model, that is, the goal of the bookmakers is simply to balance their book. This means that the odds, more or less, reflect the wisdom of the crowd, making it a very good predictor.

3. As an example, one instance, where Pinnacle was underconfident and all other models were more confident is the R32 encounter between Ivo Karlovic and Jared Donaldson. Pinnacle’s implied probability for Karlovic to win was 64%. The other models (except the also underconfident Riles model) gave 72% (ATP ranking), 75% (FiveThirtyEight), and 82% (Tennis Abstract). Turns out, Karlovic won in straight sets. One factor at play here might be that these were the US Open where more US citizens are likely to be confident about the US player Jared Donaldson and hence place a bet on him. As a consequence, to balance the book, Pinnacle will lower the odds on Donaldson, which results in higher odds (and a lower implied probability) for Karlovic.

Andrey Kuznetsov and Career Highs of ATP Non-Semifinalists

When following this week’s ATP 250 tournament in Winston-Salem and seeing Andrey Kuznetsov in the quarterfinals the following question arose: Will he finally make it into the first ATP semifinal of his career? As shown here Andrey – with a ranking of 42 – is currently (by far) the best-ranked player who has not reached an ATP SF. And it looks as if he will stay on top of this list for some time longer after losing to Pablo Carreno Busta 4-6 3-6 on Wednesday.

With stats of 0-10 in ATP quarterfinals, he is still pretty far away from Teymuraz Gabashvili‘s streak of 0-16. Despite having lost six more quarterfinals before winning his first QF this January against a retiring Bernard Tomic, Teymuraz climbed only to a ranking of 50. Still, we could argue that the QF losing-streak of Teymuraz is not really over after having won against a possibly injured player.

Running the numbers can answer questions such as “Who could climb up highest in the rankings without having won an ATP quarterfinal?” Doing so will put Andrey’s number 42 into perspective and will possibly reveal some other statistical trivia.

Player                Rank            Date   On
Andrei Chesnokov        30      1986.11.03    1
Yen Hsun Lu             33      2010.11.01    1
Nick Kyrgios            34      2015.04.06    1
Adrian Voinea           36      1996.04.15    1
Paul Haarhuis           36      1990.07.09    1
Jaime Yzaga             40      1986.03.03    1
Antonio Zugarelli       41      1973.08.23    1
Bernard Tomic           41      2011.11.07    1
Omar Camporese          41      1989.10.09    1
Wayne Ferreira          41      1991.12.02    1
Andrey Kuznetsov        42      2016.08.22    0
David Goffin            42      2012.10.29    1
Mischa Zverev           45      2009.06.08    1
Alexandr Dolgopolov     46      2010.06.07    1
Andrew Sznajder         46      1989.09.25    1
Lukas Rosol             46      2013.04.08    1
Ulf Stenlund            46      1986.07.07    1
Dominic Thiem           47      2014.07.21    1
Janko Tipsarevic        47      2007.07.16    1
Paul Annacone           47      1985.04.08    1
Renzo Furlan            47      1991.06.17    1
Mike Fishbach           47      1978.01.16    0
Oscar Hernandez         48      2007.10.08    1
Ronald Agenor           48      1985.11.25    1
Gary Donnelly           48      1986.11.10    0
Francisco Gonzalez      49      1978.07.12    1
Paolo Lorenzi           49      2013.03.04    1
Boris Becker            50      1985.05.06    1
Brett Steven            50      1993.02.15    1
Dominik Hrbaty          50      1997.05.19    1
Mike Leach              50      1985.02.18    1
Patrik Kuhnen           50      1988.08.01    1
Teymuraz Gabashvili     50      2015.07.20    1
Blaine Willenborg       50      1984.09.10    0

The table shows career highs (up until #50) for players before they won their first ATP QF. A 0 in the last column indicates that the player can still climb up in this table, because he did not win a QF, yet. There may also be retired players being denoted with a 0, because they never managed to get past a QF during their career.

I wonder, who had Andrei Chesnokov on the radar for this? Before winning his first ATP QF he pushed his ranking as far as 30. He later went on to have a career high of 9. Nick Kyrgios could also improve his ranking quickly without the need to go as deep as a SF. His Wimbledon 2014 QF, Roland Garros 2015 R32, and Australian Open 2015 QF runs helped him to get up until #34 without a single win at an ATP QF. Also, I particularly would like to highlight Alexandr Dolgopolov who reached #46 before having even played a single QF.

Looking only at players who are still active and able to up their ranking without an ATP SF we get the following picture:

Player                 Rank            Date
Andrey Kuznetsov         42      2016.08.22
Rui Machado              59      2011.10.03
Tatsuma Ito              60      2012.10.22
Matthew Ebden            61      2012.10.01
Kenny De Schepper        62      2014.04.07
Pere Riba                65      2011.05.16
Tim Smyczek              68      2015.04.06
Blaz Kavcic              68      2012.08.06
Alejandro Gonzalez       70      2014.06.09

Andrey seems to be relatively alone with Rui Machado being second in the list having reached his highest ranking already about five years ago. Skimming through the remainder of the table, we would be surprised if anyone soon would be able to come close to Andrey’s 42, which doesn’t mean that a sudden unexpected streak of an upcoming player would render this scenario impossible.

So what practical implications does this give us for analyzing tennis? Hardly any, I am afraid. Still, we can infer that it is possible to get well within the top-50 without winning more than two matches at a single tournament over a duration that can even range over a player’s whole career. Of course it would be interesting to see how long such players can stay in these ranking areas, guaranteeing direct acceptance into ATP tournaments and, hence, a more or less regular income from R32, R16, and QF prize money. Moreover, as the case of 2015-ish Nick Kyrgios shows, the question arises how one’s ranking points are composed: Performing well at the big stage of Masters or Grand Slams can be enough for a decent ranking while showing poor performance at ATP 250s. On the other hand, are there players whose ATP points breakdown reveals that they are willing to go for easier points at ATP 250s while never having deep runs at Masters or Grand Slams? These are questions which I would like to answer in a future post.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria. I would like to thank Jeff for being open-minded and allowing me to post these surface-scratching lines here.