The Tournament Simulation Reference

Italian translation at settesei.it

Among the more popular features of Heavy Topspin are my tournament forecasts, based on draw simulations.  It’s about time that I summarize how these work.

Monte Carlo simulations

To generate tournament predictions, we first need a way to predict the outcome of individual matches.  For that, I use jrank, which I’ve written about elsewhere.  With numerical estimates of a player’s skill–not unlike ATP ranking points–we can calculate the probability that each player wins the match.

Once those matchup probabilities are calculated, it’s a matter of “playing” the tournament thousands upon thousands of times.  Here, computers come in awfully handy.

My code (a version of which is publicly available) uses a random-number generator (RNG) to determine the winner of each match.  For instance, at the top of the Rogers Cup draw this week, Novak Djokovic gets a bye, after which he’ll play the winner of Bernard Tomic‘s match with Michael Berrer.  My numbers give Tomic a 64% chance of beating Berrer.  To “play” that match in a simulated tournament, the RNG spits out a number between 0 and 1.  If the result is below .64, Tomic is the winner; if not, Berrer wins.

The winner advances to “play” Djokovic.  The code determines Djokovic’s probability of beating whoever advances to play him, then generates a new random number to pick the winner.  Repeat the process 47 times–one for each match–and you’ve simulated the entire tournament.

Each simulation, then, gives us a set of results.  Perhaps Tomic reaches the second round, losing to Djokovic, who then loses in the quarters to Juan Martin Del Potro, who goes on to win the tournament.   That’s one possibility–and it’s more likely than many alternatives–but it doesn’t tell the whole story.

That’s why we do it thousands (or even millions) of times.  Over that many simulations, Delpo occasionally wins, but somewhat more often, Djokovic wins that quarterfinal showdown.  Tomic usually reaches the second round, but sometimes it’s Berrer into the second round.  All of these “usually’s” and “sometimes’s” are converted into percentages based on just how often they occur.

Probability adjustments

For any given pair of players, we don’t always expect the same outcome.  Pablo Andujar is almost always the underdog on hardcourts, but we expect him to beat most mid-packers on clay.  Players perform (a bit) better in their home country.  Qualifiers do worse than equivalent players who didn’t have to qualify.

Thus, if we take last week’s Washington field and transplant it to the clay courts of Vina Del Mar, the numbers would change a great deal.  Americans and hard-court specialists would see their chances decrease, while Chileans and clay-courters would see theirs increase–just as conventional wisdom suggests would happen.

Simulation variations: Draw-independence

Some of the more interesting results come from messing around with the draw.  Every time a field is arranged into a bracket, there are winners and losers.  Whoever is drawn to face the top seed in the first round (or second, as Berrer and Tomic can attest) is probably unlucky, while somewhere else in the draw, a couple of lucky qualifiers get to play each other for a spot in the second round.

That’s one of the reasons I sometimes run draw-independent simulations (DIS).  If we want to know how much the draw helped or hurt a player, we need to know how successful he was likely to be before he was placed in the draw.  (DISs are also handy if you know the likely field, but the draw isn’t yet set.)

To run a draw-independent sim, we have to start one step earlier.  Instead of taking the draw as a given, we take the field as a given, including the seedings if we know them.  Then we use the same logic as tournament officials will use in constructing the draw.  The #1 seed goes at the top, #2 at the bottom.  #3 and #4 are randomly placed in the remaining quarters.  #5 through #8 are randomly placed in the remaining eighths, and so on.

(Update: I’ve published a python function, reseeder(), which generates random draws for any combination of number of seeds and field size that occurs on the ATP tour.)

Simulation variations: Seed-independence

We can take this even further to measure the beneficial effect of seeding.  Most of the time we take seeding for granted–we want the top two players in the world to clash only in the final, and so on.  But it can have a serious effect on a player’s chances of winning a tournament.  In Toronto this week, the top 16 seeds (along with, in all likelihood, a very lucky loser or two) get a bye straight into the second round.  That helps!

Even when there are no byes, seedings guarantee relatively easy matches for the first couple of rounds.  That may not make a huge difference for someone like Djokovic–he’ll cruise whether he draws a seeded Florian Mayer or an unseeded Jeremy Chardy.  But if you are Mayer, consider the benefits.  You’re barely better than some unseeded players, but you’re guaranteed to miss the big guns until the third round.

This is why we talk so much about getting into the top 32 in time for slams.  When the big points and big money are on the line, you want those easy opening matches even more than usual.  There isn’t much separating Kevin Anderson from Sam Querrey, but if the US Open draw were held today, Anderson would get a seed and Querrey wouldn’t.  Guess who we’d be more likely to see in the third round!

To run a seed-independent simulation: Instead of generating a logical draw, as we do with a DIS, generate a random draw, in which anyone can face anyone in the first round.

Measuring variations

If we compare forecasts based on the actual draw to draw-independent or seed-independent forecasts, we want to quantify the difference.  To do so, I’ve used two metrics: Expected Ranking Points (ERP) and Expected Prize Money (EPM).

Both reduce an entire tournament’s worth of forecasts to one number per player.  If Djokovic has a 30% chance of winning this week in Toronto, that’s the probability he’ll take home 1,000 points.  If those were the only points on offer, his ERP would be 30% of 1,000, or 300.

Of course, if Djokovic loses, he’ll still get some points.  To come up with his overall ERP, we consider his probability of losing the finals and the number of points awarded to the losing finalist, his probability of losing in the semis and the number of points awarded to semifinalists, and so on.  To calculate EPM, we use the same process, but with–you guessed it–prize money instead of ranking points.

Both numbers allow to see how much the draw helps or hurts a player.  For instance, before the French Open, I calculated that Richard Gasquet‘s EPM rose by approximately 25% thanks to a very lucky draw.

These numbers also help us analyze a player’s scheduling choices.  The very strong Olympics field and the much weaker Washington field last week created an odd situation: Lesser players were able to rack up far more points than their more accomplished colleagues. Even before the tournament, we could use the ERP/EPM approach to see that Mardy Fish could expect 177 points in Washington while the far superior David Ferrer could expect only 159 in London.

If you’ve read this far, you will probably enjoy the newest feature on TennisAbstract.com–live-ish forecast updates for all ATP events.  Find links on the TA.com homepage, or click straight to the Rogers Cup page.

Serving First in Marathon Sets

Italian translation at settesei.it

Last night, when Jo Wilfried Tsonga finally defeated Milos Raonic, it was on a match-ending break of serve.  Conventional wisdom suggests that’s often how it goes.  Whoever serves first in a long set seems to have the advantage.  There’s less pressure to hold serve at 7-7 (or 47-47) than there is at 7-8.

Tsonga won his contest with a match-ending break point; Isner finished off his 70-68 set on Mahut’s serve; and when Federer and Roddick went to 14-14 in the 2009 Wimbledon final, Roger held for 15-14 before breaking the American.  Is it a trend?

As it turns out, those three high-profile matches have misled us.  Based on the limited data available, the first server in fifth-set epics has little or no advantage.

(Third-set epics are so rare that we might as well ignore them–the Olympics is the only tournament where men play best-of-three with no tiebreak in the final set.)

We don’t know who served first for every marathon fifth set in tennis history, but we can figure it out for some.  The ATP has limited stats for most matches back to 1991, and those stats include numbers of service games.  When the number of service games is equal for both players, we’re stuck at square one.  When one player has more than the other, that guy must have served the first game of the match–and the last.  Since marathon sets must contain an even number of games, we know who served first in the final set.

The result is a pool of 138 matches in which the fifth set ended at 8-6 or higher and we know who served first.  Of those, the guy who served first–at 0-0, 1-1, 6-6, and so on–won the match 67 times (48.6%).  It’s a coin toss.

If we take pressure out of the equation, this makes perfect sense.  If two guys have gotten to 6-6 in the fifth set, they’re playing as equally as two tennis players can play.  It’s only when we consider the stress of serving to stay in the match that we start to suspect that one player–but not the other–won’t be able to hold up his end.

For a bigger dataset, we can look to similar situations.  Consider 5-setters that end 7-5 in the fifth.  Those don’t have the cachet of matches that go farther, but they are quite epic in their own right.  We know who served first in 86 such matches, and of those, the man who served first won only 38 (44.2%).  It’s not exactly proof that the first server has a disadvantage, but it does cast more doubt on the conventional wisdom.

If want more than 200 or so matches, we need to weaken our definition of “epic.”  Tiebreaks aren’t relevant here, since we’re looking for instances where one player was broken under pressure.  But we can use best-of-three contests that ended 7-5.

With so many more best-of-three matches on the schedule, our dataset is now much bigger.  We know who served first for 753 tour-level matches that ended 7-5 in the third.  Of these, the player who served first went 412-341, winning nearly 55% of matches.

If you want evidence that the conventional wisdom is correct, there you go.  If a match reaches 5-5 in the deciding set and ends with a break, there is, altogether, a 53% chance that the first server wins.

But with our more limited data, it’s impossible to draw the same conclusion about five-setters once they head into the barely-charted territory beyond 6-6.

2012 Olympics Round of 16 Forecasts

Here are my forecasts for the remaining 16 players in both Olympics singles draws.  Note that Djokovic has opened up a bigger gap over Federer.  Novak is aided by Berdych’s upset, while Federer is still likely to play the top seeds in his half.

On the women’s side, the third quarter is a crowded one, with Clijsters, Sharapova, and two dangerous floaters in Ivanovic and Lisicki.

For more background, you can see my initial forecasts, (almost) current rankings, and methodology.

Men:

Player                       QF     SF      F      W  
(1)Roger Federer          85.3%  64.5%  45.1%  25.7%  
Denis Istomin             14.7%   5.0%   1.5%   0.3%  
(10)John Isner            53.5%  16.9%   7.5%   2.4%  
(7)Janko Tipsarevic       46.5%  13.5%   5.6%   1.7%  
(4)David Ferrer           63.3%  36.3%  16.2%   6.7%  
(15)Kei Nishikori         36.7%  16.0%   5.2%   1.6%  
(12)Gilles Simon          32.3%  11.7%   3.3%   0.8%  
(8)Juan Martin Del Potro  67.7%  36.0%  15.5%   6.2%  

Player                       QF     SF      F      W  
Steve Darcis              39.5%   8.9%   1.5%   0.3%  
(11)Nicolas Almagro       60.5%  18.1%   4.2%   1.3%  
Marcos Baghdatis          22.7%  11.9%   2.7%   0.7%  
(3)Andy Murray            77.3%  61.1%  29.8%  16.4%  
(5)Jo-Wilfried Tsonga     67.5%  23.3%  12.0%   5.4%  
Feliciano Lopez           32.5%   6.9%   2.4%   0.7%  
(WC)Lleyton Hewitt         4.6%   0.6%   0.1%   0.0%  
(2)Novak Djokovic         95.4%  69.3%  47.3%  29.7%

Women:

Player                 QF     SF      F      W  
Victoria Azarenka   78.9%  53.3%  28.2%  18.0%  
Nadia Petrova       21.1%   7.9%   1.9%   0.6%  
Venus Williams      16.8%   2.5%   0.3%   0.1%  
Angelique Kerber    83.2%  36.3%  14.8%   7.6%  
Serena Williams     75.9%  56.2%  36.9%  26.2%  
Vera Zvonareva      24.1%  11.5%   4.4%   1.9%  
Daniela Hantuchova  36.2%   9.1%   2.9%   1.1%  
Caroline Wozniacki  63.8%  23.2%  10.6%   5.3%  

Player                 QF     SF      F      W  
Kim Clijsters       62.5%  33.2%  20.3%   8.9%  
Ana Ivanovic        37.5%  15.4%   7.4%   2.5%  
Sabine Lisicki      36.8%  15.7%   7.7%   2.5%  
Maria Sharapova     63.2%  35.6%  22.2%  10.0%  
Petra Kvitova       65.5%  45.7%  23.9%  10.2%  
Flavia Pennetta     34.5%  18.9%   7.0%   1.9%  
Maria Kirilenko     47.5%  16.2%   5.0%   1.2%  
Julia Goerges       52.5%  19.3%   6.6%   1.8%