# The Tournament Simulation Reference

Among the more popular features of Heavy Topspin are my tournament forecasts, based on draw simulations.  It’s about time that I summarize how these work.

Monte Carlo simulations

To generate tournament predictions, we first need a way to predict the outcome of individual matches.  For that, I use jrank, which I’ve written about elsewhere.  With numerical estimates of a player’s skill–not unlike ATP ranking points–we can calculate the probability that each player wins the match.

Once those matchup probabilities are calculated, it’s a matter of “playing” the tournament thousands upon thousands of times.  Here, computers come in awfully handy.

My code (a version of which is publicly available) uses a random-number generator (RNG) to determine the winner of each match.  For instance, at the top of the Rogers Cup draw this week, Novak Djokovic gets a bye, after which he’ll play the winner of Bernard Tomic‘s match with Michael Berrer.  My numbers give Tomic a 64% chance of beating Berrer.  To “play” that match in a simulated tournament, the RNG spits out a number between 0 and 1.  If the result is below .64, Tomic is the winner; if not, Berrer wins.

The winner advances to “play” Djokovic.  The code determines Djokovic’s probability of beating whoever advances to play him, then generates a new random number to pick the winner.  Repeat the process 47 times–one for each match–and you’ve simulated the entire tournament.

Each simulation, then, gives us a set of results.  Perhaps Tomic reaches the second round, losing to Djokovic, who then loses in the quarters to Juan Martin Del Potro, who goes on to win the tournament.   That’s one possibility–and it’s more likely than many alternatives–but it doesn’t tell the whole story.

That’s why we do it thousands (or even millions) of times.  Over that many simulations, Delpo occasionally wins, but somewhat more often, Djokovic wins that quarterfinal showdown.  Tomic usually reaches the second round, but sometimes it’s Berrer into the second round.  All of these “usually’s” and “sometimes’s” are converted into percentages based on just how often they occur.

For any given pair of players, we don’t always expect the same outcome.  Pablo Andujar is almost always the underdog on hardcourts, but we expect him to beat most mid-packers on clay.  Players perform (a bit) better in their home country.  Qualifiers do worse than equivalent players who didn’t have to qualify.

Thus, if we take last week’s Washington field and transplant it to the clay courts of Vina Del Mar, the numbers would change a great deal.  Americans and hard-court specialists would see their chances decrease, while Chileans and clay-courters would see theirs increase–just as conventional wisdom suggests would happen.

Simulation variations: Draw-independence

Some of the more interesting results come from messing around with the draw.  Every time a field is arranged into a bracket, there are winners and losers.  Whoever is drawn to face the top seed in the first round (or second, as Berrer and Tomic can attest) is probably unlucky, while somewhere else in the draw, a couple of lucky qualifiers get to play each other for a spot in the second round.

That’s one of the reasons I sometimes run draw-independent simulations (DIS).  If we want to know how much the draw helped or hurt a player, we need to know how successful he was likely to be before he was placed in the draw.  (DISs are also handy if you know the likely field, but the draw isn’t yet set.)

To run a draw-independent sim, we have to start one step earlier.  Instead of taking the draw as a given, we take the field as a given, including the seedings if we know them.  Then we use the same logic as tournament officials will use in constructing the draw.  The #1 seed goes at the top, #2 at the bottom.  #3 and #4 are randomly placed in the remaining quarters.  #5 through #8 are randomly placed in the remaining eighths, and so on.

(Update: I’ve published a python function, reseeder(), which generates random draws for any combination of number of seeds and field size that occurs on the ATP tour.)

Simulation variations: Seed-independence

We can take this even further to measure the beneficial effect of seeding.  Most of the time we take seeding for granted–we want the top two players in the world to clash only in the final, and so on.  But it can have a serious effect on a player’s chances of winning a tournament.  In Toronto this week, the top 16 seeds (along with, in all likelihood, a very lucky loser or two) get a bye straight into the second round.  That helps!

Even when there are no byes, seedings guarantee relatively easy matches for the first couple of rounds.  That may not make a huge difference for someone like Djokovic–he’ll cruise whether he draws a seeded Florian Mayer or an unseeded Jeremy Chardy.  But if you are Mayer, consider the benefits.  You’re barely better than some unseeded players, but you’re guaranteed to miss the big guns until the third round.

This is why we talk so much about getting into the top 32 in time for slams.  When the big points and big money are on the line, you want those easy opening matches even more than usual.  There isn’t much separating Kevin Anderson from Sam Querrey, but if the US Open draw were held today, Anderson would get a seed and Querrey wouldn’t.  Guess who we’d be more likely to see in the third round!

To run a seed-independent simulation: Instead of generating a logical draw, as we do with a DIS, generate a random draw, in which anyone can face anyone in the first round.

Measuring variations

If we compare forecasts based on the actual draw to draw-independent or seed-independent forecasts, we want to quantify the difference.  To do so, I’ve used two metrics: Expected Ranking Points (ERP) and Expected Prize Money (EPM).

Both reduce an entire tournament’s worth of forecasts to one number per player.  If Djokovic has a 30% chance of winning this week in Toronto, that’s the probability he’ll take home 1,000 points.  If those were the only points on offer, his ERP would be 30% of 1,000, or 300.

Of course, if Djokovic loses, he’ll still get some points.  To come up with his overall ERP, we consider his probability of losing the finals and the number of points awarded to the losing finalist, his probability of losing in the semis and the number of points awarded to semifinalists, and so on.  To calculate EPM, we use the same process, but with–you guessed it–prize money instead of ranking points.

Both numbers allow to see how much the draw helps or hurts a player.  For instance, before the French Open, I calculated that Richard Gasquet‘s EPM rose by approximately 25% thanks to a very lucky draw.

These numbers also help us analyze a player’s scheduling choices.  The very strong Olympics field and the much weaker Washington field last week created an odd situation: Lesser players were able to rack up far more points than their more accomplished colleagues. Even before the tournament, we could use the ERP/EPM approach to see that Mardy Fish could expect 177 points in Washington while the far superior David Ferrer could expect only 159 in London.

If you’ve read this far, you will probably enjoy the newest feature on TennisAbstract.com–live-ish forecast updates for all ATP events.  Find links on the TA.com homepage, or click straight to the Rogers Cup page.

## 4 thoughts on “The Tournament Simulation Reference”

1. jalnichols says:

Have you ever referenced betting odds compared to your projections? I’ve noticed in the past your odds for the better players understate their advantage compared to the no-vig Pinnacle Sports betting line.

1. I have, but not thoroughly enough to come to any clear conclusions. You’re right, I’m much more bearish on top players than betting odds are, and I know that I’m too bearish. There’s a bit on that here:
http://heavytopspin.com/2011/12/27/grand-slam-forecasting-for-dummies/

I suspect betting odds are too bullish on the same players, but I’m sure they are closer to correct than I am!

2. Philip says:

Does each player’s JRank remain constant throughout the tournament or is it updated after each round.

1. Stays constant. I haven’t tested whether updating after each round would increase accuracy.