Italian translation at settesei.it
At HeavyTopspin, I frequently post references to “my rankings” which power my tournament projections. (For instance, 2012 French Open men and women.) My system is unofficially called “JRank”–in other words, it needs a new name. The rankings it generates are superior to the ATP (and presumably WTA) rankings in the sense that they better predict the outcome of tour- and challenger-level matches.
The algorithm is complex but the ideas behind it are not. The fundamental difference between JRank and the ATP system is how it values individual matches.
The ATP system awards points based on tournament and round. (A first round win at Wimbledon is worth more than a first round win at Halle; a third round win at Roland Garros is worth more than a second round win.) JRank, by contrast, awards points based on opponent and recency. In my system, a win against Rafael Nadal is worth much more than a defeat of Igor Kunitsyn, even if both take place in the same round at the same tournament. And a defeat of Kunitsyn is worth more if it took place last week than if it took place eight months ago. A recent win tells you more about a player’s current ability level than an older one does.
The advantage of giving recent matches more weight is that it allows us to take into account matches more than one year old, without the veteran-favoring disadvantages of Nadal’s two-year plan. JRank uses all matches from the last two years, but a match one year ago is worth only half as much as a match last week, while a match two years ago is worth only a quarter as much. That way, we get the benefits of that much more data, but without unduly favoring vets. There is the added benefit that JRank is “smoother” from week to week–none of the bizarre effects of a tournament “falling off” from last year–as if a player’s results 51 weeks ago are 100% more relevant than his results 54 weeks ago!
JRank’s value is even greater because it generates separate rankings for clay and hard surfaces. Everyone knows that surface matters, but the ATP ranking system ignores it completely. If you want to know who should be favored at the French, it seems silly to weight Bercy as heavily as Monte Carlo. JRank gives more weight to a player’s clay record for his clay ranking, and so on. Even further, beating a clay court specialist is worth more on clay than it is on a hard court.
Creating projections
Armed with rankings, it’s a few small steps to generating a forecast for any tournament. For each match, the projection is based almost entirely on the rankings of the two players. (The formula is a slightly more complicated version of A divided by A+B, where A is one player’s ranking point and B is the other’s. It works–approximately–with ATP ranking points as well.)
There are a few tweaks, though. First, my research has indicated that qualifiers, lucky losers, and wild cards all perform slightly below expectations. It is unclear why, though with qualifiers I suspect it is due to fatigue–while their opponents rested, they played two or three tough matches to qualify.
Second, I’ve established that there is a slight home court advantage. When surface is accounted for, home court advantage is minimal, but it is still there–the “home” player performs about 2% better than expected. Perhaps it’s referee bias, home cooking, fan support, or some combination of the above.
A frequent suggestion is to incorporate head-to-head records into match projections. It’s a tempting idea–so tempting that I’ve tried it. However, it doesn’t seem to make much difference, at least for any broad cross-section of matches. (Perhaps when a pair of players have, say, 10 or more head-to-head matches in the books, stronger patterns emerge.) For the most part, it seems that if a ranking system represents a good approximation of each player’s ability level, head-to-head results are superfluous.
There may be other variables worth looking at, including the importance of the tournament, the player’s fatigue level or recent injury history, or each player’s experience at a particular event. For now, those are among the influences I haven’t even tested.