Why Do My Forecasts (Sometimes) Sharply Differ From Betting Markets?

I got an email this week, from Peter S., asking this question. There are a few reasons.

Peter zeroed in on a first-rounder at the Temuco Challenger, between Milledge Cossu and Alafia Ayeni. Ayeni is ranked 731, Cossu in the 1600s. Betting markets had Ayeni as a heavy favorite; my forecast gave Cossu the edge. Sportsbooks, unsurprisingly, had this one right, as Ayeni needed just 68 minutes to advance.

My forecasts are based on my Elo ratings, and my Elo ratings take into account all tour-level and tour-level qualifying, plus all Challenger main-draw results. (For women, I consider ITFs down to the $50K level.) For most players that you’ve heard of, that means that my Elo ratings are looking at every match they play. But for the likes of Ayeni and Cossu, it’s the opposite. The majority of Ayeni’s results this year have come at the ITF level. Cossu doesn’t have many pro results, period.

Point being, my forecast for a match like that is based on too little information to be anywhere near reliable. It might agree with the betting market, but only by happenstance. Ayeni has a poor recent record in Challengers, while Cossu hasn’t played any. By the logic of Elo, even starting a newbie at a fairly low rating, that makes Cossu the favorite. Give them both a dozen more matches, and the kinks would be ironed out, but we don’t get to do that simply for the sake of science.

Maybe I should indicate that more clearly on the forecast pages. (Or maybe I should include ITF results, too. Lots of stuff I should probably do.) In the meantime, you can check my Elo ratings leaderboard. Neither Ayeni or Cossu even appears, indicating that neither has played ten matches in the last 52 weeks that contribute to their rating. Ayeni is close to that threshold, so his rating (1166) is probably in the ballpark. Cossu is an unknown quantity. If players aren’t on that list, their forecasts aren’t going to be as accurate as those for players who are.

Outside the model

For extreme gaps between my forecasts and betting odds, limited data is usually the answer. You’ll often find smaller–but still puzzling–gaps, even between players with extensive track records.

It’s worth considering what’s “in” the model. Elo looks at match results–period. Has the player won or lost lately, and against whom? My single substantial tweak to that is an injury/absence penalty, so if someone misses a lot of time (minimum eight weeks during the season), they get docked. The assumption is that they’ll come back rusty or still physically compromised. The size of the penalty is based on player results after past absences. Though of course not all absences are alike, and players differ in how they handle them.

For player who haven’t missed time lately, any “news” isn’t going to show up in the forecast. If word leaks out that Alcaraz is dealing with a bum ankle, betting markets will adjust, but my forecast will not.

If players have missed enough time to trigger the penalty, their Elo ratings are less reliable until they’ve gotten several matches under their belt. When Sinner came back from his doping ban, he was surely in better shape than the typical guy who had just sat out three months. Same story with Djokovic’s layoffs-by-choice this season. On the flip side, a player who comes back too soon, perhaps treating a 250 as a mere trial run, is less likely to win than his adjusted Elo rating suggests.

Surfaces

Another major factor outside my Elo model is the specifics of surface. Not all hard courts are created equal, and I don’t even differentiate between indoor and outdoor. (I know, another thing you want me to do.) My forecasts probably underestimated Sinner’s chances of breezing through the last several weeks of the season because they did not recognize that indoor Sinner is reliably better than outdoor-hard-court Sinner.

Even among outdoor courts, speed varies enormously. Some players are considerably better on faster or slower courts, even if they are the same type. When Rafael Nadal was winning 1.2 million consecutive matches at Roland Garros, my model always considered him the favorite, but not by the overwhelming margin that bettors (rightly) did. Part of the reason was that the Paris clay is reliably slow, while Nadal was more vulnerable at, say, Madrid. So my Elo ratings, tossing all of his clay results into the same bucket, saw Rafa as (barely) beatable, even though it took an act of God to dislodge him at the French.

In short, if a player is particularly well-suited to the conditions at a certain tournament, Elo isn’t going to pick that up. He’ll be underrated in the forecasts. The degree depends on the player, and on just how well-suited he is.

There can also be an issue with limited surface data, most often–but not always–during grass season. Young players on the rise might show up at Wimbledon qualies without ever having played a professional match on grass. Their overall (surface-agnostic) rating will tell us something about their level, and my model makes an adjustment for their grass inexperience. But that sort of player could be anything from a grass natural to a hopeless case. Some of that might be predictable to a savvy fan, but Elo doesn’t have a clue.

Financial advice

If you’re using my forecasts as betting advice, stop doing that. C’mon, man.

If you check my forecasts for entertainment purposes, it’s good to know exactly what you’re looking at, and what the numbers are based on. Hope this helps!

4 thoughts on “Why Do My Forecasts (Sometimes) Sharply Differ From Betting Markets?”

  1. Hey,
    I can’t find your email anywhere, so I’ll just ask here in the comments. I was wondering if there is any access to past ELO rankings at the time of a particular date in the past? I have taken to capturing the new ranking whenever they update. So I have them to do research if needed. Let me add my appreciation and thanks for tennis abstract. Happy Holidays

  2. Hey,
    thanks a lot for another very interesting blog post! Love to read those posts.
    I have to admit I am using your ELO ratings (at least partly) to make my bet decisions.
    May I ask if you are betting in general on tennis matches as well? Your knowledge of the sport and obviously of data work as well is extensive.
    Thanks in advance!
    Flo

Comments are closed.

Discover more from Heavy Topspin

Subscribe now to keep reading and get access to the full archive.

Continue reading