{"id":603,"date":"2011-12-27T11:26:26","date_gmt":"2011-12-27T16:26:26","guid":{"rendered":"http:\/\/heavytopspin.com\/?p=603"},"modified":"2011-12-27T11:26:26","modified_gmt":"2011-12-27T16:26:26","slug":"grand-slam-forecasting-for-dummies","status":"publish","type":"post","link":"https:\/\/www.tennisabstract.com\/blog\/2011\/12\/27\/grand-slam-forecasting-for-dummies\/","title":{"rendered":"Grand Slam Forecasting for Dummies"},"content":{"rendered":"<p>It&#8217;s one thing to predict a winner&#8211;it&#8217;s another thing to quantify how likely a player is to become that winner.<\/p>\n<p>In most tennis tournaments, it&#8217;s not hard to pick a favorite. \u00a0For most of the last year, it was Novak Djokovic, no matter the surface or who he might face. \u00a0Before that, it was Federer on hard courts, Nadal on clay courts. \u00a0While every one likes to identify a dark horse, there&#8217;s rarely much debate at the top.<\/p>\n<p>Given that agreement, though, what odds would you have placed on Novak Djokovic winning Wimbledon? \u00a0Or the French? \u00a0Or an in-form Federer winning the tour finals over an injured Djokovic and a tired Nadal? \u00a0Usually, my numbers spit out something between 20 and 30 percent&#8211;in theory, even the best player in the tournament has a better than two-thirds chance of going home a loser.<\/p>\n<p>Intuitively, this is difficult to believe. \u00a0Djokovic seemed so dominant for much of the year that his slam victories felt like foregone conclusions. \u00a0Anyone who watched Novak on a good day found it impossible to imagine anyone outplaying him. \u00a0When Carl Bialik wrote a column asking whether Djokovic could keep up his dominance for the entire season, most responses were some variation of <a href=\"http:\/\/blogs.wsj.com\/dailyfix\/2011\/08\/16\/can-djokovic-craft-best-season-ever\/tab\/comments\/\">&#8220;What are you, stupid? Numbers are irrelevant when someone is so good.&#8221;<\/a><\/p>\n<p>But, all good things must come to end, and a combination of injuries and good opponents proved that even Djokovic is human.<\/p>\n<p>That said, Djokovic&#8217;s dominance&#8211;and Nadal&#8217;s before him, and Federer&#8217;s before him&#8211;raises questions about forecasting tennis matches. \u00a0 The questions are complicated, but rest easy: today&#8217;s attempt at an answer will be simple.<\/p>\n<p><strong>Do the rules apply to the very best?<\/strong><\/p>\n<p>My <a href=\"http:\/\/tennisabstract.com\/blog\/2011\/03\/11\/indian-wells-projections\/\">ranking and forecasting system<\/a> starts by assigning a number to every player, not unlike ATP ranking points. \u00a0To keep things simple, let&#8217;s use ranking points. \u00a0If we want to predict the outcome of, say, Mardy Fish against Feliciano Lopez, we take their point totals (2965 and 1755) and divide one by the sum of the others: 2965\/(2965+1755) = 62.8%. \u00a0(It&#8217;s a little more complicated than that, but not much.) \u00a0Setting aside concerns like home court advantage and surface, that sounds about right to me.<\/p>\n<p>Do the same with Djokovic and Lopez, and you get 88.6%. \u00a0Work the numbers with Djokovic and world #100 Michael Berrer, and you get 96.0%. \u00a0That&#8217;s pretty dominant, suggesting that Berrer would win only 1 in 25 matchups, but wait a minute&#8211;we&#8217;re saying Berrer&#8217;s going to beat Djokovic, <em>ever<\/em>?<\/p>\n<p>And therein lies the problem. \u00a0The formulas I use to generate points and generate predictions are reasonably accurate, tested against years of ATP results. \u00a0And in the aggregate, individual match percentages pass the smell test. \u00a0But at the extremes, the numbers seem questionable.<\/p>\n<p>And it is at the extremes where the exact percentages matter the most. \u00a0Consider <a href=\"http:\/\/tennisabstract.com\/blog\/2011\/06\/19\/wimbledon-mens-draw-predictions\/\">my pre-tournament predictions for Wimbledon this year<\/a>. \u00a0While Nadal was the top seed, I picked Djokovic as the favorite, giving him a 21.6% chance of winning. \u00a0But look at those first few rounds: I gave him only an 87% chance of getting past Jeremy Chardy (<em>Jeremy Chardy!)<\/em>\u00a0in the first round, then only an 88% chance of beating Kevin Anderson or Ilya Marchenko, then only an 85% chance of winning against (probably) Marcos Baghdatis.<\/p>\n<p>Only the last of those three numbers is plausible. \u00a0And when combined, they meant that I gave Djokovic less than a 65% chance of reaching the round of 16. \u00a0With all due respect to myself, that was almost as ridiculous then as it it sounds now.<\/p>\n<p>It&#8217;s those early-round numbers that result in such minute chances that the favorite will win the tournament. \u00a0Even if we give a player a 90% chance of winning <em>all<\/em>\u00a0his matches, he&#8217;ll still only win the seven consecutive matches required for a grand slam 48% of the time. \u00a0Lower it to 80%, and we&#8217;re down to 21% for the tournament. \u00a0Since the odds of winning a semifinal match against the likes of Murray, Federer, or Nadal is probably much lower, it seems that early round odds should be much more favorable.<\/p>\n<p>To summarize, one of two things is going on here. \u00a0Either (1) my numbers underestimate the likelihood that the pre-tournament favorite wins a grand slam; or (2) our intuition overestimates the likelihood that the favorite takes home the trophy.<\/p>\n<p><strong>Forecasting for dummies<\/strong><\/p>\n<p>One way to pick between the two is to look at the recent past. \u00a0Are pre-tournament favorites winning more or less than expected?<\/p>\n<p>For now, let&#8217;s set aside the question of the likelihood that Djokovic beats Chardy or Marchenko, and look only at winning the tournament. \u00a0We&#8217;re going to make two major assumptions here: (1) it&#8217;s possible to identify the pre-tournament favorite years later, and (2) favorites are generally created equal&#8211;Djokovic towers over his competitors to the same degree that Courier, or Lendl, or Sampras, or Federer towered over his. \u00a0As usual, both of these assumptions probably aren&#8217;t true, but they aren&#8217;t so hideously wrong that they&#8217;ll stop us from reaching some worthwhile conclusions.<\/p>\n<p>There are three easy ways of picking the pre-tournament favorite for a grand slam: using (a) the winner of the last slam; (b) the defending champion, and (c) the top seed&#8211;almost always the world #1. \u00a0The top seed is probably best, while the defending champion might identify a player who is particularly good on the surface, and the winner of the last slam might pick out someone who is riding a hot streak.<\/p>\n<p>The last 21 years (back to 1991, inclusive), give us 84 slams to work with. \u00a0Our sample is a bit smaller than that, because occasionally the winner of the last slam or the defending champion did not play, and on three occasions, the top seed pulled out before the tournament began. \u00a0Here is how the favorites did:<\/p>\n<ul>\n<li>Of the 75 players who had won the previous slam, 18 (24%) won the tournament.<\/li>\n<li>Of the 76 defending champions, 26 (34%) won the tournament.<\/li>\n<li>Of the 81 top seeds, 29 (36%) won the tournament. \u00a0If we exclude the French (where the top seed is often #1 on the basis of hard court performance), we get a more dramatic result here&#8211;26 of 60 (43.3%) won the tournament.<\/li>\n<\/ul>\n<p>All of these measures are much higher than the 21.6% shot I gave Djokovic at Wimbledon. \u00a0And most are higher than the 27-28% chances I gave him at the <a href=\"http:\/\/tennisabstract.com\/blog\/2011\/05\/20\/french-open-predictions\/\">French<\/a> and <a href=\"http:\/\/tennisabstract.com\/blog\/2011\/08\/27\/us-open-mens-draw-predictions\/\">US Open<\/a>. \u00a0The 43.3% likelihood that the top seed wins a hard-court slam (thank you, Pete and Roger!) suggests that a more sophisticated measure of identifying the favorite might allow us to predict slam champions with, say, 40% accuracy.<\/p>\n<p>40% is considerably higher than my models are spitting out right now, but I suspect it is much lower than many fans imagine for their favorite. \u00a0It suggests that, at the extremes, my predictions aren&#8217;t quite one-sided enough. \u00a0It might take Michael Berrer more than 25 chances before he finally catches Djokovic on a bad day.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s one thing to predict a winner&#8211;it&#8217;s another thing to quantify how likely a player is to become that winner. In most tennis tournaments, it&#8217;s not hard to pick a favorite. \u00a0For most of the last year, it was Novak Djokovic, no matter the surface or who he might face. \u00a0Before that, it was Federer &hellip; <a href=\"https:\/\/www.tennisabstract.com\/blog\/2011\/12\/27\/grand-slam-forecasting-for-dummies\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Grand Slam Forecasting for Dummies<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[40,96],"tags":[],"class_list":["post-603","post","type-post","status-publish","format-standard","hentry","category-forecasting","category-research"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.tennisabstract.com\/blog\/wp-json\/wp\/v2\/posts\/603","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tennisabstract.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tennisabstract.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tennisabstract.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tennisabstract.com\/blog\/wp-json\/wp\/v2\/comments?post=603"}],"version-history":[{"count":0,"href":"https:\/\/www.tennisabstract.com\/blog\/wp-json\/wp\/v2\/posts\/603\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.tennisabstract.com\/blog\/wp-json\/wp\/v2\/media?parent=603"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tennisabstract.com\/blog\/wp-json\/wp\/v2\/categories?post=603"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tennisabstract.com\/blog\/wp-json\/wp\/v2\/tags?post=603"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}