The Power of One Point Per Thousand

Last week, I offered a method to rank smash-hitting skill. I measured the results in “points per 100”–the number of points a player could expect to gain or lose, relative to tour average, thanks to their ability hitting that one shot. The resulting figures were quite small: My calculations showed that Jo-Wilfried Tsonga has the game’s best smash, a shot worth 0.17 points per 100 above average, and 0.27 points per 100 above the weakest smash-hitting player I found, Pablo Cuevas.

That gap between best and worst of 0.27 per 100 gives us a rough maximum of how much difference a good or bad smash can make in a player’s game. The rate is roughly equivalent to one point out of 370. It sounds tiny, and since most players are closer to the average than they are to either of those extremes, the typical smash effect is even smaller still.

However, it’s difficult to have any intuitive sense of how much one point is worth. In any given match, a single point, or even five points, isn’t going to make the difference. On the other hand, plenty of matches are so close that one or two points would flip the result. If an average player could train really hard in the offseason and develop a smash just as good as Tsonga’s, what would that extra 0.17 points per 100 mean for him in the win column? What about in the rankings?

This is a relatively straightforward question to answer once we’ve posed it. Over the course of a season, the best players win more points than their peers–obviously. Yet the margin isn’t that great. In 2017, no man won points at a higher clip than Rafael Nadal, who came out on top 55.7% of the time. That’s less than seven percentage points higher than the worst player in the top 50, Paolo Lorenzi, who won 49.1% of points. Nearly half of top 50 players–22 of them–won between 49.0% and 51.0% of total points, and another 15% fell between 51.0% and 52.0%.

Fixing total points won

These numbers are slightly misleading, though only slightly. The total points won stat (TPW) tends to cluster very close to the 50% mark because competitors face what, in other sports, we would call unbalanced schedules. If you win, you usually have to play someone better in the next round; win again, and an even more superior opponent awaits. This means that the 6.6% gap between Nadal and Lorenzi is a bit wider than it sounds: Had the Italian faced the same set of opponents that Rafa did, he wouldn’t have managed to win 49.1% of points.

That problem, however, is possible to resolve. Earlier this year I shared an algorithm that analyzed return points won by controlling for opponent, by comparing how each pair of players fared in equivalent matchups. (That analysis hinted at the second-half breakthrough of return wizard Diego Schwartzman.) While we don’t know exactly what would happen if Lorenzi played Nadal’s exact schedule, we can use this common-opponent approach to approximate it. When we do so, we find that the 1st-to-50th, Nadal-to-Lorenzi spread is almost 10 percentage points; setting Rafa’s rate at a constant 55.7%, Lorenzi’s works out a less neutral-sounding 46.2%. Many players remain packed in the 49%-to-51% range, but the overall spread is wider, because we control for tennis’s natural tendency to cancel out player’s wins with subsequent losses.

Even when we widen the pool of players to 71–everyone who played at least 35 tour-level matches this season–the ten-percentage-point spread remains. Lorenzi remains close to the bottom, a few places above Mikhail Youzhny, whose competition-adjusted rate of points won is 45.7% ranks last, exactly ten points below Rafa.

Think about what that means: In a typical ATP match, for every hundred points played, only ten are really up for grabs. That isn’t literally true, of course: There are plenty of matches in which one player wins 60% or more of total points. But on average, you can expect even the weakest tour regular to win 45 out of 100 points. In team sports analytics, this is what we might call “replacement level”–the skill level of a freely available minor leaguer or bench player. I don’t like importing the concept of replacement level for tennis, because in an individual sport you’re never really replacing one player with another. But at the most general level, it’s a useful way of thinking about this subject–just as even a minor league batter could hit .230 in the major leagues (as opposed to .000), so a fringey ATP player will win 45% of points, not 0%.

Points to wins

In team sports analytics, it’s common to say that some number of runs, or goals, or points is equal to one win. Thinking in terms of wins is a good way to value players: If you can say that upgrading your goalkeeper is worth two wins over your current option, it makes very clear what he brings to the table. Again, the metaphor is a bit strained when we apply it to tennis, but we can start thinking about things in the same way.

Another oddity in tennis is that players not only face very unequal competition, they also play widely different numbers of matches. The year-end top 50 contested anywhere from 35 matches up to more than 80; part of the variation is due to injury, but much is structural: The more matches you win, the more you play. Rafa managed his schedule by entering only a handful of optional events, yet only David Goffin played more matches. So we have another quirk to handle: In this case, let’s adopt the fiction that a tennis season is exactly 50 matches long. Rafa’s actual record was 67-11; scaled to a 50-match season, that’s roughly 43-7.

Finally, we can look at the relationship between points and wins. Points, here, means the rate of total points won adjusted for competition. And wins is the number of victories in our hypothetical 50-match season. The relationship between points and wins is quite strong (r^2 = 0.75), though of course not exact. Roger Federer won matches at a higher rate than Nadal did, but by competition-adjusted total points won, Rafa trounced him, 55.7% to 53.5%. And as we’ve seen, Lorenzi is close to the bottom of our 71-player sample, despite hanging on to a ranking in the mid-40s. Luck, clutch play, and a host of other factors make the points-to-wins relationship imperfect, but it is nonetheless a healthy one.

It doesn’t take many points to boost one’s win total. An increase of only 0.367 points per 100 translates into one more win in a 50-match season. The average player contests 8,000 points per season, so we’re talking about only 29 more points per year. This puts my smash-skill conclusions in a new light: The spread between the best and the worst of 0.27 points per 100 seemed tiny, but now we see it’s worth almost a full win over the course of a 50-match season.

Wins to ranking places

Unless you’re nearing a round number and have a hankering for cake, even wins aren’t the currency that really matters in tennis. What counts is position on the ranking table. The relationship between wins and ranking position is another strong but imperfect one (r^2 = 0.63).

As we’ve seen, the middle of the ATP pack is tightly grouped together in total points won, with so many players hovering around the 50% mark, even when adjusted for competition. There’s not much to distinguish between these men in the win column, either: On average, an increase of 0.26 wins per 50 matches translates into a one-spot jump on the ranking computer. Put another way: If you win one more match, your ranking will improve by four places. Again, these are not iron laws–in reality, it depends when and where that extra win occurs, and the corresponding ranking improvement could be anywhere from zero spots to 30. Still, knowing the typical result allows us to understand better the impact of each marginal win and, by extention, the value of winning a few more points.

One point per thousand

Combine these two relationships, and we get a new, conveniently round-numbered rule of thumb. If an increase in one ranking place requires 0.26 additional wins per 50 matches, and one additional win requires 0.367 extra points per 100, a little tapping at the calculator demonstrates that one ranking place is equal to about 0.095 points per 100. Round up a bit to 0.1 per 100, and we’re looking at one point per thousand.

One extra point per thousand is a miniscule amount, the sort of difference we could never dream of spotting with the naked eye. Players regularly win entire tournaments without contesting so many points; even for Goffin, who served or returned more than 12,000 times this year, we’re talking about a dozen points. Yet think back to all of those players clustered between 49% and 52% of total points won; even when adjusted for competition, three men ended the 2017 season tied at exactly 50.4%, with less than one point per thousand separating the three of them.

The one part of the ranking table where one point per thousand is no more than a rounding error is the very top. Usually one player separates himself from the pack, and the top few distance themselves from the rest. This year is no different: The competition-adjusted gap between Nadal and Federer is a whopping 2.2% (22 points per thousand), while the next 2.2% takes us all the way from Fed through the entire top 10. The 2.2% after that, extending from 51.1% to 48.9%, covers another 20 players: spaced, on average, one point per thousand apart. For a player seeking to improve from 30th to 20th, the path is largely linear; from 5th to 3rd it is much less predictable–and probably steeper.

If this all sounds unnecessarily abstruse, I can only mention once again the example of my smash-skill findings. Now we know that the range of overhead-hitting ability among the game’s regulars is worth close to three places in the rankings. Imagine a similar type of conclusion for forehands, backhands, net approaches… it’s exciting stuff. While plenty of work lies ahead, this framework allows us to measure the impact of individual shots–perhaps even tactics–and translate that impact into ranking places, the ultimate currency of tennis.

Overperforming in Davis Cup

This is a guest post by Peter Wetz.

With the help of weighted surface specific Elo ratings we have a powerful new tool to measure player performance. The traditional conclusion of the tennis season, the Davis Cup final, provides us with an opportunity once again to examine which players thrive when competing for their nation and which players seem to suffer from the pressure. While we are at it, I don’t like the sound of the word offseason. After all, there are still ITF tournaments, not to mention the Australian Open Asia-Pacific Wildcard Play-offs.

As already hinted, Elo ratings have proven to represent a better picture of player quality than traditional ATP rankings. Hence, comparing expected wins based on Elo with actual wins provides us with a clearer picture of who outperforms expectations and who does not.

In this evaluation, I consider completed World Group and Group 1 Davis Cup live rubbers played since 1980. The data set contains around 5000 matches through this year’s World Group Quarterfinals, and I’ve limited my focus to players having played 15 or more matches.

Let’s first take a glance at the obvious stat, win-loss percentage. The following table shows the top ten win-loss records of all players under consideration. (The Active column denotes if the player is still an active player).

Name	        W	L	Perc	Active
Rafael Nadal	20	1	95%	1
Boris Becker	31	2	94%	0
Andy Murray	25	3	90%	1
Balazs Taroczy	23	3	89%	0
David Ferrer	20	3	87%	1
Andre Agassi	23	4	85%	0
Roger Federer	40	7	85%	1
Novak Djokovic	27	5	84%	1
Guillermo Vilas	16	3	84%	0
Andrei Medvedev	16	3	84%	0

As one would expect, the big four and other all time greats are included. However, this obviously does not tell the whole story. Rafael Nadal is expected to win most of the time and that is what he does. For such a player, it is hard to outperform expectations.

If we compute how much a player outperforms his expectations, we get a clearer picture, given we want to know who does especially well in Davis Cup. Expected wins are calculated based on a half-and-half mix of surface specific Elo and overall Elo as this, in general, provides close to the best results, as pointed out in a previous article.

The tables below show the top and bottom five among all (first table) and active (second table) players in terms of over and underperforming expected wins. It shows actual wins (W), expected wins (eW), the percentage of over or underperformance (+/-), and if a player is still active.

Name	         W	eW	+/-	active
Francisco Maciel 11	6	72%	0
Slobodan Zi'vic  20	11	72%	0
Vasek Pospisil	 9	5	71%	1
Adrian Ungur	 6	3	56%	1
Mahesh Bhupathi	 5	3	55%	0
...
Wally Masur	 7	10     -31%	0
Sebastien Lareau 7	10     -31%	0
James Blake	 7	10     -36%	0
Nicolas Kiefer	 6	10     -40%	0
Aqeel Khan	 2	4      -57%	0
Name	        W	eW	+/-	Active
Vasek Pospisil	9	5	71%	1
Adrian Ungur	6	3	56%	1
Andrey Golubev	13	8	46%	1
Di Wu	        14	9	45%	1
Steve Darcis	15	11	35%	1
...
Florian Mayer	7	8      -14%	1
Gilles Muller	9	10     -15%	1
Alejandro Falla	8	9      -17%	1
John Isner	9	11     -19%	1
Jurgen Melzer	20	25     -22%	1

The tables seem to overlap with some conventional wisdom floating through the tennis sphere. Namely, that Steve Darcis, despite his recent losses at the Davis Cup final, plays above expectations. Also, Jurgen Melzer is known for regularly disappointing Austrian Davis Cup fans. (In his defense, he created several moments of joy, too).

If we were to pick a Davis Cup hero for the active and inactive group of players, Slobodan Zivojinovic and Andrey Golubev seem to be good choices. Golubev has a record of 13-6 (68%) and outperforms expected wins by 46%. He provides a good combination of consistently beating players he should beat and scoring more than his share of exceptional upsets (Wawrinka 2014, Goffin 2014, Melzer 2013 and Berdych 2011).

Zivojinovic provides a similar pattern with a record of 20-8 (71%), 72% better than expected. He tallied six wins out of ten matches in which Elo assigned him a win probability of less than 25%. Further, he only lost one match in when his pre-match odds of winning were greater than 35%.

This post provides insight into how Elo ratings help in quantifying a player’s performance. We identified players who have (not) shown great improvement on what the algorithm expected based on results from the regular tour. For future research it would be interesting to delve into Davis Cup doubles heroes: Where there are no dead rubbers, stakes are always high.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.