Early Round Dominance and Women’s Semifinal Outcomes

Today’s women’s semifinals have at least one thing in common.  In both, one player has yet to drop a set at the US Open and has lost many fewer games than her opponent.

Does it matter?

The differences are particularly glaring in today’s second semifinal, between Serena Williams and Na Li.  Serena has not lost a set, and has dropped only 13 games.  Li has been pushed quite a bit further, losing 31 games in her first five matches.

Out of 714 Open-era Grand Slam semifinal matches, 30 have featured two players with such a wide gap.  To quantify it, we’ll note that Li has lost 2.38 times as many games as Serena has.  Of those 30 matches, the player who had displayed more dominance in the early rounds won 25.

Strangely, though, the connection has been much weaker in recent years.  Most of those 25 super-dominant semifinals were the usual suspects in WTA history: Margaret Court, Chris Evert, and Steffi Graf.  Only five of these lopsided pairings have taken place since 1994, and of those five, the less dominant player has won three.  The most recent example was Li’s semifinal in Australia.  She went into her match having lost 31 games, while her opponent, Maria Sharapova, had lost only 9.  Despite showing so much more weakness in the early rounds, Li won her semifinal 6-2 6-2.

In general, however, the more dominant the early rounds, the better chance a player has of reaching the final.  Of the 349 Slam semifinals in which one player had lost fewer games in her first five rounds, 228 (65.3%) advanced to the final.  The same percentage applies to the player who lost fewer sets en route to her semifinal.

Despite her low ranking and her buzzsaw of an opponent, this bodes well for Flavia Pennetta, right?

Well, not exactly.  As hardly needs mention, there are other factors involved here.  A great player might have a sloppy early-round match or suffer an unlucky draw.  That doesn’t mean she’s any less great, or less likely to show her top form in the semis.  Victoria Azarenka has certainly had a more challenging tournament so far than Pennetta has, but it would be a mistake to read too much into that.

For the most part, early-round dominance and superior WTA rankings go hand in hand.  Of 228 semifinal matches where I have ranking data, just over half (117) were won by the player who had dropped fewer games–who just happened to be the player with the better ranking.  No surprises here–if someone is going to play like Serena has so far, she’s probably #1.

The remaining 111 matches are where things get interesting.  In 75 of them, one player had the higher ranking (like Azarenka) and the other had been more dominant in the early rounds (like Pennetta).  The results favor the higher-ranked player, but not as much as you might expect: 30 of those 75 (40%) went in favor of the lower-ranked player.

Of course, most of those lower-ranked players aren’t quite the underdogs that Pennetta is.  As we saw yesterday, Flavia is one of the lowest ranked semifinalists in women’s Slam history.  Only two players outside of the top 32 have ever advanced to a final–Venus Williams at the 1997 US Open, and Serena at the 2007 Australian.  Whatever else you might say about the Italian, she’s not a Williams sister.

Using these two variables, though, it is Na Li who faces the tougher challenge today.  She’ll need to beat a higher-ranked player who has been untouchable through five rounds.  Keep the faith: That’s exactly what she did in Melbourne this year.

Wawrinka d. Murray: Recap and Detailed Stats

The narrative felt familiar.  A flashy player from the fringes of the top ten takes on an established top-five guy, a great defender who would be sure to outlast his opponent in the end.

Yesterday, it was Gasquet and Ferrer.  Today, Stanislas Wawrinka and Andy Murray.  Even after Wawrinka took the first set, the same talking points reappeared: Surely Wawrinka would press, or tire, or Murray would wake up and play better tennis.  Fortunately for Stan, he didn’t have to fight off as spirited a comeback as Gasquet did; he simply kept employing the same successful strategies while Murray, passive and error-ridden, let him run away with the match.

While Murray’s impotence will be the story of this match–he hit only 15 winners in the entire match, and that includes six aces–much must be said about Wawrinka’s game plan.

The Swiss is known for his backhand, but unlike Gasquet, he doesn’t unduly favor it.  Roughly 40% of his groundstrokes are backhands (including slices), meaning he is willing to move around it and attack with the forehand.  The Wawrinka forehand is a weapon that is known to break down, but when it’s working, it can be just as deadly as the backhand.  It didn’t falter today: Stan earned 27 winners and induced five additional forced errors with shots from that side.

But the forehand was only a complementary part of the attack.  What continued to surprise throughout the match was Wawrinka’s willingness–sometimes over-eagerness–to come to net.  His transition game is a little awkward, and many of his errors came from failed approach shots, but by continually putting more pressure on Murray, he closed out points when Andy would’ve been content to let them go on for ten more shots.

Another underrated part of Wawrinka’s game is the serve.  While Stan will never post eye-popping ace numbers, it’s an effective shot that sets up the rest of his game well.  Today, he only tallied four aces and one unreturnable, but of 76 total serve points, Wawrinka won 29 of them with or before his second shot.  That isn’t as foolproof as an Isner-like ace tally, but the end result is the same.

And sure enough, it prevented Murray from even sniffing opportunity.  Murray didn’t earn a single break point in the match, the first time he has failed to generate one since his loss to Roger Federer in the 2010 World Tour Finals.

Wawrinka, on the other hand, pushed Murray to 30-30 in almost every one of his service games, and after suffering through a marathon game at the end of the first set, in which he needed seven opportunities to seal the break and the set, he didn’t waste nearly so much time again.  The Swiss converted three of five break point opportunities after that first set.

It was a bad day for Murray, that’s for sure.  It represented a step back to before his days as an Olympic and Grand Slam champion, and it may be a tough one to bounce back from.  Wawrinka, on the other hand, forces us to consider him as one of the “next four,” perhaps the Swiss #1 sooner rather than later.  He won’t always beat Murray with today’s game plan, but he’ll do more damage against higher-ranked players.

In Saturday’s semifinal against Djokovic? That’ll be a big ask, even playing the way he did today.  Novak has reeled off eleven victories in a row in their head-to-head, though their last match was the marathon fourth-rounder in Australia, when Stan pushed him to 12-10 in the fifth set.  The semi won’t have the star power it would’ve with Murray, but we can expect some great tennis.

Here are my detailed serve, return, and shot-type stats for today’s match.

Stubborn Richard and Fighting Flavia

We all know how great Richard Gasquet‘s backhand is.  It’s arguably the best one-hander in the game, and the down-the-line version is right up there with with any other men’s backhandweapon, one- or two-handed.

What has struck me in his last two matches is that, unlike virtually every other top player, he never runs around it.  Even Stanislas Wawrinka, another man with a claim on the “best one-hander” title, will frequently take several steps to get in position to hit a forehand from the backhand corner.

Gasquet doesn’t do that.  In 277 points yesterday, he ran all the way around a backhand once, and there were two or three other shots when he took a couple of steps to hit a forehand when he might have taken one to hit a backhand.  In other words, he’s totally comfortable hitting his backhand from anywhere on the court, against any spin, at any height, and he trusts it as his go-to offensive shot.

In my detailed stats tables, I added a chart last night showing shot types–how many each player hit, grouped into various categories.  Against David Ferrer, Gasquet hit 296 backhands (excluding slices) to 222 forehands, a ratio of 1.33.  Ferrer hit 274 to 297, a 0.923 ratio.  Ferrer is more typical.  He can hit solid crosscourt backhands all day long–even crush a down-the-line winner on occasion, but given the opportunity, he’ll move around it and hit a more powerful inside-out forehand.

Of the last five men’s matches I’ve charted, Gasquet’s backhand preference stands out.  Marcos Baghdatis vs Kevin Anderson: 0.58 for Baghdatis, 0.36 for Anderson.  Lleyton Hewitt vs Brian Baker?  0.72 for Hewitt, 0.86 for Baker. Tomas Berdych, 0.65, against Julien Benneteau, 0.73.  Against Denis Istomin, Andy Murray‘s ratio was 0.56.  Only Istomin is anywhere near Gasquet’s category, with a ratio of 1.15, and that may be more a testament to Murray’s ability to find his opponent’s backhand than anything else.

For all the beauty of Gasquet’s backhand, much of the time it is a simple rallying shot.  Move him deep into that corner, and he generally won’t hurt you. I’m not convinced all those backhands make up a wise tactical decision–perhaps more inside-out forehands would be in order.  Certainly, he’ll need to come up with something out of the ordinary when he faces Rafael Nadal on Saturday.

From the day the draw was announced, Flavia Pennetta‘s quarter was considered the wide-open section of the field.  Except, until yesterday, nobody thought of it as Pennetta’s quarter.  Technically it was fourth-seed Sara Errani‘s to lose, which she promptly did, to Pennetta in the second round.  It was also considered fair game for Caroline Wozniacki … who lost in the third round.  Then it was the domain of rising star Simona Halep … another Pennetta victim.

Surely Flavia’s run ends tomorrow at the hands of Victoria Azarenka.  In the meantime, let’s take a moment to celebrate a few amazing aspects of her accomplishment thus far.

Ranked 83rd–and ranked outside of the top 100 only six weeks ago–it took a late injury withdrawal to get her into the main draw.  Now, she is only the 10th woman in the Open era to reach a Grand Slam semifinal while ranked outside of the top 80.  Just one previous US Open semifinalist–Angelique Kerber two years ago–was ranked so low.

Another remarkable aspect of Pennetta’s run is that she has reached her first Slam semifinal at the age of 31.  Only three women–Gigi Fernandez, Nathalie Tauziat, and Wendy Turnbull–reached their first Slam semi after turning 30.  (Fernandez did it while ranked outside the top 80, making her the proto-Flavia.)  Turnbull is the only first-time semifinalist to have done so while older than Pennetta is now, by a couple of months.  She accomplished that feat at the 1984 US Open.  Amazingly, it wasn’t Turnbull’s only moment in the spotlight–she reached the semis of the Australian a few months later, beating a young Steffi Graf along the way.  She even reached the quarters at the following year’s US Open.

Finally, we may marvel at the fact that Pennetta, once a top-ten player, did not reach a semifinal until this, her 41st slam.  Also near the top of the all-time leaderboard, but not a record.  Francesca Schiavone had played 41 slams before reaching her first semi in the French Open a few years ago.  Tauziat makes another appearance here; she needed 44 tries before winning five straight matches.  The most dogged of all WTA players must be Elena Likhotseva, who played 56 career Slams, not making it to the semifinal in her 46th try.

Most of these precedents jibe with our intuition that, no matter how hot she is, Flavia doesn’t stand much of a chance against Vika.  But a couple of these cases–Schiavone with her two deep French open runs, and Turnbull with her pair of late-career semifinals–suggest that this could be more than a one-off for the Italian.

Rafael Nadal has yet to lose serve at the US Open, and has a string of 82 consecutive service holds going back to Cincinnati.  I plan to have more on this before his semifinal match.

Here’s a win-probability graph for yesterday’s Gasquet-Ferrer five-setter. And if you somehow missed it the last five times I linked to it, here are my detailed stats from that match.

I’ll chart one of the two men’s quarters today, though I’m not yet sure which one.  Keep an eye on my Twitter account, as I’ll post those stats after each set.

And last for today, here’s an example of thorough data collection that tennis organizations will almost certainly fail to follow.

Gasquet d. Ferrer: Recap and Detailed Stats

The knock on Richard Gasquet has long been his inability to play the big matches, to overcome higher-ranked opponents, even when he has the weapons to defeat them.  David Ferrer is the sort of guy who eats such players for lunch.  One might figure the Frenchman would win a set, but not that he would find his way into the semifinals of a Grand Slam.

For two sets today, Gasquet played as well as I’ve ever seen him play. He combined patience with his devastating down-the-line backhand, waiting eight, ten, or more shots before the opportunities arose to unleash the monster.

What’s remarkable is that most conventional stats don’t bear this out. He barely got half of his first serves in. He hit a mere seven winners in the first set, against a dozen unforced errors. But he coaxed plenty of mistakes out of an opponent who doesn’t often make many.

Gasquet was able to race to his two-set lead in large part because Ferrer wasn’t playing his best tennis. The tactics looked familiar, but Ferru wasn’t quite as aggressive as usual, letting Gasquet earn those opportunities to strike. Ferrer hit only three winners in the entire second set.

The next two sets fulfilled everyone’s expectations. Despite his five-set triumph over Milos Raonic, Gasquet’s history suggests he would mentally fade, and perhaps physically give out long before Ferrer would. As the Spaniard piled on the breaks, those forecasts appeared to come true.

Ferrer’s success against Gasquet’s serve tells the story. While failing to win more than 30% of return points in the first two sets, suddenly he won half of Gasquet’s service points. With Gasquet playing more listlessly, settling in further back in the court, a couple of breaks were plenty.

It would have been easy for the Frenchman to go away in the fifth set; he’s done it before. Ferrer’s reputation precedes him, and certainly, he showed no signs of physically weakening as the match went into its fifth hour.

But Gasquet dug out of a 15-30 hole to win his opening service game; he fought past two deuces and a break point to win his second. With both players settling in for a grind, the turning point came on Gasquet’s only break point of the deciding set, when Ferrer double-faulted to give his opponent a 5-2 advantage.

Thanks to a couple of errors from Ferrer in the final game and a big serve on match point, the Spaniard never had another opportunity. Gasquet moves to the semifinals and a probable date with Rafael Nadal.

It was only Gasquet’s second win against Ferrer in nine meetings, and only his seventh career win in a five-setter. His only previous five-set win against a higher-ranked opponent was in his only prior Grand Slam quarterfinal, in 2007 at Wimbledon against Andy Roddick.

And now, after winning his second Grand Slam fourth-round match in 17 tries, he moves to a perfect 2-0 in quarterfinals.

Here are my complete serve, return, and rally length stats for the match.

Number One Bagels and Clutch Break Points

The big story from yesterday’s action at the US Open was the dominance of the world #1s.  Both Novak Djokovic and Serena Williams dished out two 6-0 sets, making one wonder if we’d been transported back in time to the first Tuesday, when top players are more likely to face opponents who don’t challenge them.

Djokovic’s drubbing of Marcel Granollers was only the 146th men’s Grand Slam match of the Open era in which one player won two bagel sets.  That’s a little less than once per Slam for that time period.

Only 15 of those double-bagels have come in the fourth round or later, and such final-16 drubbings have gotten more rare over time–only 5 of the 15 have taken place since 1983.  The most recent was Rafael Nadal‘s defeat of Juan Monaco at last year’s French Open, 6-2 6-0 6-0.  Roger Federer shows up on the list as well, twice: His quarterfinal win over Juan Martin del Potro at the 2009 Australian, 6-3 6-0 6-0, and the final in his 2004 US Open title over Lleyton Hewitt, 6-0 7-6 6-0.

Double bagels are a bit more common in the women’s game, though not as frequent for Serena at Slams as you might expect.  While there have been over 180 in the Open era, yesterday’s defeat of Carla Suarez Navarro was only her fourth.  Several of the game’s greats tallied more than that, notably Chris Evert with 13, Margaret Court with 8, and Steffi Graf with 7.

Where Serena stacks up more impressively is in her record of 6-0 sets this year.  She has now served a bagel in ten different Grand Slam matches in 2013, including two double bagels.  Only Court in 1969 and Graf in 1988 won a 6-0 set in more Slam matches in a single year, and only Graf won more 6-0 sets at Slams in a single year.

Of course, Serena isn’t done yet.  However, in nine career matches against her semifinal opponent, Na Li, she has only won a single set 6-0.  She might not want to do it again: After serving a bagel set to open their 2008 in Stuttgart, Serena lost the next two sets for her only career loss against Li.

As we all mulled over Roger Federer’s future yesterday, Carl Bialik outlined a useful way of thinking about break point conversions.  As I noted yesterday, while Federer has played horribly on such key points in his last several slam losses, it’s not clear how much we should read into those numbers.  Yes, he probably would’ve won the match had he converted more break points, but does a dreadful 2-for-16 showing (or several) mean he is a fundamentally different player than he used to be?

Carl’s algorithm involves comparing performance on break points to performance on all other points.  If tennis players were robots, we would expect them to perform exactly as well at 30-40 as they do at 30-0.  The only slight difference is that most break points take place in the ad court, and lefties have an advantage there.  For now, let’s ignore that.

Thus, a player who wins 44% of break point opportunities against only 40% of other return points is playing 10% better in those pressure situations.  We might even say he is performing well in the clutch.

I ran these numbers for every member of the top 50 in 2013.  As is so often the case, the results don’t offer a lot of confidence in the connection between break point results and clutch skills.

The four players who have performed the best this year on break points, relative to other points in the same matches, are Jo-Wilfried Tsonga (+14%), Martin Klizan (+12%), Nicolas Almagro (+10%), and Ernests Gulbis (+10%).  Of the big four (or five, or seven), tops is Rafael Nadal, at +5%.

At the other end of the spectrum are Tommy Robredo (-5%), Sam Querrey (-6%), Kei Nishikori (-6%), Michael Llodra (-7%), and David Ferrer (-7%).

(These numbers don’t include the US Open.  If they did, presumably Robredo would move up a few spots.)

Federer ranks 38th among the top 50, winning 2.6% fewer break points than non-break points.  That’s certainly nothing to be proud of, but it’s only two spots behind Novak Djokovic, at -1.7%.

Another approach that matches our intuition a little better is to look only at break point opportunities–that is, clutch return points.  Here, Federer is -7.8%, worse than 40 members of the top 50.  Djokovic and Andy Murray are still in the bottom half, but a full 10 spots ahead of Roger, at -3.2% and -3.7%, respectively.  Nadal is +2.1%.

If nothing else, these numbers show us how thin the margins are in top-level men’s tennis.  A few percentage points differentiate the very best from a fading player having a disappointing season.

The presence of Djokovic so far down these lists serves as another reminder.  Converting break points is a numbers game.  Look through Novak’s season and you’ll find a couple 3-for-11s, a 2-for-12, and a 4-for-18 (against Bobby Reynolds!).  You only need to convert a few to win a match, and the best way to convert a few is to earn as many as possible.

In other words, break point conversion rates represent only a small part of a player’s performance on any given day.  Earning those break opportunities can be every bit as important, and that’s one category in which Federer remains strong.

If you missed it last night, check out my recap and detailed stats for Murray vs. Istomin.

Here’s another interesting graph from Betting Market Analytics, showing win probability throughout yesterday’s Ivanovic-Azarenka match.  Because Vika was so heavily favored yesterday, she retained a better than 50/50 chance of winning the match even after Ana took the first set.

Unexpected Quarterfinalists: Gasquet, Hantuchova, and Not Fed

Yesterday, Richard Gasquet won a fourth-round match at a Grand Slam.

If that doesn’t surprise you, you haven’t been paying much attention to Gasquet for, say, the last eight years.  The Frenchman with the stunning backhand has advanced to the fourth round at a Slam 17 times now, making him only the 35th man in the Open era to do so.  The problem is what happens next.

Entering yesterday’s match, Gasquet was 1-15 in round-of-16 matches at majors, his one victory coming at 2007 Wimbledon over Jo-Wilfried Tsonga.  Since then, he’s lost his last eleven tries, including one to Tsonga and two to David Ferrer, his quarterfinal opponent this week.  No player has lost more than 15 fourth-round Slam matches; only Wayne Ferreira reached the same plateau, and Lleyton Hewitt will match it if he loses today.

One thing that has held him back is an inability to beat higher-ranked players, as Carl Bialik noted earlier this year.  At slams, he has played 28 matches against players with superior ATP rankings, and won only four of them.  Against lower-ranked players, he is 62-11.  Since Gasquet’s ranking has rarely reached the top eight, that mark hasn’t helped him the fourth round, where players outside of the top eight generally meet a higher-ranked opponent.

Now that Gasquet has broken through with his second Grand Slam quarterfinal appearance, history suggests he’ll go no further.  He has beaten Ferrer only one time in nine tries, and that was five years ago.  And Ferrer’s ranking puts him firmly in the category of guys Gasquet doesn’t beat at majors.

There’s one reason for hope, though.  Despite all the disappointment in the fourth round, he has never lost a Grand Slam quarterfinal.

Daniela Hantuchova‘s appearance in the quarterfinals of this year’s US Open is surprising for a different reason.  When the tournament began, her spot was pegged for Petra Kvitova, before an ailing Kvitova was upset by Alison Riske.  For all my talk recently about easy bracket on the men’s side, no one in either single’s draw has faced such lowly-ranked competition.

Hantuchova’s four opponents thus far include two qualifiers and two wild cards.  Among them, only Riske is ranked inside the top 100, and she’s #81.  By contrast, Hantuchova’s presumptive quarterfinal opponent, Victoria Azarenka, will have faced #13 and #28.

Of over 850 women’s Slam quarterfinalists since 1987, only six have reached the quarters without playing someone in the top 80.  The luckiest path was that of Claudia Kohde Kilsch, who reached the 1989 Wimbledon quarterfinals by beating #126, #246, #247, and #131.  Then her luck ran out: Steffi Graf ended her run in the quarters.  Steffi herself is one of the six, having won her first four rounds at 1993 Wimbledon without playing anyone ranked better than #87.

These lucky draws have become less common in recent years.  Of the six, only one has occurred since Steffi’s run in 1993.  Nadia Petrova reached the quarterfinals at the 2006 Australian Open without having to beat anyone ranked better than #100.

After four easy matches, there’s little pattern to how these players fare in the quarters.  As we might expect, the success rate in their fifth matches has much more to do with their quarterfinal opponents than the women they faced to get there.

And perhaps you’ve heard: Tommy Robredo defeated Roger Federer in straight sets.

It was the first time in twelve meetings that Robredo beat Fed.  It’s the Spaniard’s first quarterfinal appearance in New York, despite seven previous fourth-round showings (including one against Roger, in 2009).  Even Gasquet hasn’t been that bad, losing in the US Open round of 16 a mere four times.  And Robredo pulled off the upset while winning fewer return points than his opponent did–something that happens in only one of 15 US Open men’s matches.

When oddities like this occur–Gasquet’s match is another, as he won only 48.5% of total points–it is almost always because the winner played much better on high-leverage points.  In many matches, those important moments are at the back end of tiebreaks, when two points can make or break a set.  In Federer’s loss, the finger-pointing is directed at break points.  Roger barely converted any of them. It’s been a problem for Fed for years, particularly in his last several Slam losses.

It’s difficult to know how to evaluate poor break point performances.  In one sense, it’s obvious: If Fed was going to win the match, he needed to win more.  A failure to convert break points is a good explanation for any loss.

But what does it say about Fed’s current level, or about what we can expect from him going forward?  Is he suddenly weak on break points?  When I ran the numbers a couple of years ago, he was winning slightly fewer return points in the ad court, but the difference isn’t nearly extreme enough to explain a 2-for-16 performance on break points.

What’s particularly frustrating about squandering so many break points is that he earned them with good play on other return points. And, of course, there’s no difference between a typical ad-court point and a break point except for the pressure.

So, if Federer is still generating all those break-point opportunities, is he simply suffering through a run of bad luck?  Has he lost his clutch superpowers?  Have other players ceased to fear him in big moments?  Judging from the growing number of surprising defeats in Roger’s record, it certainly seems to be something more than bad luck.

Finally, a couple of notes.

Don’t miss this win probability graph of the Raonic-Gasquet match.  Mike says it’s “almost too interesting.”

In the New York Times Straight Sets blog (known for its coverage of the United States Open), Clayton Chin gives a brief overview of a forecasting method.  He emphasizes his reliance on the Monte Carlo method–a technique that utilizes thousands or even millions of simulations–which isn’t necessary here.

If you estimate each player’s serve and return points won, it’s straightforward to calculate each player’s chances of winning a game, set, or match.  Generally speaking, Monte Carlo techniques are useful when such closed-form solutions aren’t available.

The most important part of Chin’s approach is one he doesn’t shed any light on.  If Serena is holding serve at a certain rate and breaking serve at a certain rate over the course of the year, how do you generate hold and break rates for an individual match?  It can be done, and many have tried, but that’s much more challenging that simulating outcomes at the match or tournament level.  Without that glimpse under the hood, it’s tough to know how much weight to give his results.

Doubles Chaos, R2 Rigging, and the Threat of Watson

Today Bob Bryan and Mike Bryan open up their title defense in Flushing.  They’ve won four Grand Slams in a row, so winning this one would give them a calendar-year Slam, one of the few accomplishments they don’t already have in their pockets.

What makes this so impressive to me is the unpredictability of men’s doubles results, not to mention the utter chaos that reigns these days in the sport.  As I wrote after last year’s surprise Wimbledon results, men’s doubles is so heavily serve oriented that it often comes down to a tiebreak or two.  For most teams, that means that winning a tournament is roughly equivalent to guessing right on a series of coin flips.

For the Bryans to remain so dominant, they need to break serves that are rarely broken and win plenty of the tiebreaks that ensue when they don’t.  Roughly speaking, it’s as if John Isner stopped getting broken and improved his already impressive record in 7-6 sets.

Before the rain struck, yesterday provided a case in point of how good teams can easily suffer a bad loss.  Max Mirnyi and Horia Tecau make up one of the few teams that has remained together lately.  They aren’t unbeatable, but both are very good doubles players.  In their first-rounder yesterday, they lost in straight sets to Pablo Cuevas and Horacio Zeballos.  Yes, both of their opponents have strong doubles resumes, but Cuevas has been injured for what seems like years, and Zeballos was sick.  And neither plays nearly as much doubles as Mirnyi and Tecau do.

That sort of thing happens at every tournament.  We’ll see more of it in the next two days.  Somehow, it seems only the Bryans are immune.

Remember a couple years ago, when ESPN thought they discovered that the US Open was rigging the draw in favor of the top two seeds?  They weren’t, but tournament favorites have gotten a lot of easy first-round matches over the years.

While it’s surely just an accident, one can’t help think about it when looking at the men’s second-round draw.  Each of the original big four is playing a virtual non-threat, as is David Ferrer.  Djokovic gets Benjamin Becker, Murray drew Leonardo Mayer, Federer gets Carlos Berlocq, and Nadal drew Rogerio Dutra Silva.

To find a second-round match with some interest, you have to look to sixth-seed Juan Martin del Potro, who drew Lleyton Hewitt.  Even eighth-seed Richard Gasquet gets off easy, drawing qualifier Stephane Robert.

Sure, Slam second rounds aren’t always filled with interest.  But there are plenty of unseeded players–like Hewitt, or even Lleyton’s victim yesterday, Brian Baker–who could make things interesting for a top seed.  Ivo Karlovic, Gael Monfils, and Marcos Baghdatis, frequently cited as floaters, will face lower-ranked seeds, while Bernard Tomic and Jack Sock have clear paths to the third round.

In other words, we can look forward to some more blowouts on the show courts.

Could IBM’s contribution to the US Open get any worse?  It seems that the corporate giant has a team working hard on just that.

For those hardy enough to venture to the company’s website, there is a blog post called–I kid you not–“What if Watson Showed Up at the US Open Tennis Championships?

(They’re not talking about Heather.)

The answer is predictable: A bunch of amazing stuff will happen, what with the leveraging and the analytics and undoubtedly some synergies.  And predictive.

Aaron believes that cognitive technologies could utterly transform the US Open, from the way the technology responds to changes in demand for computing resources to the experiences of the fans, commentators and players. “Watson could bring a whole new level of engagement. It’s a cognitive agent that can improve the interactions between all of the people involved and between them and the event itself,” he says.

Ooh, cognitive agent!

He envisions augmenting Watson with predictive analytics technologies the sports events  team  has created for the US Open. In this future scenario, that technology would help commentators analyze and offer insights about matches with a level of accuracy never possible before.

We can only hope that IBM’s Watson team is completely different from IBM’s current tennis group.

On the subject of analytics–but I hope not embarrassingly bad ones–please check out my post last night with extremely detailed return profiles for Brian Baker and Lleyton Hewitt.  Return stats like you’ve never seen them before.

Contrasting Serves, Futile Slams, and (More) IBM Shortcomings

In most of his matches, John Isner makes his opponents look short and their serves look weak.  What happens, then, when his opponent really is short, with one of the weakest serves in the game?

Third up on grandstand today, Isner takes on Filippo Volandri, the man who sets records Isner will never reach.  Three years ago, the Italian failed to hit a single ace for 19 straight matches.  Volandri may not be as short as some players on tour–the ATP site lists him at six feet–but it’s more common for him to fail to hit an ace in a match than it is for him to hit one.

In the last year, Isner has hit nearly 19% of his first serves for aces, good for best among tour regulars.  In the top 50, the other extreme is represented by Nikolay Davydenko, whose rate is just under 3%.  Volandri–despite playing many weaker opponents on the Challenger tour–sits at 0.8%.

The good news for Big John is that the 31-year-old Volandri is a nonentity on hard courts, having not played on the surface since losing in the first round of the Australian. The bad news? He’ll have to hit a lot of returns today.

As my forecast very delicately predicted, Fernando Verdasco didn’t live up to his seed, losing to the barely-unseeded Ivan Dodig yesterday in five sets.  That’s the fourth slam this year in which he’s lost in a five-setter.

Verdasco, with his flashy talent and underwhelming results, comes in for his share of fan mockery.  But this is one time he doesn’t deserve it.  Out of the several dozen players who enter all four slams each year, almost all will lose four matches.  While it may be frustrating to lose in five, losing in five, all else equal, says better things about your game than losing in three.

One of those five-set losses this year was to Andy Murray at Wimbledon; the other two previous contests were against Janko Tipsarevic and Kevin Anderson.  Perhaps Fernando should have finished off at least one of those matches, but none of his four slam losses this year are nearly as groan-inducing as, say, Ernests Gulbis‘s disaster yesterday against Andreas Haider-Maurer.  And his record is nothing compared to Marinko Matosevic‘s streak of 11 losses in 11 slam appearances.

Verdasco is the sixth man in the Open era to complete this distinctive slam feat, and he’s not in bad company. Last year, Isner did it–and added an exclamation point with a five-set loss in Davis Cup.  Before that, the most recent were Fernando Gonzalez in 2006 and Tim Henman in 2000.  Not bad company.

Anyway, if you’re drawn to this unusual feat, don’t miss Steve Johnson‘s first-round match with Tobias Kamke. It’s last on Court 13 today. Johnson is three-quarters of the way to the Fernando slam, losing all three of his matches at majors this year in five sets.  If he completes the set, it will be particularly impressive for at least one man: Kamke has won only two five-setters in his career.

As part of IBM’s ham-handed PR push leading up to another slam, the company gave analyst and coach Craig O’Shannessy some data.  He reported some results on both the ATP site and the New York Times Straight Sets blog.

This is a huge step up from the thinly-veiled advertisement I highlighted yesterday.  But it still, frustratingly, falls short.

One of the major points of Craig’s ATP piece is summarized at the beginning: “Most baseline points are a losing proposition,” and “Approaching the net is a goldmine.”  Later, he continues, “It seems amazing that players don’t venture forward more often to capitalize on the far higher winning percentage approaching offers over baseline play.”

Is this the data-driven, actionable advice I pleaded for last week? Not quite.

As I’m sure Craig would agree, opportunities to come to net aren’t always available, and they don’t arise in a vacuum.  Especially in today’s baseline-focused game, net points tend to occur when one player hits a particularly weak shot.  So if most net points end in victory for the player who approaches, is that because of the choice to come to net, or the weak shot that generated that opportunity?

Think about it probabilistically.  When Djokovic serves against Tsonga, let’s say he has a 75% chance of winning a first serve point.  If Tsonga hits a weak chip return in the middle of the court, allowing Novak to take several steps forward, we could figure that Djokovic’s chance of winning the point increases to 95%–perhaps higher.  When Novak puts away his second shot, he wins the point.  Formally speaking, his chance of winning jumps to 100%.

Now, in that example, what do you credit as the reason for Djokovic winning the point?  Landing a solid first serve, which gives him a 75% chance of winning instead of, say, 60%? A particularly good first serve, which forced the weak return?  Tsonga’s poor return? Or Novak’s “choice” to approach the net?

That final choice is laughable.  And this is the data he’s drawing from.  Aside from a few particularly aggressive players on tour, that’s the profile of a net point in 2013.

So, what’s the actionable advice here?  You probably shouldn’t approach the net without a reasonable opening, so … hit bigger serves to get more weak returns? Hit deep groundstrokes into corners? Take advantage of short balls?

These are the benefits we reap from “Big Data?”

IBM clearly wants to wow us with this stuff.  Yet the “findings” are so elementary as to be useless.  The solution is so simple: release the data, let fans and analysts innovate, and watch the quality of this work go through the roof.

Dodig’s Consistency, IBM’s Offensive, and Hopeless Wild Cards

Ivan Dodig just missed out on a seeding at this year’s US Open.  Ranked 37th when seeds were assigned, he had ascended as high as #35, largely on the strength of his fourth-round showing at Wimbledon.

While the Croatian could have drawn any seed as early as the first round, he got lucky, pulling 27th-seeded Fernando Verdasco.  My forecast underlines his fortune, giving him a 51% chance to advance to the round of 64, then roughly even odds again to make the round of 32 against (probably) Nikolay Davydenko–another player who fell just outside the seed cut.

Making the Dodig-Verdasco comparison more interesting is that in the last 52 weeks, the unseeded player has won more matches (38 to 29) with a higher winning percentage (58% to 56%).  What the Spaniard has done, however, is bunch his wins much more effectively than his first round opponent.  While Dodig achieved a career highlight with his R16 showing in London, Verdasco made the quarters.  Fernando reached the final in Bastad, and earlier in the year, won two matches at the Madrid Masters.

A telling comparison is that while Dodig has lost five opening-round matches in the last year, Verdasco has lost nine.  As Carl Bialik explained two years ago, consistency isn’t such a great thing in tennis.  Certainly, the ATP rankings–and the seedings that utilize them–prefer inconsistency.

You know there’s a Grand Slam in the offing when the PR pieces from IBM start to appear.  Last week, a particularly bald-faced plant showed up in the New York Times, a publication that–one fervently hopes–should know better.

This particular piece includes such hard-hitting journalism as, “The keys are updated during matches to track any shift in momentum, and they correlate well with the final outcome,” and “These extra features are likely to drive traffic to the event’s Web site, USOpen.org, and its various mobile versions. ”

The Times should be embarrassed.  What makes this particularly frustrating to the statistically-oriented fan is that while IBM speaks the right language, the results of this effort to “fulfill fans’ desire for deeper knowledge” are so disappointing.

The much-vaunted Keys to the Match are frequently arbitrary, often bizarre.  In Kei Nishikori‘s second-round match at Wimbledon, one of his “Keys” was to “Win between 71 and 89 of winners on the forehand side.”  He didn’t do that–whatever it means, exactly. He didn’t meet the goals set by his two other Keys, either, yet he won the match in straight sets.

Most frustrating to those of us who want actual analysis, the underlying data–to the extent it is available at all–is buried almost beyond the possibility of a fan’s use.  IBM–like Hawkeye–is collecting so much data, yet doing so little with it.

Lots of fans do desire more statistical insight. Much more. The raw material is increasingly collected, yet the deeper knowledge remains elusive.

Stay with me as I leap from one hobby-horse to another.

Wild cards cropped up as a topic of conversation last weekend, largely thanks to Lindsay Gibbs’s piece for Sports on Earth, in which Jose Higueras said, “If it was up to me, there would be no wild cards. Wild cards create entitlement for the kids. I think you should be in the draw if you actually are good enough to get in the draw.”

I don’t object to wild cards used as rewards, like the one that goes to the USTA Boys’ 18s champion, or the ones that the USTA awards based on Challenger performance in a set series of events.  There’s even a place for WCs as a way to get former greats into the draw. James Blake shouldn’t have gotten the deluge of free passes that he has received in the last few years, but it’s probably good for the sport to have him in more top-level events than he strictly deserves.

The problem stems from all the other wild cards, and not just from a player development perspective.  Are fans going to get that much enjoyment out of one or two matches from the likes of Rhyne Williams and Ryan Harrison, Americans who didn’t have a high enough ranking to make the cut?  Of the fourteen Americans in the men’s main draw, six were wild cards, and it would shock no one if those six guys failed to win a single match.

There are further effects, as well.  By exempting Williams, Harrison, Tim Smyczek, and Brian Baker from the qualifying tournament, fans seeking quality American tennis last week barely got to see any.  Donald Young–who has received far too many wild cards himself–was the only American to qualify, largely because the US players at the same level as the other would-be qualifiers didn’t have to compete.  The remaining Americans were in over their heads.

This leads me to a great alternative suggested by Juan José Vallejo on Twitter: Be liberal with free passes in qualifying, and take the opportunity to promote those early rounds much more.  At the Citi Open a few weeks ago, the crowds on Saturday and Sunday for qualifying were comparable to those Monday and Tuesday.  Because qualifying often falls on the weekend, the crowds are there.  But if they want to see Jack Sock play, they’ve got to come back Tuesday night (and spend a lot more money), and they’re much more likely to see him overmatched by a better, more experienced player.

Cut the entitlement, improve the quality of main draw play, and give the fans more chances to watch up-and-coming stars.  I wish there was a chance this would happen.