Wednesday, February 20, 2013

Creating a Formula for "Expected Wins"...or How Brad Keselowski Won the Championship

Guess who is the all-time leader in statistically winning more than expected?

Your answer: 2012 Champion Brad Keselowski.

Of every modern driver, he has won more races than his laps led performance would predict. Consider his very first win at Talladega: he only led one lap in that race: the final one. It was his first career lap led, and it got him a race win. That's the most extreme way of outperforming expectations.

The Question
"How many race wins should I expect from any given driver?" Based on how a driver performs during a race, what expectation should we place on them being able to win the race? And can we quantify this over an entire career? Could I use this number to help better predict long-term performance, and measure who is over-performing or under-performing?

The Inspiration
Bill James's "Pythagorean expectation", originally applied to baseball, is a way to predict how many wins a team would get based on how many runs it scored vs. how many runs it gave up. This equation was later modified for sports like football and basketball.

To update this approach from stick-and-ball sports (where each game has one winner and one loser), to racing (where there is only one winner but multiple losers), we have to make a couple tweaks.

The Racing Formula
Expected Win Percentage = Laps Led / Laps Competed

The idea here is that if a driver leads 5% of their career laps, then we expect them to win 5% of their races. If you take this example to the extreme, a driver who led 0 laps would win 0 races, and a driver who leads 100% of their laps would win 100% of their races.

When we test this formula going back nearly 50 years to all race winners who had a minimum of 25 career starts, we get a very good r-squared of 83% (across 93 qualifying drivers).
In the chart above, points above the red line mean these drivers won more races than expected. Drivers below the red line won fewer races than their laps led would have suggested. These are drivers that led a ton of laps but didn't have the right stuff at the end of the race.

The Implications
Drivers who are able to win more races than their laps led expectation are finding ways to win  without dominating races. They might just lead the final few laps, due to smart strategy, being patient, working with their crews, or not burning up their equipment for unnecessarily hard racing earlier in the race.

Drivers who lead many laps but don't win as many races are too fast, too furious. They have the speed but don't know how to consistently convert it into an actual win.

People should consider this wins expectation statistic when looking at the drivers who can make the most out of nothing. The over-performing drivers might be better picks for fantasy racing, as they find ways to win when others burn out. In the actual (non-fantasy) racing business, these drivers might be better picks for owners looking to hire new talent, since these drivers can help make your team better rather than burn out your equipment. Additionally, you could measure crew chief performance by calculating their rate at squeezing unexpected wins, no matter who drives for them.

Recent examples
As I mentioned above, Brad Keselowski is the all-time leader in winning more than expected. See Table 1 below.

We also see Jimmie Johnson and Kevin Harvick on that list, two more drivers who seem to have a knack for coming out of nowhere to get wins.

And on the flip side, Kyle Busch is the active driver who has most underperformed his in-race abilities.  This is no surprise to anybody who watches the races: he dominates the early and middle part of races, but so often will lose at the end.

Table 1: All-time best and worst drivers at winning races vs laps expectation