Thursday, April 25, 2013

Video discussing the Penalty article and Previewing Richmond

You can also find this at at StatsInsights.com

Wednesday, April 24, 2013

Do Penalties even matter in NASCAR? When and where cheating might pay off

(note: this was written before the Matt Kenseth penalty of April 24 was announced)

(This is a cross-post with StatsInsights.com)

The big news in the last few days is NASCAR levying harsh penalties to Penske drivers Brad Keselowski and Joey Logano. They each were fined 25 points because their cars' rear axle housings were skewed and not perpendicular to the chassis. It appears the penalties were applied because NASCAR felt they modified this part to gain an unfair advantage.

What are the implications of these penalties?

First, let's take a look at what 25 points means in the standings. Keselowski would be one rank higher, and Logano would be six ranks higher if not for the penalties:

Table 1: Effect of Penske penalties in 2013 Standings

But what are the longer-term effects? Before we get into the data, there are three basic hypotheses for what could happen:
  1. The penalty overwhelmingly hurts the team's performance for that year. This suggests the penalties are too harsh for the infraction.
  2. The penalty is too lenient, and doesn't incent teams to play by the rules. Teams continue taking advantage of the system, because they gain more points outperforming their competitors than they give up in penalties (assuming they are even caught).
  3. The penalty has no overall effect: presumably this means that the performance gained was equal to the points lost once caught. This suggests a fair penalty process.

Let's go back through 2007 and look at all the points penalties in Sprint Cup:

Table 2: Recent Points Penalties
Data notes:

  • Source is Jayski.com
  • We are ignoring Carl Long's penalty (because he didn't compete in any Sprint Cup races in those years - his was an All-Star race penalty)
  • The blue zones at top are based on 2013 standings, and not finalized


What does the data teach us?

1) The average points finish of a penalized team is 21. This is right in the middle of the pack (of 43 cars each week), suggesting that NASCAR is penalizing teams equally up and down the performance spectrum (good, average, and poor teams are all getting penalized).

2) There is little evidence to suggest that the penalty makes any difference in performance. When comparing points in the penalized year with both the year prior and the year after, the average difference is basically 0. Notice in this table the rankings are all about 21/22.


Table 3: Yearly Summary of Penalties

Figure 1: The difference in Season Rank averages 0

3) If you ONLY look at the top 25 teams, there is some evidence that the penalties have an overall negative effect on them. Notice in this table the rankings decline from 11 to 12 to 13 in successive years.

Table 4: Avg Season Rank of Top 25 Teams

4) If you look ONLY at teams in the top 10, the penalties don't do enough to offset the year in which they got caught (improvement by 2 ranking spots over prior year), but the following year the teams drop an average of 5 ranking spots. This suggests that the cheating is worth a good short-term gain, but not worth it in the long run.

Table 5: Avg Season Rank of Top 10 Teams

What Relationships Can We Infer from the Data?

The effect of cheating and being penalized for it is negligible on the average NASCAR team. But the more competitive teams see the biggest gain from cheating, and conversely, the biggest drop in performance in the year following their penalty. This brings up a good short-term vs. long-term tradeoff question that each team can answer for themselves: is next year's loss from being penalized worth the short-term gains from bending the rules now?

Friday, April 19, 2013

Latest BSports video, previewing Kansas this weekend

Thursday, April 18, 2013

A Smarter Way to Predict the Chase Cutoff: Why Making the Chase is about Racing Against History, Not Other Drivers

(This is a cross-post with StatsInsights.com)

After 26 races, the top ten drivers in the points standings qualify for the Chase (plus two wildcard drivers).

The key obviously is making it into the top 10 in points. That tenth position takes on supreme importance in discussions throughout the season, every week from now through the summer.

Here's the fascinating thing: cracking the top 10 is not really about beating the other drivers, but about racing against a historically consistent benchmark. If you focus on making that benchmark, it doesn't matter what the other drivers do.

Let's take a closer look.

We know that after each race, the winner gets 43 points, and last place gets 1 point (each position is worth one point), plus some bonus points.

Here's what happens though: after just a few races, the amount of points it takes to be in any given position is very consistent.

I have highlighted 10th place: notice that the average points per race required to be in 10th place is very consistent throughout the season, hovering between 30 and 31. No matter how many races into the season we are, 30 points per race will keep you in the 10th spot. For example, right now 10th place in points is Paul Menard. His 206 points after 7 races is an average of 29.4 points per race, very close to the predicted average.

What does this mean in terms of qualifying for the Chase? After 26 races, you would need about 26*30 = 780 points to make the top 10. Using 800 points as a round number for where a driver would need to be after 26 races, it's very easy to predict now what each driver needs to do in the next 19 races to make it in.


In the table above, you see that Kyle Busch could average just a 15th place finish from here on out and still qualify. Jeff Gordon would need to do significantly better, averaging better than 11th in every race to make it in. Here you see how important every position is each week. Paul Menard, for example, needs to average 12th each week to make the chase, but averaging 13th will mean he probably won't make the cut.

The most important takeaway from this table is that a driver just has to focus on hitting a target; the Chase berth will naturally follow. Using this thought process may be easier, less stressful, and more predictable for teams and their drivers: focus on staying consistent, executing well, and hitting the benchmark rather than worrying about how everybody else is doing.  

(If you like this article, check out this post in 2009 where I described Chase qualifying levels, based on the old points system)

Wednesday, April 10, 2013

What Makes a Race Competitive? A mathematical look at the factors that cause more lead changes in a race.

(This is a cross-post with StatsInsights.com)

Lead changes in any NASCAR race are always a big topic in the media. Lead changes are used as a barometer for how entertaining or competitive a race is from a product and TV spectator standpoint. To give you a reference point, on average there were about 20 lead changes per race in the Chase Era.

What if we could predict lead changes? Let's look at the factors that cause lead changes to vary in a race.

For this exercise, we look at all the races in the 9 full seasons of the Chase Era (2004-2012), or 324 total races.

After running many linear regressions with various combinations of factors, here are the four that dominate how many lead changes a race will have:

Length of Race (in miles, not laps): Every 100 miles adds 4.7 lead changes.
Number of Caution Laps: Every 20 laps of caution adds 1.5 lead changes to the race.
Restrictor-plate race: Being a plate race itself adds 22 lead changes on average.
Saturday race: The effect of racing on Saturday decreases lead changes by 2.6 on average.

What Can We Infer from These Factors?
  • Obviously the longer a race, the opportunity for more lead changes can only increase, not decrease.
  • The more caution laps in a race, the more chance for different pit strategies, problems on pit road for the leader, drivers staying out during the pit sequence to pick up a lead lap bonus, bunching up the cars on restarts so that somebody else could take the lead.
  • We know plate racing has always resulted in more lead changes (because the cars are all bunched up together, racing in a large pack), now we have quantified that factor.
  • The one factor that sticks out is Saturday racing. There are fewer lead changes for a Saturday race. Why is that? Does the shorter week make it harder for teams in the shop to work on their cars and get them ready for the weekend? Do the Saturday races have compressed practice schedules, making it harder for the lagging teams to improve their cars? Is it because Saturday races are usually at night, and the cooler temperatures make it harder to compete? This is a very interesting factor to focus on in the future.

 
These four factors explain almost 60% of the lead changes in any race.

Now let's focus on the factors that DO NOT have a statistically significant effect on lead changes in a race, even though a lot of these might get discussed in the media as having an effect:

  • Month of the year: Lead changes don't change over the course of the year.
  • Number of LAPS in a race: As we saw above, it's the race length measured in MILES that matters, but not the number of laps.
  • Road course: Races at Sonoma and Watkins Glen don't affect lead changes.
  • Track Length: Whether the track is half-mile Bristol or 2.5-mile Pocono, this doesn't affect how many lead changes will occur.
  • Race speed: Some tracks are faster than others, and average speeds can vary from track to track. But this is another area where there is no effect on lead changes. This is interesting because it suggests that you don't need a fast track to have a more competitive race.
  • Winner's Starting Position: One factor considered was the original starting position of the winner. If a winner started on the pole, versus if the winner started way in the back of the pack, would that make an impact on lead changes? Nope; theoretically, this suggests that inverting the field (starting the fastest cars in the back) would not increase lead changes. (This finding is not terribly surprising: remember my earlier post showing how starting position often has little bearing on finishing position.)
  • Race purse: Here's one more interesting, unexpected quirk from the data: the amount of money at stake does NOT increase the number of lead changes! In fact, there was one FEWER lead change for every extra $734,000 added to the average's race purse.


In conclusion, what have we learned today? We learned that the number of lead changes is a number that doesn't move on most factors. Even factors that you would think might make a difference (namely race speed, track length, amount of money at stake) don't actually contribute much at all. Ironically, so many of the changes that are suggested to make races more viewer-friendly (shorter distances, fewer cautions, more Saturday night races) all actually reduce the number of lead changes!


Friday, April 5, 2013

Dale Earnhardt Jr.’s Sizzling Start to 2013 and what it might mean for the record books


(This article originally appeared at Bloomberg Sports)

The media is abuzz with the fantastic start Dale Earnhardt, Jr. is having this season. What do the numbers tell us about his performance so far? Let's take a look at NASCAR's Loop Data and see what we can learn.
Through the first 5 races of the 2013 season, Earnhardt’s average finishing position is 4.4. That’s not a secret.

Here’s the twist though, his “best position” in each race (i.e. the best position he reaches at any point during a race), is averaging 2.6. The difference between his average Finish (4.4) and his average Best Position (2.6) is 1.8. How good is a 1.8 compared to everybody else? Earnhardt’s ability to finish strong is ahead of everybody this year, including champions like Brad Keselowski and Jimmie Johnson.


On average, Earnhardt finishes only 1.8 spots worse than his best position in a race. Consider this number is a driver’s ability to get the most from his car: a larger value means that a driver can get to a good position during a race, but cannot maintain it. A smaller number means the driver finishes as close as possible to his best place during the race. Sometimes a driver doesn't have the fastest car in a given week, but a championship-level season is based on squeezing the most out of what's available.

Earnhardt is doing a fantastic job of maxing out his potential in 2013.

In fact, it’s so good, that if he keeps this pace up, it would blow out the record since Loop Data became available in 2005. Here are the 15 best seasons for maximizing potential (the lowest Finish-Best spreads):

Notice the two best seasons are in the 4-5 range. Even a number in the 7s reflects a fantastic ability to get the most out of one’s car.

As we see in Figure 3, the vast majority of drivers finish about 10-15 spots worse than their best position in a race.

We'll keep an eye on this number over the course of the season to see if he can keep up the good work, and make history in the process. If he keeps it up, it would set a sizzling stat record.