Tuesday, September 21, 2010

Each Turnover Worth a Touchdown

NOTE: I know I'm getting ahead of myself, as this post corresponds to 2nd order win%, but this is what I've been thinking about for days.

In our quest to identify true team talent, one of the questions I have been thinking about is how much is a turnover worth? How many points can a team expect to lose by turning it over. Not just in the points off the turnover given the good field position, also the lost scoring opportunity, the game changing ability, and the momentum shift. These all contribute to lost points.

To estimate the value of each of these turnovers, I took data from my new favorite site http://pro-football-reference.com/ on each of the past 5 years. I took each team's full season data, especially considering Point Margin (Pts Scored - Pts Allowed) and Turnover Margin (Takeaways - Turnovers).

Then I plotted on a scatter plot the two values. The regression line will give us the answer we're looking for:
There! Over the last 5 years, a turnover has been worth 7.615 points over a full season. Thats almost a point every two games! Its also more than most touchdowns. This gives us a starting point to move forward in our analysis of individual NFL contributions. If a running back fumbles the ball, -7.615 points immediately... boom baby.

I'll call this value TOp (Turnovers in points), thus TOp = -7.615.

In defense of TOp, I believe that margins are the best way to estimate this because of the effects on both the opponents score and your own. I also believe in including non-offensive scoring because momentum is also part of this analysis. Turnovers affect the very mindset of the football players. Thus we must use all-encompassing values that are expected to go towards the mean over time. I believe 5 years is enough time, although, we can add that to the list of todo posts: When do these stats stablize?

Interestingly, the 4 years preceding 2009 saw relatively stable value for TOp, but in 2009, that value jumped to 10.123:
I wonder what was the cause of this. It seems turnovers were especially important last year. This is interesting, but a story for another time... 

The bigger picture:
This value of TOp will help us get ePM (Expected Point Margin) which is the key to 2W%, the 2nd order winning percentage.

My NFL Quest

I've got a new quest in mind. The NFL has been on my mind a lot recently, for obvious reasons. Looking around, advanced NFL stats are rather hard to come by. At first I wanted to gain an understanding of an individual's contribution to the overall performance of his team. But this has many complications, mainly things like how do you separate a quarterbacks throws from the receivers' catches? After finding myself unable to answer this question, I started down a different path. One, that when complete, should help me get to the overall goal, individual contribution.

My quest, as it stands now is to estimate the true talent level on a team basis. I'll try to stick to as much sabremetric methods as possible, deviating only when I have to. By dealing on a team basis, we don't have to think about individuals, and that makes it easier. As answers arise and a team talent picture is formed, we can formulate ways to estimate individual contributions to a team.

So here's my quick outline of the direction I'm going in:
Goal 1: Give estimates of 1st, 2nd, and 3rd order win percentage.
Each successive order takes out another part of luck.

  • First order is based on points margin, trying to take out the randomness of when the team scores.
  • Second order is based on a statistic called expected points. Once point margin is estimated, we can use our first order expectation to find a second order win percentage. This takes out the randomness of whether the team scores when they should or scores when they shouldn't.
  • Third order aims at taking out other forms of randomness, such as strength of schedule and park factors. Park factors are interesting, because as I was watching the Week 2 MNF game Saints @ 49ers, I wondered how much Candlestick park had to do with the Saints not throwing the ball well as they might have in the domes. Also, as it gets colder, does Lambeau Field get less pass-friendly? This is all park factors that will be included.

Some notes:

  • The first round of "stabs" at these expectations will be rough, and will definitely need to be come back to. I'll keep a list of things I need to revisit and fiddle with. This way, we can get a rough clue and tweak it to make it right.
  • I won't stick with theory the whole way. As Bill James said, sabremetrics isn't supposed to teach us anything new, so empirically fudging with equations might happen.
  • Notation: Xp will refer to X represented in terms of points, usually points, and Xw will be X in terms of wins.
  • Notation: For each winning percentage order, OW% will be my notation (O = order) thus 0W% = (W+L)/G, 1W% is first order, etc.
I hope you enjoy my findings as much as I enjoyed thinking about them!

Thursday, August 26, 2010

The most underrated (and overrated) NCAA College Football teams

Welcome to The Eisen Estimation! I am Jon Eisen, and this is a space for my ramblings, thoughts, comments, and complaints about all things sports. In particular, I will follow one simple rule for all posts: All analysis must be objective, even if the underlying opinions are not. I will use statistical evidence to answer questions and develop insight into all things sports.

Our first topic: NCAA College Football polls. First, my opinion of the polls is quite low. They start with an arbitrary opinion favoring only the most celebrity-like programs, and then use filtering algorithms to correlate their previous rankings with the next week, until they arrive at a conclusions that is mostly based on the original (and arbitrary) preseason rankings! My focus this week is on the absurdity of the preseason rankings.

So, whats the underlying question: Who do the polls favor in the preseason rankings? And, are these programs "celebrity-like"?

To answer this question, I found the change in rankings of any team from the preaseason to the final for the AP poll. (Thank you http://preseason.stassen.com/over-under/). This data covers the last 21 years. It takes the final rankings minus the preseason rankings (unranked = 26th), then sums that value over all 21 years.

First, the initial results...
The most underrated teams in the past 21 years:


Boise State 75
Oregon 72
Washington State 70
Utah 64


Those four make up the top echelon of underrated teams, and they are all teams that have been good in the past 21 years, but not (preseason) ranked so highly.

Now the most overrated teams:


Oklahoma 74
Notre Dame 76
Southern Cal 77
Michigan 78


These are all very good teams, but each it seems has been overrated to a significant degree. Hows the distribution look for all the teams?

On the right, we see the top 4 underrated teams, and on the left, we see many overrated teams. This implies there are more consistently overrated teams than consistently underrated. As teams get better consistently, the voters get smart, and stop underrating them. BUT, it seems that they don't stop overrating teams. Consider this year's preseason poll. Of the top 10 most overrated teams of the past 21 years, 3 are in the preseason Top 10, and 7 are in the Top 25. I would bet that more than half of these finish below their preseason position. On the other side, of the top 10 most underrated teams, 3 are in the top 10, and 4 are in the top 25.

We can definitely see that the overrated teams are the more celebrity-like programs: USC, Michigan, and Notre Dame definitely fall into that category. Interestingly, Alabama (preseason #1) is one of the more underrated teams.

We have uncovered evidence that supports the claim: The AP poll consistently overrates celebrity programs, and underrates smaller programs. But, more questions were revealed than answered. For instance, there are some interesting outliers, what are the trends? USC has been good for a long time, could they just be overrated slightly for each year? What is the standard deviation? Could Michigan be simply affected by one bad (really really bad) year after being chosen for a top 5 preseason ranking? Do voters ever learn their lesson?

These answers and more... next time!