Tradition, one year tradition


So with schedules to be released in 2 days, the inevitable question is “who has the toughest schedule?” To answer this question, a estimate of all team strengths is required. However, no games are been played, so what should be done?

Possibilities:

1. Use last year’s team strengths
2. Use a different year’s team strengths
3. Use a combination of previous year’s team strengths
4. Use a combination of previous year’s team strengths, with a correction for returning starters

I really like #4, but there is no complete database of returning starters. For now I would like to announce some results regarding choice #1.

year strengths, year data, correct / total = ratio, notes
1999 strengths, 2000 data, 1188 / 1695 =   0.701, 7 teams set to 0.00
2000 strengths, 2001 data, 1220 / 1699 =   0.718
2001 strengths, 2002 data, 1182 / 1702 =   0.694, 2 new teams set to 0.00
2002 strengths, 2003 data, 1228 / 1716 =   0.716, 2 new teams set to 0.00

Details for line 1:

1999 strengths: these are the unbiased estimates of teams strengths obtained from the 1999 analysis
2000 data: the data set of all games played in 2000
correct: number of games picked correctly by using 1999 strengths
total: total number of games played in 2000
ratio: correct/total
notes: in 2000 there were 7 new teams that did not exist in 1999 data set

So in general, a person can predict about 70% of games correctly by using last year’s analysis.

This is quite a large number in my opinion. Flipping a coin gives you 50% correct. Using last year’s data gets you to 70%. Doing a full regression analysis get you to 80%. Tradition, in the sense of 1 year tradition, adds value, quite a bit of value, in fact.

I will continue to update weekly reports for 2004-2009. I will then do some research on options #2 and #3. And if I have time, I will dream longingly about #4.

2 days until fall schedules are released, July 15, 2010
27 days until the first day of practice, August 9, 2010