Predicting a decade in the future


game, pred, corr/totl, ratio, notes
2009, 2008, 1292/1800, 0.718, 2 teams set to 0.00
2009, 2007, 1157/1800, 0.643, 14 teams set to 0.00
2009, 2006, 1125/1800, 0.625, 17 teams set to 0.00
2009, 2005, 1094/1800, 0.608, 20 teams set to 0.00
2009, 2004, 1075/1800, 0.597, 23 teams set to 0.00
2009, 2003, 1023/1800, 0.568, 32 teams set to 0.00
2009, 2002, 1044/1800, 0.580, 34 teams set to 0.00
2009, 2001, 1047/1800, 0.582, 33 teams set to 0.00
2009, 2000, 1075/1800, 0.597, 33 teams set to 0.00
2009, 1999,  992/1800, 0.551, 38 teams set to 0.00

Explanation of last entry:
Use 1999 analysis to predict 2009 games
992 games picked correctly out of 1800, 55.1% correct
38 teams existed in 2009 that I don’t have strength data in 1999.

In general, prediction ratio goes down as age of explanatory variable goes up. Old data produce poor results compared to new data. In an absolute comparison, nine-year-old data produces 60% correct predictions, amazingly high in my opinion. Tradition rules?

A criticism of this approach, and one I strongly agree with, is the large number of teams that have ratings set to zero. If a strong team is formed and gets set to 0.00, the prediction ratio will definitely go down. One example of this is IKM-Manning, formed in fall 2008. Both teams were previously quite strong, with the two teams finishing 1-2 after the 2006 season. In this example IKM-Manning should/could be rated as the average of the two schools previous ratings, undoubtedly improving the predictions for this new school.

If a team splits into two teams or a new school begins to compete, the zero rating seems reasonable.

So in a future try, I will remove games involving a team for which I have no past strength estimate. So totl in above table will be different for each year (except 2000-2001), but overall percentage should have more value.