Week 3 predictions were less than ideal for my standards. For spread, ML and over/under betting, the model performed the same as flipping a coin for each choice, 50/50.
Because of this, I decided to make major adjustments to the model.
Initially, the model was using historical data including team statistics from 1970 for every game and attempted to predict the score of each team independently and then use those scores to construct the betting picks. On a team by team level, this actually performed pretty well! The average absolute error was about 6 points! That’s not too far off for NFL score predictions. However, the fact of the matter is that teams playing against each other are not independent events. So when two teams matched up and the predictions were 27 to 24, it really meant the prediction was 27 ± 6 and 24 ± 6. This meant that the score for each team was expected to be in a range [21,34] - [18,30]. This doesn’t actually create a valuable prediction. There is so much overlap and so many scores in between these ranges that could lead to any outcome.
For these reasons, instead of treating these events as separate independent scores, I changed the output variable to be the difference between the home team’s score and the away team’s score. This change in variables now treats the game as one event with one single defined outcome. Additionally, I ran a separate model to predict the total score of the game as well. From these two numbers, we can create the predictions for what the score will be. I also chose to use only data since 2000, which was a data scientist assumption that the game of football has transformed too much since the ‘70s to use the entire dataset. See photos below of the frequency distribution of the spreads of the games in the past 20 years as well as the total scores.
Both photos make a lot of sense, it looks like both spreads and total scores are somewhat normally distributed. Interestingly enough, the mode of the spreads is a close game home loss! That is something I was not expecting at all.
Coupled with these changes, I also engineered several new variables to capture a team’s strength in the season. I created variables that would hold season averages up to and excluding the week of the game I am trying to predict the spread and total score for. For example, in Week 4, we would use an average of points scored through the first three weeks. This added information helps the model get a full picture of the team from a season perspective and also gives the team the benefit of the doubt if they have an off week.
Further, I changed the k-fold cross validation from k = 5 to k = 10 for a more robust validation set. I also decided to change the metric for accuracy between models to minimize for mean squared error instead of mean residual deviance. I also set aside 2020 data on its own and tested the model against 2020 data it had not seen (Week 2 and 3) and it performed actually better than the original model.
Finally, I decided to add a way to measure probabilities for each of the bets that the model predicts. This way, I can give bettors an idea as to how likely different predictions will perform and gives me the confidence to choose which predictions are more likely to come true. The way I do this is by first running the model and generating the predicted spread and total score. Then I look at predictions across all 20 years of data, finding similar predictions to the one in question. From there, I examine how often historically the model would have predicted the correct moneyline, covering the spread, and the over or under set by the sportsbook.
For example, when examining the Packers-Falcons game coming up on Monday, the model predicts that the spread will be Packers by 10 and the total score to be 72. To calculate the probabilities, I look historically at predictions where a team is predicted to win by 10 and examine how often that team wins outright and how often they cover the 8 point spread that the Packers will need to cover on Monday. Then do the same for the total score. By doing this I not only see how well the model performs historically, I also can get an idea what to expect in the upcoming game.
As an addition to the instagram prediction posts, I have also added probability icons. These will be used to quickly and easily see how likely the predictions are to come true, solely based on the historically similar predictions. I quite like this method, as it gives me hope that these slight analytical edges will help provide a better record of predictions.
I’m feeling much more confident about the model predictions this week and we’ll have to wait and see!