I recently made a post about developing an algorithm to predict the NBA playoffs, and I concluded with 2 predictions. Although Miami beat the Celtics to make my algorithm 1-0 in terms of predictions, it fell to 1-1 when the Thunder beat the Spurs. So, we are now at .500 . Considering that the algorithm was about 61.5% accurate over the whole season, this is to be expected.
I made some improvements to the algorithm which improved accuracy, and then used this to make new predictions for the next game in each of the current series (Spurs vs. Thunder and Celtics vs. Heat). You can scroll down all the way to see the predictions, or read through to see what I did.
Improvements in Variables
I created rough code initially, and I didn't fully utilize all of the information that I had. The first step in making improvements was to add some variables relating to different player positions and bench vs. starter performance.
For example, this plot shows that the average number of seconds bench players played over the last 10 games has a decent correlation with winning percentage:
Improvements in Machine Learning
After adding some variables, I moved on to adjusting the models that I used. I had initially spent very little time on the machine learning framework, and most of the time on the data, and that did not change here, but I was able to tweak what I was predicting. Initially, I was predicting a binary value- whether a team won or not. I adjusted this to predict the ratio between a team's score and another team's score. This gave the machine learning algorithms a lot more information than a 1/0 target, and also had the benefit of being a normal distribution, as this quantile-quantile plot shows:
Updated Season Accuracy Results
With the improvements, the accuracy now comes to 63.6% for the season, which is a reasonable improvement over the previous results. This results in this confusion matrix:
Predictions for Upcoming Games
As before, I will leave you with predictions for the two upcoming games.
I can do a decent amount of analysis on the data, so please let me know if you want to see something specific next time. I'm going to make posts predicting all of the games in the series and the finals.