Down and Dirty Forecasting: Part 2

[This article was first published on OutLie..R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This is the second part of the forecasting exercise, where I am looking at a multiple regression. To keep it simple I chose the states that boarder WI and the US unemployment information for the regression. Again this is a down and dirty analysis, I would not call this complete in any sense.

#getting the data for WI, US, IL, MI, MN, IO
#Using Quandl data, great little site
wi<-read.csv('', colClasses=c('Date'='Date'))
us<-read.csv('', colClasses=c('Date'='Date'))
il<-read.csv('', colClasses=c('Date'='Date'))
mi<-read.csv('', colClasses=c('Date'='Date'))
mn<-read.csv('', colClasses=c('Date'='Date'))
io<-read.csv('', colClasses=c('Date'='Date'))
#merging the data into one dataframe
#I started with WI so that I could isolate the rest of the variables better
unemp<-merge(wi, io, by='Date'); colnames(unemp)<-c('Date', 'wi', 'io')
unemp<-merge(unemp, mi, by='Date'); colnames(unemp)<-c('Date', 'wi', 'io', 'mi')
unemp<-merge(unemp, mn, by='Date'); colnames(unemp)<-c('Date', 'wi', 'io', 'mi', 'mn')
unemp<-merge(unemp, us, by='Date'); colnames(unemp)<-c('Date', 'wi', 'io', 'mi', 'mn', 'us')
unemp<-merge(unemp, il, by='Date'); colnames(unemp)<-c('Date', 'wi', 'io', 'mi', 'mn', 'us', 'il')
save(unemp, file='Unemployment WI.RData')
load('Unemployment WI.RData')
plot(unemp$Date, unemp$us, col='black', 
main='Unemployment WI Region', type='l', xlab='Date', 
ylab='Unemp. Rate', ylim=c(2, 17))
lines(unemp$Date, unemp$il, col='blue')
lines(unemp$Date, unemp$io, col='dark green')
lines(unemp$Date, unemp$mi, col='purple')
lines(unemp$Date, unemp$mn, col='gray')
lines(unemp$Date, unemp$wi, col='red')
leg.txt<-c('US', 'IL', 'IO', 'MI', 'MN', "WI")
leg.col<-c('black','blue', 'dark green', 'purple', 'gray', 'red')
legend('topleft', leg.txt, col=leg.col, pch=15)
#correlation matrix
#From the graph and correlation matrix, everything 
#is highly correlated, the lowest r value is 0.7879 
#between Iowa and US.
#plots of each of the data, then each individual, 
#everything looks really good, nice linear trends, 
#although Michigan has quite bit of volatility 
#compared to the other variables.
plot(unemp.b, pch='.')
model<-lm(wi~., data=unemp.b[,-1])
step.m<-step(model, direction='backward')
#testing for outliers
# After conducting the outlier test, looks like it is michigan 
#Sept 1980 where unemployment was 13.2 percent, point 57 is 
#close at 12.9 those are the two biggest, but the outlierTest 
#isolated row 56
dwtest(step.m, alternative='two.sided')
#yep lots of autocorrelation, so a multiple ARAIMA might be in order
mult.f<-predict(step.m, newdata=pred, interval='prediction')
a1<-accuracy(mult.f, unemp.p$wi)
#because the MASE is missing I needed to add the following to compare later
names(a1)<-c("ME",  "RMSE", "MAE", "MPE", "MAPE", "MASE")
auto.r<-auto.arima(unemp.b$wi, xreg=unemp.b[,3:7])
auto.f<-forecast(auto.r, xreg=unemp.p[,3:7], h=12)
a2<-accuracy(auto.f, unemp.p$wi)
#Just remember that the 0 is just a place holder
r.table<-rbind(a1, a2)
row.names(r.table)<-c('Multiple Regression', 'Reg. with ARIMA Errors')

At this point I combined the two accuracy tables to see which one was the best.
b.table<-rbind(a.table[,1:6], r.table)

The winner is: Regression with ARIMA Errors, for most of the analysis the simple mean like predictions dominated, even the multiple regression is considerably low on the list.

To leave a comment for the author, please follow the link and comment on their blog: OutLie..R. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)