Down and Dirty Forecasting: Part 1

May 24, 2013
By

(This article was first published on OutLie..R, and kindly contributed to R-bloggers)

I wanted to see what I could do in a hurry using the commands found at Forecasting: Principles and Practice . I chose a simple enough data set of Wisconsin Unemployment from 1976 to the present (April 2013). I kept the last 12 months worth of data to test the accuracy of the models. The next blog post will include a multiple regression analysis. The analysis is lacking many important steps, particularly the ARIMA, but this is a down and dirty exercise.

library(forecast)
library(lmtest)
library(caret)
#State Unemployment seasonally adjusted
#http://www.quandl.com/FRED-Federal-Reserve-Economic-Data/WIUR-Unemployment-Rate-in-Wisconsin
 
#Using Quandl data, great little site
wi<-read.csv('http://www.quandl.com/api/v1/datasets/FRED/WIUR.csv?&auth_token=gigXwpxd6Ex91cjgz1B7&trim_start=1976-01-01&trim_end=2013-04-01&sort_order=desc', colClasses=c('Date'='Date'))
 
#some minor clean up
colnames(wi)<-c('date', 'rate')
wi$date<-as.Date(wi$date)
summary(wi)
 
#base data, 1-436, test data 437-448
 
wi.b<-wi[1:436,]
wi.p<-wi[437:448,]
wi.ts<-ts(wi.b$rate, start=c(1976, 1), frequency=12)
wi.p.ts<-ts(wi.p$rate, start=c(2012, 5), frequency=12)
plot.ts(wi.ts)
 
#Lets test some models
mean<-meanf(wi.ts, 12)
naive<-rwf(wi.ts, 12)
s.naive<-snaive(wi.ts, 12)
drift<-rwf(wi.ts, 12, drift=T)
 
#linear fit
m1<-tslm(wi.ts~trend)
m2<-tslm(wi.ts~trend+season)
 
#checking for autocorrelation
res1 <- residuals(m1)
par(mfrow=c(1,2))
plot(res1, ylab="Residuals",xlab="Year")
Acf(res1, main="ACF of residuals")
 
res2 <- residuals(m2)
par(mfrow=c(1,2))
plot(res2, ylab="Residuals",xlab="Year")
Acf(res2, main="ACF of residuals")
par(mfrow=c(1,1))
 
#Durbin-Watson Test
dwtest(m1, alt="two.sided")
dwtest(m2, alt="two.sided")
#yep autocorrelation city! No surprize here, due to the nature of unemployment
 
#STL ETS Decomposition
m3<-stl(wi.ts, s.window='periodic')
plot(m3)
m4<-ets(wi.ts, model='ZZZ')
plot(m4)
 
#ARIMA
m5<-auto.arima(wi.ts)
plot(forecast(m5, h=12))
#neural networks
m6<-nnetar(wi.ts)
m6
plot(forecast(m6, h=12))
 
#Testing for accuracy the first 4 models
a1<-accuracy(mean, wi.p.ts)
a2<-accuracy(naive, wi.p.ts)
a3<-accuracy(s.naive, wi.p.ts)
a4<-accuracy(drift, wi.p.ts)
 
a.table<-rbind(a1, a2, a3, a4)
 
#Creating the forecast and accuracy for the next 6 models
f1<-forecast(m1, h=12)
f2<-forecast(m2, h=12)
f3<-forecast(m3, h=12)
f4<-forecast(m4, h=12)
f5<-forecast(m5, h=12)
f6<-forecast(m6, h=12)
 
a5<-accuracy(f1, wi.p.ts)
a6<-accuracy(f2, wi.p.ts)
a7<-accuracy(f3, wi.p.ts)
a8<-accuracy(f4, wi.p.ts)
a9<-accuracy(f5, wi.p.ts)
a10<-accuracy(f6, wi.p.ts)
 
#Combining into a table with row names
a.table<-rbind(a.table, a5, a6, a7, a8, a9, a10)
row.names(a.table)<-c('Mean', 'Naive', 'S. Naive', 'Drift', 'Lm~Trend',
'Lm~Trend+Sea', 'STL', 'ETS', 'ARIMA', 'Neuro')
 
#make into a data frame so the best model is first, according to MAPE
a.table<-as.data.frame(a.table)
a.table<-a.table[order(a.table$MAPE),]
a.table
Created by Pretty R at inside-R.org

Results so far: Looks like the mean like forecasts are doing the best, the fancy models are not doing very well.


To leave a comment for the author, please follow the link and comment on his blog: OutLie..R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.