# Down and Dirty Forecasting: Part 1

May 24, 2013
By

(This article was first published on OutLie..R, and kindly contributed to R-bloggers)

I wanted to see what I could do in a hurry using the commands found at Forecasting: Principles and Practice . I chose a simple enough data set of Wisconsin Unemployment from 1976 to the present (April 2013). I kept the last 12 months worth of data to test the accuracy of the models. The next blog post will include a multiple regression analysis. The analysis is lacking many important steps, particularly the ARIMA, but this is a down and dirty exercise.

`library(forecast)library(lmtest)library(caret)#State Unemployment seasonally adjusted#http://www.quandl.com/FRED-Federal-Reserve-Economic-Data/WIUR-Unemployment-Rate-in-Wisconsin #Using Quandl data, great little sitewi<-read.csv('http://www.quandl.com/api/v1/datasets/FRED/WIUR.csv?&auth_token=gigXwpxd6Ex91cjgz1B7&trim_start=1976-01-01&trim_end=2013-04-01&sort_order=desc', colClasses=c('Date'='Date')) #some minor clean upcolnames(wi)<-c('date', 'rate')wi\$date<-as.Date(wi\$date)summary(wi) #base data, 1-436, test data 437-448 wi.b<-wi[1:436,]wi.p<-wi[437:448,]wi.ts<-ts(wi.b\$rate, start=c(1976, 1), frequency=12)wi.p.ts<-ts(wi.p\$rate, start=c(2012, 5), frequency=12)plot.ts(wi.ts) #Lets test some modelsmean<-meanf(wi.ts, 12)naive<-rwf(wi.ts, 12)s.naive<-snaive(wi.ts, 12)drift<-rwf(wi.ts, 12, drift=T) #linear fitm1<-tslm(wi.ts~trend)m2<-tslm(wi.ts~trend+season) #checking for autocorrelationres1 <- residuals(m1)par(mfrow=c(1,2))plot(res1, ylab="Residuals",xlab="Year")Acf(res1, main="ACF of residuals") res2 <- residuals(m2)par(mfrow=c(1,2))plot(res2, ylab="Residuals",xlab="Year")Acf(res2, main="ACF of residuals")par(mfrow=c(1,1)) #Durbin-Watson Testdwtest(m1, alt="two.sided")dwtest(m2, alt="two.sided")#yep autocorrelation city! No surprize here, due to the nature of unemployment #STL ETS Decompositionm3<-stl(wi.ts, s.window='periodic')plot(m3)m4<-ets(wi.ts, model='ZZZ')plot(m4) #ARIMAm5<-auto.arima(wi.ts)plot(forecast(m5, h=12))`
`#neural networksm6<-nnetar(wi.ts)m6plot(forecast(m6, h=12)) #Testing for accuracy the first 4 modelsa1<-accuracy(mean, wi.p.ts)a2<-accuracy(naive, wi.p.ts)a3<-accuracy(s.naive, wi.p.ts)a4<-accuracy(drift, wi.p.ts) a.table<-rbind(a1, a2, a3, a4) #Creating the forecast and accuracy for the next 6 modelsf1<-forecast(m1, h=12)f2<-forecast(m2, h=12)f3<-forecast(m3, h=12)f4<-forecast(m4, h=12)f5<-forecast(m5, h=12)f6<-forecast(m6, h=12) a5<-accuracy(f1, wi.p.ts)a6<-accuracy(f2, wi.p.ts)a7<-accuracy(f3, wi.p.ts)a8<-accuracy(f4, wi.p.ts)a9<-accuracy(f5, wi.p.ts)a10<-accuracy(f6, wi.p.ts) #Combining into a table with row namesa.table<-rbind(a.table, a5, a6, a7, a8, a9, a10)row.names(a.table)<-c('Mean', 'Naive', 'S. Naive', 'Drift', 'Lm~Trend', 'Lm~Trend+Sea', 'STL', 'ETS', 'ARIMA', 'Neuro') #make into a data frame so the best model is first, according to MAPEa.table<-as.data.frame(a.table)a.table<-a.table[order(a.table\$MAPE),]a.table`

Created by Pretty R at inside-R.org

Results so far: Looks like the mean like forecasts are doing the best, the fancy models are not doing very well.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...