Batch Forecasting in R

February 29, 2016
By

(This article was first published on R – Mathew Analytics, and kindly contributed to R-bloggers)

Given a data frame with multiple columns which contain time series data, let’s say that we are interested in executing an automatic forecasting algorithm on a number of columns. Furthermore, we want to train the model on a particular number of observations and assess how well they forecast future values. Based upon those testing procedures, we will estimate the full model. This is a fairly simple undertaking, but let’s walk through this task. My preference for such procedures is to loop through each column and append the results into a nested list.

First, let’s create some data.

ddat <- data.frame(date = c(seq(as.Date("2010/01/01"), as.Date("2010/03/02"), by=1)),
                      value1 = abs(round(rnorm(61), 2)),
                      value2 = abs(round(rnorm(61), 2)),
                      value3 = abs(round(rnorm(61), 2)))
head(ddat)
tail(ddat)

We want to forecast future values of the three columns. Because we want to save the results of these models into a list, lets begin by creating a list that contains the same number of elements as our data frame.

lst.names <- c(colnames(data))
lst <- vector("list", length(lst.names))
names(lst) <- lst.names
lst

I’ve gone ahead and written a user defined function that handles the batch forecasting process. It takes two arguments, a data frame and default argument which specifies the number of observations that will be used in the training set. The model estimates, forecasts, and diagnostic measures will be saved as a nested list and categorized under the appropriate variable name.

batch <- function(data, n_train=55){
 
  lst.names <- c(colnames(data))
  lst <- vector("list", length(lst.names))
  names(lst) <- lst.names    
 
  for( i in 2:ncol(data) ){  
 
    lst[[1]][["train_dates"]] <- data[1:(n_train),1]
    lst[[1]][["test_dates"]] <- data[(n_train+1):nrow(data),1]
 
    est <- auto.arima(data[1:n_train,i])
    fcas <- forecast(est, h=6)$mean
    acc <- accuracy(fcas, data[(n_train+1):nrow(data),i])
    fcas_upd <- data.frame(date=data[(n_train+1):nrow(data),1], forecast=fcas,                           actual=data[(n_train+1):nrow(data),i])
 
    lst[[i]][["estimates"]] <- est
    lst[[i]][["forecast"]] <- fcas
    lst[[i]][["forecast_f"]] <- fcas_upd
    lst[[i]][["accuracy"]] <- acc
 
    cond1 = diff(range(fcas[1], fcas[length(fcas)])) == 0
    cond2 = acc[,3] >= 0.025
 
    if(cond1|cond2){
 
      mfcas = forecast(ma(data[,i], order=3), h=5)        
      lst[[i]][["moving_average"]] <- mfcas
 
    } else {
 
      est2 <- auto.arima(data[,i])
      fcas2 <- forecast(est, h=5)$mean
 
      lst[[i]][["estimates_full"]] <- est2
      lst[[i]][["forecast_full"]] <- fcas2
 
    }  
  }  
  return(lst)
}
 
batch(ddat)

This isn’t the prettiest code, but it gets the job done. Note that lst was populated within a function and won’t be available in the global environment. Instead, I chose to simply print out the contents of the list after the function is evaluated.

To leave a comment for the author, please follow the link and comment on their blog: R – Mathew Analytics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)