# Forecasting: Multivariate Regression Exercises (Part-4)

May 1, 2017
By

(This article was first published on R-exercises, and kindly contributed to R-bloggers)

In the previous exercises of this series, forecasts were based only on an analysis of the forecast variable. Another approach to forecasting is to use external variables, which serve as predictors. This set of exercises focuses on forecasting with the standard multivariate linear regression.
Running regressions may appear straightforward but this method of forecasting is subject to some pitfalls:
(1) a basic difficulty is selection of predictor variables (which is more of an art than a science),
(2) a possible problem is the dependence of a forecast on assumptions about expected values of predictor variables,
(3) another problem can arise if autocorrelation is present in regression residuals (it implies, among other things, that not all information, which could be used for forecasting, was retrieved from the forecast variable).
This set of exercises allow to practice in using the `regsubsets` function from the `leaps` package to run sets of regressions, making and plotting forecast from a multivariate regression, and testing residuals for autocorrelation (which requires the `lmtest` package to be installed). The model selection is based on the Bayesian information criterion (BIC).
The exercises make use of the quarterly data on light vehicles sales (in thousands of units), real disposable personal income (per capita, in chained 2009 dollars), civilian unemployment rate (in percent), and finance rate on personal loans at commercial banks (24 month loans, in percent) in the USA for 1976-2016 from FRED, the Federal Reserve Bank of St. Louis database (download here).
For other parts of the series follow the tag forecasting.
Answers to the exercises are available here.

Exercise 1
Load the dataset, and plot the `sales` variable.

Exercise 2
Create the `trend` variable (by assigning a successive number to each observation), and lagged versions of the variables `income`, `unemp`, and `rate` (lagged by one period). Add them to the dataset.
(Note that the base R libraries do not include functions for creating lags for non-time-series data, so the variables can be created manually).

Exercise 3
Run all possible linear regressions with `sales` as the dependent variable and the others as independent variables using the `regsubsets` function from the `leaps` package (pass a formula with all possible dependent variables, and the dataset as inputs to the function).
Plot the output of the function.

Exercise 4
Note that `regsubsets` returns only one “best” model (in terms of BIC) for each possible number of dependent variables. Run all regressions again, but increase the number of returned models for each size to 2.
Plot the output of the function.

Exercise 5
Look at the plots from the previous exercises and find the model with the lowest value of BIC. Run a linear regression for the model, save the result in a variable, and print its summary.

Exercise 6
Load an additional dataset with assumptions on future values of dependent variables. Use the dataset and the model obtained in the previous exercise to make a forecast for the next 4 quarters with the `forecast` function (from the package with the same name). Note that the names of the lagged variables in the assumptions data have to be identical to the names of the corresponding variables in the main dataset.
Plot the summary of the forecast.

Exercise 7
The `plot` function does not automatically draw plots for forecasts obtained from regression models with multiple predictors, but such plots can be created manually. As the first step, create a vector from the `sales` variable, and append the forecast (mean) values to this vector. Then use the `ts` function to transform the vector to a quarterly time series that starts in the first quarter of 1976.

Exercise 8
Plot the forecast in the following steps:
(1) create an empty plot for the period from the first quarter of 2000 to the fourth quarter of 2017,
(2) plot a black line for the sales time series for the period 2000-2016,
(3) plot a thick blue line for the sales time series for the fourth quarter of 2016 and all quarters of 2017.
Note that a line can be plotted using the `lines` function, and a subset of a time series can be obtained with the `window` function.

Exercise 9
Perform the Breusch-Godfrey test (the `bgtest` function from the `lmtest` package) to test the linear model obtained in the exercise 5 for autocorrelation of residuals. Set the maximum order of serial correlation to be tested to 4.
Is the autocorrelation present?
(Note that the null hypothesis of the test is the absence of autocorrelation of the specified orders).

Exercise 10
Use the ```Pacf function from the forecast package to explore autocorrelation of residuals of the linear model obtained in the exercise 5. Find at which lags partial correlation between lagged values is statistically significant at 5% level. Residuals can be obtained from the model using the residuals function.```

``` Related exercise sets: Forecasting: Linear Trend and ARIMA Models Exercises (Part-2) Forecasting: Exponential Smoothing Exercises (Part-3) Multiple Regression (Part 3) Diagnostics Explore all our (>1000) R exercises Find an R course using our R Course Finder directory var vglnk = { key: '949efb41171ac6ec1bf7f206d57e90b8' }; (function(d, t) { var s = d.createElement(t); s.type = 'text/javascript'; s.async = true; s.src = '//cdn.viglink.com/api/vglnk.js'; var r = d.getElementsByTagName(t)[0]; r.parentNode.insertBefore(s, r); }(document, 'script')); Related 158SHARESShareTweet To leave a comment for the author, please follow the link and comment on their blog: R-exercises. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more... If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook... ```
``` ```
``` Comments are closed. ```
``` Recent popular posts Deep Learning with R Add P-values and Significance Levels to ggplots Introducing the MonteCarlo Package How to create dot-density maps in R Most visited articles of the week How to write the first for loop in R Installing R packages Using apply, sapply, lapply in R How to Make a Histogram with Basic R Tutorials for learning R How to perform a Logistic Regression in R Freedman's paradox In-depth introduction to machine learning in 15 hours of expert videos Deep Learning with R Sponsors // https://support.cloudflare.com/hc/en-us/articles/200169436-How-can-I-have-Rocket-Loader-ignore-my-script-s-in-Automatic-Mode- // this must be placed higher. Otherwise it doesn't work. // data-cfasync="false" is for making sure cloudflares' rocketcache doesn't interfeare with this // in this case it only works because it was used at the original script in the text widget function createCookie(name,value,days) { var expires = ""; if (days) { var date = new Date(); date.setTime(date.getTime() + (days*24*60*60*1000)); expires = "; expires=" + date.toUTCString(); } document.cookie = name + "=" + value + expires + "; path=/"; } function readCookie(name) { var nameEQ = name + "="; var ca = document.cookie.split(';'); for(var i=0;i < ca.length;i++) { var c = ca[i]; while (c.charAt(0)==' ') c = c.substring(1,c.length); if (c.indexOf(nameEQ) == 0) return c.substring(nameEQ.length,c.length); } return null; } function eraseCookie(name) { createCookie(name,"",-1); } function readTextFile(file) { // Helps people browse between pages without the need to keep downloading the same // ads txt page everytime. This way, it allows them to use their browser's cache. var random_number = readCookie("ad_random_number_cookie"); if(random_number == null) { var random_number = Math.floor(Math.random()*100*(new Date().getTime()/1000)); createCookie("ad_random_number_cookie",random_number,1) } file += '?t='+random_number; var rawFile = new XMLHttpRequest(); rawFile.onreadystatechange = function () { if(rawFile.readyState === 4) { if(rawFile.status === 200 || rawFile.status == 0) { // var allText = rawFile.responseText; // document.write(allText); document.write(rawFile.responseText); } } } rawFile.open("GET", file, false); rawFile.send(null); } // readTextFile('https://raw.githubusercontent.com/Raynos/file-store/master/temp.txt'); readTextFile("https://www.r-bloggers.com/wp-content/uploads/text-widget_anti-cache.txt"); Jobs for R usersResearch and Statistical Analyst – Housing @ London, England, U.K.Data Scientist @ Garching bei München, Bayern, GermanySoftware DeveloperSenior Quantitative Analyst, Data ScientistR data wranglerSenior Data ScientistManager, Statistical Consulting & Data Science Full list of contributing R-bloggers ```
``` R-bloggers was founded by Tal Galili, with gratitude to the R community. Is powered by WordPress using a bavotasan.com design. Copyright © 2017 R-bloggers. All Rights Reserved. Terms and Conditions for this website var snp_f = []; var snp_hostname = new RegExp(location.host); var snp_http = new RegExp("^(http|https)://", "i"); var snp_cookie_prefix = ''; var snp_separate_cookies = false; var snp_ajax_url = 'https://www.r-bloggers.com/wp-admin/admin-ajax.php'; var snp_ignore_cookies = false; var snp_enable_analytics_events = false; var snp_enable_mobile = false; var snp_use_in_all = false; var snp_excluded_urls = []; snp_excluded_urls.push(''); Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.) Click here to close (This popup will not appear again) .snp-pop-109583 .snp-theme6 { max-width: 700px;} .snp-pop-109583 .snp-theme6 h1 {font-size: 17px;} .snp-pop-109583 .snp-theme6 { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field ::-webkit-input-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field :-moz-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field :-ms-input-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field input { border: 1px solid #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field { color: #000000;} .snp-pop-109583 .snp-theme6 { background: #f2f2f2;} (function(){ var corecss = document.createElement('link'); var themecss = document.createElement('link'); var corecssurl = "https://www.r-bloggers.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shCore.css?ver=3.0.9b"; if ( corecss.setAttribute ) { corecss.setAttribute( "rel", "stylesheet" ); corecss.setAttribute( "type", "text/css" ); corecss.setAttribute( "href", corecssurl ); } else { corecss.rel = "stylesheet"; corecss.href = corecssurl; } document.getElementsByTagName("head")[0].insertBefore( corecss, document.getElementById("syntaxhighlighteranchor") ); var themecssurl = "https://www.r-bloggers.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shThemeDefault.css?ver=3.0.9b"; if ( themecss.setAttribute ) { themecss.setAttribute( "rel", "stylesheet" ); themecss.setAttribute( "type", "text/css" ); themecss.setAttribute( "href", themecssurl ); } else { themecss.rel = "stylesheet"; themecss.href = themecssurl; } //document.getElementById("syntaxhighlighteranchor").appendChild(themecss); document.getElementsByTagName("head")[0].insertBefore( themecss, document.getElementById("syntaxhighlighteranchor") ); })(); SyntaxHighlighter.config.strings.expandSource = '+ expand source'; SyntaxHighlighter.config.strings.help = '?'; SyntaxHighlighter.config.strings.alert = 'SyntaxHighlighter\n\n'; SyntaxHighlighter.config.strings.noBrush = 'Can\'t find brush for: '; SyntaxHighlighter.config.strings.brushNotHtmlScript = 'Brush wasn\'t configured for html-script option: '; SyntaxHighlighter.defaults['pad-line-numbers'] = false; SyntaxHighlighter.defaults['toolbar'] = false; SyntaxHighlighter.all(); _stq = window._stq || []; _stq.push([ 'view', {v:'ext',j:'1:4.7.1',blog:'11524731',post:'149872',tz:'-6',srv:'www.r-bloggers.com'} ]); _stq.push([ 'clickTrackerInit', '11524731', '149872' ]); /* <![CDATA[ */ jQuery(function(){ jQuery("ul.sf-menu").supersubs({ minWidth: 12, maxWidth: 27, extraWidth: 1 }).superfish({ delay: 100, speed: 250 }); }); /* ]]> */ ```