Forecasting Many Time Series (Using NO ForLoops)
Want to share your content on Rbloggers? click here if you have a blog, or here if you don't.
Spending too much time on making iterative forecasts? Iβm super excited to introduce the new panel data forecasting functionality in modeltime
. Itβs perfect for forecasting many time series at once without forloops saving you time β±οΈ and aggravation π. Just say NO to forloops for forecasting.
Fitting many time series can be an expensive process. The most widelyaccepted technique is to iteratively run an ARIMA model on each time series in a forloop.
Times are changing. Organizations now need 1000βs of forecasts. Think 1000s of customers, products, and complex hierarchical data.
In this tutorial:

Weβll explain new techniques involving Global Models and Panel Data for dealing with many time series.

Weβll then provide an introductory tutorial using
modeltime
0.7.0 new features (now available on CRAN π) for modeling time series as Panel Data to make forecasts without forloops.
Before we move on, what if I want to learn more indepth forecasting training with Modeltime? π
Free Forecasting Training!
I canβt possibly go over all of the new modeltime features released in 0.7.0 in this tutorial. If you would like to learn more about the new features in modeltime
0.7.0, Iβm hosting a free live webinar on on Wednesday July, 28th at 2PM EST. Iβll cover:
 Deep Learning (Torch Integration)
 Global Models
 Panel Data Forecasting
 New Modeltime Features
 And, a lot of
code
What is Modeltime?
A growing ecosystem for tidymodels forecasting
Modeltime is part of a growing ecosystem of forecasting packages. Modeltime integrates tidymodels
for forecasting at scale. The ecosystem contains:
And several new communitycontributed modeltime
extension packages have emerged including: boostime, bayesmodels, garchmodels, and sknifedatar
Problem: Forecasting with ForLoops is Not Scalable
Time series data is increasing at an exponential rate. Organizationwide forecasting demands have changed from toplevel to bottomlevel forecasting, which has increased the number of forecasts that need to be made from the range of 1100 to the range of 1,00010,000.
Think of forecasting by customer for an organization that has 10,000 customers. It becomes a challenge to make these forecasts one at a time in an iterative approach. As that organization grows, moving from 10,000 to 100,000 customers, forecasting with an iterative approach is not scalable.
Modeltime has been designed to take a different approach using Panel Data and Global Models (more on these concepts shortly). Using these approaches, we can dramatically increase the scale at which forecasts can be made. Prior limitations in the range of 1,000 to 10,000 forecasts become managable. Beyond is also possible with clustering techniques and making several panel models. We are only limited by RAM, not modeling time.
Before we move on, we need to cover two key concepts:
 Panel Data
 Global Models
What are Panel Data and Global Models?
In itβs simplest form, Panel Data is a time series dataset that has more than one series. Each time series is stacked rowwise (ontop) of each other.
The Panel Data Time Series Format
Traditional modeling techniques like ARIMA can only be used on one time series at a time. The widely accepted forecasting approach is to iterate through each time series producing a unique model and forecast for each time series identifier. The downside with this approach is that itβs expensive when you have many time series. Think of the number of products in a database. As the number of time series approaches the range of 100010,000, the iterative approach becomes unscalable as forloops run endlessly and errors can grind your analysis to a hault.
Problem: 1000 ARIMA Models Needed for 1000 Time Series
Global Models are alternatives to the iterative approach. A Global Model is a single model that forecasts all time series at once. Global Models are highly scalable, which solves the problem of 110,000 time series. An example is an XGBoost Model, which can determine relationships for all 1000 time series panels with a single model. This is great: No ForLoops!
Solution: A Single XGBOOST Model can Model 1000 Time Series leaves you waiting for hours β³…
The downside is that an global model approach can be less accurate than the iterative approach. To improve accuracy, feature engineering and localized model selection by time series identifier become critical to largescale forecasting success. If interested, I teach proven feature engineering techniques in my Time Series Forecasting Course.
Say No to ForLoops
If youβre tired of waiting for ARIMA models to finish, then maybe itβs time to say NO to forloops and give modeltime
a try.
Forecast using Global Models and Panel Data with Modeltime for a 1000X Speedup
While Modeltime can perform iterative modeling, Modeltime excels at forecasting at scale without ForLoops using:

Global Modeling: Global model Machine Learning and Deep Learning strategies using the Modeltime Ecosystem (e.g.
modeltime
,modeltime.h2o
, andmodeltime.gluonts
). 
Panel Data: Tidy data that is easy to work with if you are familiar with the
tidyverse
andtidymodels
. 
Feature Engineering: Developing calendar features, lagged features, and other timebased, windowbased, and sequencebased features using
timetk
. 
MultiForecast Visualization: Visualizing multiple local time series forecasts at once.

Global and Localized Accuracy Reporting: Generating outofsample accuracy both globally and at a local level by time series identifier (available in
modeltime
>= 0.7.0) 
Global and Localized Confidence Intervals Reporting: Generating outofsample confidence intervals both globally and at a local level by time series identifier (available in
modeltime
>= 0.7.0)
Once you learn these concepts, you can achieve speedups of 1000X or more. Weβll showcase several of these features in our tutorial on forecasting many time series without forloops.
Tutorial on Forecasting Many Time Series (Without ForLoops)
Weβll cover a short tutorial on Forecasting Many Time Series (Without ForLoops).
Load Libraries
First, load the following libraries.
Collect data
Next, collect the walmart_sales_weekly
dataset. The dataset consists of 1001 observations of revenue generated by a storedepartment combination on any given week. It contains:
 7 Time Series Groups denoted by the βIDβ column
 The data is structured in Panel Data format
 The time series groups will be modeled with a single Global Model
Visualize the Data
From visualizing, the weekly department revenue patterns emerge. Most of the series have yearly seasonality and longterm trends.
Train/Test Splitting
We can split the data into training and testing sets using time_series_split()
. Weβll investigate the last 3months of the year to test a global model on a 3month forecast. The message on overlapping dates is to let us know that multiple time series are being processed using the last 3month window for testing.
Feature Engineering (Recipe)
We can move to preprocessing the data. We will use the recipes
workflow for generating time series features.

This results in 37 derived features for modeling.

We can certainly include more features such as lags and rolling features, which are covered in the HighPerformance Time Series Course.
Machine Learning
Weβll create an xgboost
workflow by fitting the default xgboost model to our derived features from our insample training data set.

We create a Global XGBOOST Model, a single model that forecasts all of our time series

Training the global xgboost model takes approximately 50 milliseconds.

Conversely, an ARIMA model might take several minutes to iterate through possible parameter combinations for each of the 7 time series.

Global modeling is a 1000X speedup.
Modeltime Workflow
Weβll step through the modeltime workflow, which is used to test many different models on the time series and organize the entire process.
Create a Modeltime Table
First, we create a Modeltime Table using modeltime_table()
. The Modeltime Table organizes our model(s). We can even add more models if weβd like, and each model will get an ID (.model_id) and description (.model_desc).
Calibrate by ID
Next, we calibrate. Calibration calculates the out of sample residual error.A new feature in modeltime
0.7.0 is the ability to calibrate by each time series.
Measure Accuracy
Next, we measure the global and local accuracy on the global model.
Global Accuracy
Global Accuracy is the overall accuracy of the test forecasts, which simply returns an aggregated error without taking into account that there are multiple time series. The default is modeltime_accuracy(acc_by_id = FALSE)
, which returns a global model accuracy.
Accuracy Table  

.model_id  .model_desc  .type  mae  mape  mase  smape  rmse  rsq 
1  XGBOOST  Test  3254.56  7.19  0.1  7  4574.52  0.98 
Local Accuracy
The drawback with the global accuracy is that the model may not perform well on specific time series. By toggling modeltime_accuracy(acc_by_id = TRUE)
, we can obtain the Local Accuracy, which is the accuracy that the model has on each of the time series groups. This can be useful for identifying specifically which time series the model does well on (and which it does poorly on). We can then apply model selection logic to select specific global models for specific IDs.
Accuracy Table  

.model_id  .model_desc  .type  ID  mae  mape  mase  smape  rmse  rsq 
1  XGBOOST  Test  1_1  1138.25  6.19  0.85  5.93  1454.25  0.95 
1  XGBOOST  Test  1_3  3403.81  18.47  0.57  16.96  4209.29  0.91 
1  XGBOOST  Test  1_8  1891.35  4.93  0.86  5.07  2157.43  0.55 
1  XGBOOST  Test  1_13  1201.11  2.92  0.53  2.97  1461.49  0.60 
1  XGBOOST  Test  1_38  8036.27  10.52  0.99  10.64  8955.32  0.02 
1  XGBOOST  Test  1_93  3493.69  4.50  0.34  4.64  4706.68  0.78 
1  XGBOOST  Test  1_95  3617.45  2.83  0.46  2.83  4184.46  0.72 
Forecast the Data
The last step weβll cover is forecasting the test dataset. This is useful to evaluate the model using a sampling of the time series within the panel dataset. In modeltime
0.7.0, we now have modeltime_forecast(conf_by_id = TRUE)
to allow the confidence intervals (prediction intervals) to be calculated by time series identifier. Note, that the modeltime_calibrate()
must have been performed with an id
specified.
Summary
We just showcased the Modeltime Workflow for Panel Data using a Global XGBOOST Model. But, this is a simple problem. And, thereβs a lot more to learning time series:
 Many more algorithms
 Feature Engineering for Time Series
 Ensembling
 Machine Learning
 Deep Learning
 Scalable Modeling: 10,000+ time series
Your probably thinking how am I ever going to learn time series forecasting. Hereβs the solution that will save you years of struggling.
It gets better
Youβve just scratched the surface, hereβs whatβs comingβ¦
The Modeltime Ecosystem functionality is much more featurerich than what weβve covered here (I couldnβt possibly cover everything in this post). π
Hereβs what I didnβt cover:

Feature Engineering: We can make this forecast much more accurate by including features from competitionwinning strategies

Ensemble Modeling: We can stack models together to make superlearners that stabilize predictions.

Deep Learning: We can use GluonTS Deep Learning for developing highperformance, scalable forecasts.
So how are you ever going to learn time series analysis and forecasting?
Youβre probably thinking:
 Thereβs so much to learn
 My time is precious
 Iβll never learn time series
I have good news that will put those doubts behind you.
You can learn time series analysis and forecasting in hours with my stateoftheart time series forecasting course. π
Advanced Time Series Course
Become the times series expert in your organization.
My Advanced Time Series Forecasting in R course is available now. Youβll learn timetk
and modeltime
plus the most powerful time series forecasting techniques available like GluonTS Deep Learning. Become the times series domain expert in your organization.
π Advanced Time Series Course.
You will learn:
 Time Series Foundations  Visualization, Preprocessing, Noise Reduction, & Anomaly Detection
 Feature Engineering using lagged variables & external regressors
 Hyperparameter Tuning  For both sequential and nonsequential models
 Time Series CrossValidation (TSCV)
 Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
 Deep Learning with GluonTS (Competition Winner)
 and more.
Unlock the HighPerformance Time Series Course
Have questions about Modeltime?
Make a comment in the chat below. π
And, if you plan on using modeltime
for your business, itβs a nobrainer  Join my Time Series Course.
Rbloggers.com offers daily email updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/datascience job.
Want to share your content on Rbloggers? click here if you have a blog, or here if you don't.