Time Series in 5Minutes, Part 2: Autocorrelation and Cross Correlation
Want to share your content on Rbloggers? click here if you have a blog, or here if you don't.
Have 5minutes? Then let’s learn time series. In this short articles series, I highlight how you can get up to speed quickly on important aspects of time series analysis. Today we are focusing on a critical visualization technique: Autocorrelation and Cross Correlation. Learn how to make interactive (plotly
) and static (ggplot2
) visualizations easily with timetk
.
Updates
This article has been updated. View the updated Time Series in 5Minutes article at Business Science.
Time Series in 5Mintues
Articles in this Series
I just released timetk
2.0.0 (read the release announcement). A ton of new functionality has been added. We’ll discuss some of the key pieces in this article series:
 Part 1, The Time Plot
 Part 2, Autocorrelation
 Part 3, Seasonality
 Part 4, Anomalies and Anomaly Detection
 Part 5, Dealing with Missing Time Series Data
👉 Register for our blog to get new articles as we release them.
Have 5Minutes?
Then let’s learn the Time Plot
This tutorial focuses on, plot_acf_diagnostics()
, a workhorse timeseries plotting function that makes interactive:
 ACF and PACF Plots (Autocorrelation and Partial Autocorrelation)
 CCF Plots (Cross Correlation)
in interactive (plotly
) and static (ggplot2
) visualization formats.
Time Series Course (Coming Soon)
I teach Time Series (timetk
, more) in my Time Series Analysis & Forecasting Course. If interested in learning ProForecasting Strategies then join my waitlist. The course is coming soon.
You will learn:
 Time Series Preprocessing, Noise Reduction, & Anomaly Detection
 Feature engineering using lagged variables & external regressors
 Hyperparameter tuning
 Time series crossvalidation
 Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
 NEW – Deep Learning with RNNs (Competition Winner)
 and more.
Signup for the Time Series Course waitlist
Libraries
Load the following libraries. For the purposes of this tutorial, I’m setting all plots to static ggplot2
using interactive < FALSE
, but I encourage you to switch this to TRUE
to see how easy it is to make interactive plotly
plots.
Part 1: Autocorrelation
Autocorrelation is the presence of correlation that is connected to lagged versions of a time series. In laymen’s terms, this means that past history is related to future history. We can visualize this relationship with an ACF plot.
First, plot the time series we’ll be looking at taylor_30_min
using plot_time_series()
. We learned how to plot time series with the Time Plot in Part 1 of this series.
This series represents hourly electricity demand taken at 30min intervals for about 3months. We can visualize the autocorrelation in the series using a new function, plot_acf_diagnostics()
.
Why are ACF and PACF important?
From the plot_acf_diagnostics()
we get:

ACF Plot: The autocorrleation (yaxis), which is the relationship between the series and each progressive lag (xaxis) with the series.

PACF Plot: The partialautocorrelation vs lags. The Partial Autocorrelation shows how much each progressive ACF adds to the predictability. In other words, lags that are correlated with each other are deweighted so the most important lags are present.
These 2 visualizations help us model relationships and develop predictive forecasts:
 Seasonality: Possible Fourier Series we can use to model a relationship
 Lags as Predictors: We can find important lags to include in our models
If you want to learn Time Series Forecasting for Business, it’s a nobrainer  Join my Time Series Course Waitlist (It’s coming, it’s really insane).
Grouped ACF and PACFs
Often in time series we are dealing with more than one series  these are called groups. Let’s switch to a different hourly dataset, m4_hourly
, that contains 4groups.
We can get the ACF and PACF plots easily using plot_acf_diagnostics()
. We can isolate 14days of lags using the .lags = "14 days"
.
Why use time series groups?

Using groups helps us to evaluate time series much faster than analyzing every time series individually. We’re able to quickly evaluate 4 time series.

Grouped analysis can highlight similarities and differences between time series. We can see H150 and H410 have spikes at 1week in addition to the daily frequency.
Part 2: Cross Correlation
The last example here is Cross Correlation, an important technique for finding external predictors. We start with a new time series, walmart_sales_weekly
, which contains weekly sales for walmart, time series groups consisting of various departments, and several (potential) predictors including temperature and fuel price.
Note that you will need to the development version of timetk
for this functionality until timetk
2.0.1 is released. You can upgrade using devtools::install_github("businessscience/timetk")
.
We can visualize Cross Correlations using the .ccf_vars
between Weekly Sales and Temperature and Fuel Price.
Time Series Course (Coming Soon)
I teach Time Series (timetk
, more) in my Time Series Analysis & Forecasting Course. If interested in learning ProForecasting Strategies then join my waitlist. The course is coming soon.
You will learn:
 Time Series Preprocessing, Noise Reduction, & Anomaly Detection
 Feature engineering using lagged variables & external regressors
 Hyperparameter tuning
 Time series crossvalidation
 Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
 NEW  Deep Learning with RNNs (Competition Winner)
 and more.
Signup for the Time Series Course waitlist
Have questions on using Timetk for time series?
Make a comment in the chat below. 👇
And, if you plan on using timetk
for your business, it’s a nobrainer  Join my Time Series Course Waitlist (It’s coming, it’s really insane).
Rbloggers.com offers daily email updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/datascience job.
Want to share your content on Rbloggers? click here if you have a blog, or here if you don't.