Calibrate and Plot a Time Series with {healthyR.ts}

[This article was first published on Steve's Data Tips and Tricks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

In time series analysis, it is common to split the data into training and testing sets to evaluate the accuracy of a model. However, it is important to ensure that the model is calibrated on the training set before evaluating its performance on the testing set. The {healthyR.ts} library provides a function called calibrate_and_plot() that simplifies this process.

Function

Here is the full function call:

calibrate_and_plot(
  ...,
  .type = "testing",
  .splits_obj,
  .data,
  .print_info = TRUE,
  .interactive = FALSE
)

Here are the arguments to the parameters:

  • ... – The workflow(s) you want to add to the function.
  • .type – Either the training(splits) or testing(splits) data.
  • .splits_obj – The splits object.
  • .data – The full data set.
  • .print_info – The default is TRUE and will print out the calibration accuracy tibble and the resulting plotly plot.
  • .interactive – The defaults is FALSE. This controls if a forecast plot is interactive or not via plotly.

Example

By default, calibrate_and_plot() will print out a calibration accuracy tibble and a resulting plotly plot. This can be controlled with the print_info argument, which is set to TRUE by default. If you prefer a non-interactive forecast plot, you can set the interactive argument to FALSE.

Here’s an example of how to use the calibrate_and_plot() function:

library(healthyR.ts)
library(dplyr)
library(timetk)
library(parsnip)
library(recipes)
library(workflows)
library(rsample)

# Get the Data
data <- ts_to_tbl(AirPassengers) |>
  select(-index)

# Split the data into training and testing sets
splits <- time_series_split(
   data
  , date_col
  , assess = 12
  , skip = 3
  , cumulative = TRUE
)

# Make the recipe object
rec_obj <- recipe(value ~ ., data = training(splits))

# Make the Model
model_spec <- linear_reg(
   mode = "regression"
   , penalty = 0.5
   , mixture = 0.5
) |>
   set_engine("lm")

# Make the workflow object
wflw <- workflow() |>
   add_recipe(rec_obj) |>
   add_model(model_spec) |>
   fit(training(splits))

# Get our output
output <- calibrate_and_plot(
  wflw
  , .type = "training"
  , .splits_obj = splits
  , .data = data
  , .print_info = FALSE
  , .interactive = TRUE
 )

The resulting output will include a calibration accuracy tibble and a plotly plot showing the original time series data along with the fitted values for the training set.

Let’s take a look at the output.

output$calibration_tbl
# Modeltime Table
# A tibble: 1 × 5
  .model_id .model     .model_desc .type .calibration_data 
      <int> <list>     <chr>       <chr> <list>            
1         1 <workflow> LM          Test  <tibble [132 × 4]>
output$model_accuracy
# A tibble: 1 × 9
  .model_id .model_desc .type   mae  mape  mase smape  rmse   rsq
      <int> <chr>       <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1         1 LM          Test   31.4  12.0  1.31  11.9  41.7 0.846

And…

output$plot

Overall, the calibrate_and_plot() function is a useful tool for simplifying the process of calibrating time series models on a training set and evaluating their performance on a testing set.

Voila!

To leave a comment for the author, please follow the link and comment on their blog: Steve's Data Tips and Tricks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)