{healthyR.ts} New Features: Unlocking More Power

[This article was first published on Steve's Data Tips and Tricks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

New Features: Unlocking More Power

My R package {healthyR.ts} has been updated to version 0.3.0; you can install it from either CRAN, r-universe or GitHub. Let’s go over some of the changes and improvements.

News

1. util_log_ts() – Logging Time Series Data

One of the standout additions is the introduction of util_log_ts(). This function seems like a game-changer, providing a streamlined way to log time series data. This is incredibly useful, especially when dealing with extensive datasets, making the whole process more efficient and user-friendly. This is a helper function for auto_stationarize().

2. util_singlediff_ts() – Single Differences for Time Series

The addition of util_singlediff_ts() expands the toolkit, offering a function dedicated to handling single differences in time series data. This is valuable for various applications, such as identifying trends or preparing data for further analysis. This is a helper function for auto_stationarize().

3. util_doublediff_ts() – Double Differences for Time Series

Building on the concept of differencing, util_doublediff_ts() seems to provide a higher level of sophistication, allowing users to perform double differences on time series data. This could be pivotal in cases where a more refined analysis is required. This is a helper function for auto_stationarize().

4. util_difflog_ts() – Combining Differences and Log Transformation

The fusion of differencing and log transformation in util_difflog_ts() is a remarkable addition. This could be particularly beneficial in scenarios where both operations are needed to unlock deeper insights from the time series data. This is a helper function for auto_stationarize().

5. util_doubledifflog_ts() – Double Differences with Log Transformation

The introduction of util_doubledifflog_ts() appears to take things a step further by combining double differences and log transformation. This function seems poised to provide a comprehensive solution for users dealing with complex time series data. This is a helper function for auto_stationarize().

Minor Fixes and Improvements: Polishing the Experience

1. Attributes Enhancement in ts_growth_rate_vec()

The attention to detail is evident with the addition of attributes to the output of ts_growth_rate_vec(). This enhancement not only improves the clarity of results but also contributes to a more informative and user-friendly experience.

2. Refinement of auto_stationarize() in Response to User Feedback

Updates to auto_stationarize() based on user feedback (Fix #481 #483) demonstrate a commitment to refining existing features. This responsiveness to the community’s needs is commendable and ensures that the package evolves in sync with user expectations. It has taken all of the util_ transforms mentioned above in order to improve it’s functionality.

3. Integration with auto_arima Engine in ts_auto_arima()

The integration of ts_auto_arima() with the parsnip engine of auto_arima is a notable improvement. This update, triggered when .tune is set to FALSE, aligns the package with cutting-edge tools, potentially enhancing the efficiency and accuracy of time series modeling.

In conclusion, the release of healthyR.ts version 0.3.0 is an exciting leap forward. The new features introduce powerful capabilities, while the minor fixes and improvements showcase a commitment to providing a robust and user-friendly package. Users can look forward to a more versatile and refined experience in time series analysis. Great job on this release, and I’m sure the community is eager to explore these enhancements!

Examples

Let’s see how the main functions now behave.

auto_stationarize()

library(healthyR.ts)

auto_stationarize(AirPassengers)
The time series is already stationary via ts_adf_test().
     Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1949 112 118 132 129 121 135 148 148 136 119 104 118
1950 115 126 141 135 125 149 170 170 158 133 114 140
1951 145 150 178 163 172 178 199 199 184 162 146 166
1952 171 180 193 181 183 218 230 242 209 191 172 194
1953 196 196 236 235 229 243 264 272 237 211 180 201
1954 204 188 235 227 234 264 302 293 259 229 203 229
1955 242 233 267 269 270 315 364 347 312 274 237 278
1956 284 277 317 313 318 374 413 405 355 306 271 306
1957 315 301 356 348 355 422 465 467 404 347 305 336
1958 340 318 362 348 363 435 491 505 404 359 310 337
1959 360 342 406 396 420 472 548 559 463 407 362 405
1960 417 391 419 461 472 535 622 606 508 461 390 432
auto_stationarize(BJsales)
The time series is not stationary. Attempting to make it stationary...
$stationary_ts
Time Series:
Start = 3 
End = 150 
Frequency = 1 
  [1]  0.5 -0.4  0.6  1.1 -2.8  3.0 -1.1  0.6 -0.5 -0.5  0.1  2.0 -0.6  0.8  1.2
 [16] -3.4 -0.7 -0.3  1.7  3.0 -3.2  0.9  2.2 -2.5 -0.4  2.6 -4.3  2.0 -3.1  2.7
 [31] -2.1  0.1  2.1 -0.2 -2.2  0.6  1.0 -2.6  3.0  0.3  0.2 -0.8  1.0  0.0  3.2
 [46] -2.2 -4.7  1.2  0.8 -0.6 -0.4  0.6  1.0 -1.6 -0.1  3.4 -0.9 -1.7 -0.5  0.8
 [61]  2.4 -1.9  0.6 -2.2  2.6 -0.1 -2.7  1.7 -0.3  1.9 -2.7  1.1 -0.6  0.9  0.0
 [76]  1.8 -0.5 -0.4 -1.2  2.6 -1.8  1.7 -0.9  0.6 -0.4  3.0 -2.8  3.1 -2.3 -1.1
 [91]  2.1 -0.3 -1.7 -0.8 -0.4  1.1 -1.5  0.3  1.4 -2.0  1.3 -0.3  0.4 -3.5  1.1
[106]  2.6  0.4 -1.3  2.0 -1.6  0.6 -0.1 -1.4  1.6  1.6 -3.4  1.7 -2.2  2.1 -2.0
[121] -0.2  0.2  0.7 -1.4  1.8 -0.1 -0.7  0.4  0.4  1.0 -2.4  1.0 -0.4  0.8 -1.0
[136]  1.4 -1.2  1.1 -0.9  0.5  1.9 -0.6  0.3 -1.4 -0.9 -0.5  1.4  0.1

$ndiffs
[1] 1

$adf_stats
$adf_stats$test_stat
[1] -6.562008

$adf_stats$p_value
[1] 0.01


$trans_type
[1] "double_diff"

$ret
[1] TRUE
plot.ts(auto_stationarize(BJsales)$stationary_ts)
The time series is not stationary. Attempting to make it stationary...

auto_stationarize(BJsales.lead)
The time series is not stationary. Attempting to make it stationary...
$stationary_ts
Time Series:
Start = 2 
End = 150 
Frequency = 1 
  [1]  0.06  0.25 -0.57  0.58 -0.20  0.23 -0.04 -0.19  0.03  0.42  0.04  0.24
 [13]  0.34 -0.46 -0.18 -0.08  0.29  0.56 -0.37  0.20  0.54 -0.31  0.03  0.52
 [25] -0.70  0.35 -0.63  0.44 -0.38 -0.01  0.22  0.10 -0.50  0.01  0.30 -0.76
 [37]  0.52  0.15  0.06 -0.10  0.21 -0.01  0.70 -0.22 -0.76  0.06  0.02 -0.17
 [49] -0.08  0.01  0.11 -0.39  0.01  0.50 -0.02 -0.37 -0.13  0.05  0.54 -0.46
 [61]  0.25 -0.52  0.44  0.02 -0.47  0.11  0.06  0.25 -0.35  0.00 -0.06  0.21
 [73] -0.09  0.36  0.09 -0.04 -0.20  0.44 -0.23  0.40 -0.01  0.17  0.08  0.58
 [85] -0.27  0.79 -0.21  0.02  0.30  0.28 -0.27 -0.01  0.03  0.16 -0.28  0.15
 [97]  0.26 -0.36  0.32 -0.11  0.22 -0.65  0.00  0.47  0.16 -0.19  0.48 -0.26
[109]  0.21  0.00 -0.20  0.35  0.38 -0.48  0.20 -0.32  0.43 -0.50  0.12 -0.17
[121]  0.15 -0.36  0.35 -0.03 -0.18  0.16  0.07  0.21 -0.50  0.23 -0.13  0.14
[133] -0.15  0.19 -0.24  0.26 -0.22  0.17  0.37 -0.06  0.29 -0.34 -0.12 -0.16
[145]  0.25  0.08 -0.07  0.26 -0.37

$ndiffs
[1] 1

$adf_stats
$adf_stats$test_stat
[1] -4.838625

$adf_stats$p_value
[1] 0.01


$trans_type
[1] "diff"

$ret
[1] TRUE
plot.ts(auto_stationarize(BJsales.lead)$stationary_ts)
The time series is not stationary. Attempting to make it stationary...

ts_auto_arima()

This use to only use the Arima engine if the .tune parameter was set to FALSE, thus it would many times give a simple straight line forecast. This was changed to make the engine auto_arima if .tune is set to FALSE.

library(timetk)
library(dplyr)
library(modeltime)

data <- AirPassengers |>
  ts_to_tbl() |>
  select(-index)

splits <- time_series_split(
  data
  , date_col
  , assess = 12
  , skip = 3
  , cumulative = TRUE
)

ts_aa <- ts_auto_arima(
  .data = data,
  .num_cores = 2,
  .date_col = date_col,
  .value_col = value,
  .rsamp_obj = splits,
  .formula = value ~ .,
  .grid_size = 5,
  .cv_slice_limit = 2,
  .tune = FALSE
)

ts_aa$recipe_info
$recipe_call
recipe(.data = data, .date_col = date_col, .value_col = value, 
    .formula = value ~ ., .rsamp_obj = splits, .tune = FALSE, 
    .grid_size = 5, .num_cores = 2, .cv_slice_limit = 2)

$recipe_syntax
[1] "ts_arima_recipe <-"                                                                                                                                                                           
[2] "\n  recipe(.data = data, .date_col = date_col, .value_col = value, .formula = value ~ \n    ., .rsamp_obj = splits, .tune = FALSE, .grid_size = 5, .num_cores = 2, \n    .cv_slice_limit = 2)"

$rec_obj
ts_aa$model_info
$model_spec
ARIMA Regression Model Specification (regression)

Computational engine: auto_arima 


$wflw
══ Workflow ════════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: arima_reg()

── Preprocessor ────────────────────────────────────────────────────────────────
0 Recipe Steps

── Model ───────────────────────────────────────────────────────────────────────
ARIMA Regression Model Specification (regression)

Computational engine: auto_arima 


$fitted_wflw
══ Workflow [trained] ══════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: arima_reg()

── Preprocessor ────────────────────────────────────────────────────────────────
0 Recipe Steps

── Model ───────────────────────────────────────────────────────────────────────
Series: outcome 
ARIMA(1,1,0)(0,1,0)[12] 

Coefficients:
          ar1
      -0.2431
s.e.   0.0894

sigma^2 = 109.8:  log likelihood = -447.95
AIC=899.9   AICc=900.01   BIC=905.46

$was_tuned
[1] "not_tuned"
ts_aa$model_calibration
$plot

$calibration_tbl
# Modeltime Table
# A tibble: 1 × 5
  .model_id .model     .model_desc             .type .calibration_data
      <int> <list>     <chr>                   <chr> <list>           
1         1 <workflow> ARIMA(1,1,0)(0,1,0)[12] Test  <tibble [12 × 4]>

$model_accuracy
# A tibble: 1 × 9
  .model_id .model_desc             .type   mae  mape  mase smape  rmse   rsq
      <int> <chr>                   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1         1 ARIMA(1,1,0)(0,1,0)[12] Test   18.5  4.18 0.384  4.03  23.9 0.955
ts_aa$model_calibration$plot

Finally enhancement to add attributes to ts_growth_rate_vec()

ts_growth_rate_vec(AirPassengers)
  [1]          NA   5.3571429  11.8644068  -2.2727273  -6.2015504  11.5702479
  [7]   9.6296296   0.0000000  -8.1081081 -12.5000000 -12.6050420  13.4615385
 [13]  -2.5423729   9.5652174  11.9047619  -4.2553191  -7.4074074  19.2000000
 [19]  14.0939597   0.0000000  -7.0588235 -15.8227848 -14.2857143  22.8070175
 [25]   3.5714286   3.4482759  18.6666667  -8.4269663   5.5214724   3.4883721
 [31]  11.7977528   0.0000000  -7.5376884 -11.9565217  -9.8765432  13.6986301
 [37]   3.0120482   5.2631579   7.2222222  -6.2176166   1.1049724  19.1256831
 [43]   5.5045872   5.2173913 -13.6363636  -8.6124402  -9.9476440  12.7906977
 [49]   1.0309278   0.0000000  20.4081633  -0.4237288  -2.5531915   6.1135371
 [55]   8.6419753   3.0303030 -12.8676471 -10.9704641 -14.6919431  11.6666667
 [61]   1.4925373  -7.8431373  25.0000000  -3.4042553   3.0837004  12.8205128
 [67]  14.3939394  -2.9801325 -11.6040956 -11.5830116 -11.3537118  12.8078818
 [73]   5.6768559  -3.7190083  14.5922747   0.7490637   0.3717472  16.6666667
 [79]  15.5555556  -4.6703297 -10.0864553 -12.1794872 -13.5036496  17.2995781
 [85]   2.1582734  -2.4647887  14.4404332  -1.2618297   1.5974441  17.6100629
 [91]  10.4278075  -1.9370460 -12.3456790 -13.8028169 -11.4379085  12.9151292
 [97]   2.9411765  -4.4444444  18.2724252  -2.2471910   2.0114943  18.8732394
[103]  10.1895735   0.4301075 -13.4903640 -14.1089109 -12.1037464  10.1639344
[109]   1.1904762  -6.4705882  13.8364780  -3.8674033   4.3103448  19.8347107
[115]  12.8735632   2.8513238 -20.0000000 -11.1386139 -13.6490251   8.7096774
[121]   6.8249258  -5.0000000  18.7134503  -2.4630542   6.0606061  12.3809524
[127]  16.1016949   2.0072993 -17.1735242 -12.0950324 -11.0565111  11.8784530
[133]   2.9629630  -6.2350120   7.1611253  10.0238663   2.3861171  13.3474576
[139]  16.2616822  -2.5723473 -16.1716172  -9.2519685 -15.4013015  10.7692308
attr(,"vector_attributes")
attr(,"vector_attributes")$tsp
[1] 1949.000 1960.917   12.000

attr(,"vector_attributes")$class
[1] "ts"

attr(,"name")
[1] "AirPassengers"
To leave a comment for the author, please follow the link and comment on their blog: Steve's Data Tips and Tricks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)