Forecasting benchmark: Dynrmf (a new serious competitor in town) vs Theta Method on M-Competitions and Tourism competitition

[This article was first published on T. Moudiki's Webpage - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the world of time series forecasting, benchmarking methods against established competitions is crucial. Today, I’m sharing results from a comprehensive comparison between two forecasting approaches: the classical Theta (notoriously difficult to beat on the chosen benchmark datasets) method and the newer Dynrmf (from R and Python packages ahead) algorithm, tested across three major forecasting competitions.

The Setup

I evaluated both methods on:

  • M3 Competition: 3,003 diverse time series
  • M1 Competition: The original M-competition dataset (1001 time series)
  • Tourism Competition: Tourism-specific time series data (1311 time series)

For each dataset, I trained both models on historical data and evaluated their forecasts against held-out test sets using standard accuracy metrics (ME, RMSE, MAE, MPE, MAPE, MASE, ACF1). Each model uses its default parameters.

Implementation Details

The analysis used parallel processing with 2 cores to speed up computations across thousands of series. Both methods received identical train/test splits for fair comparison – complete code at the end of the post.

# Example forecasting loop structure
metrics <- foreach(i = 1:n, .combine = rbind)%dopar%{
  series <- dataset[[i]]
  train <- series$x
  test <- series$xx
  fit <- method(y = train, h = length(test))
  forecast::accuracy(fit)
}

Results

It’s worth mentioning that Theta was the clear winner of M3 competition (and it’s been enhanced in many ways), and, of course, it’s always easier to beat the winner a posteriori.

I performed paired t-tests on the difference in metrics (Theta - Dynrmf) across all series in each competition. Here’s what the numbers reveal:

M3 Competition Results

Metric t-statistic p-value Interpretation
ME 16.96 <0.001 Theta significantly more biased
RMSE 10.17 <0.001 Theta significantly worse
MAE 7.65 <0.001 Theta significantly worse
MPE -5.59 <0.001 Dynrmf more biased
MAPE 1.29 0.197 No significant difference
MASE 14.09 <0.001 Theta significantly worse
ACF1 -1.37 0.171 No significant difference

Verdict: Dynrmf demonstrates substantially better accuracy on M3 data for most error metrics.

M1 Competition Results

Metric t-statistic p-value Interpretation
ME 2.56 0.011 Theta more biased
RMSE 2.94 0.003 Theta significantly worse
MAE 2.28 0.023 Theta significantly worse
MPE -0.61 0.540 No significant difference
MAPE 0.38 0.703 No significant difference
MASE -0.74 0.461 No significant difference
ACF1 -6.50 <0.001 Dynrmf residuals better

Verdict: Dynrmf shows moderate but statistically significant improvements, particularly in RMSE and MAE.

Tourism Competition Results

Metric t-statistic p-value Interpretation
ME 1.62 0.105 No significant difference
RMSE 1.88 0.060 Marginal advantage to Dynrmf
MAE 1.16 0.245 No significant difference
MPE 4.52 <0.001 Theta less biased
MAPE -5.03 <0.001 Dynrmf significantly better
MASE -5.17 <0.001 Dynrmf significantly better
ACF1 -8.65 <0.001 Dynrmf residuals better

Verdict: On tourism data, Dynrmf excels at scaled metrics (MAPE, MASE) and produces better-behaved residuals.

Key Takeaways

  1. Dynrmf consistently outperforms Theta on accuracy metrics like RMSE, MAE, and MASE across competitions
  2. The advantage is strongest on M3, suggesting Dynrmf handles diverse series types well
  3. Tourism data shows Dynrmf’s strength in percentage and scaled error metrics
  4. Residual autocorrelation (ACF1) is consistently better with Dynrmf, indicating it captures patterns more completely
  5. Model-agnostic flexibility: It’s important to mention that unlike method-specific approaches, Dynrmf would accept any fitting function (and the default here is automatic Ridge Regression minimizing Generalized Cross-Validation errror) through its fit_func parameter (see here), enabling use of Ridge regression (default), Random Forest, XGBoost, Support Vector Machines, or any other model with standard fit/predict interfaces

Code Availability

The full R implementation is available below. The analysis leverages parallel processing for efficiency on large competition datasets.

library(ahead)
library(Mcomp)
library(Tcomp)
library(foreach)
library(doParallel)
library(doSNOW)

setwd("~/Documents/Papers/to_submit/2026-01-01-dynrmf-vs-Theta-on-M1-M3-Tourism")

# 1. M3 comp. results --------------

data(M3)

n <- length(Mcomp::M3)

training_indices <- seq_len(n)

# Theta -----------

metrics_theta <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_theta <- foreach::foreach(i = 1:n, 
                 .combine = rbind, .verbose = TRUE)%dopar%{
  series <- M3[[i]]
  train <- series$x
  test <- series$xx
  fit <- forecast::thetaf(
    y = train,
    h = length(test))
  utils::setTxtProgressBar(pb, i)
  forecast::accuracy(fit)
}
close(pb)

print(metrics_theta)

# Dynrmf -----------


metrics_dynrmf <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_dynrmf <- foreach::foreach(i = 1:n, 
                                  .combine = rbind, .verbose = TRUE)%dopar%{
                                    series <- M3[[i]]
                                    train <- series$x
                                    test <- series$xx
                                    fit <- ahead::dynrmf(
                                      y = train,
                                      h = length(test))
                                    utils::setTxtProgressBar(pb, i)
                                    forecast::accuracy(fit)
                                  }
close(pb)

print(metrics_dynrmf)

diff_metrics <- metrics_theta - metrics_dynrmf

M3_results <- apply(diff_metrics, 2, t.test)

M3_results <- apply(diff_metrics, 2, function(x) {z <- t.test(x); return(c(z$statistic, z$p.value))})
rownames(M3_results) <- c("statistic", "p-value")



# 2. M1 comp. results -----------------

data(M1)

n <- length(Mcomp::M1)

training_indices <- seq_len(n)

# Theta -----------

metrics_theta <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_theta <- foreach::foreach(i = 1:n, 
                                  .combine = rbind, .verbose = TRUE)%dopar%{
                                    series <- M1[[i]]
                                    train <- series$x
                                    test <- series$xx
                                    fit <- forecast::thetaf(
                                      y = train,
                                      h = length(test))
                                    utils::setTxtProgressBar(pb, i)
                                    forecast::accuracy(fit)
                                  }
close(pb)

print(metrics_theta)

# Dynrmf -----------

metrics_dynrmf <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_dynrmf <- foreach::foreach(i = 1:n, 
                                   .combine = rbind, .verbose = TRUE)%dopar%{
                                     series <- M1[[i]]
                                     train <- series$x
                                     test <- series$xx
                                     fit <- ahead::dynrmf(
                                       y = train,
                                       h = length(test))
                                     utils::setTxtProgressBar(pb, i)
                                     forecast::accuracy(fit)
                                   }
close(pb)

print(metrics_dynrmf)

diff_metrics <- metrics_theta - metrics_dynrmf

M1_results <- apply(diff_metrics, 2, function(x) {z <- t.test(x); return(c(z$statistic, z$p.value))})
rownames(M1_results) <- c("statistic", "p-value")


# 2. Tourism comp. results -----------------

data(tourism)

n <- length(Tcomp::tourism)

training_indices <- seq_len(n)

# Theta -----------

metrics_theta <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_theta <- foreach::foreach(i = 1:n, 
                                  .combine = rbind, .verbose = TRUE)%dopar%{
                                    series <- tourism[[i]]
                                    train <- series$x
                                    test <- series$xx
                                    fit <- forecast::thetaf(
                                      y = train,
                                      h = length(test))
                                    utils::setTxtProgressBar(pb, i)
                                    forecast::accuracy(fit)
                                  }
close(pb)

print(metrics_theta)

# Dynrmf -----------

metrics_dynrmf <- rep(NA, n)

pb <- utils::txtProgressBar(max = n, style = 3)

doParallel::registerDoParallel(cl=2)

# looping on all the training set time series
metrics_dynrmf <- foreach::foreach(i = 1:n, 
                                   .combine = rbind, .verbose = TRUE)%dopar%{
                                     series <- tourism[[i]]
                                     train <- series$x
                                     test <- series$xx
                                     fit <- ahead::dynrmf(
                                       y = train,
                                       h = length(test))
                                     utils::setTxtProgressBar(pb, i)
                                     forecast::accuracy(fit)
                                   }
close(pb)

print(metrics_dynrmf)

diff_metrics <- metrics_theta - metrics_dynrmf

diff_metrics <- diff_metrics[is.finite(diff_metrics[, 4]), ]

Tourism_results <- apply(diff_metrics, 2, function(x) {z <- t.test(x); return(c(z$statistic, z$p.value))})
rownames(Tourism_results) <- c("statistic", "p-value")

# # Theta - dynrmf+default
# 
# ```R
# kableExtra::kable(M3_results)
# kableExtra::kable(M1_results)
# kableExtra::kable(Tourism_results)
# 
# |          |       ME|     RMSE|      MAE|       MPE|      MAPE|    MASE|       ACF1|
#   |:---------|--------:|--------:|--------:|---------:|---------:|-------:|----------:|
#   |statistic | 16.96265| 10.16816| 7.652445| -5.586289| 1.2900974| 14.0924| -1.3680305|
#   |p-value   |  0.00000|  0.00000| 0.000000|  0.000000| 0.1971162|  0.0000|  0.1714049|
# 
# |          |        ME|      RMSE|       MAE|        MPE|      MAPE|       MASE|      ACF1|
#   |:---------|---------:|---------:|---------:|----------:|---------:|----------:|---------:|
#   |statistic | 2.5575962| 2.9406984| 2.2815166| -0.6129693| 0.3808239| -0.7378457| -6.497644|
#   |p-value   | 0.0106864| 0.0033502| 0.0227274|  0.5400360| 0.7034148|  0.4607813|  0.000000|
# 
# |          |        ME|      RMSE|       MAE|       MPE|       MAPE|       MASE|      ACF1|
#   |:---------|---------:|---------:|---------:|---------:|----------:|----------:|---------:|
#   |statistic | 1.6239829| 1.8845405| 1.1628106| 4.5207962| -5.0276196| -5.1743478| -8.648504|
#   |p-value   | 0.1046342| 0.0597262| 0.2451306| 0.0000068|  0.0000006|  0.0000003|  0.000000|
#   
# ```

What forecasting methods have you found most reliable in your work? Share your experiences in the comments below.

Please sign and share this petition https://www.change.org/stop_torturing_T_Moudiki – after seriously researching my background and contributions to the field.

image-title-here

To leave a comment for the author, please follow the link and comment on their blog: T. Moudiki's Webpage - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)