`mlS3` — A Unified S3 Machine Learning Interface in R

Posted on April 11, 2026 by T. Moudiki in R bloggers | 0 Comments

[This article was first published on T. Moudiki's Webpage - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Overview

Following the R6 object-based package unifiedml introduced last week, this blog post introduces the mlS3 R package, which strives to provide a unified, consistent S3 interface for training and predicting from a variety of popular machine learning models. Rather than learning the idiosyncratic API of each package (you’d still need to read their docs to see the parameters specification, though), mlS3 wraps them under a common wrap_* / predict() pattern.

What You’ll Learn

How to install and load mlS3 (for now, from GitHub)
How to apply a consistent API across multiple ML algorithms for both classification and regression tasks

Models Covered

Wrapper	Underlying Package	Task(s)
`wrap_glmnet()`	`glmnet` generalized linear models	Classification, Regression
`wrap_lightgbm()`	`lightgbm` gradient boosting	Classification, Regression
`wrap_ranger()`	`ranger` random forest	Classification, Regression
`wrap_svm()`	`e1071` support vector machines	Classification, Regression
`wrap_caret()`	`caret` package	Classification, Regression with caret 200+ models

Datasets Used

iris — binary and multiclass classification (setosa/versicolor, all three species)
mtcars — regression to predict miles per gallon (mpg)

Key Design Principle

All models follow the same two-step workflow:

mod  <- wrap_*(X_train, y_train, ...)       # Train
pred <- predict(mod, newx = X_test, ...)    # Predict

This makes it easy to swap algorithms and compare results without rewriting your pipeline.

Prerequisites

R with the following packages: remotes, caret, randomForest, ggplot2
mlS3 installed from GitHub (for now) via remotes::install_github("Techtonique/mlS3")

Code

Install packages

install.packages(c("remotes"))

install.packages(c("caret"))

install.packages(c("randomForest"))

remotes::install_github("Techtonique/mlS3")

Predefined wrappers

# Classification

library(mlS3)

# =============================================================================
# Classification examples (no leakage)
# =============================================================================
set.seed(123)

# --- Binary classification: iris setosa vs versicolor ---
iris_bin <- iris[iris$Species != "virginica", ]
X_bin <- iris_bin[, 1:4]
y_bin <- droplevels(iris_bin$Species)

# Split into train/test
idx_bin <- sample(nrow(X_bin), 0.7 * nrow(X_bin))
X_bin_train <- X_bin[idx_bin, ]
y_bin_train <- y_bin[idx_bin]
X_bin_test  <- X_bin[-idx_bin, ]
y_bin_test  <- y_bin[-idx_bin]

# glmnet
mod <- wrap_glmnet(X_bin_train, y_bin_train, family = "binomial")
pred_bin_glmnet <- predict(mod, newx = X_bin_test, type = "class")
acc_glmnet <- mean(pred_bin_glmnet == y_bin_test)

cat("Accuracy (glmnet): ", acc_glmnet, "\n")


# --- Multiclass classification: iris all species ---
X_multi <- iris[, 1:4]
y_multi <- iris$Species

# Split into train/test
idx_multi <- sample(nrow(X_multi), 0.7 * nrow(X_multi))
X_multi_train <- X_multi[idx_multi, ]
y_multi_train <- y_multi[idx_multi]
X_multi_test  <- X_multi[-idx_multi, ]
y_multi_test  <- y_multi[-idx_multi]

# lightgbm
mod <- wrap_lightgbm(X_multi_train, y_multi_train,
                     params = list(objective = "multiclass",
                                   num_class = 3, verbose = -1),
                     nrounds = 150)
pred_multi_lightgbm <- predict(mod, newx = X_multi_test, type = "class")
acc_lightgbm <- mean(pred_multi_lightgbm == y_multi_test)

# ranger
mod <- wrap_ranger(X_multi_train, y_multi_train, num.trees = 100L)
pred_multi_ranger <- predict(mod, newx = X_multi_test, type = "class")
acc_ranger <- mean(pred_multi_ranger == y_multi_test)

# svm
mod <- wrap_svm(X_multi_train, y_multi_train, kernel = "radial")
pred_multi_svm <- predict(mod, newx = X_multi_test, type = "class")
acc_svm <- mean(pred_multi_svm == y_multi_test)

cat("Accuracy (lightgbm): ", acc_lightgbm, "\n")
cat("Accuracy (ranger): ", acc_ranger, "\n")
cat("Accuracy (svm): ", acc_svm, "\n")


# Regression


# =============================================================================
# Regression examples (mtcars)
# =============================================================================
X_reg <- mtcars[, -1]
y_reg <- mtcars$mpg

# Split into train/test
set.seed(123)
idx_reg <- sample(nrow(X_reg), 0.7 * nrow(X_reg))
X_reg_train <- X_reg[idx_reg, ];  y_reg_train <- y_reg[idx_reg]
X_reg_test  <- X_reg[-idx_reg, ]; y_reg_test  <- y_reg[-idx_reg]

# lightgbm
mod <- wrap_lightgbm(X_reg_train, y_reg_train,
                     params = list(objective = "regression", verbose = -1),
                     nrounds = 50)
pred_reg_lightgbm <- predict(mod, newx = X_reg_test)
rmse_lightgbm <- sqrt(mean((pred_reg_lightgbm - y_reg_test)^2))

# glmnet
mod <- wrap_glmnet(X_reg_train, y_reg_train, alpha = 0)
pred_reg_glmnet <- predict(mod, newx = X_reg_test)
rmse_glmnet <- sqrt(mean((pred_reg_glmnet - y_reg_test)^2))

# svm
mod <- wrap_svm(X_reg_train, y_reg_train)
pred_reg_svm <- predict(mod, newx = X_reg_test)
rmse_svm <- sqrt(mean((pred_reg_svm - y_reg_test)^2))

# ranger
mod <- wrap_ranger(X_reg_train, y_reg_train, num.trees = 100L)
pred_reg_ranger <- predict(mod, newx = X_reg_test)
rmse_ranger <- sqrt(mean((pred_reg_ranger - y_reg_test)^2))

cat("RMSE (lightgbm): ", rmse_lightgbm, "\n")
cat("RMSE (glmnet): ", rmse_glmnet, "\n")
cat("RMSE (svm): ", rmse_svm, "\n")
cat("RMSE (ranger): ", rmse_ranger, "\n")



Accuracy (glmnet):  1 
Accuracy (lightgbm):  0.2444444  # I'm probably doing something wrong here
Accuracy (ranger):  0.9333333 
Accuracy (svm):  0.9333333 
RMSE (lightgbm):  4.713678 
RMSE (glmnet):  2.972557 
RMSE (svm):  2.275837 
RMSE (ranger):  2.067692

`caret` wrapper

For this part, you need to install package caret and randomForest. Model parameters available for caret: https://topepo.github.io/caret/available-models.html

library(mlS3)
library(caret)

# ============================================================================
# Regression with mtcars dataset
# ============================================================================
data(mtcars)

# Prepare data
X_reg <- mtcars[, -1]  # All except mpg
y_reg <- mtcars$mpg     # Target variable

# Split into train/test
set.seed(123)
idx_reg <- sample(nrow(X_reg), 0.7 * nrow(X_reg))
X_reg_train <- X_reg[idx_reg, ]
y_reg_train <- y_reg[idx_reg]
X_reg_test <- X_reg[-idx_reg, ]
y_reg_test <- y_reg[-idx_reg]

# ----------------------------------------------------------------------------
# Example 1: Random Forest with specific parameters
# ----------------------------------------------------------------------------
cat("\n=== Example 1: Random Forest Regression ===\n")

mod_rf <- wrap_caret(X_reg_train, y_reg_train,
                     method = "rf",
                     mtry = 3)        # Number of variables sampled at each split

print(mod_rf)

# Predictions
pred_rf <- predict(mod_rf, newx = X_reg_test)
rmse_rf <- sqrt(mean((pred_rf - y_reg_test)^2))
r2_rf <- 1 - sum((y_reg_test - pred_rf)^2) / sum((y_reg_test - mean(y_reg_test))^2)

cat("RMSE:", round(rmse_rf, 3), "\n")
cat("R-squared:", round(r2_rf, 3), "\n")


=== Example 1: Random Forest Regression ===
$model
Random Forest 

22 samples
10 predictors

No pre-processing
Resampling: None 

$task
[1] "regression"

$method
[1] "rf"

$parameters
$parameters$mtry
[1] 3


attr(,"class")
[1] "wrap_caret"
RMSE: 2.007 
R-squared: 0.681 

library(ggplot2)

df <- data.frame(
  pred = pred_rf,
  actual = y_reg_test
)

ggplot(df, aes(x = pred, y = actual)) +
  geom_point() +
  geom_abline(slope = 1, intercept = 0, color = "red") +
  theme_minimal() +
  labs(x = "Predicted", y = "Actual")

To leave a comment for the author, please follow the link and comment on their blog: T. Moudiki's Webpage - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

`mlS3` — A Unified S3 Machine Learning Interface in R

Overview

What You’ll Learn

Models Covered

Datasets Used

Key Design Principle

Prerequisites

Code

Install packages

Predefined wrappers

`caret` wrapper

Related

Overview

What You’ll Learn

Models Covered

Datasets Used

Key Design Principle

Prerequisites

Code

Install packages

Predefined wrappers

caret wrapper

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

`caret` wrapper

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)