Creating and Predicting Fast Regression Parsnip Models with {tidyAML}

[This article was first published on Steve's Data Tips and Tricks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

I am almost ready for a first release of my R package {tidyAML}. The purpose of this is to act as a way of quickly generating models using the parsnip package and keeping things inside of the tidymodels framework allowing users to seamlessly create models in tidyAML but pluck and move them over to tidymodels should they prefer. This is because I believe that software should be interchangeable and work well with other libraries. Today I am going to showcase how the function fast_regression()

Function

Let’s take a look at the function.

fast_regression(
  .data,
  .rec_obj,
  .parsnip_fns = "all",
  .parsnip_eng = "all",
  .split_type = "initial_split",
  .split_args = NULL
)

Here are the arguments to the function:

  • .data – The data being passed to the function for the regression problem
  • .rec_obj – The recipe object being passed.
  • .parsnip_fns – The default is ‘all’ which will create all possible regression model specifications supported.
  • .parsnip_eng – The default is ‘all’ which will create all possible regression model specifications supported.
  • .split_type – The default is ‘initial_split’, you can pass any type of split supported by rsample
  • .split_args – The default is NULL, when NULL then the default parameters of the split type will be executed for the rsample split type.

Example

Let’s take a look at an example.

library(tidyAML)
library(dplyr)
library(recipes)
library(purrr)

rec_obj <- recipe(mpg ~ ., data = mtcars)
fast_reg_tbl <- fast_regression(
  .data = mtcars,
  .rec_obj = rec_obj,
  .parsnip_eng = c("lm","glm"),
  .parsnip_fns = "linear_reg"
)

glimpse(fast_reg_tbl)
Rows: 2
Columns: 8
$ .model_id       <int> 1, 2
$ .parsnip_engine <chr> "lm", "glm"
$ .parsnip_mode   <chr> "regression", "regression"
$ .parsnip_fns    <chr> "linear_reg", "linear_reg"
$ model_spec      <list> [~NULL, ~NULL, NULL, regression, TRUE, NULL, lm, TRUE]…
$ wflw            <list> [cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb, mp…
$ fitted_wflw     <list> [cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb, mp…
$ pred_wflw       <list> [<tbl_df[24 x 1]>], [<tbl_df[24 x 1]>]

Let’s take a look at the model spec.

fast_reg_tbl %>% slice(1) %>% pull(model_spec) %>% pluck(1)
Linear Regression Model Specification (regression)

Computational engine: lm 

Now the wflw column.

fast_reg_tbl %>% slice(1) %>% pull(wflw) %>% pluck(1)
══ Workflow ════════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: linear_reg()

── Preprocessor ────────────────────────────────────────────────────────────────
0 Recipe Steps

── Model ───────────────────────────────────────────────────────────────────────
Linear Regression Model Specification (regression)

Computational engine: lm 

The Fitted workflow.

fast_reg_tbl %>% slice(1) %>% pull(fitted_wflw) %>% pluck(1)
══ Workflow [trained] ══════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: linear_reg()

── Preprocessor ────────────────────────────────────────────────────────────────
0 Recipe Steps

── Model ───────────────────────────────────────────────────────────────────────

Call:
stats::lm(formula = ..y ~ ., data = data)

Coefficients:
(Intercept)          cyl         disp           hp         drat           wt  
 -15.077267     1.107474     0.001161    -0.001014     4.010199    -1.280324  
       qsec           vs           am         gear         carb  
   0.512318    -0.488014     2.430052     4.353568    -2.546043  

And lastly tne predicted workflow column.

fast_reg_tbl %>% slice(1) %>% pull(pred_wflw) %>% pluck(1)
# A tibble: 24 × 1
   .pred
   <dbl>
 1  24.7
 2  28.2
 3  18.9
 4  12.0
 5  14.8
 6  15.4
 7  14.7
 8  20.0
 9  11.2
10  19.1
# … with 14 more rows

Voila!

To leave a comment for the author, please follow the link and comment on their blog: Steve's Data Tips and Tricks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)