Site icon R-bloggers

Tuning

[This article was first published on mlr-org, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
JavaScript is required to unlock solutions.
Please enable JavaScript and reload the page,
or download the source files from GitHub and run the code locally.
< section id="goal" class="level1">

Goal

After this exercise, you should be able to define search spaces for learning algorithms and apply different hyperparameter (HP) optimization (HPO) techniques to search through the search space to find a well-performing hyperparameter configuration (HPC).

< section id="exercises" class="level1">

Exercises

Again, we are looking at the german_credit data set and corresponding task (you can quickly load the task with tsk("german_credit")). We want to train a k-NN model but ask ourselves what the best choice of might be? Furthermore, we are not sure how to set other HPs of the learner, e.g., if we should scale the data or not. In this exercise, we conduct HPO for k-NN to automatically find a good HPC.

library(mlr3verse)
task = tsk("german_credit")
< details> < summary> Recap: k-NN k-NN is a machine learning method that predicts new data by averaging over the responses of the k nearest neighbors. < section id="parameter-spaces" class="level2">

Parameter spaces

Define a meaningful search space for the HPs k and scale. You can checkout the help page lrn("classif.kknn")$help() for an overview of the k-NN learner.

< details> < summary> Hint 1 Each learner has a slot param_set that contains all HPs that can be used for the tuning. In this use case we tune a learner with the key "classif.kknn". The functions to define the search space are ps and p_int, p_dbl, p_fct, or p_lgl for HPs in the search space. < details> < summary> Hint 2
library(mlr3tuning)

search_space = ps(
  k = p_int(...),
  scale = ...
)
Solution
< section id="hyperparameter-optimization" class="level2">

Hyperparameter optimization

Now, we want to tune the k-NN model with the search space from the previous exercise. As resampling strategy we use a 3 fold cross validation. The tuning strategy should be a random search. As termination criteria we choose 40 evaluations.

< details> < summary> Hint 1

The elements required for the tuning are:

< details> < summary> Hint 2

The optimization algorithm is obtained from tnr() with the corresponding key as argument. Furthermore we allow parallel computations using four cores:

library(mlr3)
library(mlr3learners)
library(mlr3tuning)

future::plan("multicore", workers = 4L)

task = tsk(...)
lrn_knn = lrn(...)

search_space = ps(
  k = p_int(1, 100),
  scale = p_lgl()
)
resampling = rsmp(...)

terminator = trm(..., ... = 40L)

instance = ti(
  task = ...,
  learner = ...,
  resampling = ...,
  terminator = ...,
  search_space = ...
)

optimizer = tnr(...)
optimizer$...(...)
Finally, the optimization is started by passing the tuning instance to the $optimize() method of the tuner.
Solution
< section id="analyzing-the-tuning-archive" class="level2">

Analyzing the tuning archive

Inspect the archive of hyperparameters evaluated during the tuning process with instance$archive. Create a simple plot with the goal of illustrating the association between the hyperparametere k and the estimated classification error.

Solution
< section id="visualizing-hyperparameters" class="level2">

Visualizing hyperparameters

To see how effective the tuning was, it is useful to look at the effect of the HPs on the performance. It also helps us to understand how important different HPs are. Therefore, access the archive of the tuning instance and visualize the effect.

< details> < summary> Hint 1 Access the archive of the tuning instance to get all information about the tuning. You can use all known plotting techniques after transforming it to a data.table. < details> < summary> Hint 2
arx = as...(instance$...)

library(ggplot2)
library(patchwork)

gg_k = ggplot(..., aes(...)) + ...()
gg_scale = ggplot(..., aes(...)) + ...()

gg_k + gg_scale & theme(legend.position = "bottom")
Solution
< section id="hyperparameter-dependencies" class="level2">

Hyperparameter dependencies

When defining a hyperparameter search space via the ps() function, we sometimes encounter nested search spaces, also called hyperparameter dependencies. One example for this are SVMs. Here, the hyperparameter degree is only relevant if the hyperparameter kernel is set to "polynomial". Therefore, we only have to consider different configurations for degree if we evaluate candidate configurations with polynomial kernel. Construct a search space for a SVM with hyperparameters kernel (candidates should be "polynomial" and "radial") and degree (integer ranging from 1 to 3, but only for polynomial kernels), and account for the dependency structure.

< details> < summary> Hint 1 In the p_fct, p_dbl, … functions, we specify this using the depends argument, which takes a named argument of the form <param> == value or <param> %in% <vector>.
Solution
< section id="hyperparameter-transformations" class="level2">

Hyperparameter transformations

When tuning non-negative hyperparameters with a broad range, using a logarithmic scale can be more efficient. This approach works especially well if we want to test many small values, but also a few very large ones. By selecting values on a logarithmic scale and then exponentiating them, we ensure a concentrated exploration of smaller values while still considering the possibility of very large values, allowing for a targeted and efficient search in finding optimal hyperparameter configurations.

A simple way to do this is to pass logscale = TRUE when using to_tune() to define the parameter search space while constructing the learner:

lrn = lrn("classif.svm", cost = to_tune(1e-5, 1e5, logscale = TRUE))
lrn$param_set$search_space()
<ParamSet(1)>
       id    class     lower    upper nlevels        default  value
   <char>   <char>     <num>    <num>   <num>         <list> <list>
1:   cost ParamDbl -11.51293 11.51293     Inf <NoDefault[0]> [NULL]
Trafo is set.

To manually create the same transformation, we can pass the transformation to the more general trafo argument in p_dbl() and related functions and set the bounds using the log() function. For the following search space, implement a logarithmic transformation. the output should look exactly as the search space above.

# Change this to a log trafo:
ps(cost = p_dbl(1e-5, 1e5))
Solution
< section id="summary" class="level1">

Summary

< section id="further-information" class="level1">

Further information

Other (more advanced) tuning algorithms:

To leave a comment for the author, please follow the link and comment on their blog: mlr-org.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version