Benchmarking

April 13, 2025 | Giuseppe Casalicchio

Goal We will go beyond resampling single learners. We will learn how to compare a large number of different models using benchmarking. In this exercise, we will not show you how to tune a learner. Instead, we will compare identical learners with...
[Read more...]

Train Predict Evaluate Basics

April 13, 2025 | Giuseppe Casalicchio

Goal Our goal for this exercise sheet is to learn the basics of mlr3 for supervised learning by training a first simple model on training data and by evaluating its performance on hold-out/test data. German Credit Dataset The German credit dat... [Read more...]

Resampling Solution

April 13, 2025 | Giuseppe Casalicchio

Goal You will learn how to estimate the model performance with mlr3 using resampling techniques such as 5-fold cross-validation. Additionally, you will compare k-NN model against a logistic regression model. German Credit Data We work with the...
[Read more...]

Tree Methods

April 13, 2025 | Giuseppe Casalicchio

Goal The goal for this exercise is to familiarize yourself with two very important machine learning methods, the decision tree and random forest. After this exercise, you should be able to train these models and extract important information to ... [Read more...]

Benchmarking

April 13, 2025 | Giuseppe Casalicchio

Goal We will go beyond resampling single learners. We will learn how to compare a large number of different models using benchmarking. In this exercise, we will not show you how to tune a learner. Instead, we will compare identical learners with...
[Read more...]

Tree Methods

April 13, 2025 | Giuseppe Casalicchio

Goal The goal for this exercise is to familiarize yourself with two very important machine learning methods, the decision tree and random forest. After this exercise, you should be able to train these models and extract important information to ...
[Read more...]

Resampling

April 13, 2025 | Giuseppe Casalicchio

Goal You will learn how to estimate the model performance with mlr3 using resampling techniques such as 5-fold cross-validation. Additionally, you will compare k-NN model against a logistic regression model. German Credit Data We work with the...
[Read more...]

Train Predict Evaluate Basics

April 13, 2025 | Giuseppe Casalicchio

Goal Our goal for this exercise sheet is to learn the basics of mlr3 for supervised learning by training a first simple model on training data and by evaluating its performance on hold-out/test data. German Credit Dataset The German credit dat... [Read more...]

R Version 4.5.0 is Out!

April 13, 2025 | Blog on Credibly Curious

Some windows. Olympus XA, Portra 800. Photo by Nicholas Tierney The new R version 4.5.0 is out, and you should get it! I’ve read through the NEWS file, which details every change - there are many! I would recommend having a skim. If you’...
[Read more...]

Mastering Data Preprocessing in R with the `recipes` Package

April 13, 2025 | Nick Han

Data preprocessing is a critical step in any machine learning workflow. It ensures that your data is clean, consistent, and ready for modeling. In R, the recipes package provides a powerful and flexible framework for defining and applying preprocessing steps. In this blog post, we’ll explore how to use ...
[Read more...]

The apply() Family of Functions in R

April 13, 2025 | Nick Han

The apply() family of functions in R is a powerful tool for applying operations to data structures like matrices, data frames, and lists. These functions help you write concise and efficient code by avoiding explicit loops. Here’s what we’ll cover: Introduction: A brief overview of the apply() family ... [Read more...]

Exploring a 3-D Synthetic Dataset

April 11, 2025 | John Russell

Exploring the HistData package Over on BlueSky, I have been working through a few challenges. For the months of February and March, I participated in the DuBois Challenge, where you take a week to recreate some of the powerful visualizations that came out of the Paris Exposition from W.E....
[Read more...]

[R] data.table’s frank()

April 11, 2025 | Zhenguo Zhang

Zhenguo Zhang's Blog /2025/04/12/r-data-table-s-frank/ - knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE) library(knitr) library(data.table) One can use data.table::frank() to rank the rows of a data.table or simply a vector. Comp... [Read more...]
1 2 3 4 2,171