We are happy to announce that a new extension package has joined the CRAN family of mlr3 packages. mlr3spatiotempcv was in the works for more than a year and adds spatiotemporal resampling methods to the mlr3 ecosystem.
Such dedicated resampling methods make it possible to retrieve biased-reduced performance estimates in cross-validation scenarios when working with spatial, temporal or spatiotemporal datasets. mlr3spatiotempcv does not implement new methods but rather attempts to collect existing methods.
So far, applying such methods in both R and the mlr ecosystem was not particular easy since they were spread across various R packages. Usually every R package uses a slightly different syntax for the required objects and the returned results. This not only leads to an inconvenient single use experience but is also unpractical when working in an overarching ecosystem such as mlr3.
We hope that with the release of this package users are now able to seamlessly work with spatiotemporal data in mlr3. Please file issues and suggestions in the issues pane of the package.
For a quick and rather technial introduction please see the “Get Started” vignette. For more detailed information and a detailed walk-through, see the “Spatiotemporal Analysis” section in the mlr3book.
To finish with something visual, a simple example which showcases the visualization capabilities of mlr3spatiotempcv for different partitioning methods (random (non-spatial) partitioning (Fig.1) vs. k-means based partitioning (spatial) (Fig. 2)):
library("mlr3") library("mlr3spatiotempcv") set.seed(42) # be less verbose lgr::get_logger("bbotk")$set_threshold("warn") lgr::get_logger("mlr3")$set_threshold("warn") task = tsk("ecuador") learner = lrn("classif.rpart", maxdepth = 3, predict_type = "prob") resampling_nsp = rsmp("repeated_cv", folds = 4, repeats = 2) learner = lrn("classif.rpart", maxdepth = 3, predict_type = "prob") resampling_sp = rsmp("repeated_spcv_coords", folds = 4, repeats = 2) autoplot(resampling_nsp, task, fold_id = c(1:4), crs = 4326) * ggplot2::scale_y_continuous(breaks = seq(-3.97, -4, -0.01)) * ggplot2::scale_x_continuous(breaks = seq(-79.06, -79.08, -0.02))
autoplot(resampling_sp, task, fold_id = c(1:4), crs = 4326) * ggplot2::scale_y_continuous(breaks = seq(-3.97, -4, -0.01)) * ggplot2::scale_x_continuous(breaks = seq(-79.06, -79.08, -0.02))