mlr3spatiotempcv: Initial CRAN release

[This article was first published on r-bloggers on Machine Learning in R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

We are happy to announce that a new extension package has joined the CRAN family of mlr3 packages. mlr3spatiotempcv was in the works for more than a year and adds spatiotemporal resampling methods to the mlr3 ecosystem.

Such dedicated resampling methods make it possible to retrieve biased-reduced performance estimates in cross-validation scenarios when working with spatial, temporal or spatiotemporal datasets. mlr3spatiotempcv does not implement new methods but rather attempts to collect existing methods.

So far, applying such methods in both R and the mlr ecosystem was not particular easy since they were spread across various R packages. Usually every R package uses a slightly different syntax for the required objects and the returned results. This not only leads to an inconvenient single use experience but is also unpractical when working in an overarching ecosystem such as mlr3.

We hope that with the release of this package users are now able to seamlessly work with spatiotemporal data in mlr3. Please file issues and suggestions in the issues pane of the package.

For a quick and rather technial introduction please see the “Get Started” vignette. For more detailed information and a detailed walk-through, see the “Spatiotemporal Analysis” section in the mlr3book.

To finish with something visual, a simple example which showcases the visualization capabilities of mlr3spatiotempcv for different partitioning methods (random (non-spatial) partitioning (Fig.1) vs. k-means based partitioning (spatial) (Fig. 2)):

library("mlr3")
library("mlr3spatiotempcv")
set.seed(42)

# be less verbose
lgr::get_logger("bbotk")$set_threshold("warn")
lgr::get_logger("mlr3")$set_threshold("warn")

task = tsk("ecuador")

learner = lrn("classif.rpart", maxdepth = 3, predict_type = "prob")
resampling_nsp = rsmp("repeated_cv", folds = 4, repeats = 2)

learner = lrn("classif.rpart", maxdepth = 3, predict_type = "prob")
resampling_sp = rsmp("repeated_spcv_coords", folds = 4, repeats = 2)

autoplot(resampling_nsp, task, fold_id = c(1:4), crs = 4326) *
  ggplot2::scale_y_continuous(breaks = seq(-3.97, -4, -0.01)) *
  ggplot2::scale_x_continuous(breaks = seq(-79.06, -79.08, -0.02))

autoplot(resampling_sp, task, fold_id = c(1:4), crs = 4326) *
  ggplot2::scale_y_continuous(breaks = seq(-3.97, -4, -0.01)) *
  ggplot2::scale_x_continuous(breaks = seq(-79.06, -79.08, -0.02))

To leave a comment for the author, please follow the link and comment on their blog: r-bloggers on Machine Learning in R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)