mlr 2.10

Posted on February 12, 2017 by Janek Thomas in R bloggers | 0 Comments

[This article was first published on mlr-org, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

mlr 2.10 is now on CRAN. Please update your package if you haven’t done so in a while.

Here is an overview of the changes:

functions – general

fixed bug in resample when using predict = “train” (issue #1284)
update to irace 2.0 – there are algorithmic changes in irace that may affect performance
generateFilterValuesData: fixed a bug wrt feature ordering
imputeLearner: fixed a bug when data actually contained no NAs
print.Learner: if a learner hyperpar was set to value “NA” this was not displayed in printer
makeLearner, setHyperPars: if you mistype a learner or hyperpar name, mlr uses fuzzy matching to suggest the 3 closest names in the message
tuneParams: tuning with irace is now also parallelized, i.e., different learner configs are evaluated in parallel.
benchmark: mini fix, arg ‘learners’ now also accepts class strings
object printers: some mlr printers show head previews of data.frames. these now also print info on the total nr of rows and cols and are less confusing
aggregations: have better properties now, they know whether they require training or test set evals
the filter methods have better R docs
filter randomForestSRC.var.select: new arg “method”
filter mrmr: fixed some smaller bugs and updated properties
generateLearningCurveData: also accepts single learner, does not require a list
plotThreshVsPerf: added “measures” arg
plotPartialDependence: can create tile plots with joint partial dependence on two features for multiclass classification by facetting across the classes
generatePartialDependenceData and generateFunctionalANOVAData: expanded “fun” argument to allow for calculation of weights
new “?mlrFamilies” manual page which lists all families and the functions belonging to it
we are converging on data.table as a standard internally, this should not change any API behavior on the outside, though
generateHyperParsEffectData and plotHyperParsEffect now support more than 2 hyperparameters
linear.correlation, rank.correlation, anova.test: use Rfast instead of FSelector/custom implementation now, performance should be much better
use of our own colAUC function instead of the ROCR package for AUC calculation to improve performance
we output resample performance messages for every iteration now
performance improvements for the auc measure
createDummyFeatures supports vectors now
removed the pretty.names argument from plotHyperParsEffect – labels can be set though normal ggplot2 functions on the returned object
Fixed a bad bug in resample, the slot “runtime” or a ResampleResult, when the runtime was measured not in seconds but e.g. mins. R measures then potentially in mins, but mlr claimed it would be seconds.
New “dummy” learners (that disregard features completely) can be fitted now for baseline comparisons, see “featureless” learners below.

functions – new

filter: randomForest.importance
generateFeatureImportanceData: permutation-based feature importance and local importance
getFeatureImportanceLearner: new Learner API function
getFeatureImportance: top level function to extract feature importance information
calculateROCMeasures
calculateConfusionMatrix: new confusion-matrix like function that calculates and tables many receiver operator measures
makeLearners: create multiple learners at once
getLearnerId, getLearnerType, getLearnerPredictType, getLearnerPackages
getLearnerParamSet, getLearnerParVals
getRRPredictionList
addRRMeasure
plotResiduals
getLearnerShortName
mergeBenchmarkResults

functions – renamed

Renamed rf.importance filter (now deprecated) to randomForestSRC.var.rfsrc
Renamed rf.min.depth filter (now deprecated) to randomForestSRC.var.select
Renamed getConfMatrix (now deprecated) to calculateConfusionMatrix
Renamed setId (now deprecated) to setLearnerId

functions – removed

mergeBenchmarkResultLearner, mergeBenchmarkResultTask

learners – general

classif.ada: fixed some param problem with rpart.control params
classif.cforest, regr.cforest, surv.cforest: removed parameters “minprob”, “pvalue”, “randomsplits” as these are set internally and cannot be changed by the user
regr.GPfit: some more params for correlation kernel
classif.xgboost, regr.xgboost: can now properly handle NAs (property was missing and other problems), added “colsample_bylevel” parameter
adapted {classif,regr,surv}.ranger parameters for new ranger version

learners – new

multilabel.cforest
surv.gbm
regr.cvglmnet
{classif,regr,surv}.gamboost
classif.earth
{classif,regr}.evtree
{classif,regr}.evtree

learners – removed

classif.randomForestSRCSyn, regr.randomForestSRCSyn: due to continued stability issues

measures – new

ssr, qsr, lsr
rrse, rae, mape
kappa, wkappa
msle, rmsle

To leave a comment for the author, please follow the link and comment on their blog: mlr-org.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

mlr 2.10

functions – general

functions – new

functions – renamed

functions – removed

learners – general

learners – new

learners – removed

measures – new

Related

functions – general

functions – new

functions – renamed

functions – removed

learners – general

learners – new

learners – removed

measures – new

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)