mlr 2.10

February 12, 2017

(This article was first published on mlr-org, and kindly contributed to R-bloggers)

mlr 2.10 is now on CRAN. Please update your package if you haven’t done so in a while.

Here is an overview of the changes:

functions – general

  • fixed bug in resample when using predict = “train” (issue #1284)
  • update to irace 2.0 – there are algorithmic changes in irace that may affect
  • generateFilterValuesData: fixed a bug wrt feature ordering
  • imputeLearner: fixed a bug when data actually contained no NAs
  • print.Learner: if a learner hyperpar was set to value “NA” this was not
    displayed in printer
  • makeLearner, setHyperPars: if you mistype a learner or hyperpar name, mlr
    uses fuzzy matching to suggest the 3 closest names in the message
  • tuneParams: tuning with irace is now also parallelized, i.e., different
    learner configs are evaluated in parallel.
  • benchmark: mini fix, arg ‘learners’ now also accepts class strings
  • object printers: some mlr printers show head previews of data.frames.
    these now also print info on the total nr of rows and cols and are less confusing
  • aggregations: have better properties now, they know whether they require training or
    test set evals
  • the filter methods have better R docs
  • filter new arg “method”
  • filter mrmr: fixed some smaller bugs and updated properties
  • generateLearningCurveData: also accepts single learner, does not require a list
  • plotThreshVsPerf: added “measures” arg
  • plotPartialDependence: can create tile plots with joint partial dependence
    on two features for multiclass classification by facetting across the classes
  • generatePartialDependenceData and generateFunctionalANOVAData: expanded
    “fun” argument to allow for calculation of weights
  • new “?mlrFamilies” manual page which lists all families and the functions
    belonging to it
  • we are converging on data.table as a standard internally, this should not
    change any API behavior on the outside, though
  • generateHyperParsEffectData and plotHyperParsEffect now support more than 2
  • linear.correlation, rank.correlation, anova.test: use Rfast instead of
    FSelector/custom implementation now, performance should be much better
  • use of our own colAUC function instead of the ROCR package for AUC calculation
    to improve performance
  • we output resample performance messages for every iteration now
  • performance improvements for the auc measure
  • createDummyFeatures supports vectors now
  • removed the pretty.names argument from plotHyperParsEffect – labels can be set
    though normal ggplot2 functions on the returned object
  • Fixed a bad bug in resample, the slot “runtime” or a ResampleResult,
    when the runtime was measured not in seconds but e.g. mins. R measures then potentially in mins,
    but mlr claimed it would be seconds.
  • New “dummy” learners (that disregard features completely) can be fitted now for baseline comparisons,
    see “featureless” learners below.

functions – new

  • filter: randomForest.importance
  • generateFeatureImportanceData: permutation-based feature importance and local
  • getFeatureImportanceLearner: new Learner API function
  • getFeatureImportance: top level function to extract feature importance
  • calculateROCMeasures
  • calculateConfusionMatrix: new confusion-matrix like function that calculates
    and tables many receiver operator measures
  • makeLearners: create multiple learners at once
  • getLearnerId, getLearnerType, getLearnerPredictType, getLearnerPackages
  • getLearnerParamSet, getLearnerParVals
  • getRRPredictionList
  • addRRMeasure
  • plotResiduals
  • getLearnerShortName
  • mergeBenchmarkResults

functions – renamed

  • Renamed rf.importance filter (now deprecated) to randomForestSRC.var.rfsrc
  • Renamed rf.min.depth filter (now deprecated) to
  • Renamed getConfMatrix (now deprecated) to calculateConfusionMatrix
  • Renamed setId (now deprecated) to setLearnerId

functions – removed

  • mergeBenchmarkResultLearner, mergeBenchmarkResultTask

learners – general

  • classif.ada: fixed some param problem with rpart.control params
  • classif.cforest, regr.cforest, surv.cforest:
    removed parameters “minprob”, “pvalue”, “randomsplits”
    as these are set internally and cannot be changed by the user
  • regr.GPfit: some more params for correlation kernel
  • classif.xgboost, regr.xgboost: can now properly handle NAs (property was missing and other problems), added “colsample_bylevel” parameter
  • adapted {classif,regr,surv}.ranger parameters for new ranger version

learners – new

  • multilabel.cforest
  • surv.gbm
  • regr.cvglmnet
  • {classif,regr,surv}.gamboost
  • {classif,regr}.evtree
  • {classif,regr}.evtree

learners – removed

  • classif.randomForestSRCSyn, regr.randomForestSRCSyn: due to continued stability issues

measures – new

  • ssr, qsr, lsr
  • rrse, rae, mape
  • kappa, wkappa
  • msle, rmsle

To leave a comment for the author, please follow the link and comment on their blog: mlr-org. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)