Predicting optimal of iterations and completion time for GBM

November 20, 2013

(This article was first published on Heuristic Andrew » r-project, and kindly contributed to R-bloggers)

When choosing the hyperparameters for Generalized Boosted Regression Models, two important choices are shrinkage and the number of trees. Generally a smaller shrinkage with more trees produces a better model, but the modeling time significantly increases. Building a model with too many trees that are heavily cut back by cross validation wastes time, while building a model with too few trees may require starting over with a larger number of trees—also a waste of time. So here I present a simple way to estimate the optimal number of trees and the modeling time for GBM as implemented in the R package gbm. Continue reading

To leave a comment for the author, please follow the link and comment on his blog: Heuristic Andrew » r-project. offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.