33 search results for "gradient boosted"

Build a Gradient Boosted Trees Model with Microsoft R Server

May 3, 2016
By
Build a Gradient Boosted Trees Model with Microsoft R Server

by Yuzhou Song, Microsoft Data Scientist R is an open source, statistical programming language with millions of users in its community. However, a well-known weakness of R is that it is both single threaded and memory bound, which limits its ability to process big data. With Microsoft R Server (MRS), the enterprise grade distribution of R for advanced analytics,...

Read more »

Gradient Boosting: Analysis of LendingClub’s Data

April 8, 2013
By
Gradient Boosting: Analysis of LendingClub’s Data

An old 5.75% CD of mine recently matured and seeing that those interest rates are gone forever, I figured I’d take a statistical look at LendingClub’s data. Lending Club is the first peer-to-peer lending company to register its offerings as securities with the Securities and Exchange Commission (SEC). Their operational statistics are public and available for download. The latest

Read more »

Generalized Boosted Regression with A Monotonic Marginal Effect for Each Predictor

December 18, 2012
By
Generalized Boosted Regression with A Monotonic Marginal Effect for Each Predictor

In the practice of risk modeling, it is sometimes mandatory to maintain a monotonic relationship between the response and each predictor. Below is a demonstration showing how to develop a generalized boosted regression with a monotonic marginal effect for each predictor. Plot of Variable Importance Plot of Monotonic Marginal Effects

Read more »

Hyperparameter Optimization in H2O: Grid Search, Random Search and the Future

June 15, 2016
By
Hyperparameter Optimization in H2O: Grid Search, Random Search and the Future

“Good, better, best. Never let it rest. ‘Til your good is better and your better is best.” – St. Jerome tl;dr H2O now has random hyperparameter search with time- and metric-based early stopping. Bergstra and Bengio write on p. 281: Compared with neural networks configured by a pure grid search, we find that random...

Read more »

Improved vtreat documentation

April 17, 2016
By
Improved vtreat documentation

Nina Zumel has donated some time to greatly improve the vtreat R package documentation (now available as pre-rendered HTML here). vtreat is an R data.frame processor/conditioner package that helps prepare real-world data for predictive modeling in a statistically justifiable manner. Even with modern machine learning techniques (random forests, support vector machines, neural nets, gradient boosted … Continue reading...

Read more »

Data Mining Standard Process across Organizations

October 18, 2015
By
Data Mining Standard Process across Organizations

Recently I have come across a term, CRISP-DM - a data mining standard. Though this process is not a new one but I felt every analyst should know about commonly used Industry wide process. In this post I will explain about different phases involved in creating a data mining solution. CRISP-DM, an acronym for Cross Industry Standard...

Read more »

Partial Dependence Plots

December 22, 2014
By
Partial Dependence Plots

It can be difficult to understand the functional relations between predictors and an outcome when using black box prediction methods like random forests. One way to investigate these relations is with partial dependence plots. These plots are graphical visualizations of the marginal effect of a given variable (or multiple variables) on an outcome. Typically, these are restricted to only...

Read more »

Introducing H2O Lagrange (2.6.0.11) to R

September 1, 2014
By

From my perspective the most important event that happened at useR! 2014 was that I got to meet the 0xdata team and now, long story short, here I am introducing the latest version of H2O, labeled Lagrange (2.6.0.11), to the R and greater data science communities. Before joining 0xdata, I was working at a competitor on a rival project and was repeatedly asked...

Read more »

Natural language processing tutorial

June 25, 2013
By
Natural language processing tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. The goal is to provide a reasonable baseline on top of which more complex natural language processing can be...

Read more »

Edge Prediction in a Social Graph: My Solution to Facebook’s User Recommendation Contest on Kaggle

July 31, 2012
By
Edge Prediction in a Social Graph: My Solution to Facebook’s User Recommendation Contest on Kaggle

A couple weeks ago, Facebook launched a link prediction contest on Kaggle, with the goal of recommending missing edges in a social graph. I love investigating social networks, so I dug around a little, and since I did well enough to score one of the coveted prizes, I’ll share my approach here. (For some background, the contest provided...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)