Max Kuhn | R-bloggers

Central Iowa R User Group Talk [Updated]

January 18, 2016 | Max Kuhn

I'll be giving a talk ("Applied Predictive Modeling") to the Central Iowa R User Group on Thursday night at 6:00 PM to 8:00 PM (CST). It looks like it will be broadcast live on YouTube. The link i...

[Read more...]

Central Iowa R User Group Talk [Updated]

January 18, 2016 | Max Kuhn

I'll be giving a talk ("Applied Predictive Modeling") to the Central Iowa R User Group on Thursday night at 6:00 PM to 8:00 PM (CST). It looks like it will be broadcast live on YouTube. The link is http://... [Read more...]

In Search Of…

December 13, 2015 | Max Kuhn

Rafael Ladeira asked on github: I was wondering why it doesn't implement some others algorithms for search for optimal tuning parameters. What would be the caveats of using a genetic algorithm , for instance, instead of grid or random search? Do y...

[Read more...]

In Search Of…

December 13, 2015 | Max Kuhn

Rafael Ladeira asked on github: I was wondering why it doesn't implement some others algorithms for search for optimal tuning parameters. What would be the caveats of using a genetic algorithm , for instance, instead of grid or random search? Do y... [Read more...]

C5.0 Class Probability Shrinkage

September 14, 2015 | Max Kuhn

(The image above has nothing do to with this post. It does, however, show the prize that my daughter won during a recent vacation to Virginia and how I got it back home). I was recently asked to explain a potential disconnect in C5.0 between the class probabilities shown in ...

[Read more...]

C5.0 Class Probability Shrinkage

September 14, 2015 | Max Kuhn

(The image above has nothing do to with this post. It does, however, show the prize that my daughter won during a recent vacation to Virginia and how I got it back home). I was recently asked to explain a potential disconnect in C5.0 between the class probabilities shown in ... [Read more...]

Feature Engineering versus Feature Extraction: Game On!

August 3, 2015 | Max Kuhn

"Feature engineering" is a fancy term for making sure that your predictors are encoded in the model in a manner that makes it as easy as possible for the model to achieve good performance. For example, if your have a date field as a predictor and there are larger differences ...

[Read more...]

Feature Engineering versus Feature Extraction: Game On!

August 3, 2015 | Max Kuhn

"Feature engineering" is a fancy term for making sure that your predictors are encoded in the model in a manner that makes it as easy as possible for the model to achieve good performance. For example, if your have a date field as a predictor and there are larger differences ... [Read more...]

New caret Version (6.0-52)

July 22, 2015 | Max Kuhn

A new version of caret (6.0-52) is on CRAN. Here is the news file but the Cliff Notes are: sub-sampling for class imbalances is now integrated with train and is used inside of standard resampling. There are four methods available right now: up- and... [Read more...]

New caret Version (6.0-52)

July 22, 2015 | Max Kuhn

A new version of caret (6.0-52) is on CRAN. Here is the news file but the Cliff Notes are: sub-sampling for class imbalances is now integrated with train and is used inside of standard resampling. There are four methods available right now: up- and... [Read more...]

Slides from recent talks

April 21, 2015 | Max Kuhn

I've been buried in work lately but thought I'd share the slides from two recent talks. The first is from the Bay Area RUG. Since someone filmed the talks, I was waiting to post the slides. The video of my t... [Read more...]

A Talk and Course in NYC Next Week

February 13, 2015 | Max Kuhn

I'll be giving talk on Tuesday February 17 (7:00PM-9:00PM) that will be an overview of predictive modeling. It will not be highly technical and here is the current outline: "Predictive modeling" definition Some example applications A short overview and example How is this different from what statisticians already do? What ... [Read more...]

Simulated Annealing Feature Selection

January 12, 2015 | Max Kuhn

As previously mentioned, caret has two new feature selection routines based on genetic algorithms (GA) and simulated annealing (SA). The help pages for the two new functions give a detailed account of the options, syntax etc. The package already has functions to conduct feature selection using simple filters as well ... [Read more...]

Regression Solutions Available

January 8, 2015 | Max Kuhn

The github page for the APM exercises has been updated with three new files for Chapters 6-8 (the section on regression). The classifications section is in-progress. Here's one of our fancy-pants graphs: [Read more...]

New Version of caret on CRAN

January 5, 2015 | Max Kuhn

A new version of caret is on CRAN. Some recent features/changes: The license was changed to GPL __= 2 to accommodate new code from the GA package. New feature selection functions gafs and safs were adde... [Read more...]

Comparing the Bootstrap and Cross-Validation

December 8, 2014 | Max Kuhn

This is the second of two posts about the performance characteristics of resampling methods. The first post focused on the cross-validation techniques and this post mostly concerns the bootstrap. Recall from the last post: we have some simulations to evaluate the precision and bias of these methods. I simulated some ... [Read more...]

Comparing Different Species of Cross-Validation

December 2, 2014 | Max Kuhn

This is the first of two posts about the performance characteristics of resampling methods. I just had major shoulder surgery, but I've pre-seeded a few blog posts. More will come as I get better at one-handed typing. First, a review: Resampling methods, such as cross-validation (CV) and the bootstrap, can ... [Read more...]

Solutions on github

November 12, 2014 | Max Kuhn

See this page. We're not done with them all but chapter 3 and 4 are there and the regression chapters are not too far behind. The Rnw files (using knitr LaTeX) are there along with the corresponding pdf files. You may have better solutions than ... [Read more...]

Some Thoughts on “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?”

November 11, 2014 | Max Kuhn

Sorry for the blogging break. I’ve got a few planned for the next few weeks based on some work I’ve been doing. In the meantime, you should check out “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” by Manuel Fernandez-Delgado at JMLR. They took ... [Read more...]

useR! 2014 Highlights

July 3, 2014 | Max Kuhn

My talk went well; here are the slides and a link to the paper pre-print. Hadley Wickham gave an excellent tutorial on dplyr. Based on the talk I saw, I think I will take the data sets from the book and make some public visualizations on the Plotly we... [Read more...]

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Articles by Max Kuhn

Central Iowa R User Group Talk [Updated]

Central Iowa R User Group Talk [Updated]

In Search Of…

In Search Of…

C5.0 Class Probability Shrinkage

C5.0 Class Probability Shrinkage

Feature Engineering versus Feature Extraction: Game On!

Feature Engineering versus Feature Extraction: Game On!

New caret Version (6.0-52)

New caret Version (6.0-52)

Slides from recent talks

A Talk and Course in NYC Next Week

Simulated Annealing Feature Selection

Regression Solutions Available

New Version of caret on CRAN

Comparing the Bootstrap and Cross-Validation

Comparing Different Species of Cross-Validation

Solutions on github

Some Thoughts on “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?”

useR! 2014 Highlights

Articles by Max Kuhn

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)