Blog Archives

Getting started with the Heritage Health Price competition

April 8, 2011
By
Getting started with the Heritage Health Price competition

The US$ 3 million Heritage Health Price competition is on so we take a look at how to get started using the R statistical computing and analysis platform.

Read more »

Getting started with the Heritage Health Price competition

April 8, 2011
By
Getting started with the Heritage Health Price competition

The US$ 3 million Heritage Health Price competition is on so we take a look at how to get started using the R statistical computing and analysis platform.

Read more »

Benchmarking feature selection with Boruta and caret

November 25, 2010
By
Benchmarking feature selection with Boruta and caret

Feature selection is the data mining process of selecting the variables from our data set that may have an impact on the outcome we are considering. For commercial data mining, which is often characterised by having too many variables for model building, this is an important step in the analysis process. And since we often work on...

Read more »

Benchmarking feature selection with Boruta and caret

November 25, 2010
By
Benchmarking feature selection with Boruta and caret

Feature selection is the data mining process of selecting the variables from our data set that may have an impact on the outcome we are considering. For commercial data mining, which is often characterised by having too many variables for model building, this is an...

Read more »

Benchmarking feature selection with Boruta and caret

November 25, 2010
By
Benchmarking feature selection with Boruta and caret

Feature selection is the data mining process of selecting the variables from our data set that may have an impact on the outcome we are considering. For commercial data mining, which is often characterised by having too many variables for model building, this is an...

Read more »

Feature selection: Using the caret package

November 16, 2010
By
Feature selection: Using the caret package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. In a previous post we looked at all-relevant feature selection using the Boruta package while in this post we consider the same (artificial, toy) examples using the caret package. ...

Read more »

Feature selection: Using the caret package

November 16, 2010
By
Feature selection: Using the caret package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. In a previous post we looked at all-relevant feature selection using the Boruta package while in this post we consider the same (artificial, toy) examples using the caret package. ...

Read more »

Feature selection: All-relevant selection with the Boruta package

November 15, 2010
By
Feature selection: All-relevant selection with the Boruta package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. There are two main approaches to selecting the features (variables) we will use for the analysis:...

Read more »

Feature selection: All-relevant selection with the Boruta package

November 15, 2010
By
Feature selection: All-relevant selection with the Boruta package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. There are two main approaches to selecting the features (variables) we will use for the analysis:...

Read more »

Big data for R

August 5, 2010
By
Big data for R

Revolutions Analytics recently announced their "big data" solution for R. This is great news and a lovely piece of work by the team at Revolutions. However, if you want to replicate their analysis in standard R, then you can absolutely do so and we show you how.

Read more »