Monthly Archives: November 2010

Feature selection: Using the caret package

November 16, 2010
By
Feature selection: Using the caret package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. In a previous post we looked at all-relevant feature selection using the Boruta package while in this post we consider the same (artificial, toy) examples using the caret package. Max Kuhn kindly listed...

Read more »

Feature selection: Using the caret package

November 16, 2010
By
Feature selection: Using the caret package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. In a previous post we looked at all-relevant feature selection using the Boruta package while in this post we consider the same (artificial, toy) examples using the caret package. ...

Read more »

Assignment operators in R: ‘=’ vs. ‘<-’

November 16, 2010
By
Assignment operators in R: ‘=’ vs. ‘<-’

In R, you can use  both ‘=’ and ‘<-’ as assignment operators. So what’s the difference between them and which one should you use? What’s the difference? The main difference between the two assignment operators is scope. It’s easiest to see the difference with an example: ##Delete x (if it exists) > rm(x) > mean(x=1:10)

Read more »

Data Science meets Humanities

November 16, 2010
By

There's an interesting article in the NYT today about the emerging discipline of "digital humanities": extracting digital data from historical archives to answer questions from the Arts and Humanities. From the article: Members of a new generation of digitally savvy humanists argue it is time to stop looking for inspiration in the next political or philosophical “ism” and start...

Read more »

Postdoc in Wharton

November 16, 2010
By
Postdoc in Wharton

Just received this email from José Bernardo about an exciting postdoc position in Wharton: POST-DOCTORAL FELLOW – DEPARTMENT OF STATISTICS, THE WHARTON SCHOOL The Department of Statistics at The Wharton School of the University of Pennsylvania is seeking candidates for a Post-Doctoral Fellowship. This research fellowship provides full funding without any teaching requirements at a

Read more »

Loops in R: Think different

November 15, 2010
By

Especially for programmers that come to R from other languages, R sometimes gets dinged about the speed of its for loops. But a lot of the time, where you might have needed an iterative loop in another language to solve a specific task, you don't need a for loop in R at all. Often, there's a pre-build function to...

Read more »

Example 8.14: generating standardized regression coefficients

November 15, 2010
By
Example 8.14: generating standardized regression coefficients

Standardized (or beta) coefficients from a linear regression model are the parameter estimates obtained when the predictors and outcomes have been standardized to have variance = 1. Alternatively, the regression model can be fit and then standardized ...

Read more »

Feature selection: All-relevant selection with the Boruta package

November 15, 2010
By
Feature selection: All-relevant selection with the Boruta package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. There are two main approaches to selecting the features (variables) we will use for the analysis: the minimal-optimal feature selection which identifies a small (ideally minimal) set of variables that gives the best...

Read more »

Feature selection: All-relevant selection with the Boruta package

November 15, 2010
By
Feature selection: All-relevant selection with the Boruta package

Feature selection is an important step for practical commercial data mining which is often characterised by data sets with far too many variables for model building. There are two main approaches to selecting the features (variables) we will use for the analysis:...

Read more »

Isarithmic History of the Two-Party Vote

November 15, 2010
By
Isarithmic History of the Two-Party Vote

A few weeks ago, I shared a series of choropleth maps of U.S. presidential election returns, illustrating the relative support for Democratic, Republican, and third Party candidates since 1920. The granularity of these county level results led me to wonder whether it would be possible to develop an isarithmic map of presidential voting using the … Continue reading →

Read more »

Sponsors

Mango solutions



plotly webpage

dominolab webpage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)