Monthly Archives: August 2012

Finding the Best Subset of a GAM using Tabu Search and Visualizing It in R

August 24, 2012
By
Finding the Best Subset of a GAM using Tabu Search and Visualizing It in R

Finding the best subset of variables for a regression is a very common task in statistics and machine learning. There are statistical methods based on asymptotic normal theory that can help you decide whether to add or remove a variable at a time. The ...

Read more »

CRAN might get tenure at Yale?

August 24, 2012
By

From one of the R lists I follow: Today (2012-08-23) on CRAN : “Currently, the CRAN package repository features 4001 available packages.” These packages are maintained by approximately 2350 different folks. Previous milestones: 2011-05-12: 3,000 packages 2009-10-04: 2,000 packages 2007-04-12: 1,000 packages 2004-10-01: 500 packages 2003-04-01: 250 packages http://cran.r-project.org/web/packages/

Read more »

Data analysis using R – course in Essex

August 24, 2012
By
Data analysis using R – course in Essex

This course is running 1-5 October at the University of Essex. There doesn’t seem to be a website but you register by writing to [email protected] Here’s what they say in their e-mail: Lecturers: Dr Werner Adler (University of Erlangen-Nuremberg; Co-author … Continue reading →

Read more »

ggplot2 Self-deprecation

August 24, 2012
By
ggplot2 Self-deprecation

I've been in China working for a few weeks (where this blog is (oddly) blocked). So, I haven't been able to post much over the summer. To kick things off for the new (academic) year, I thought I might just re-post something good I saw on the Book of Sa...

Read more »

Comparing hist() and cut() R functions

August 24, 2012
By

The other day a question about faceting data came up in the Dallas R Users group (link of conversation). The hist() function is more efficient and uses less memory than the cut() function. Additionally, hist() returns an object that makes...

Read more »

Momentum with R: Part 1

August 23, 2012
By
Momentum with R: Part 1

Time really flies… it is hard to believe that it has been over a month since my last post. Work and life in general have consumed much of my time lately and left little time for research and blog posts. Anyway, on to the post! This post will be the first in a series of … Continue reading...

Read more »

Revolution Analytics receives Top Innovator award for Data Science Technology

August 23, 2012
By
Revolution Analytics receives Top Innovator award for Data Science Technology

A big thank-you to all the R users out there who voted for Revolution R Enterprise in DataWeek Awards. We're so pleased to be recognized by the voters and the DataWeek judging panel with the Top Innovator Award for Data Science Technology. We're looking forward to the awards ceremony next week at DataWeek SF (in San Francisco, September 24-27)....

Read more »

difference between NA and NaN in R

August 23, 2012
By

We usually see NA and NaN in R. What's the difference between them?Here a good post for that topic:http://stats.stackexchange.com/questions/5686/what-is-the-difference-between-nan-and-naIn summary here:NaN ("Not a Number") means 0/0NA ("Not Available") is generally interpreted as a missing value and has various forms - NA_integer_, NA_real_, etc. Therefore, NaN ≠ NA and there is a need for NaN and NA.is.na() returns TRUE for both NA...

Read more »

Bonds Much Sharpe -r Than Buffett

August 23, 2012
By
Bonds Much Sharpe -r Than Buffett

Mebane Faber’s post Buffett’s Alpha points out Warren Buffett’s 0.76 Sharpe Ratio discussed in the similarly title paper Buffet’s Alpha.  I of course immediately think about the 8th Wonder of the World – the US Bond Market, whose Sharpe ...

Read more »

R and the web (for beginners), Part III: Scraping MPs’ expenses in detail from the web

August 23, 2012
By
R and the web (for beginners), Part III: Scraping MPs’ expenses in detail from the web

In this last post of my little series (see my latest post) on R and the web I explain how to extract data of a website (web scraping/screen scraping) with R. If the data you want to analyze are a part of a web page, for example a HTML-table (or hundreds of...

Read more »