Blog Archives

Beautiful Data

July 27, 2009
By
Beautiful Data

O'Reilly's recent publication Beautiful Data has a chapter by Jeff Jonas which is enough reason in itself for me to recommend it. The chapter, Data Finds Data, is also available as a PDF download.

Read more »

Massively parallel database for analytics

July 22, 2009
By
Massively parallel database for analytics

This is by far the best description of why traditional parallel databases (like Teradata, Greenplum et al.) is a evolutionary dead end. But much more than a theoretical discussion, they have built a solution which they call HadoopDB. It is based on Hadoop, PostgreSQL, and Hive and is completely Open Source. Alternative, column-based, backends to PostgreSQL...

Read more »

The Knapsack Problem

July 10, 2009
By
The Knapsack Problem

David posts a question about how to solve this knapsack problem using the R statistical computing and analysis platform. My reply in the comments seems to have disappeared for a while so here is my proposed solution:

Read more »

OECD Statistics

July 2, 2009
By
OECD Statistics

I am a sucker for good quality data. I wrote about data.gov, the US Government data site before, and now I find OECD Statistics which has some 300 data sets, many of which seems to be readily accessible (though some may require subscription)

Read more »

R tips: Determine if function is called from specific package

June 16, 2009
By
R tips: Determine if function is called from specific package

I like the "multicore" library for a particular task. I can easily write a combination of if(require("multicore",...)) that means that my function will automatically use the parallel mclapply() instead of lapply() where it is available. Which is grand 99% of the time, except when my function is called from mclapply() (or one of the lower level functions)...

Read more »

R tips: Installing Rmpi on Fedora Linux

June 12, 2009
By
R tips: Installing Rmpi on Fedora Linux

Somebody on the R-help mailing list asked how to get Rmpi working on his Fedora Linux machine so he could do high-performance computing on a cluster of machines (or a single multicore machine) using the R statistical computing and analysis platform. Since it is unusually painful to get working, I might as well copy the instructions here.

Read more »

Data Mashups in R from O’Reilly

June 9, 2009
By
Data Mashups in R from O’Reilly

O’Reilly has published Data Mashups in R as a $4.99 PDF download in their Short Cut series. In 27 pages it takes you through an example of how to combine foreclosure information with maps and geographical information to produce plots like the one here. This is all done with the R statistical computing and analysis platform.

Read more »

How to win the KDD Cup Challenge with R and gbm

June 1, 2009
By
How to win the KDD Cup Challenge with R and gbm

Hugh Miller, the team leader of the winner of the KDD Cup 2009 Slow Challenge (which we wrote about recently) kindly provides more information about how to win this public challenge using the R statistical computing and analysis platform on a laptop (!).

Read more »

R used by KDD 2009 cup winner of slow challenge

May 31, 2009
By
R used by KDD 2009 cup winner of slow challenge

The results from the KDD Cup 2009 challenge (which we wrote about before) are in, and the winner of the slow challenge used the R statistical computing and analysis platform for their winning submission.

Read more »

R tips: Use read.table instead of strsplit to split a text column into multiple columns

May 29, 2009
By
R tips: Use read.table instead of strsplit to split a text column into multiple columns

Someone on the R-help mailing list had a data frame with a column containing IP addresses in quad-dot format (e.g. 1.10.100.200). He wanted to sort by this column and I proposed a solution involving strsplit. But Peter Dalgaard comes up with a much nicer method using read.table on a textConnection object:

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)