Posts Tagged ‘ plyr ’

A quick primer on split-apply-combine problems

December 16, 2011
By
A quick primer on split-apply-combine problems

I’ve just answered my hundred billionth question on Stack Overflow that goes something like I want to calculate some statistic for lots of different groups. Although these questions provide a steady stream of easy points, its such a common and basic data analysis concept that I thought it would be useful to have a document

Read more »

I Work For The Internet !

December 13, 2011
By
I Work For The Internet !

UPDATE: code and figure updated at 1150 am CST.The site I WORK FOR THE INTERNET is collecting pictures and first names (last name initials only) to show collective support against SOPA (the Stop Online Piracy Act).  Please stop by their site and a...

Read more »

Parallelization using plyr: loading objects and packages into worker nodes

November 14, 2011
By

I really love the plyr package. Apart from having a progress bar and plyr handeling a lot of the overhead, a very interesting feature is being able to run plyr in parallel mode. Essentially, setting .parallel = TRUE runs any… See more ›

Read more »

Comparison of ave, ddply and data.table

August 25, 2011
By
Comparison of ave, ddply and data.table

A guest post by Paul Hiemstra. ———— Fortran and C programmers often say that interpreted languages like R are nice and all, but lack in terms of speed. How fast something works in R greatly depends on how it is implemented, i.e. which packages/functions does one use. A prime example, which shows up regularly on

Read more »

Your Data is Never the Right Shape

July 31, 2011
By
Your Data is Never the Right Shape

One of the recurring frustrations in data analytics is that your data is never in the right shape. Worst case: you are not aware of this and every step you attempt is more expensive, less reliable and less informative than you would want. Best case: you notice this and have the tools to reshape yourRelated posts:

Read more »

plyr’s idata.frame VS. data.frame

May 13, 2011
By
plyr’s idata.frame VS. data.frame

I had seen the function idata.frame in plyr before, but not really tested it. Here are a few comparisons of operations on normal data frames and immutable data frames. Immutable data frames don't work with the doBy package, but do work with aggregate i...

Read more »

One-way ANOVAs in R – including post-hocs/t-tests and graphs

May 11, 2011
By
One-way ANOVAs in R – including post-hocs/t-tests and graphs

In this post, I go over the basics of running an ANOVA using R. The dataset I’ll be examining comes from this website, and I’ve discussed it previously (starting here and then here). I’ve not seen many examples where someone runs through the … Continue reading →

Read more »

Charting the Defeat of AV using R (and some ggplot2 and merge operations on top)

May 8, 2011
By
Charting the Defeat of AV using R (and some ggplot2 and merge operations on top)

In this post, I’ll be graphing some results from a recent referendum held here in the UK and combining it with the results of a set of local elections that were held at the same time. I’ll give some examples of graphing stuff using ggplot2 and will also show some info regarding merging datasets. At

Read more »

Data Aggregation in R: plyr, sqldf and data.table

April 28, 2011
By
Data Aggregation in R: plyr, sqldf and data.table

I’ve also previously put up a couple of posts about aggregating data in R. In this post, I’m going to be trying some other alternative methods for aggregating the dataset. Before I begin, I’d like to thank Matthew Dowle for highlighting these to me. It’s a bit daunting at first, deciding which method of aggregating data is

Read more »

Article about plyr published in JSS, and the citation was added to the new plyr (version 1.5)

April 11, 2011
By
Article about plyr published in JSS, and the citation was added to the new plyr (version 1.5)

The plyr package (by Hadley Wickham) is one of the few R packages for which I can claim to have used for all of my statistical projects. So whenever a new version of plyr comes out I tend to be excited about it (as was when version 1.2 came out with support for parallel processing)

Read more »