180 search results for "plyr"

Text Mining to Word Cloud App with R

May 13, 2012
By
Text Mining to Word Cloud App with R

Here is a simple application to transform text into a beautiful word cloud, Text Mining to WordCloud. The purpose is to find out the highest frequency word in a certain text. It is an app built with R language, the source code is attached at the end of...

Read more »

data.table version 1.8.1 – now allowed numeric columns and big-number (via bit64) in keys!

May 9, 2012
By

This is a guest post written by Branson Owen, an enthusiastic R and data.table user. Wow, a long time desired feature of data.table finally came true in version 1.8.1! data.table now allowed numeric columns and big number (via bit64) in …

Read more »

Read more »

cumplyr: Extending the plyr Package to Handle Cross-Dependencies

May 3, 2012
By

Introduction For me, Hadley Wickham‘s reshape and plyr packages are invaluable because they encapsulate omnipresent design patterns in statistical computing: reshape handles switching between the different possible representations of the same underlying data, while plyr automates what Hadley calls the Split-Apply-Combine strategy, in which you split up your data into several subsets, perform some computation

Read more »

microbenchmarking with R

April 28, 2012
By
microbenchmarking with R

I love to benchmark.  Maybe I’m a bit weird but I love to bench  everything in R.  Recently I’ve had people raise accuracy challenges to the typical system.time and rbenchmark package approaches to benchmarking.  I saw Hadley Wickham promoting the … Continue reading

Read more »

Measuring user retention using cohort analysis with R

April 27, 2012
By
Measuring user retention using cohort analysis with R

Cohort analysis is super important if you want to know if your service is in fact a leaky bucket despite nice growth of absolute numbers. There’s a good write up on that subject “Cohorts, Retention, Churn, ARPU” by Matt Johnson. So how to do it using R and how to visualize it. Inspired by examples

Read more »

Heat map visualization of sick day trends in Finland with R, ggplot2 and Google Correlate

April 24, 2012
By
Heat map visualization of sick day trends in Finland with R, ggplot2 and Google Correlate

Inspired by Margintale’s post “ggplot2 Time Series Heatmaps” and Google Flu Trends I decided to use a heat map to visualize sick days logged by HeiaHeia.com Finnish users. I got the data from our database, filtering results by country (Finnish users only) in a tab separated form with the first line as the header. Three columns

Read more »

Visualising the Path of a Genetic Algorithm

April 23, 2012
By
Visualising the Path of a Genetic Algorithm

We quite regularly use genetic algorithms to optimise over the ad-hoc functions we develop when trying to solve problems in applied mathematics. However it’s a bit disconcerting to have your algorithm roam through a high dimensional solution space while not being able to picture what it’s doing or how close one solution is to another. … Continue reading...

Read more »

Generating all subsets of a set

April 20, 2012
By
Generating all subsets of a set

Recently I have calculated Banzhaf power index. I required generation of all subsets of a given set. The code given there was a bit complex and I have decided to write a simple function calculating it. As an example of its application I reproduce Figur...

Read more »

An R Script to Automatically download PubMed Citation Counts By Year of Publication

April 19, 2012
By
An R Script to Automatically download PubMed Citation Counts By Year of Publication

Ever wanted to look at PubMed trends and make elegant graphs of them? Here’s an R script that will do it automatically for you.

Read more »

A word cloud where the x and y axes mean something

April 17, 2012
By
A word cloud where the x and y axes mean something

Ok so I have now done two iterations on the a better way to visualize term frequencies using R, ggplot2 and plyr. The first was ok but ugly, the second was better but still ugly. How to read it: Frequency is segmented in to 20% quantiles The frequency ...

Read more »