Monthly Archives: June 2013

-omics in 2013

June 24, 2013
By
-omics in 2013

Just how many (bad) -omics are there anyway? Let’s find out. 1. Get the raw data It would be nice if we could search PubMed for titles containing all -omics: However, we cannot since leading wildcards don’t work in PubMed search. So let’s just grab all articles from 2013: and save them in a format

Read more »

Visualising Crime Hotspots in England and Wales using {ggmap}

June 24, 2013
By
Visualising Crime Hotspots in England and Wales using {ggmap}

Two weeks ago, I was looking for ways to make pretty maps for my own research project. A quick search led me to some very informative blog posts by Kim Gilbert, David Smith and Max Marchi. Eventually, I Google'd the excellent crime weather map exa...

Read more »

Comparing the speed of pqR with R-2.15.0 and R-3.0.1

June 24, 2013
By
Comparing the speed of pqR with R-2.15.0 and R-3.0.1

As part of developing pqR, I wrote a suite of speed tests for R. Some of these tests were used to show how pqR speeds up simple real programs in my post announcing pqR, and to show the speed-up obtained with helper threads in pqR on systems with multiple processor cores. However, most tests in

Read more »

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Introduction Continuing my recent series on exploratory data analysis (EDA), this post focuses on the conceptual foundations of empirical cumulative distribution functions (CDFs); in a separate post, I will show how to plot them in R.  (Previous posts in this series include descriptive statistics, box plots, kernel density estimation, and violin plots.) To give you

Read more »

Merging Data — SAS, R, and Python

June 24, 2013
By
Merging Data — SAS, R, and Python

On analyticbridge, the question was posed about moving an inner join from Excel (which was taking many minutes via VLOOKUP()) to some other package.  The question asked what types of performance can be expected in other systems.  Of the list ...

Read more »

Rcpp 0.10.4

June 24, 2013
By

A new version of Rcpp is now on the CRAN network for GNU R; binaries for Debian have been uploaded as well. This release brings a fairly large number of fixes and improvements across a number of Rcpp features, see below for the detailed list. We a...

Read more »

A beer recommendation system made with R

June 24, 2013
By
A beer recommendation system made with R

If you know a beer you like and want some recommendations for a style of beer to try, check out the yhat Beer Recommender: This neat little app is the product of a recommendation system built using the R language by the folks behind the yhat blog. It's based on about 1.5 million beer reviews from the Beer Advocate....

Read more »

My Stat Bytes talk, with slides and code

June 24, 2013
By

On Thursday of last week I gave a short informal talk to Stat Bytes, the CMU Statistics department‘s twice a month computing seminar. Quick tricks for faster R code: Profiling to Parallelism Abstract: I will present a grab bag of … Continue reading →

Read more »

Opel Corsa Diesel Usage

June 24, 2013
By
Opel Corsa Diesel Usage

I wanted to extend my car weight distribution calculation of June 16 from only 2000 to years 2000 to 2013. Unfortunately, come Sunday afternoon the code seemed too slow and not even the beginning of a post. So, I went on to another calculation I w...

Read more »

Streamline Your Mechanical Turk Workflow with MTurkR

June 24, 2013
By
Streamline Your Mechanical Turk Workflow with MTurkR

I’ve been using Thomas Leeper‘s MTurkR package to administer my most recent Mechanical Turk study—an extension of work on representative-constituent communication claiming credit for pork benefits, with Justin Grimmer and Sean Westwood.  MTurkR is excellent, making it quick and easy to: test … Continue reading →

Read more »