Blog Archives

Playing with earthquake data

March 28, 2013
By

(This article was first published on Digithead's Lab Notebook, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: Digithead's Lab Notebook. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git,...

Read more »

Data analysis class

February 7, 2013
By
Data analysis class

I've been writing software to help others do data analysis for a number of years and at the same time trying to work up my nerve to try my own analysis. Why let other people have all the fun? So, when I saw that Jeffrey Leek, biostatistician at Johns Hopkins and coauthor of Simply Statistics, was teaching...

Read more »

R in the Cloud

December 6, 2012
By
R in the Cloud

I've been having some great fun parallelizing R code on Amazon's cloud. Now that things are chugging away nicely, it's time to document my foibles so I can remember not to fall into the same pits of despair again.

The goal was to perform lots of trails of a randomized statistical simulation. The jobs were independent and fairly chunky, taking...

Read more »

Feature selection and linear modeling

October 27, 2012
By

(This article was first published on Digithead's Lab Notebook, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: Digithead's Lab Notebook. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git,...

Read more »

Computing kook density in R

September 24, 2012
By
Computing kook density in R

Do you ever see strange lights in the sky? Do you wonder what really goes on in Area 51? Would you like to use your R hacking skills to get to the bottom of the whole UFO conspiracy? Of course, you would! UFO data from infochimps is the focus of a dat...

Read more »

OO in R

September 13, 2012
By
OO in R

The R Project

"Is there a package for obfuscating code in #rstats?", someone asked. "The S4 object system?!" came the snarky reply. If you're smiling right now, you know that it wouldn't be funny if it weren't at least a little bit true.

Options: S3, S4 or R5?

There can be little doubt that object oriented...

Read more »

Linear regression by gradient descent

July 26, 2012
By
Linear regression by gradient descent

In Andrew Ng's Machine Learning class, the first section demonstrates gradient descent by using it on a familiar problem, that of fitting a linear function to data.

Let's start off, by generating some bogus data with known characteristics. Let's make y just a noisy version of x. Let's also add 3 to give the intercept term something to...

Read more »

Long-vector kludge in R

July 25, 2012
By
Long-vector kludge in R

Just recently, I found out that R is limited to 32-bit integers, even on 64-bit hardware. Bummer, huh? As a consequence, the maximum size of a vector is 2^31-1. To be fair, dealing with numeric types across machine architectures is hard. A fixed repr...

Read more »

Sage Bionetworks Synapse

April 27, 2012
By
Sage Bionetworks Synapse

Michael Kellen, Director of Technology at Sage Bionetworks, is trying to build a GitHub for science. It's called Synapse and Kellen described it in a talk at the Sage Bionetworks Commons Congress 2012, this past weekend: 'Synapse' Pilot for Building an...

Read more »

International Open Data Hackathon

December 5, 2011
By
International Open Data Hackathon

This past Saturday, I hung out at the Seattle branch of the International Open Data Hackathon. The event was hosted at the Pioneer Square office of Socrata, a small company that helps governments provide public open data.

A pair of data analysts from Tableau were showing off a visualization for the Washington...

Read more »