Posts Tagged ‘ data mining ’

Iris Data Set Visualization Web App in < 100 LOC

August 7, 2010
By
Iris Data Set Visualization Web App in < 100 LOC

The iris data set pops up pretty regularly in statistical literature.  It consists of 50 records from three species of Iris flowers (Iris setosa, Iris virginica and Iris versicolor).   I came across it recently while reading

Read more »

How Google and Facebook are using R

August 4, 2010
By
How Google and Facebook are using R

This is an older (2009) video from the kickoff meeting of the San Francisco Bay Area R Users Group. It was a panel discussion within the Predictive Analytics World conference. Video courtesy by Ron Fredericks of LectureMaker (click on the … Continue reading →

Read more »

An experiment in A/B Testing my Résumé

July 1, 2010
By
An experiment in A/B Testing my Résumé

Objective I’ll admit it: my résumé doesn’t stand out. I’ve had some great internships, but also a tendency to work for companies that aren’t (yet!) household names. And though I’m doing fine academically, it’s not well enough to stand out … Continue reading →

Read more »

My Experience at Hadoop Summit 2010 #hadoopsummit

June 30, 2010
By
My Experience at Hadoop Summit 2010 #hadoopsummit

This week I had the opportunity the trek up north to Silicon Valley to attend Yahoo’s Hadoop Summit 2010. I love Silicon Valley. The few times I’ve been there the weather was perfect (often warmer than LA), little to no traffic, no road rage and people overall seem friendly and happy. Not to mention there are so many trees...

Read more »

Why R doesn’t suck

June 19, 2010
By

I first encountered the R programming language a few years ago when I needed to make some plots. Although I’ve used it occasionally since, I always considered it a sort of “Perl for statisticians” — a useful swiss-army knife with … Continue reading →

Read more »

Data Mining with WEKA example implemented in R

June 9, 2010
By
Data Mining with WEKA example implemented in R

IBM Developer Works has several new articles on Data Mining with WEKA by Michael Abernethy. I decided to implement the example provided in the first article in the series using R. I realize that I could have used WEKA through R (using the RWeka packa...

Read more »

Some Code for Dumping Data from Twitter Gardenhose

March 30, 2010
By

Gardenhose is a Streaming API feed that continuously sends a sample (roughly 15% according to Ryan Sarver at the 140tc in September 2009) of all tweets to feed recipients. This is some code for dumping the tweets to files named by date and hour. It is in PHP which is not my favorite language, but works nonetheless. I received...

Read more »

Data Visualization and R Programming Books (Updated)

January 7, 2010
By

Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit Download Code Like A Pirate - The #rstats Appliance from the SUSE Gallery Disclosure As you probably ...

Read more »