Posts Tagged ‘ data mining ’

My Experience at Hadoop Summit 2010 #hadoopsummit

June 30, 2010
By
My Experience at Hadoop Summit 2010 #hadoopsummit

This week I had the opportunity the trek up north to Silicon Valley to attend Yahoo’s Hadoop Summit 2010. I love Silicon Valley. The few times I’ve been there the weather was perfect (often warmer than LA), little to no traffic, no road rage and people overall seem friendly and happy. Not to mention there are so many trees...

Read more »

Why R doesn’t suck

June 19, 2010
By

I first encountered the R programming language a few years ago when I needed to make some plots. Although I’ve used it occasionally since, I always considered it a sort of “Perl for statisticians” — a useful swiss-army knife with … Continue reading

Read more »

Data Mining with WEKA example implemented in R

June 9, 2010
By
Data Mining with WEKA example implemented in R

IBM Developer Works has several new articles on Data Mining with WEKA by Michael Abernethy. I decided to implement the example provided in the first article in the series using R. I realize that I could have used WEKA through R (using the RWeka packa...

Read more »

Opening Statements on Markov Chain Monte Carlo

April 1, 2010
By
Opening Statements on Markov Chain Monte Carlo

This quarter I am TAing UCLA’s Statistics 102C. Introduction to Monte Carlo Methods for Professor Qing Zhou. This course did not exist when I was an undergraduate, and I think it is pretty rare to teach Monte Carlo (minus the bootstrap if you count that) or MCMC to undergrads. I am excited about this class because to me, MCMC...

Read more »

Some Code for Dumping Data from Twitter Gardenhose

March 30, 2010
By

Gardenhose is a Streaming API feed that continuously sends a sample (roughly 15% according to Ryan Sarver at the 140tc in September 2009) of all tweets to feed recipients. This is some code for dumping the tweets to files named by date and hour. It is in PHP which is not my favorite language, but works nonetheless. I received...

Read more »

My Experience at ACM Data Mining Camp #DMcamp

March 21, 2010
By
My Experience at ACM Data Mining Camp #DMcamp

My parents and I made plans to visit San Jose and Saratoga on my grandmother’s birthday, March 19, since that is where she grew up. I randomly saw someone tweet about the ACM Data Mining Camp unconference that happened to be the next day, March 20, only a couple of miles from our hotel in Santa Clara. This was...

Read more »

Mining Tuition Data for US Colleges and Universities, and a Tangent

January 30, 2010
By
Mining Tuition Data for US Colleges and Universities, and a Tangent

I wrote this script for the UCLA Statistical Consulting Center. I don’t know all of the specifics, but one of our faculty members has this idea that we can help our paper, The Daily Bruin, with their graphics or something to that effect. I don’t quite understand because our paper has never really been big on graphics for data,...

Read more »

Data Visualization and R Programming Books (Updated)

January 7, 2010
By

Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit Download Code Like A Pirate - The #rstats Appliance from the SUSE Gallery Disclosure As you probably ...

Read more »

Data Mining and R

October 24, 2009
By
Data Mining and R

This post lists a few data mining resources in R. I also provide a few observations on the distinction between data mining, data analysis, and statistics as it pertains to the analysis work that I do in psychology.Online ResourcesThe classic book The E...

Read more »

Data Mining and Statistics Video Course

October 3, 2009
By
Data Mining and Statistics Video Course

David Mease has an online course presented with complete videos (Statistics 202: Statistical Aspects of Data Mining ). The course uses Excel and R.I might update this post with a few notes below on what is covered as I get a chance to watch t...

Read more »