Blog Archives

My Day at ACM Data Mining Camp III

November 13, 2010
By
My Day at ACM Data Mining Camp III

My first time at ACM Data Mining Camp was so awesome, that I was thrilled the make the trip up to San Jose for the November 2010 version. In July, I gave a talk at the Emerging Technologies for Online Learning Symposium conference with a faculty member in the Department of Statistics, at the Fairmont. The place was amazing,...

Read more »

UCLA Statistics: Analyzing Thesis/Dissertation Lengths

September 29, 2010
By
UCLA Statistics: Analyzing Thesis/Dissertation Lengths

As I am working on my dissertation and piecing together a mess of notes, code and output, I am wondering to myself “how long is this thing supposed to be?” I am definitely not into this to win the prize for longest dissertation. I just want to say my piece, make my point and move on. I’ve heard that...

Read more »

Taking R to the Limit, Part II – Large Datasets in R

August 20, 2010
By
Taking R to the Limit, Part II – Large Datasets in R

For Part I, Parallelism in R, click here. Tuesday night I again had the opportunity to present on high performance computing in R, at the Los Angeles R Users’ Group. This was the second part of a two part series called “Taking R to the Limit: High Performance Computing in R.” Part II discussed ways to work with large datasets...

Read more »

Hitting the Big Data Ceiling in R

May 16, 2010
By
Hitting the Big Data Ceiling in R

As a true R fan, I like to believe that R can do anything, no matter how big, how small or how complicated: there is some way to do it in R. I decided to approach my large, sparse matrix problem with this attitude. But here I sit a broken man. There is no “native” big data support built into...

Read more »

Opening Statements on Markov Chain Monte Carlo

April 1, 2010
By
Opening Statements on Markov Chain Monte Carlo

This quarter I am TAing UCLA’s Statistics 102C. Introduction to Monte Carlo Methods for Professor Qing Zhou. This course did not exist when I was an undergraduate, and I think it is pretty rare to teach Monte Carlo (minus the bootstrap if you count that) or MCMC to undergrads. I am excited about this class because to me, MCMC...

Read more »

My Experience at ACM Data Mining Camp #DMcamp

March 21, 2010
By
My Experience at ACM Data Mining Camp #DMcamp

My parents and I made plans to visit San Jose and Saratoga on my grandmother’s birthday, March 19, since that is where she grew up. I randomly saw someone tweet about the ACM Data Mining Camp unconference that happened to be the next day, March 20, only a couple of miles from our hotel in Santa Clara. This was...

Read more »

Exact Complexity of Mergesort, and an R Regression Oddity

February 13, 2010
By
Exact Complexity of Mergesort, and an R Regression Oddity

It’s nice to be back after a pretty crazy two weeks or so. Let me start off by stating that this blog post is simply me pondering and may not be correct. Feel free to comment on inaccuracies or improvements! In preparation for an exam and my natural tendencies to be masochistic, I am forcing myself to find the exact...

Read more »

Mining Tuition Data for US Colleges and Universities, and a Tangent

January 30, 2010
By
Mining Tuition Data for US Colleges and Universities, and a Tangent

I wrote this script for the UCLA Statistical Consulting Center. I don’t know all of the specifics, but one of our faculty members has this idea that we can help our paper, The Daily Bruin, with their graphics or something to that effect. I don’t quite understand because our paper has never really been big on graphics for data,...

Read more »

Advanced Graphics in R

January 27, 2010
By
Advanced Graphics in R

Each quarter the UCLA Statistical Consulting Center hosts minicourses twice per week in R and LaTeX. Tonight was my turn to present. I presented Advanced Graphics in R. This was the same presentation I gave at the LA R Users’ Group in August will a fellow consultant. She and I had trouble coming together to make one presentation, so we...

Read more »