PCA or Polluting your Clever Analysis

August 31, 2012 | Christoph Molnar

When I learned about principal component analysis (PCA), I thought it would be really useful in big data analysis, but that's not true if you want to do prediction. I tried PCA in my first competition at kaggle, but it delivered bad results. This post illustrates how PCA can pollute ...
[Read more...]

Publishing in Veterinary Academic Journals

May 10, 2011 | denishaine

Following the post by Arthur Charpentier (Freakonometrics), I wondered what would be the outcome considering my current engagement (veterinary medicine, epidemiology, bovine mastitis). Briefly, Arthur Charpentier’s post looked at clusters of journals publishing the same kind of papers. So I looked at 25 journals (Journal of Dairy Science, Canadian Journal ... [Read more...]


March 13, 2011 | Edwin Chen

Aaron Koblin’s Sheep Market visualization is an awesome use of Mechanical Turk. But it’d be even more awesome if the grid were ordered, so inspired by the use of eigenfaces in facial recognition, I decided to try projecting the sheep … Continue reading →
[Read more...]

Clustering NHL Skaters

February 6, 2011 | --

I have been sitting on this post for some time now and wanted to get it out there.  The goal is to simply show how easy it is to pull live data from the web into R, massage it, and perform some analytics on it.  I am not sure how ... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)