## Forbes: Top 20 influencers in Big Data

February 3, 2012
Haydn Shaughnessy at The Forbes blog provides a list of the "Top 20 Influencers in Big Data", and I'm humbled to report that yours truly is listed there at #2. It's an instantaneous ranking based on the social-media tracking tool Traakr, but it's still great to be listed alongside writers for SiliconAngle, GigaOM, and KDNuggets (and even Mashable!). I...

## New R User Groups in Austin, Adelaide

February 3, 2012
It's awesome to see so many local R user groups kicking off in 2011! Yet another is the Austin R User Group in Austin, Texas. They've already held their first informal get-together, and the first formal meeting on February 23 will be devoted to data management techniques in R. Props to Sandy Donlon for organizing the group! And I'm...

## Why don’t we hear more about Adrian Dantley on ESPN? This graph makes me think he was as good an offensive player as Michael Jordan.

February 3, 2012
In my last post I complained about efficiency not being discussed enough by NBA announcers and commentators. I pointed out that some of the best scorers have relatively low FG% or TS%. However, via the comments it was pointed out that top scorers need ...

## Large search spaces using R

February 3, 2012
I'm working on some really interesting stuff at the moment, the details of which I can't discuss for reasons of national security (not really). However, one of the things I've been doing a lot of is searching though lots of different combina...

## How many pages in Scott Walker Recall Petition PDF files?

February 3, 2012
Computer Assisted Reporting In an online press release on Tuesday the Wisconsin Government Accountability Board announced they would put all 153,335 pages of PDF copies of the Scott Walker recall petition online later that day. The GAB announced the PD...

## Green Disk Sizing

February 3, 2012
I finally got around to completing item 5 on my 2011 list concerning electrical power consumed by a magnetic hard disk drive (HDD). The semi-empirical statement is: Power ∝ Nplatters × Ω2.8 × D4.6    . . .    (1) where Nplatters is the number of platters on the spindle, Ω is the rotational speed in revolutions per minute (RPM) and D...

## Japan Quake Map 2010-2011

February 2, 2012
1 Introduction “The 3.11 Tohoku Earthquake in Japan”, It did serious damage to Japan. I have attempted gaining

## Commonly used R commands (statistics)

February 2, 2012
When I say Ease of Use Improved, I mean you can simply copy, paste and run the codes in this post, without referring to other places, without downloading a data file and read it from R. This is how I like a blog article to be. You don’t need to read the whole article. You

## speed of R, C, &tc.

February 2, 2012
My Paris colleague (and fellow-runner) Aurélien Garivier has produced an interesting comparison of 4 (or 6 if you consider scilab and octave as different from matlab) computer languages in terms of speed for producing the MLE in a hidden Markov model, using EM and the Baum-Welch algorithms. His conclusions are that matlab is a lot

February 2, 2012
## Confirming SSR, SSE, and SST using matrix in R

February 1, 2012
The codes below was done in our regression laboratory class. Here, we run first the data in SPSS, and take the ANOVA output where we can find the computed values of SSR, SSE, and SST.ANOVAb Model Sum of Squares df Mean Square F Sig. 1 Regress...