Articles by Capehart

Data Science Undefined

April 4, 2012 | Capehart

One of the favorite bar room discussions of statisticians, machine learners, and computer scientists is – what is data science? (And I don’t care whether it happens in a bar or not, it’s a “bar room” discussion by virtue of... [Read more...]

Radical Education Reform? Think Bigger.

April 2, 2012 | Capehart

“My job is to teach you how to think.” –Hugh Young A few days ago John Naughton published an article summarizing his manifesto on how to reform computer science education. I agree computer science education is in need of drastic... [Read more...]

Missing Data Club

April 1, 2012 | Capehart

Welcome to Missing Data Club. There are only three rules. Rule #1 is: There is no missing data. Rule #2 is: THERE IS NO MISSING DATA! Rule #3: If you’ve never built a model using missing data – you must do it... [Read more...]

A Crash Course in git for Data Scientists

March 10, 2012 | Capehart

I really like git. It’s the first versioning tool I’ve ever used so I have nothing else to compare it to, but in the world of statistical model building where iteration is constant (and almost never a strict linear progression)... [Read more...]

Get ROAuth to work on Windows 7

March 10, 2012 | Capehart

Jeff Gentry has created a couple of really fun and handy R packages for working with Twitter data called twitteR and ROAuth. He’s also written an easy to read vignette on how to get started. As of right now (March... [Read more...]

Thoughts on SPSS and R Integration

March 10, 2012 | Capehart

As part of considering SPSS as a platform for modeling I wanted to test SPSS’ integration with R. What I found out is getting SPSS to work with R isn’t embarssingly obvious. What’s worse I found it quite difficult to... [Read more...]

