Blog Archives

Code for Machine Learning for Hackers

February 16, 2012
By

With the release of the eBook version of Machine Learning for Hackers this week, many people have been asking for the code. With good reason—as it turns out—because O’Reilly still (at the time of this writing) has not updated the book page to include a link to the code. For those interested, my co-author John Myles

Read more »

Create an animated clock in R with ggplot2 (and ffmpeg)

August 12, 2011
By

Because it’s Friday—and I needed to create this for a separate visualization—here is how to create an animated clock in R using ggplot2.In just about 20 lines of code! And here is the clock…I think this is a nifty way to show time elapse, rather than the windowed timelines I had used previously.

Read more »

Clustering U.S. Senators using roll call voting data

July 22, 2011
By
Clustering U.S. Senators  using roll call voting data

For our forthcoming book on machine learning for hackers, John Myles White and I will discuss clustering, and various methods for doing so. One common method for clustering observations

Read more »

A year of Chicago’s crime, in 30 seconds

June 21, 2011
By

Yesterday Brett Goldstein, the Chief Data Officer for the City of Chicago, announced on Twitter the release of Chicago’s crime data for the past year. The data is very detailed, and wonderful resource for criminologist and social scientists alike. I have been playing around with the data a bit, and have produced an animation

Read more »

stalkR: R functions for exploring iPhone and iPad (OS X only)

April 21, 2011
By
stalkR: R functions for exploring iPhone and iPad (OS X only)

Yesterday Alasdair Allan and Pete Warden shocked the world by revealing that iPhones and iPads have been keeping track of our every move, and saving the data in obfuscated back up files. As my friend Vince Buffalo mentioned on Twitter, part of me was disgusted by the secret stalking Steve Jobs was doing, but my

Read more »

EC2 AMI for scientific computing in Python and R

April 11, 2011
By

Like many people who crunch numbers frequently, I have increasingly been integrating Amazon’s cloud computing services into my daily workflow. In particular, I have been using their elastic cloud computing (EC2) on a regular basis. The service is an excellent way to offload computationally intensive work from your laptop for literally pennies on the

Read more »

Updated infochimps R package, includes several new APIs

March 21, 2011
By

Recently, the good folks at Infochimps.com rolled out a series of new APIs to add to their already impressive set of data resources. I have been in a perpetual state of catch-up since the new year, so I have only now got around to adding some of these new APIs to the infochimps R package. Here

Read more »

Happy Pi Day, Now Go Estimate It!

March 14, 2011
By
Happy Pi Day, Now Go Estimate It!

As you may know, today is Pi Day, when all good nerds take a moment to thank the geeks of antiquity for their painstaking work in estimating this marvelous mathematical constant. It is also a great opportunity to thank contemporary geeks for the wonders of modern computing, which allow us to estimate pi to near

Read more »

Amanda Cox on How The New York Times Graphics Department Uses R

March 14, 2011
By

Last month, Amanda Cox from The New York Times Graphic Department gave a great talk to the NYC R Statistical Programming Meetup. I’ve just got around to uploading the video, which has been broken into a part one and part two. You can also view the videos embedded after the jump. Amanda made use of

Read more »

Language used by Academics with the Protection of Anonymity

March 14, 2011
By
Language used by Academics with the Protection of Anonymity

Those in the political science discipline probably remember their first encounter with poliscijobrumors.com. For those outside, you have probably never heard of this particular message board, and you would have no reason to. As the URL suggests, the board specializes in rumor, gossip, back-bitting, mudslinging, and the occasional lucid thread on the political science

Read more »