October 27, 2010
By

(This article was first published on Zero Intelligence Agents » R, and kindly contributed to R-bloggers)

Last week I participated in bit.ly’s fourth hackabit hack-a-thon, which is a wonderful opportunity for NYC area hackers to get together, eat pizza, drink energy drinks, and stay up late hacking with some of the best data geeks around. I was lucky enough to saddle up next to Hilary Mason, bit.ly’s lead scientist, recently named one of New York’s 100 Coolest Tech People by Business Insider and all-around awesome hacker. We thought it would be fun to plot the locations from which people were sharing links about New York City.

With Hilary’s exclusive access to bit.ly’s massive data set, she scraped a random sample of shared links about NYC and we parsed the geo-location data; she in Python while me in R—just for fun. With the latitude and longitude data in hand, I set off to generate some visualizations of the where the shared links were coming from. Below are three of those visualizations with increasing focus on New York City.

As should be no surprise, this small sample shows that New Yorkers share the most links about New York, and I suspect that a larger sample would only further reinforce this. Plotted points are sized by the relative number of links coming from that location; with a modal frequency of one shared link per lat/long.

Interestingly, it appears that that people in the San Francisco Bay Area and the Washington, DC metropolitan area are the next most frequent sharers. All that talk about how the New York tech scene will never be as good as the Bay Area’s, but yet it seems many people in that area find our city interesting enough to share links about it.

Here I have zoomed in further on the Northeastern/Tri-state area of the U.S. to bring New York City further into focus. It was fun playing with different map projections to literally bend the world around NYC to make it the center of this map’s universe. Using ggplot2 and the `coord_map` grammar set options to `projections="lambert", lat0=-75, lat1=-72` for this New York-centric perspective.

At further magnification we see that it appears no links are being shared from the island of Manhattan. While this is quite possible, it is also likely due to a lack of accuracy in the geo-location.

A quick and dirty data visualization project is the perfect thing to dig into at these hack-a-thons, and I encourage others to come to future events. I am sure that bit.ly will host a hackabit five, but in the meantime check out the Great Urban Hack, happening on November 6th. Hope to see you there!

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Tags: , , ,