December 14, 2010

Paul Butler, an intern on Facebook’s data infrastructure engineering team, was interested in visualizing the "locality of friendship". Luckily, he has some great data to work with: Facebook's social network of the friendships between its 500 million members. But visualizing that much data can be a challenge in its own right — it takes skill to draw meaning from what could easily be an incomprehensible mess of data. After drawing a sample of 10 million friend pairs from the Hive interface to Facebook's Hadoop-based database, Paul set to using R to solve this visualization problem. 

As Paul describes in a post on his Facebook page, initial attempts to visualize the data resulted in a "big white blob" roughly resembling the outline of the continents. But when Paul switched from plotting every friend pair to instead plotting every city pair with a great-circle line whose transparency was determined by the number of friend-pairs in those cities, something beautiful emerges: a clear image of the world, with friendship bonds flowing between the continents:


(Click to enlarge, or visit Paul's post to download a super hi-res version.) 

This is a beautiful image, and a testament to Paul's visualization skills (with a little help from the graphical prowess of R). Not only can you see the population centers in bright white (from the density of intra-city friendships), you can also see clear country outlines: look how visible India is, floating in the dark void of China and Russia. You can also see cultural relationships: Hawaii to the continental US; Australia to New Zealand; India to the UK. (The latter's a bit hard to see, though — it would be fascinating to see this in a 3-D globe form, with relationships flowing through the middle of the globe.) Paul sums up the impact of this visualization best in his own words:

After a few minutes of rendering, the new plot appeared, and I was a bit taken aback by what I saw. The blob had turned into a surprisingly detailed map of the world. Not only were continents visible, certain international borders were apparent as well. What really struck me, though, was knowing that the lines didn't represent coasts or rivers or political borders, but real human relationships. Each line might represent a friendship made while travelling, a family member abroad, or an old college friend pulled away by the various forces of life.

Mashable also has a nice review of this chart … with 4,780 Facebook Likes at the time of writing. It was also featured on FlowingData and ReadWriteWeb.

Paul Butler: Visualizing Friendships


