# Anecdotal Evidence that Facebook Stores all Clicks?

April 11, 2010
By

(This article was first published on Byte Mining, and kindly contributed to R-bloggers)

This is not really news. A few months ago, news broke that Facebook recorded each user’s clicks and profile views in a database. Of course, I am not at all surprised. I would be more surprised if they didn’t store every single click.

By now, most people have some sense as to how Facebook’s recommendation system works. It typically performs what one of my professors called “completing the triangle.” If users $A$ and $B$ are friends, and users $B$ and $C$ are friends, Facebook may hypothesize that $A$ and $C$ should also be friends. Of course, Facebook’s algorithm is not that naive. Consider a slightly more realistic example in the graph below. I must provide a picture, otherwise I will end up using “recursive language” (i.e. “friends of a friend of a friend that’s friends with…”). The red lines represent existing friendships. This graph consists of two triangles, one containing one man and the two women, and another containing one woman and the two men. Facebook would most likely conclude that the two people with spiky hair should be friends, denoted by the green dashed line.

On Facebook, I am a member of several different network clusters, as most people are. Some include high school friends, dormmates from two different schools, colleagues in the same college major, graduate school friends, and coworkers from a job I worked at in college. I will focus my discussion on this last group. This job was a very “social” job in that it was a lot of fun, and socializing with patrons and coworkers was essential. In other words, it was not a cubicle, professional type position. Anyway, most of us are friends on Facebook, and this group is/was very tight in a social networking sense. On last check, a random coworker and I have 50 Facebook friends in common. In this subgraph, a lack of a connection is more telling than a connection. Typically, if someone is not friends with another person in the cluster, there is a story behind it.

For me, there was one person in particular, a supervisor, so we were not friends on Facebook. This guy and I shared not only the friends from this job, but also several other friends from the guys I lived with my senior year in college. Of anybody out there on Facebook, he and I probably have the most friends in common. Naturally, he should have been the first person displayed to me in the “People you May Know” system; his ranking should have been sky high.

So what does the ranking take into account? I have no clue. Of course network position and common ties are both very important. Two important questions come to mind:

I am not blocked, so why did he not show up for so long?

• Since my communication is more rampant with other clusters reflecting the current, perhaps he and other people I “possibly know” were down-weighted.
• Perhaps this cluster is partitioned into sub-clusters and since I avoided he and his sub-cluster (other supervisors), our connection was down-weighted.
• We have little, or nothing in common except for friends, and so he was down-weighted.

Why did he show up on that fateful day? Or, The Plot Thickens (the point of this post)

One typical day I logged onto Facebook and saw his face in the “People You May Know” section of the screen and the fact that we had 60 something friends in common. I will call him Bill. I saw that he now how works for the same employer as my father. I did not know this until now, and neither did my dad. My dad’s coworkers are very tight knit despite the size of the organization. I told him about Bill.

The next day when my dad called me, he said, “you are never going to guess who I just met today. Bill!” In this profession, people frequently move around and work at other locations, particularly for overtime days. They typically discuss shift information with each other, and know who is the “supervisor” at each location. More likely than not, Bill put 2 and 2 together. My last name isn’t terribly common (as a last name) in the US. He thought perhaps we are from the same family. He then goes on to Facebook, looks at my profile and see if he can find any information matching my dad and I to confirm his suspicion. Facebook then recorded this click/profile view, combined it with network position metrics, and bingo he ended up being recommended to me.

This is no Earth-shattering blog post, but it was definitely an “Oh wow!” moment with respect to Facebook.

The icons used in the social graph in this post are from Ikonka.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...