People are weird. And if there’s anything that’s greater collective proof of this fact than Reddit, you’d be hard pressed to find it.I tend to put reddit in the same bucket as companies like Google, Amazon and Netflix, where they have enough money, or freedom, or both, to say something like “wouldn’t it be cool if….?” and then they do it simply because they can. Enter “the button” (/r/thebutton), reddit’s great social experiment that appeared on April Fool’s Day of this year. An enticing blue rectangle with a timer that counts down from 60 to zero that’s reset when the button is pushed, with no explanation as to what happens when the time is allowed to run out. Sound familiar? The catch here being that it was an experience shared by anyone who visited the site, and each user also only got one press (though many made attempts to game the system, at least initially). Finally, the timer reached zero, the last button press being at 2015-06-05 21:49:53.069000UTC, and the game (rather anti-climactically I might offer) ended. What does this have to do with people being weird? Well, an entire mythology was built up around the button, amongst other things. Okay, maybe interesting is a better word. And maybe we’re just talking about your average redditor. Either way, what interests me is that when the experiment ended, all the data were made available. So let’s have a look shall we?
The dataset consists of simply four fields:
press time, the date and time the button was pressed
flair, the flair the user was assigned given at what the timer was at when they pushed the button
css, the flair class given to the user
and lastly outage press, a Boolean indicator as to if the press occurred during a site outage.
The data span a time period from 2015-04-01 16:10:04.468000 to 2015-06-05 21:49:53.069000, with a total of 1,008,316 rows (unique presses).
I found there was css missing for some rows, and a lot of of “non presser” flair (users who were not eligible to press the button as their account was created after the event started). For these I used a “missing” value of -1 for the number of seconds remaining when the button was pushed; otherwise it could be stripped from the css field.
With this data set, we’re looking at a pretty straightforward categorical time series.
Overall Activity in Time
First we can just look at the total number of button presses, regardless of what the clock said (when they occurred in the countdown) by plotting the raw number of presses per day:
We can see this if we plot a histogram of the button presses by using the CSS class which gives the more granular seconds remaining, and use breaks the same as above:
as far as I can tell). You can see that plotting a bar graph is similar to the above histogram with the difference being contained in the “cheater” class:
small multiples for more clarity, but it’s not the easiest to do using ggplot.
Another way to look at it would be to look at the percent of presses by class per day. I’ve written previously about how stacked area graphs are not your friend, but in this case it’s actually not too bad (plus I wanted to learn how to do it in ggplot). If anything it shows the increase presses in the 51-60 range right after the outage on May 18, and the increase in the 0-10 range toward the end (green):
Hmmm… you can see there is a massive spike at the beginning of the graph and there’s much, much fewer for the rest of the duration of the experiment. In fact, nearly 32% of all clicks occurred in the first day, and over half (51.3%) in the first two days.
I think has something to do with both the initial interest in the experiment when it first was announced, and also with the fact that the higher the counter is kept at, the more people can press the button in the same time period (more on this later).
Perhaps a logarithmic graph for the y-axis would be more suitable?
That’s better. We can see the big drop-off in the first two days or so, and also that little blip around the 18th of May is more apparent. This is likely tied to one of several technical glitches which are noted in the button wiki,For a more granular look, let’s do the hourly presses as well (with a log scale): Universal Time. Unfortunately there is no geo data, but assuming most redditors who pushed the button are within the continental United States (a rather fair assumption) the high between 5-7 PM would be 11 AM to 1 PM (so, around your lunch hour at work). But wait, that was just the overall sum of hours over the whole time period. Is there a daily pattern? What about by hour and day of week? Are most redditors pushing the button on the weekend or are they doing it at work (or during school)? We should look into this in more detail. flair classes served as a bucketing which functioned very much like a histogram:
|Color||Have they pressed?||Can they press?||Timer number when pressed|
|Purple||Y||N||60.00 ~ 51.01|
|Blue||Y||N||51.00 ~ 41.01|
|Green||Y||N||41.00 ~ 31.01|
|Yellow||Y||N||31.00 ~ 21.01|
|Orange||Y||N||21.00 ~ 11.01|
|Red||Y||N||11.00 ~ 00.00|
The button experiment is over. In the end there was no momentous meaning to it all, no grand scheme or plan, no hatch exploding into the jungle, just an announcement that the thread would be archived. Again, somewhat anti-climactic.
But, it was an interesting experiment. This was an interesting data set, given the relationship between the amount of data that could exist in the same interval of time because of the nature of it.
And I think it really says something about what the internet allows us to do (both in terms of creating something simply for the sake of it, and collecting and analyzing data), and also about people’s desire to find patterns and create meaning in things, no matter what they are. If you’d asked me, I never would have guessed religions would have sprung up around something as simple as pushing a button. But then again, religions have sprung up around stranger things.
You can read and discuss in the button aftermath thread, and if you want to have a go at it yourself, the code and data are below. Until next time I’ll just keep pressing on.
References & Resources
the button press data (from reddit’s github)
R code for plots