Did what you write drive what I read?

[This article was first published on Timothée Poisot » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

GoogleReader allows you to track your activity, by representing the number of news items read and published by day and by hour. I use it quite a lot to stay up to date with the scientific literature (I subscribed to probably over 30 journals) and a bunch of other feeds. Stuff tend to accumulate faster in my unread folder, so I often have a lot of material to read.

I was wondering if the two patterns (what is published vs. what I read) share a common temporal dynamic. Using the digitize package and a screenshot, I was able to get my own data for the last 30 days, and an overview of the pattern by hours (for the same period).

My big question was: is there some days of the weeks, or hours of the day, at which I read more than the other publish? Using R and ggplot2, it was really easy to visualize the results. Below are the results sorted by day of the week (where 1 is monday).

activity-byday.png

It tend to confirm what I already observed. I tend to let stuff accumulate over the week, and go through it from friday night to sunday. The fact that the number of items published and read are the same on the week-end can indicate that I am more prone to read news « as they happen ». Perhaps the fact that I spent all my week-ends for the last month in the lab, and that my RSS feeds were a good source of entertainment while waiting for bacteria to grow can explain it.

Now that it is established that I read stuff « as it happens » on the week-end, and less so during the week, let’s focus on the hourly pattern.

activity-byhour.png

As I read it, this graph is a summation of all the events that occurred at a given time for a 30 days period, so there could be a potential bias if my daily pattern changes between week and week-end. Anyway. I am usually asleep between 1am and 7 to 8am, where there is a strong decrease in my reading activity. As a matter of fact, there are less published items in the same time (probably because european feeds are quite numerous in my google reader).

My strongest period of activity is between 5pm and 9pm, and overall my reading pattern is similar to the publishing pattern. But are the residuals evenly distributed? As shown below, no, they are not.

activity-byhour-2.png

I tend to read more around noon (usually a much more casual time in the lab) and after 8pm, i.e. when I’m back home.

So, all in all, I guess the message is that the volume of published news somehow impacts my reading pattern (but otherwise I would spend hours going through stuff), but the divergence between what I am actually reading and what I « should » be reading based on the publication volume are not evenly distributed through time.

To leave a comment for the author, please follow the link and comment on their blog: Timothée Poisot » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)