Survey question biases and crowdsourcing

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It’s common knowledge that the way you ask a question in a survey can bias the results you get. (It’s been a staple of political pollsters since the dawn of time.) But Aaron Shaw from Dolores Labs has used an interesting technique to demonstrate that bias: crowdsourcing. He asked the same question of crowdsourced respondents assigned randomly to one of two groups, and offered a different way for each group to respond:

Q: About how many hours do you spend online per day?

Group 1 selected from these responses:

(a) 0 – 1 hour

(b) 1 – 2 hours

(c) 2 – 3 hours

(d) More than 3 hours

Group 2 selected from these responses:

(a) 0 – 3 hours

(b) 3 – 6 hours

(c) 6 – 9 hours

(d) More than 9 hours

Each set of answers covers the entire range of possible hours in a day, just grouping them into different buckets. In theory, you can estimate the true underlying distribution from either set of responses. Since the groups were selected randomly, the underlying distributions for each group should be the same. With some analysis in R, though, Aaron discovers that’s not the case. See the link below for the details.

Dolores Labs: Ask a Stupid Question

To leave a comment for the author, please follow the link and comment on their blog: Revolutions. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)