It’s common knowledge that the way you ask a question in a survey can bias the results you get. (It’s been a staple of political pollsters since the dawn of time.) But Aaron Shaw from Dolores Labs has used an interesting technique to *demonstrate* that bias: crowdsourcing. He asked the same question of crowdsourced respondents assigned randomly to one of two groups, and offered a different way for each group to respond:

*Q: About how many hours do you spend online per day?*

Group 1 selected from these responses:

(a) 0 – 1 hour

(b) 1 – 2 hours

(c) 2 – 3 hours

(d) More than 3 hours

Group 2 selected from these responses:

(a) 0 – 3 hours

(b) 3 – 6 hours

(c) 6 – 9 hours

(d) More than 9 hours

Each set of answers covers the entire range of possible hours in a day, just grouping them into different buckets. In theory, you can estimate the true underlying distribution from either set of responses. Since the groups were selected randomly, the underlying distributions for each group should be the same. With some analysis in R, though, Aaron discovers that’s not the case. See the link below for the details.

Dolores Labs: Ask a Stupid Question

*Related*

To

**leave a comment** for the author, please follow the link and comment on their blog:

** Revolutions**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as:

Data science,

Big Data, R jobs, visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...

**Tags:** R