Analyzing Multiple Response Questions

[This article was first published on R | Fahim Ahmad, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There are at least two main approaches for storing multiple response questions in a data set. The indicator mode and the polytomous mode.

Polytomous mode

The polytomous mode is suitable when the possible response categories are not fixed and the responses are recorded according to their order of appearance.

For example, consider the following question.

What are your favorite statistical software package? (allow up to two answers)
Response 1: _____________
Response 2: _____________

The collected data would then look like the following:

polytomous <- data.frame(
response1 = c("R", "Python","R", "Stata", "SPSS","Python","Stata","SPSS","R"),
response2 = c("Python", "R", "Stata", "SPSS", "R", "R", "R", "Stata", "SPSS")
)
polytomous
## response1 response2
## 1 R Python
## 2 Python R
## 3 R Stata
## 4 Stata SPSS
## 5 SPSS R
## 6 Python R
## 7 Stata R
## 8 SPSS Stata
## 9 R SPSS

Indicator mode

The indicator mode refers to the situation where the data are stored as a set of indicator variables/columns. Consider the above question as following:

Which of the followings are your favorite statistical packages?
a) R
b) Stata
c) Python
d) SPSS

The most straightforward way to store the above data is to construct a set of indicator or dummy variables.

One variable/column for each response.

In this case, the data would like the following:

indicator <- data.frame(
R = c("Yes","Yes","Yes", "No", "Yes","Yes","Yes","No","Yes"),
Stata = c("No","No","Yes","Yes","No","No","Yes","Yes","No"),
Python = c("Yes","Yes","No","No","No","Yes","No","No","No"),
SPSS = c("No","No","No","Yes","Yes","No","No","Yes","Yes")
)
indicator
## R Stata Python SPSS
## 1 Yes No Yes No
## 2 Yes No Yes No
## 3 Yes Yes No No
## 4 No Yes No Yes
## 5 Yes No No Yes
## 6 Yes No Yes No
## 7 Yes Yes No No
## 8 No Yes No Yes
## 9 Yes No No Yes

Analyzing multiple response questions

If the data are stored as indicator mode, the common way is to tabulate each variable separately.

# install.packages("dplyr")
library(dplyr)
# R
# round(prop.table(table(indicator$R))*100,1)
count(indicator, R, name = "Freq") %>% mutate(Percent = round(Freq/sum(Freq)*100, 1))
# Stata
# round(prop.table(table(indicator$Stata))*100,1)
count(indicator, Stata, name = "Freq") %>% mutate(Percent = round(Freq/sum(Freq)*100, 1))
# Python
# round(prop.table(table(indicator$Python))*100,1)
count(indicator, Python, name = "Freq") %>% mutate(Percent = round(Freq/sum(Freq)*100, 1))
# SPSS
# round(prop.table(table(indicator$SPSS))*100,1)
count(indicator, SPSS, name = "Freq") %>% mutate(Percent = round(Freq/sum(Freq)*100, 1))
## R Freq Percent
## 1 No 2 22.2
## 2 Yes 7 77.8
## Stata Freq Percent
## 1 No 5 55.6
## 2 Yes 4 44.4
## Python Freq Percent
## 1 No 6 66.7
## 2 Yes 3 33.3
## SPSS Freq Percent
## 1 No 5 55.6
## 2 Yes 4 44.4

If the data are stored as polytomous mode even a simple descriptive analysis like tabulating frequency distributions could be quite tricky and complicated. In this case, we can use calc_cro() function from the expss package for tabulating multiple response questions.

# install.packages(expss)
library(expss)
# Frequency
calc_cro(polytomous, mrset(response1 %to% response2), total(label = "Freq"))
 Freq 
 Python  3
 R  7
 SPSS  4
 Stata  4
 #Total cases  9
# Percent of responses
calc_cro_cpct_responses(polytomous, mrset(response1 %to% response2), total_row_position = "none", total(label = "Percent of responses"))
 Percent of responses 
 Python  16.7
 R  38.9
 SPSS  22.2
 Stata  22.2
# Percent of cases
calc_cro_cpct(polytomous, mrset(response1 %to% response2), total_row_position = "none", total(label = "Percent of cases"))
 Percent of cases 
 Python  33.3
 R  77.8
 SPSS  44.4
 Stata  44.4
To leave a comment for the author, please follow the link and comment on their blog: R | Fahim Ahmad.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)