Creating the experimental design for a max-diff experiment is easy in R. This post describes how to create and check a max-diff experimental design. If you are not sure what this is, it would be best to read A beginner’s guide to max-diff first.
Step 1: Installing the packages
The first step is to install the flipMaxDiff package and a series of dependent packages. Depending on how your R has been setup, you may need to install none of these (e.g., if using Displayr), or even more packages than are shown below.
install.packages("devtools") library(devtools) install.packages("AlgDesign") install_github("Displayr/flipData") install_github("Displayr/flipTransformations") install.packages("Rcpp") install_github("Displayr/flipMaxDiff")
Step 2: Creating the design
The MaxDiffDesign function is a wrapper for the optBlock function in the wonderful AlgDesign package. The following snippet can be used to create a design. The arguments used in the code snippet here are described immediately below.
library(flipMaxDiff) MaxDiffDesign(number.alternatives = 10, alternatives.per.question = 5, number.questions = 6, n.repeats = 1)
- number.alternatives: The total number of alternatives considered in the study. In my technology study, for example, I had 10 brands, so I enter the number of alternatives as 10.
- alternatives.per.question: The number of alternatives shown to the respondents in each individual task. I tend to set this at 5. Where I have studies where the alternatives are wordy, I like to reduce this to 4. Where the alternatives are really easy to understand, I have used 6. The key trade-off here is cognitive difficulty for the respondent. The harder the questions, the more likely people are to not consider them very carefully.
- number.questions: The number of questions (i.e., tasks or sets) to present to respondents. A rule of thumb provided by the good folks at Sawtooth Software states the ideal number of questions: 3 * number.alternatives/alternatives.per.question. This would suggest that in the technology study, I should have used 3 * 10 / 5 = 6 questions, which is indeed the number that I used in the study. There are two conflicting factors to trade off when setting the number of questions. The more questions, the more respondent fatigue, and the worse your data becomes. The fewer questions, the less data, and the harder it is to work out the relative appeal of alternatives that have a similar level of overall appeal. I return to this topic in the discussion of checking designs, below.
- n.repeats: The algorithm includes a randomization component. Occasionally, this can lead to poor designs being found (how to check for this is described below). Sometimes this problem can be remedied by increasing n.repeats.
Step 3: Interpreting the design
The design is called the binary.design. Each row represents a question. Each column shows which alternatives are to be shown. Thus, in the first question, the respondent evaluates alternatives 1, 3, 5, 6, and 10. More complicated designs can have additional information (this is discussed below)
I tend to add one additional complication to my max-diff studies. I get the data collection to involve randomization of the order of the alternatives between respondents. One and only one respondent had brands shown in this order: Apple, Google Samsung, Sony, Microsoft, Intel, Dell, Nokia, IBM, and Yahoo. So, whenever Apple appeared it was at the top, whenever Google appeared, it was below Apple if Apple appeared, but at the top otherwise, etc. The next respondent had the brands in a different order, and so on.
If doing randomization like this, I strongly advise having this randomization done in the data collection software. You can then undo it when creating the data file, enabling you to conduct the analysis as if no randomization ever occurred.
There are many other ways of complicating designs, such as to deal with large numbers of alternatives, and to prevent certain pairs of alternatives appearing together. Click here for more information about this.
Step 4: Checking the design
In an ideal world, a max-diff experimental design has the following characteristics, where each alternative appears:
- At least 3 times.
- The same number of times.
- With each other alternative the same number of times (e.g., each alternative appears with each other alternative twice).
Due to a combination of maths and a desire to avoid respondent fatigue, few max-diff experimental designs satisfy these three requirements (the last one is particularly tough).
Above, I described a design with 10 alternatives, 5 alternatives per question, and 6 questions. Below, I show the outputs where I have changed of alternatives per question from 5 to 4. This small change has made a good design awful. How can we see it is awful? The first thing to note is that 6 warnings are shown at the bottom.
The first warning is telling us that we have ignored the advice about how to compute the number of questions, and we should instead have at least 8 questions. (Or, more alternatives per question.)
The second warning is telling us that we have an alternative that only appears two times, whereas good practice is that we should have each alternative appearing three times.
The third alternative tells us that some alternatives appear more regularly than others. Looking at the frequencies output, we can see that options appeared either 2 or 3 times. Why does this matter? It means we have collected more information about some of the alternatives than others, so may end up with different levels of precision of our estimates of the appeal of different alternatives.
The fourth warning is a bit cryptic. To understand it we need to look at the binary correlations, which are reproduced below. This correlation matrix shows the correlations between each of the columns of the experimental design (i.e., binary.design shown above). Looking at row 4 and column 8 we see a big problem. Alternative 4 and 8 are perfectly negatively correlated. That is, whenever alternative 4 appears in the design alternative 8 does not appear, and whenever 8 appears, 4 does not appear. One of the cool things about max-diff is that it can sometimes still work even with such a flaw in the experimental design. It would, however, be foolhardy to rely on this. The basic purpose of max-diff is to work out relative preferences between alternatives, and its ability to do this is clearly compromised if some alternatives are never shown with others.
The 5th warning tells us that there is a large range in the correlations. In most experimental designs, the ideal design results in a correlation of 0 between all the variables. Max-Diff designs differ from this, as, on average, there will always be a negative correlation between the variables. However, the basic idea is the same: we strive for designs where the correlations are close to 0 as possible. Correlations in the range of -0.5 and 0.5 should, in my opinion, cause no concern.
The last warning tells us that some alternatives never appear together. We already deduced this from the binary correlations.
The first thing to do when you have a poor design is to increase the setting for n.repeats. Start by setting it to 10. Then, if you have patience, try 100, and then bigger numbers. This only occasionally works. But, when it does work it is a good outcome. If this does not work, you need to change something else. Reducing the number of alternatives and/or increasing the number of questions are usually the best places to start.