Features form the basis for much of our preference modeling. When asked to explain one’s preferences, features are typically accepted as appropriate reasons: this job paid more, that candidate supports tax reform, or it was closer to home. We believe that features must be the drivers since they so easily serve as rationales for past behavior. Choice modeling formalizes this belief by assuming that products and services are feature bundles with the value of the bundle calculated directly from the utilities of its separate features. All that we need to know about a product or service can be represented as the intersection of its features, which is why it is called conjoint analysis.
At first, this approach seems to work, but it does not scale well. We create hypothetical products and services defined by the cells in a factorial experimental design (see the book Stated Preference Methods Using R). The number of cells increases quickly with each additional feature so that we need to turn to optimal designs in R in order to limit the number of possible combinations. We have reduced the number of hypothetical descriptions, while the number of estimated parameters remains unchanged. Overall preference continues to be an additive function of the values attributed to each of the separate components.
Representation learning, on the other hand, is associated with deep neural networks, such as the h2o package discussed by John Chambers at the useR! 2014 conference. According to Yoshua Bengio (see his new chapter on Distributed Representations), “a good representation is one that makes further learning tasks easy.” The process is described in his first chapter on Deep Learning. As shown in this figure from Wikipedia, the observed features are visible units and the product representation is a transformation contained in hidden units.
What do consumers learn before deciding to buy? They learn a representational structure that reduces the complexity of the purchase process. This learning comes relatively easy with so many sources telling us what to look for and what to buy (e.g., marketing communications, professional reviews, social media and of course, friends and family). Bengio speaks of evolving culture vs. local minima as the process for “brain to brain transfer of information.” Others refer to it as a meeting of minds or shared conceptualizations.
Are you thinking about a Smart Watch? Representation learning would suggest that the first step is “getting a lay of the land” or untangling the sources of variation accounting for differences among the offerings. I outlined such an approach in my last post on precursors to preference construction. It is possible to go online and request side-by-side feature comparisons that look similar to what one might find in choice modeling. However, that step is often late in the process after you have decided to purchase and have narrowed your consideration set. Before that, one looks at pictures, scans specifications, reads reviews and learns from others through user comments. One discovers what is available and what benefits are delivered. As you learn what if offered, you come to understand what you might want and be willing to spend.
The purchase task is somewhat easier than language translation or facial recognition because product categories are marketing creations with a deliberately simplified structure. Products and services are simple by design with benefits and features linked together and set to music with a logo and a tagline. Product and service features are observed (red in the above figure); benefits are latent or hidden features (the blue) and can be extracted with deep neural networks or nonnegative matrix factorization. That is, we can think of representation learning as the relatively slow unsupervised learning that occurs early in the decision process and makes later learning and decision making easier and faster. Utility theory lacks the expressive power to transform the input into new ways of seeing. Both deep neural networks and nonnegative matrix factorization free us to go beyond the information given.
Finally, what happens when the consumer is pulled out of the purchase context and presented feature lists constructed according to a fractional factorial or optimal design? The norms of the marketplace are violated, yet respondents get through the task the best they can using the only information that you have provided them. Unfortunately, you do not learn much about bears in the wild when they are confined in cages.