Recommender Systems 101 – a step by step practical example in R

[This article was first published on Big Data Doctor » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

keep-calm-you-are-not-uniqueEvery one of us is unique! …You are unique!  there are sooo many people different from you… but at the same time, there are also A LOT that are damned similar to you… exhibiting the same behavior, interacting with the same people, liking the same things…

Whether you like it or not, it makes us extremely predictable and boringly main stream…  But it’s not necessarily a bad thing… You are already enjoying the benefits of the so called collective intelligence, which is embedded in a lot of applications we use on a daily basis.  Of course you are used to Facebook, Twitter or Linked.in suggesting people you might also know to expand your Social Media Network, or to Amazon pointing you to products that you might also consider when you are purchasing a particular item, or to Last.fm, Spotify & Co. suggesting you songs that are quite aligned with your musical taste… aren’t you?

Well, all of them got something in common… the use of recommendation techniques to filter what statistically is most relevant for a particular user. In this post -a quite long one-, I’m going to cover the basics first to proceed with a step-by-step implementation of a recommendation engine.

A few basics first

Types of recommender systems

There are basically 2 approaches to make a recommendation… Let’s say you want to recommend a set of additional products to a customer who purchased a product X:

  • you can try to find out what in the product X was so attractive for the customer and suggest products having this “what“… We called them Content based recommender systems.
  • you check for all other users who purchased product X as well, and make a list of other products purchased by these users… Out of this list, you take the products repeating the most. We called them Collaborative filtering recommender systems

For example, let’s say I really liked  “The Mission” and I gave the highest rating to this movie… The first type of systems might have modeled this movie as:

{actors: ["Robert De Niro", "Jeremy Irons"], director: "Roland Joffé", topics:["18th century","Spanish colonization","Christian Evangelization"]}.

Based on that, movies from Robert De Niro, Jeremy Irons or Roland Joffé might be recommended, or movies like “1492: Conquest of Paradise” -Spanish colonization.
The second type of system -and the one imdb implements– will check in the database all users who rated “The Mission” as high as I did and will retrieve all other movies rated high by these users… the list includes titles like “Novecento”, “The innocent” or “The killing fields”

In this post we are going to implement a Collaborative Filtering Recommender System… In spite of a lot of known issues like the cold start problem, this kind of systems is broadly adopted, easier to model and known to deliver good results. Many implementations called hybrid recommender systems combine both approaches to overcome the known issues on both sides.

Tasks to be solved by RS

From the perspective of a particular user -let’s call it active user-, a recommender system is intended to solve 2 particular tasks:

  1. To predict the rating for an item or product, the user has not rated yet.
  2. To create the list of the top N recommended items

In the step-by-step example you are going to see that you probably need both and the second one relies on the first one.

Validating Recommender Systems

Understanding how well a Recommender System performs the above mentioned tasks is key when it comes to using it in a productive environment.

mae-rmseThe performance of the predictive task is typically measured by the deviation of the prediction from the true value. This is the basis for the Mean Average Error (MAE) or the squared version called Root Mean Square Error (RMAE)

In the formulas,  K represents the set of all user-item pairings (i, j) for which we have a predicted rating rˆ_ij and a known rating r_ij, which was not used to learn the recommendation model. The basic idea behind these metrics is measuring the deviation between your predicted rated values and the real rated values over many users and items.

Implementing a Recommender System in R

Overview

One of the  killer applications of Recommender Systems is the conversion rate optimization: customers find relevant products faster, cross-selling happens on a substantiated way and as a side effect, your image as a brand improves, as your attempt to be relevant for your customers is usually appreciated as value-adding, which also positively impacts the customer loyalty. That’s why we are going to focus on this use case :)

We are not going to implement everything from scratch (thank you Captain Obvious!)… There are a few R packages implementing collaborative filtering engines, but I like recommenderlab the most.

1- Data Gathering

Sometimes the discovery of the affinity of users for certain items is not as straight forward as a data base with ratings. Yet, there are countless indications we can use to model this affinity.

Let’s think of the clickstream data. You can measure referring method, page visited, clicked items, items part of a comparison, items part of the basket and checkout process, etc… You can even combine it with on-page indicators, like time-on-page (see the riveted time spent plugin for Google Analytics) or mouse moves (e.g.: with ClickTale, Mouseflow, etc).  An orthogonal dimension is the timely aspect, for example how long ago was which interaction.

user-item-affinity

As a result, you have a lot of sessions with a lot of events for an user with respect to an item. A handy approach would be computing the User-Item Affinity per session taking also into account the Frequency and Recency, even with an overly simplified approach like the one below:

Affinity(u_{i}, it_{j}, s_{k}) = frac {1}{daysAgo(s_{k})}*sum_{j=0}^{n} w_{j},f_{j}

An then aggregating it for all sessions where this particular user had interactions with this particular item Affinity(u_{i}, it_{j}) =sum_{l=0}^{k} Affinity(u_{i}, it_{j}, s_{l})

2- Data Normalization

Sometimes the discovery of the affinity of users for certain items is not as straight forward as a data base with ratings. Yet, there are countless indications we can use to model this affinity.

Recommender-System-matrix
Our code in R looks like:

1
2
3
4
library("recommenderlab")	 	 
# Loading to pre-computed affinity data	 
affinity.data<-read.csv("collected_data.csv")
affinity.matrix<- as(affinity.data,"realRatingMatrix")

3- Collaborative Model creation

We are going to create a model called UBCF or U(ser) B(ased) C(ollaborative) F(iltering) trained with 5000 users.
Alternatively, we could use a less memory intensive approach without having to load the entire user data base in memory called IBCF – I(ser) B(ased) C(ollaborative) F(iltering), just changing the parameter in the code below:

6
7
# Creation of the model - U(ser) B(ased) C(ollaborative) F(iltering)
Rec.model<-Recommender(affinity.matrix[1:5000], method = "UBCF")

This model computes internally the cosine similarity between all users represented as vectors, which in R is as simple as:

crossprod(a,b)/sqrt(crossprod(a)*crossprod(b))

There are other possibilities for computing the similarity between users -Jaccard, Pearson, etc-, you can also specify along with the normalization method, a minRating, etc:

Rec.model=Recommender(affinity.data[1:400],method="UBCF", 
      param=list(normalize = "Z-score",method="Cosine",nn=5, minRating=1))

4- The model in action – top N items and item affinity

Now we can play with our model… for example, let’s try to obtain the top recommendations for a particular user “u15348″

8
9
10
11
12
13
14
15
# recommended top 5 items for user u15348
recommended.items.u15348 <- predict(Rec.model, affinity.matrix["u15348",], n=5)
# to display them
as(recommended.items.u15348, "list")
# to obtain the top 3
recommended.items.u15348.top3 <- bestN(recommended.items.u15348, n = 3)
# to display them
as(recommended.items.u15348.top3, "list")

Now, for the same user “u15348″, let’s have a look at the affinity value computed for all items we didn’t have any value in the original data:

16
17
18
19
20
21
22
# Predict list of product which can be recommended to given users	 	
#to predict affinity to all non-rated items 
predicted.affinity.u15348 <- predict(Rec.model, affinity.matrix["u15348",], type="ratings")
# to see the user "u15348"'s predicted affinity for items we didn't have any value for
as(predicted.affinity.u15348, "list")
# .. and the real affinity for the items obtained from the affinity.matrix
as(affinity.matrix["u15348",], "list")

5- Validation

To evaluate our Rec.model we need data, more precisely, experimentally obtained data. The only experimentally obtained data source is our affinity.data, so we need to take a chunk to train our model, but leave another chunk to validate whether the model produces the right output. This technique is call “split”.

23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# create evaluation scheme splitting taking 90% of the date for training and leaving 10% for validation or test
e <- evaluationScheme(affinity.matrix[1:1000], method="split", train=0.9, given=15)
# creation of recommender model based on ubcf
Rec.ubcf <- Recommender(getData(e, "train"), "UBCF")
# creation of recommender model based on ibcf for comparison
Rec.ibcf <- Recommender(getData(e, "train"), "IBCF")
# making predictions on the test data set
p.ubcf <- predict(Rec.ubcf, getData(e, "known"), type="ratings")
# making predictions on the test data set
p.ibcf <- predict(Rec.ibcf, getData(e, "known"), type="ratings")
# obtaining the error metrics for both approaches and comparing them
error.ubcf<-calcPredictionAccuracy(p.ubcf, getData(e, "unknown"))
error.ibcf<-calcPredictionAccuracy(p.ibcf, getData(e, "unknown"))
error <- rbind(error.ubcf,error.ibcf)
rownames(error) <- c("UBCF","IBCF")
error

There are other validation techniques coming from the information retrieval perspective (a recommender system performs at the end of the day an information retrieval task). These techniques involve the creation of the so called confusion matrix to compute the precision and the recall metrics.

Taking it from here

Deploying your model

You got a model, alright… what now? I listed a few points you probably need to consider:

  • You need to put the right mechanism in place on your eCommerce portal to start displaying the recommendations
  • You might also want to define a couple of rules on top to select the items with the highest return out of the best N recommended to a particular user
  • Most probably you want to prevent already purchased items to be displayed to the buyer
  • You probably want to define a recommendations refreshing strategy to leverage the latest data available at user level and you might even want to go (near) real time for that.

Improving your model

Marginal improvements pay huge returns but are also complicated to accomplish -ask Mr. Pareto about the famous last 20% if you don’t believe me-.
From where you are you could make an easy step towards a hybrid approach by creating for example a matrix to capture the recommended items for all items in absence of recommendations… E.g.: a new product is released and you don’t have enough data yet to apply a collaborative filtering, so you manually define a set of best products to fall back to.

Another point you might have to deal with is the data volume. R is great but when you deal with several GBs a day of logs data, you might have to embrace a more robust BigData technology. Don’t worry! Mahout for example, from the Apache Foundation, provides out of the box scalable implementations of collaborative filtering algorithms

In conclusion

In this post, we’ve introduced the recommender systems, explained why they are kind of game-changer in many industries, went through a few concepts and implemented step-by-step a Collaborative Filtering Recommender System in R for an eCommerce platform.

I’m aware I just covered the very basics, but as you’ve seen, even with these basics you can go a long way!
My ultimate goal though was awakening your appetite for more… If I made you curious about this kind of systems, I’m a happy man!

PS: think of the benefits of creating a Recommender System based on your Google Analytics data accessible through BigQuery

PPS.: check out how these guys recommend beer in R on the yhat platform. Cool, isn’t it?

To leave a comment for the author, please follow the link and comment on their blog: Big Data Doctor » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)