Measuring the execution time of recommender systems in R

[This article was first published on Big Data Doctor » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This post is different from all others you’ve seen so far on this site. This is actually not a proper post, but a respond to a comment from my previous post Recommender Systems 101 – a step by step practical example in R. If you haven’t read it yet, you better start there :). Datamaniac commented:

Where s the limit to stop using R and start moving to Mahout or a robuster framework? ANy rule of thumb?

Well… When it comes to defining what an appropriate waiting time for results is, my first answer is the feared “It depends…”.  Of course, I don’t consider it a fully committed answer, so I decided to provide the Datamaniac with a little framework for him/her to obtain his/her answer.

my-systemTo be more concrete, I’ve run a few experiments on my own laptop (have a look at the configuration… not a bad one, is it?) with different configurations of number of users and number of items for the affinity matrix.
For each configuration, I measure for both Item-Based and User-Based Collaborative Filtering approaches the time required to train the model, the time required to to obtain the top 5 recommendations for the users in the experiment and the time required to predict the affinity for the non-rated items for the same users.

The results of my experiments can be seen on the chart below in this post. A few comments from my side:

  • The training time (t.train.ubcf and t.train.ibcf) obviously increases with the number of items and users, but the behavior is quite different. IBCF requires a much higher training time (over 75 sec for a 5000 users x 1000 items matrix, whereas the UBCF stays under 1 sec).
  • The increase of number of items plays a bigger impact in the performance decrease than the increase of the number of users
  • Once the model is created and trained, the advantages of the model-based IBCF come into picture: the prediction for the item ratings or item affinity took less than 0.4 secs for the heaviest configuration –t.predict.ratings.ibcf– (vs. over 20 secs for the same configuration for UBCF –t.predict.ratings.ubcf– .
  • Likewise, for obtaining the list of the top N recommendations, IBCF clearly outperforms UBCF –t.predict.ibcf 1,5 secs vs. t.predict.ubcf 40 secs for the 5000×1000 configuration-

When does it make sense to adopt Mahout? My answer now would be “When after measuring for your particular problem, you obtain execution times that don’t comply with your performance requirements”

At the end of the post, you’ll find the R code I wrote to run the benchmark… probably not 100% optimized but clear enough I hope to take you further.

@Datamaniac: a big thank you for making the right questions and inspiring new content!

 

recommender-systems-performance-test

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
 
library("recommenderlab")
library(ggplot2)
 
performance.test<-function ( users.count, items.count)
{
  perf.df<-NULL
  records.count<-users.count* items.count
  m <- matrix(sample(c(as.numeric(0:5), NA), records.count,  
                     replace=TRUE, prob=c(rep(.9/6,6),.6)), ncol=items.count,  
              dimnames=list(user=paste("u", 1:users.count, sep=''),  
                            item=paste("i", 1:items.count, sep=''))) 
  r.m<- as(m, "realRatingMatrix")
  t1 <- proc.time()
  scheme <- evaluationScheme(r.m, method="split", train=0.9, given=15)
  t.scheme <- proc.time() - t1
  t1 <- proc.time()
  r1 <- Recommender(getData(scheme, "train"), method="UBCF",
                    param=list(normalize = "Z-score",method="Cosine"))
  t.train.ubcf <- proc.time() - t1
  t1 <- proc.time()
  r2 <- Recommender(getData(scheme, "train"), method="IBCF",
                  param=list(normalize = "Z-score",method="Cosine"))
  t.train.ibcf <- proc.time() - t1
  t1 <- proc.time()
  p1 <- predict(r1, getData(scheme, "known"), type="ratings")
  t.predict.ratings.ubcf <- proc.time() - t1
  t1 <- proc.time()
  p2 <- predict(r2, getData(scheme, "known"), type="ratings")
  t.predict.ratings.ibcf <- proc.time() - t1
  t1 <- proc.time()
  p1 <- predict(r1, getData(scheme, "known"))
  t.predict.ubcf <- proc.time() - t1
  t1 <- proc.time()
  p2 <- predict(r2, getData(scheme, "known"))
  t.predict.ibcf <- proc.time() - t1
 
#  perf.df<-rbind(perf.df, cbind(measurepoint="t.scheme", as.data.frame(as.list(t.scheme))))
  perf.df<-rbind(perf.df,cbind(measurepoint="t.train.ubcf", as.data.frame(as.list(t.train.ubcf))))
  perf.df<-rbind(perf.df,cbind(measurepoint="t.train.ibcf", as.data.frame(as.list(t.train.ibcf))))
  perf.df<-rbind(perf.df, cbind(measurepoint="t.predict.ubcf", as.data.frame(as.list(t.predict.ubcf))))
  perf.df<-rbind(perf.df,cbind(measurepoint="t.predict.ibcf", as.data.frame(as.list(t.predict.ibcf))))
  perf.df<-rbind(perf.df, cbind(measurepoint="t.predict.ratings.ubcf", as.data.frame(as.list(t.predict.ratings.ubcf))))
  perf.df<-rbind(perf.df,cbind(measurepoint="t.predict.ratings.ibcf", as.data.frame(as.list(t.predict.ratings.ibcf))))
  config.name<-paste0(users.count,"x",items.count)
  config.name<-rep(config.name,times = nrow(perf.df))
  users.count<-rep(users.count,times = nrow(perf.df))
  items.count<-rep(items.count,times = nrow(perf.df))
  perf.df<-cbind(users.count,perf.df)
  perf.df<-cbind(items.count,perf.df)
  perf.df<-cbind(config.name,perf.df)
  return(perf.df)
}
 
perf.benchmark <- function ()
{
  users<-c(100,500,1000,2500,5000) 
  items<-c(100,200,500,1000) 
 
  perf.values <- NULL
  for (u in 1:length(users))
  {
    for (i in 1:length(items))
    {
      measure<-performance.test(users[u],items[i])
      perf.values<-rbind(perf.values,measure)
    }    
  }
  return(perf.values)
}
# performance measuring of different configurations for IBCF and UBCF
benchmark<-perf.benchmark()
# graphical representation  
ggplot(benchmark, aes(x=config.name,y=elapsed, group=measurepoint,color=measurepoint)) +
  geom_step() +
  facet_wrap(~measurepoint,ncol = 2,  scale='free')+
  ylab('t.elapsed (s)') + xlab('configuration') +
  theme(strip.text.x = element_text(size = 12, colour = "black")) +
  theme(axis.text=element_text(size=12)) +
  theme(axis.title=element_text(size=14,face="bold")) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  theme(panel.background = element_rect(fill = 'white')) +
  theme(panel.grid.major = element_line( color="snow2")) +
  theme(legend.position = "none")

To leave a comment for the author, please follow the link and comment on their blog: Big Data Doctor » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)