Predict User’s Return Visit within a day part-3

October 22, 2012

(This article was first published on Tatvic Blog » R, and kindly contributed to R-bloggers)

Welcome to the last part of the series on predicting user’s revisit to the website. In the  first part of series, I generated the logistic regression model for prediction problem whether a user will come back on  website in next 24 hours. In the second part, I discussed about model improvement and seen the model accuracy.

In this post, I will discuss about logistic regression with Google Prediction API and compare it to our model.

When I used Google Prediction API on our data set, I found following result.

Let’s understand result first. The id of the model is “revisit_log_it”, the model type is “CLASSIFICATION” (i.e. Logistic regression), number of instances are 2555 (i.e. Data set contains 2555 rows) and most important result is classification accuracy which is 0.98 (98%).

In our model, accuracy was 98.43% which is similar to the Google Prediction API result. After comparing both results, let’s try to predict whether a user will come back on website in next 24 hours. Suppose we have tracked following information of a user of his last visit and want to predict will user return in next 24 hours?.

  1. visitCount-2
  2. daySinceLastVisit-0
  3. medium-organic
  4. landingPagePath-’/features-adwords-excel-add-in/
  5. exitPagePath-’/excel-add-in-calculator/
  6. pageDepth-2

Here, we first need to understand all the parameter values. In the record, visitCount is 2 means user has visited second time, daySincelastVisit is 0 means user has visited second time in a day (not after some days), medium is organic means user has came through search engine, landingPagePath is ” ‘/features-adwords-excel-add-in/ ” means user has entered on this page in the website, exitPagePath is ” ‘/excel-add-in-calculator/ ” means user exited from this page from the website and pageDepth is 2 means user visits 2 pages during his visit. Let’s predict will this user will come back on website in next 24 hours. R code of predicting for above observation is as below.

>in_d <- data.frame( DaySinceLastVisit=0,visitCount=2,f.medium="organic",f.landingPagePath="'/google-analytics-excel-pricing/",f.exitPagepath="'/excel-add-in-calculator/",pageDepth=2)

Output is 1 means user will come back on website in next 24 hours. Let’s make prediction using Google Prediction API and it is as below.

We can see from the response that ouputLabel is “YES” means user will come back in next 24 hours. Finally, we have done prediction for a user using both models (i.e Our model and Prediction API model) .

Feel free to write your feedback  about this series of posts and let us know if you want to do such a predictive analysis.

Would you like to understand the value of predictive analysis when applied on web analytics data to help improve your understanding relationship between different variables? We think you may like to watch our Webinar – How to perform predictive analysis on your web analytics tool data. Watch the Replay now!

Amar Gondaliya

Amar Gondaliya

Amar is data modeling engineer at Tatvic. He is focused on building predictive model based on available data using R, hadoop and Google Prediction API.
Google Plus Profile: : Amar Gondaliya

To leave a comment for the author, please follow the link and comment on their blog: Tatvic Blog » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)