style="text-align: justify;">Welcome to the last part of the series on predicting user’s revisit to the website. In the
title="Predict User's Return Visit within a day part-1" href="http://www.tatvic.com/blog/predict-users-return-visit-within-a-day-part-1/" >first part of series, I generated the logistic regression model for prediction problem whether a user will come back on website in next 24 hours. In the
title="Predict User's Return Visit within a day part-2" href="http://www.tatvic.com/blog/predict-users-return-visit-within-a-day-part-2/" >second part, I discussed about model improvement and seen the model accuracy.
style="text-align: justify;">In this post, I will discuss about logistic regression with
title="Google Prediction API" href="https://developers.google.com/prediction/" >Google Prediction API and compare it to our model.
style="text-align: justify;">When I used Google Prediction API on our data set, I found following result.
class="aligncenter size-full wp-image-3215" src="http://www.tatvic.com/blog/wp-content/uploads/2012/09/prediction_API_output.png" alt="" width="617" height="230" />
style="text-align: justify;">Let’s understand result first. The id of the model is “revisit_log_it”, the model type is “CLASSIFICATION” (i.e. Logistic regression), number of instances are 2555 (i.e. Data set contains 2555 rows) and most important result is classification accuracy which is 0.98 (98%).
style="text-align: justify;">In our model, accuracy was 98.43% which is similar to the Google Prediction API result. After comparing both results, let’s try to predict whether a user will come back on website in next 24 hours. Suppose we have tracked following information of a user of his last visit and want to predict will user return in next 24 hours?.
style="text-align: justify;">Here, we first need to understand all the parameter values. In the record, visitCount is 2 means user has visited second time, daySincelastVisit is 0 means user has visited second time in a day (not after some days), medium is organic means user has came through search engine, landingPagePath is ” ‘/features-adwords-excel-add-in/ ” means user has entered on this page in the website, exitPagePath is ” ‘/excel-add-in-calculator/ ” means user exited from this page from the website and pageDepth is 2 means user visits 2 pages during his visit. Let’s predict will this user will come back on website in next 24 hours. R code of predicting for above observation is as below.
>in_d <- data.frame( DaySinceLastVisit=0,visitCount=2,f.medium="organic",f.landingPagePath="'/google-analytics-excel-pricing/",f.exitPagepath="'/excel-add-in-calculator/",pageDepth=2)
style="text-align: justify;">Output is 1 means user will come back on website in next 24 hours. Let’s make prediction using Google Prediction API and it is as below.
class="aligncenter size-full wp-image-3219" src="http://www.tatvic.com/blog/wp-content/uploads/2012/09/API_input.png" alt="" width="621" height="349" />
class="aligncenter size-full wp-image-3221" src="http://www.tatvic.com/blog/wp-content/uploads/2012/09/API_output.png" alt="" width="621" height="377" />
style="text-align: justify;">We can see from the response that ouputLabel is “YES” means user will come back in next 24 hours. Finally, we have done prediction for a user using both models (i.e Our model and Prediction API model) .
style="text-align: justify;">Feel free to write your feedback about this series of posts and let us know if you want to do such a predictive analysis.
style="color:#2361A1">Would you like to understand the value of predictive analysis when applied on web analytics data to help improve your understanding relationship between different variables? We think you may like to watch our Webinar – How to perform predictive analysis on your web analytics tool data.
href="http://www.tatvic.com/perform-predictive-analysis-on-your-web-analytics-tool/?utm_source=post&utm_medium=blog&%23038;utm_campaign=webinar3" >Watch the Replay now!
src="http://www.tatvic.com/blog/wp-content/uploads/userphoto/14.jpg" alt="Amar Gondaliya" width="60" class="photo" />
Amar is data modeling engineer at Tatvic. He is focused on building predictive model based on available data using R, hadoop and Google Prediction API.
Google Plus Profile: :
href="https://plus.google.com/115682702585184320806/" >Amar Gondaliya