Blog Archives

Hybrid content-based and collaborative filtering recommendations with {ordinal} logistic regression (2): Recommendation as discrete choice

April 14, 2017
By
Hybrid content-based and collaborative filtering recommendations with {ordinal} logistic regression (2): Recommendation as discrete choice

In this continuation of “Hybrid content-based and collaborative filtering recommendations with {ordinal} logistic regression (1): Feature engineering” I will describe the application of the {ordinal} clm() function to test a new, hybrid content-based, collaborative filtering approach to recommender engines by fitting a class of ordinal logistic (aka ordered logit) models to ratings data from...

Read more »

Hybrid content-based and collaborative filtering recommendations with {ordinal} logistic regression (1): Feature engineering

April 14, 2017
By
Hybrid content-based and collaborative filtering recommendations with {ordinal} logistic regression (1): Feature engineering

I will use {ordinal} clm() (and other cool R packages such as {text2vec} as well) here to develop a hybrid content-based, collaborative filtering, and (obivously) model-based approach to solve the recommendation problem on the MovieLens 100K dataset in R. All R code used in this project can be obtained from the respective GitHub...

Read more »

#AskNASA: What’s the Optimal Time for Aliens to Invade Earth?

February 22, 2017
By
#AskNASA: What’s the Optimal Time for Aliens to Invade Earth?

This post was originally published on SmartCat, 22 Feb 2017.My inaugural blog as a Data Science Consultant for SmartCat. The code that accompanies the analyses presented here is available at the respective GitHub repository. On how to use R to estimate the optimal time during the day for aliens to invade Earth and a few more...

Read more »

R in Open Data: Complaints in The Field of Freedom of Information data set from data.gov.rs

February 12, 2017
By
R in Open Data: Complaints in The Field of Freedom of Information data set from data.gov.rs

The notebooks (R, Rmd, and HTML files are provided in my GitHub repository) focus on an exploratory analysis of the open data set on the complaints in the field of freedom of information, provided at the Open Data Portal of the Republic of Serbia that is currently under development. The data set was kindly provided to the...

Read more »

Open Data R Meetup: exploring the Distribution of Traffic Accidents in Belgrade, 2015 in R

January 31, 2017
By
Open Data R Meetup: exploring the Distribution of Traffic Accidents in Belgrade, 2015 in R

The R code that accompanies this post is found on GitHub: you will find R, Rmd, and HTML files there that were used during the first Open Data R Meetup held in Belgrade, 31 January 2017, organized by Data Science Serbia in Startit Center, Savska 5, Bel...

Read more »

Distributional Semantics in R: Part 2 Entity Recognition w. {openNLP}

January 2, 2017
By
Distributional Semantics in R: Part 2 Entity Recognition w. {openNLP}

The R code for this tutorial on Methods of Distributional Semantics in R is found in the respective GitHub repository. You will find .R, .Rmd, and .html files corresponding to each part of this tutorial (e.g. DistSemanticsBelgradeR-Part2.R, DistSemant...

Read more »

Distributional Semantics in R: Part 1 {tm} classes + read/write

December 24, 2016
By
Distributional Semantics in R: Part 1 {tm} classes + read/write

The R code for this tutorial on Methods of Distributional Semantics in R is found in the respective GitHub repository. Following my Methods of Distributional Semantics in R BelgradeR Meetup with Data Science Serbia, organized in Startit Center, Belgrade, 11/30/2016, several people asked me for the R code used for the analysis of William Shakespeare’s...

Read more »

Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, ML Estimation + Binomial Logistic Regression]

June 21, 2016
By
Introduction to R for Data Science :: Session 8 [Intro to Text Mining in R, ML Estimation + Binomial Logistic Regression]

Welcome to Introduction to R for Data Science, Session 8: Intro to Text Mining in R, ML Estimation + Binomial Logistic Regression [Web-scraping with tm.plugin.webmining. The tm package corpora structures: assessing document metadata and content. Typical corpus transformations and Term-Document Matrix production. A simple binomial regression model with tf-idf scores as features and its shortcommings due to sparse data....

Read more »

Introduction to R for Data Science :: Session 8 [Appendix]

June 20, 2016
By
Introduction to R for Data Science :: Session 8 [Appendix]

Appendix to Session 8: Intro to Text Mining in R, ML Estimation + Binomial Logistic RegressionWelcome to Introduction to R for Data Science, Session 8: Intro to Text Mining in R, ML Estimation + Binomial Logistic Regression [Web-scraping with tm.plugin.webmining. The tm package corpora structures: assessing document metadata and content. Typical corpus transformations and Term-Document Matrix...

Read more »

Introduction to R for Data Science :: Session 7 [Multiple Linear Regression Model in R  + Categorical Predictors, Partial and Part Correlation]

June 9, 2016
By
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression Model in R  + Categorical Predictors, Partial and Part Correlation]

Welcome to Introduction to R for Data Science Session 7: Multiple Regression + Dummy Coding, Partial and Part Correlations [Multiple Linear Regression in R. Dummy coding: various ways to do it in R. Factors. Inspecting the multiple regression model: regression coefficients and their interpretation, confidence intervals, predictions. Introducing {lattice} plots + ggplot2. Assumptions: multicolinearity and testing it from the...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)