Articles by Pablo C.

Anomaly Detection in R

December 17, 2015 | Pablo C.

Introduction Inspired by this Netflix post, I decided to write a post based on this topic using R. There are several nice packages to achieve this goal, the one we´re going to review is AnomalyDetection. Download full -and tiny- R code of this post here. Normal Vs. Abnormal The ... [Read more...]

Recommendation Systems in R

September 12, 2015 | Pablo C.

These systems are used in cross-selling industries, and they measure correlated items as well as their user rate. This last point wasn't included the apriori algorithm (or association rules), used in market basket analysis. The link: http://blog.yha... [Read more...]

{Long Vs. Wide} Data Frames

July 24, 2015 | Pablo C.

Introduction This is an excellent resource to understand 2 types of data frame format: Long and Wide. Just take a look at figure 1 inside the article 1) Long format: ggplot2 needs in certain scenarios this kind of format to work (generally grouped... [Read more...]

Data Science – Short lesson on cluster analysis

May 13, 2015 | Pablo C.

Introduction In clustering you let data to be grouped according to their similarity. A cluster model is a group of segments -clusters- containing cases (such as clients, patients, cars, etc.). Once a cluster model is developed, one question arises: How can I describe my model? Here we present a way ... [Read more...]

EU Life Quality Geo Report

May 6, 2015 | Pablo C.

Living longer, living better? It's equally important to measure the longer living as well as its quality. Analyzing data from eurostat which containts the following two variables: 1- Healthy life years: Is a health expectancy indicator which com... [Read more...]

Dynamic analysis on outliers

April 24, 2015 | Pablo C.

Treating outliers Introduction Outliers are the extreme values that a variable has, depending on the model or requirement, it could be necessary to treat them, either transforming or deleting. Variable “Income” distribution This is going to be our main variable in this example, which represents customer's income in $. We can ... [Read more...]

Geo Analysis

March 19, 2015 | Pablo C.

EU - Life Quality Geo Report Living longer, living better? It's equally important to measure the longer living as well as its quality. Analyzing data from eurostat which containts the following two variables: 1- Healthy life years: Is a healt... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)