Blog Archives

Predict Customer Churn – Logistic Regression, Decision Tree and Random Forest

November 20, 2017
By
Predict Customer Churn – Logistic Regression, Decision Tree and Random Forest

Customer churn occurs when customers or subscribers stop doing business with a company or service, also known as customer attrition. It is also referred as loss of clients or customers. One industry in which churn rates are particularly useful is the telecommunications industry, because most customers have multiple options from which to choose within a Related Post Find Your Best...

Read more »

How Happy is Your Country? — Happy Planet Index Visualized

November 9, 2017
By
How Happy is Your Country? — Happy Planet Index Visualized

The Happy Planet Index (HPI) is an index of human well-being and environmental impact that was introduced by NEF, a UK-based economic think tank promoting social, economic and environmental justice. It ranks 140 countries according to “what matters most — sustainable wellbeing for all”. This is how HPI is calculated: It’s tells us “how well nations are Related Post Exploring, Clustering, and...

Read more »

Exploring, Clustering, and Mapping Toronto’s Crimes

November 2, 2017
By
Exploring, Clustering, and Mapping Toronto’s Crimes

Motivation I have had a lot of fun exploring The US cities’ Crime data via their Open Data portals. Because Toronto’s crime data was simply not available. Not until the summer of this year, Toronto police launch a public safety data portal to increase transparency between the public and officers. I recently have had the Related Post Spring Budget 2017:...

Read more »

A Gentle Introduction on Market Basket Analysis — Association Rules

October 2, 2017
By
A Gentle Introduction on Market Basket Analysis — Association Rules

Market Basket Analysis is one of the key techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it allows retailers to identify relationships between the items that people buy. Association Rules are widely used to Related Post Building A Book...

Read more »

Text Analysis with Term Frequency for Mark Twain’s Novels

September 12, 2017
By
Text Analysis with Term Frequency for Mark Twain’s Novels

Introduction Samuel Langhorne Clemens, otherwise known as Mark Twain, is one of the most important American writers.”The Adventures of Tom Sawyer” is probably one of my most favorite books in all English literature. Happy to see that Twain’...

Read more »

Topic Modeling of New York Times Articles

September 3, 2017
By
Topic Modeling of New York Times Articles

In machine learning and natural language processing, A “topic” consists of a cluster of words that frequently occur together. A topic model is a type of statistical model for discovering the abstract “topics” that occur in a collection of ...

Read more »

4 years of The Hacker News, in 5 Charts

August 22, 2017
By
4 years of The Hacker News, in 5 Charts

Introduction Hacker News is one of my favorite sites to catch up on technology and startup news, but navigating the minimalistic website can be sometimes tedious. Therefore, my plan in this post is to introduce you that how this social news site ...

Read more »

Modeling and prediction for movies

June 27, 2017
By
Modeling and prediction for movies

Setup This project details our analysis of the movie dataset that contains information from Rotten Tomatos and IMDB for a random sample of movies. The purpose of this project is to develop a multiple linear regression model to understand what att...

Read more »

United Nations General Assembly Voting Data Analysis

June 19, 2017
By
United Nations General Assembly Voting Data Analysis

I recently came across a R package called “unvote” that consists the voting history of countries in the United Nations General Assembly from 1946 to 2015. The packaged was developed by David Robinson. Explore the data library(ggplot2) librar...

Read more »

Statistical inference with the General Social Survey Data

June 6, 2017
By
Statistical inference with the General Social Survey Data

Setup Load packages library(ggplot2) library(dplyr) library(statsr) Load data load("gss.rdata") Part 1: Data Background The General Social Survey (GSS) is a sociological survey used to collect information and keep a historical record of the...

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)