Articles by Posts on Anything Data

Predicting Churn Using Tree Models

December 31, 2018 | Posts on Anything Data

Today I want to predict churn using data from a hypothetical telecom company. Although it isn’t real life data, it is based on real life data. The data are spread across 19 columns — 14 continuous, 4 categorical, and the outcome variable for prediction - “churn”. The dataset is small, with 3333 rows for ... [Read more...]

Linear Classification Models – Hepatic Dataset

December 2, 2018 | Posts on Anything Data

This post is following exercise 1 in Chapter 12 of Applied Predicative Modeling. Here I use the machine learning package CARET in R to make classification models; in particular, the linear classification models discussed in Chapter 12. The dataset in question is about hepatic injury (liver damage). It includes a dataframe of biological ... [Read more...]

First App – Halo 5 Stats

September 20, 2018 | Posts on Anything Data

“Halo 5 Stats” is my first ever Shiny app, which is themed around the the popular sci-fi shooter Halo 5. It combines beatiful data with Halo’s beautiful graphics - something that Halo fans and data enthusiasts undoubtedly love. You can use my app... [Read more...]

Plotting Word Bigrams with 3 Chinese Classics

May 31, 2018 | Posts on Anything Data

In the last post, we saw frequencies of the most common words in the Analects, Zhuangzi, and Mozi texts. The faceted plot did an excellent job of capturing a generic “theme” of each text. However, I wondered how the results might change when plotting bigrams (2 word combinations of adjacent words) ... [Read more...]

A Tidytext Analysis of 3 Chinese Classics

May 28, 2018 | Posts on Anything Data

For a long time I’ve admired the tidytext package and its wonderful companion book Text Mining with R. After reading it I thought, “Why not undertake a project of Chinese text analysis?” I am deeply interested in Chinese philosophy but I decided to keep the analysis narrow by selecting ... [Read more...]

Ctextclassics, my First Package

May 16, 2018 | Posts on Anything Data

My latest update is a milestone! I have authored my first ever R package which is an API caller for ctext.org. Ctext hosts numerous pre-modern Chinese texts and my package makes them available to you. The scope is broad, but think philosophical works in Confucianism, Daoism, Legalism, military doctrines, ... [Read more...]

On Relocating to Github/Netlify

May 5, 2018 | Posts on Anything Data

Deep, labored breathing Hello everyone, this is the opening post on my new blog, which I’m relocating from Wordpress to GitHub Pages and Netlify. It’s so nice I’ve given it a name - because nice things have names! But, why was I panting? The relocation effort wasn’... [Read more...]

Plotting Fortune 500 HQ’s in R

January 28, 2018 | Posts on Anything Data

Today I’d like to work a little on geospatial mapping in R, so I’ve chosen a small dataset (only 256 kb) that can be plotted on a map. It the location information of Fortune 500 company headquarters in the US. You can download it from here. R has several choices ... [Read more...]

Top MBA Programs by US News

June 30, 2017 | Posts on Anything Data

Somebody once asked me for reccomendations on MBA programs based on rank and tuition. I didn’t have any information on hand, but knew how toget it. Webscraping. Webscraping is an immensly useful tool for gathering data from webpages, when it isn’t hosted on an API or stored in ... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)