Blog Archives

Interactive plotting with rbokeh

February 17, 2016
By

Hello everyone! In this post, I will show you how you can use rbokeh to build interactive graphs and maps in R. What is bokeh? Bokeh is a popular python library used for building interactive plots and maps, and now it is also available in R, thanks to Ryan Hafen. It is a very powerful

Read more »

Predicting wine quality using Random Forests

February 4, 2016
By
Predicting wine quality using Random Forests

Hello everyone! In this article I will show you how to run the random forest algorithm in R. We will use the wine quality data set (white) from the UCI Machine Learning Repository. What is the Random Forest Algorithm? In a previous post, I outlined how to build decision trees in R. While decision trees

Read more »

Hierarchical Clustering in R

January 22, 2016
By
Hierarchical Clustering in R

Hello everyone! In this post, I will show you how to do hierarchical clustering in R. We will use the iris dataset again, like we did for K means clustering. What is hierarchical clustering? If you recall from the post about k means clustering, it requires us to specify the number of clusters, and finding

Read more »

Data manipulation with tidyr

January 6, 2016
By
Data manipulation with tidyr

Hello everyone! In this article, I will show you how you can use tidyr for data manipulation. tidyr is a package by Hadley Wickham that makes it easy to tidy your data. It is often used in conjunction with dplyr. Data is said to be tidy when each column represents a variable, and each row

Read more »

K Means Clustering in R

December 28, 2015
By
K Means Clustering in R

Hello everyone, hope you had a wonderful Christmas! In this post I will show you how to do k means clustering in R. We will use the iris dataset from the datasets library. What is K Means Clustering? K Means Clustering is an unsupervised learning algorithm that tries to cluster data based on their similarity.

Read more »

Using Decision Trees to Predict Infant Birth Weights

December 16, 2015
By
Using Decision Trees to Predict Infant Birth Weights

In this article, I will show you how to use decision trees to predict whether the birth weights of infants will be low or not. We will use the birthwt data from the MASS library. What is a decision tree? A decision tree is an algorithm that builds a flowchart like graph to illustrate the

Read more »

Visualizing MLS Player Salaries with ggplot2

November 23, 2015
By
Visualizing MLS Player Salaries with ggplot2

Recently, I came across this great visualization of MLS Player salaries. I tried to do something similar with ggplot2, and while I was unable to replicate the interactivity or the tree-map nature of the graph, the graph still looks pretty cool. Data The data is contained in this pdf file. I obtained a CSV file

Read more »

Building Interactive Maps with Leaflet

November 7, 2015
By
Building Interactive Maps with Leaflet

Leaflet is an JavaScript library for building interactive maps. RStudio released a package that allows us to build these maps in R! You can do some really cool things in Leaflet, and I will demonstrate a few of those below. Leaflet is compatible with Shiny apps and R Markdown documents. As mentioned on the RStudio

Read more »

Using kNN Classifier to Predict Whether the Price of Stock Will Increase

October 23, 2015
By
Using kNN Classifier to Predict Whether the Price of Stock Will Increase

In this article, I will show you how to use the k-Nearest Neighbors algorithm (kNN for short) to predict whether price of Apple stock will increase or decrease. I obtained the data from Yahoo Finance. You can download the dataset here. What is the k-Nearest Neighbors algorithm? The kNN algorithm is a non-parametric algorithm that

Read more »

Data manipulation with reshape2

October 9, 2015
By
Data manipulation with reshape2

In this article, I will show you how you can use the reshape2 package to convert data from wide to long format and vice versa. It was written and is maintained by Hadley Wickham. Long format vs Wide format In wide format data, each column represents a different variable. For example, the mtcars dataset from

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)