Articles by Teja Kodali

Interactive plotting with rbokeh

February 17, 2016 | Teja Kodali

Hello everyone! In this post, I will show you how you can use rbokeh to build interactive graphs and maps in R. What is bokeh? Bokeh is a popular python library used for building interactive plots and maps, and now it is also available in R, thanks to Ryan Hafen. ... [Read more...]

Predicting wine quality using Random Forests

February 4, 2016 | Teja Kodali

Hello everyone! In this article I will show you how to run the random forest algorithm in R. We will use the wine quality data set (white) from the UCI Machine Learning Repository. What is the Random Forest Algorithm? In a previous post, I outlined how to build decision trees ... [Read more...]

Hierarchical Clustering in R

January 22, 2016 | Teja Kodali

Hello everyone! In this post, I will show you how to do hierarchical clustering in R. We will use the iris dataset again, like we did for K means clustering. What is hierarchical clustering? If you recall from the post about k means clustering, it requires us to specify the ... [Read more...]

Data manipulation with tidyr

January 6, 2016 | Teja Kodali

Hello everyone! In this article, I will show you how you can use tidyr for data manipulation. tidyr is a package by Hadley Wickham that makes it easy to tidy your data. It is often used in conjunction with dplyr. Data is said to be tidy when each column represents ...
[Read more...]

K Means Clustering in R

December 28, 2015 | Teja Kodali

Hello everyone, hope you had a wonderful Christmas! In this post I will show you how to do k means clustering in R. We will use the iris dataset from the datasets library. What is K Means Clustering? K Means Clustering is an unsupervised learning algorithm that tries to cluster ... [Read more...]

Visualizing MLS Player Salaries with ggplot2

November 23, 2015 | Teja Kodali

Recently, I came across this great visualization of MLS Player salaries. I tried to do something similar with ggplot2, and while I was unable to replicate the interactivity or the tree-map nature of the graph, the graph still looks pretty cool. Data The data is contained in this pdf file. ... [Read more...]

Building Interactive Maps with Leaflet

November 7, 2015 | Teja Kodali

Leaflet is an JavaScript library for building interactive maps. RStudio released a package that allows us to build these maps in R! You can do some really cool things in Leaflet, and I will demonstrate a few of those below. Leaflet is compatible with Shiny apps and R Markdown documents. ... [Read more...]

Data manipulation with reshape2

October 9, 2015 | Teja Kodali

In this article, I will show you how you can use the reshape2 package to convert data from wide to long format and vice versa. It was written and is maintained by Hadley Wickham. Long format vs Wide format In wide format data, each column represents a different variable. For ...
[Read more...]

Using the ggplot2 library in R

September 20, 2015 | Teja Kodali

In this article, I will show you how to use the ggplot2 plotting library in R. It was written by Hadley Wickham. If you don’t have already have it, install it and load it up: install.packages('ggplot2') library(ggplot2) qplot qplot is the quickest way to get ... [Read more...]

Using the apply family of functions in R

September 12, 2015 | Teja Kodali

In this article, I will demonstrate how to use the apply family of functions in R. They are extremely helpful, as you will see. apply apply can be used to apply a function to a matrix. For example, let’s create a sample dataset: data
[Read more...]

Building interactive web apps with Shiny

September 11, 2015 | Teja Kodali

In this post, I will show you how to build this app. I will be using the dataset for yellow taxis in the month of January 2015 provided by the NYC Taxi & Limousine Commission. You will need RStudio for this. Since the dataset is very big, I created a smaller dataset ... [Read more...]

Building Wordclouds in R

August 28, 2015 | Teja Kodali

In this article, I will show you how to use text data to build word clouds in R. We will use a dataset containing around 200k Jeopardy questions. The dataset can be downloaded here (thanks to reddit user trexmatt for providing the dataset). We will require three packages for this: ... [Read more...]

Data Manipulation with dplyr

August 20, 2015 | Teja Kodali

dplyr is a package for data manipulation, written and maintained by Hadley Wickham. It provides some great, easy-to-use functions that are very handy when performing exploratory data analysis and manipulation. Here, I will provide a basic overview of some of the most useful functions contained in the package. For this ...
[Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)