# Articles by Teja Kodali

### Interactive plotting with rbokeh

February 17, 2016 |

Hello everyone! In this post, I will show you how you can use rbokeh to build interactive graphs and maps in R. What is bokeh? Bokeh is a popular python library used for building interactive plots and maps, and now it is also available in R, thanks to Ryan Hafen. ... [Read more...]

### Predicting wine quality using Random Forests

February 4, 2016 |

Hello everyone! In this article I will show you how to run the random forest algorithm in R. We will use the wine quality data set (white) from the UCI Machine Learning Repository. What is the Random Forest Algorithm? In a previous post, I outlined how to build decision trees ... [Read more...]

### Hierarchical Clustering in R

January 22, 2016 |

Hello everyone! In this post, I will show you how to do hierarchical clustering in R. We will use the iris dataset again, like we did for K means clustering. What is hierarchical clustering? If you recall from the post about k means clustering, it requires us to specify the ... [Read more...]

### Data manipulation with tidyr

January 6, 2016 |

Hello everyone! In this article, I will show you how you can use tidyr for data manipulation. tidyr is a package by Hadley Wickham that makes it easy to tidy your data. It is often used in conjunction with dplyr. Data is said to be tidy when each column represents ...

### K Means Clustering in R

December 28, 2015 |

Hello everyone, hope you had a wonderful Christmas! In this post I will show you how to do k means clustering in R. We will use the iris dataset from the datasets library. What is K Means Clustering? K Means Clustering is an unsupervised learning algorithm that tries to cluster ... [Read more...]

### Using Decision Trees to Predict Infant Birth Weights

December 16, 2015 |

In this article, I will show you how to use decision trees to predict whether the birth weights of infants will be low or not. We will use the birthwt data from the MASS library. What is a decision tree? A decision tree is an algorithm that builds a flowchart ... [Read more...]

### Visualizing MLS Player Salaries with ggplot2

November 23, 2015 |

Recently, I came across this great visualization of MLS Player salaries. I tried to do something similar with ggplot2, and while I was unable to replicate the interactivity or the tree-map nature of the graph, the graph still looks pretty cool. Data The data is contained in this pdf file. ... [Read more...]

### Building Interactive Maps with Leaflet

November 7, 2015 |

Leaflet is an JavaScript library for building interactive maps. RStudio released a package that allows us to build these maps in R! You can do some really cool things in Leaflet, and I will demonstrate a few of those below. Leaflet is compatible with Shiny apps and R Markdown documents. ... [Read more...]

### Using kNN Classifier to Predict Whether the Price of Stock Will Increase

October 23, 2015 |

In this article, I will show you how to use the k-Nearest Neighbors algorithm (kNN for short) to predict whether price of Apple stock will increase or decrease. I obtained the data from Yahoo Finance. You can download the dataset here. What is the k-Nearest Neighbors algorithm? The kNN algorithm ... [Read more...]

### Data manipulation with reshape2

October 9, 2015 |

In this article, I will show you how you can use the reshape2 package to convert data from wide to long format and vice versa. It was written and is maintained by Hadley Wickham. Long format vs Wide format In wide format data, each column represents a different variable. For ...

### Using Linear Regression to Predict Energy Output of a Power Plant

September 29, 2015 |

In this article, I will show you how to fit a linear regression to predict the energy output at a Combined Cycle Power Plant(CCPP). The dataset is obtained from the UCI Machine Learning Repository. The dataset contains five columns, namely, Ambient Temperature (AT), Ambient Pressure (AP), Relative Humidity (RH), ...

### Using the ggplot2 library in R

September 20, 2015 |

In this article, I will show you how to use the ggplot2 plotting library in R. It was written by Hadley Wickham. If you don’t have already have it, install it and load it up: install.packages('ggplot2') library(ggplot2) qplot qplot is the quickest way to get ... [Read more...]

### Using the apply family of functions in R

September 12, 2015 |

In this article, I will demonstrate how to use the apply family of functions in R. They are extremely helpful, as you will see. apply apply can be used to apply a function to a matrix. For example, let’s create a sample dataset: data

### Building interactive web apps with Shiny

September 11, 2015 |

In this post, I will show you how to build this app. I will be using the dataset for yellow taxis in the month of January 2015 provided by the NYC Taxi & Limousine Commission. You will need RStudio for this. Since the dataset is very big, I created a smaller dataset ... [Read more...]

### Building Wordclouds in R

August 28, 2015 |

In this article, I will show you how to use text data to build word clouds in R. We will use a dataset containing around 200k Jeopardy questions. The dataset can be downloaded here (thanks to reddit user trexmatt for providing the dataset). We will require three packages for this: ... [Read more...]

### Data Manipulation with dplyr

August 20, 2015 |

dplyr is a package for data manipulation, written and maintained by Hadley Wickham. It provides some great, easy-to-use functions that are very handy when performing exploratory data analysis and manipulation. Here, I will provide a basic overview of some of the most useful functions contained in the package. For this ...