# Articles by Anirudh

### Linear / Logistic Regression in R: Dealing With Unknown Factor Levels in Test Data

October 7, 2017 |

Let’s say you have data containing a categorical variable with 50 levels. When you divide the data into train and test sets, chances are you don’t have all 50 levels featuring in your training set. This often happens when you divide the data set into train and test sets according ... [Read more...]

### Quick Way of Installing all your old R libraries on a New Device

July 26, 2017 |

I recently bought a new laptop and began installing essential software all over again, including R of course! And I wanted all the libraries that I had installed in my previous laptop. Instead of installing libraries one by one all over again, I did the following: Step 1: Save a list ... [Read more...]

### Endogenously Detecting Structural Breaks in a Time Series: Implementation in R

November 8, 2016 |

The most conventional approach to determine structural breaks in longitudinal data seems to be the Chow Test. From Wikipedia, The Chow test, proposed by econometrician Gregory Chow in 1960, is a test of whether the coefficients in two linear regressions on different data sets are equal. In econometrics, it is most ...

### MITx 15.071x (Analytics Edge) – 2016

May 2, 2016 |

There's still time to enroll and grab a certificate (or simply audit). The course is offered once a year. I met a bunch of people who did well at a data hackathon I had gone to recently, who had learned the ropes in data science thanks to Analytics Edge.

### Detecting Structural Breaks in China’s FX Regime

April 26, 2016 |

Edit: This post is in its infancy. Work is still ongoing as far as deriving insight from the data is concerned. More content and economic insight is expected to be added to this post as and when progress is made in that direction. This is an attempt to detect structural ...

### Data Manipulation in R with dplyr – Part 3

December 22, 2015 |

This happens to be my 50th blog post – and my blog is 8 months old. ? This post is the third and last post in in a series of posts (Part 1 – Part 2) on data manipulation with dlpyr. Note that the objects in the code may have been defined in earlier posts … Continue ...

### My First Data Science Hackathon

December 20, 2015 |

I participated in https://t.co/alLuY7JjjT Finished 24th/54. It was my first ever #datascience #hackathon. Determined to get better at this. — Padawan Learner (@anirudhjay) December 20, 2015 So after 8 months of playing around with R and Python and blog post after blog post, I found myself finally hacking away at ...

### Data Manipulation in R with dplyr – Part 2

December 18, 2015 |

Note that this post is in continuation with Part 1 of this series of posts on data manipulation with dplyr in R. The code in this post carries forward from the variables / objects defined in Part 1. In the previous post, I talked about how dplyr provides a grammar of sorts to ... [Read more...]

### Data Manipulation in R with dplyr – Part 1

December 17, 2015 |

dplyr is one of the packages in R that makes R so loved by data scientists. It has three main goals: Identify the most important data manipulation tools needed for data analysis and make them easy to use in R. Provide blazing fast performance for in-memory data by writing key ... [Read more...]

### Statistical Learning – 2016

December 12, 2015 |

On January 12, 2016, Stanford University professors Trevor Hastie and Rob Tibshirani will offer the 3rd iteration of Statistical Learning, a MOOC which first began in January 2014, and has become quite a popular course among data scientists. It is a great place to learn statistical learning (machine learning) methods using the R ...

### Troubleshooting ‘Rattle’ (R library) Installation on Ubuntu

November 23, 2015 |

This post pertains to Ubuntu / Debian users only. rattle is a free graphical interface for data mining with R. I wanted to visualize decision trees and had to install this library. __ install.packages('rattle') got me the following error message: configure: error: GTK version 2.8.0 required ERROR: configuration failed for package ‘...