Blog Archives

Linear / Logistic Regression in R: Dealing With Unknown Factor Levels in Test Data

October 7, 2017
By
Linear / Logistic Regression in R: Dealing With Unknown Factor Levels in Test Data

Let’s say you have data containing a categorical variable with 50 levels. When you divide the data into train and test sets, chances are you don’t have all 50 levels featuring in your training set. This often happens when you divide the data set into train and test sets according to the distribution of the … Continue reading Linear...

Read more »

Quick Way of Installing all your old R libraries on a New Device

July 26, 2017
By
Quick Way of Installing all your old R libraries on a New Device

I recently bought a new laptop and began installing essential software all over again, including R of course! And I wanted all the libraries that I had installed in my previous laptop. Instead of installing libraries one by one all over again, I did the following: Step 1: Save a list of packages installed in … Continue reading Quick...

Read more »

Endogenously Detecting Structural Breaks in a Time Series: Implementation in R

November 8, 2016
By
Endogenously Detecting Structural Breaks in a Time Series: Implementation in R

The most conventional approach to determine structural breaks in longitudinal data seems to be the Chow Test. From Wikipedia, The Chow test, proposed by econometrician Gregory Chow in 1960, is a test of whether the coefficients in two linear regressions on different data sets are equal. In econometrics, it is most commonly used in time … Continue reading Endogenously...

Read more »

MITx 15.071x (Analytics Edge) – 2016

May 2, 2016
By
MITx 15.071x (Analytics Edge) – 2016

There's still time to enroll and grab a certificate (or simply audit). The course is offered once a year. I met a bunch of people who did well at a data hackathon I had gone to recently, who had learned the ropes in data science thanks to Analytics Edge.

Read more »

Detecting Structural Breaks in China’s FX Regime

April 26, 2016
By
Detecting Structural Breaks in China’s FX Regime

Edit: This post is in its infancy. Work is still ongoing as far as deriving insight from the data is concerned. More content and economic insight is expected to be added to this post as and when progress is made in that direction. This is an attempt to detect structural breaks in China’s FX regime … Continue reading Detecting...

Read more »

Data Manipulation in R with dplyr – Part 3

December 22, 2015
By
Data Manipulation in R with dplyr – Part 3

This happens to be my 50th blog post – and my blog is 8 months old. ? This post is the third and last post in in a series of posts (Part 1 – Part 2) on data manipulation with dlpyr. Note that the objects in the code may have been defined in earlier posts … Continue reading Data...

Read more »

My First Data Science Hackathon

December 20, 2015
By
My First Data Science Hackathon

I participated in https://t.co/alLuY7JjjT Finished 24th/54. It was my first ever #datascience #hackathon. Determined to get better at this. — Padawan Learner (@anirudhjay) December 20, 2015 So after 8 months of playing around with R and Python and blog post after blog post, I found myself finally hacking away at a problem set from the 17th … Continue reading My...

Read more »

Data Manipulation in R with dplyr – Part 2

December 18, 2015
By
Data Manipulation in R with dplyr – Part 2

Note that this post is in continuation with Part 1 of this series of posts on data manipulation with dplyr in R. The code in this post carries forward from the variables / objects defined in Part 1. In the previous post, I talked about how dplyr provides a grammar of sorts to manipulate data, … Continue reading Data...

Read more »

Data Manipulation in R with dplyr – Part 1

December 17, 2015
By
Data Manipulation in R with dplyr – Part 1

dplyr is one of the packages in R that makes R so loved by data scientists. It has three main goals: Identify the most important data manipulation tools needed for data analysis and make them easy to use in R. Provide blazing fast performance for in-memory data by writing key pieces of code in C++. … Continue reading Data...

Read more »

Statistical Learning – 2016

December 12, 2015
By
Statistical Learning – 2016

On January 12, 2016, Stanford University professors Trevor Hastie and Rob Tibshirani will offer the 3rd iteration of Statistical Learning, a MOOC which first began in January 2014, and has become quite a popular course among data scientists. It is a great place to learn statistical learning (machine learning) methods using the R programming language. … Continue reading Statistical...

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)