Articles by Selva Prabhakaran

Outlier detection and treatment with R

December 9, 2016 | Selva Prabhakaran

Outliers in data can distort predictions and affect the accuracy, if you don’t detect and handle them appropriately especially in regression models. Why outliers treatment is important? Because, it can drastically bias/change the fit estimates and predictions. Let me illustrate this using the cars dataset. To better understand ... [Read more...]

Chi-Squared Test

August 14, 2016 | Selva Prabhakaran

Before we build stats/machine learning models, it is a good practice to understand which predictors are significant and have an impact on the response variable. In this post we deal with a particular case when both your response and predictor are categorical variables. By the end of this you’... [Read more...]

Missing Value Treatment

April 25, 2016 | Selva Prabhakaran

Missing values in data is a common phenomenon in real world problems. Knowing how to handle missing values effectively is a required step to reduce bias and to produce powerful models. Lets explore various options of how to deal with missing values and how to implement them. Data prep and ...
[Read more...]

Learn R By Intensive Practice – Part 2

April 13, 2016 | Selva Prabhakaran

This is a continuation of part 1 of the Learn R By Intensive Practice Series. In this part, we step up the game and learn a number of key concepts such as lists, sampling, data frames etc. At the end of each video, you will be solving a practice challenge based ...
[Read more...]

Learn R by Intensive Practice

March 10, 2016 | Selva Prabhakaran

Learn R by Intensive Practice is an introductory R course built especially for beginners who are completely new to R or even to basic programming. This is the first part of a multi-part video lessons aimed to give hands-on learning experience throughout the course. In this and the coming parts, ...
[Read more...]

Strategies to Speedup R Code

January 30, 2016 | Selva Prabhakaran

The for-loop in R, can be very slow in its raw un-optimised form, especially when dealing with larger data sets. There are a number of ways you can make your logics run fast, but you will be really surprised how fast you can actually go. This posts shows a number ... [Read more...]

Learn R From Scratch – Part 3

January 21, 2016 | Selva Prabhakaran

In the previous tutorial, of the “Learn R From Scratch” series, we learn very important concepts such as lists, dataframes and how to import and export data from R. This time, we will discuss more practical aspects such as exploring built-in datasets, handling dates, writing functions and debugging. In this ...
[Read more...]

Learn R From Scratch – Part 2

January 20, 2016 | Selva Prabhakaran

This is a continuation from the Part 1 of “Learn R From Scratch” series. In the previous post, the videos covered the very basics for R from scratch. We first installed R, got familiar with the environment, worked some basic math, different types of variables, got introduced to vectors and learnt ...
[Read more...]

Learn R From Scratch – Part 1

January 19, 2016 | Selva Prabhakaran

R is an open source programming language with a lot of facilities for problem solving through statistical computing. At the time of writing this, there are more than 6K packages available in CRAN repository. R is a language and an environment for everything related to data analysis. That includes statistical ...
[Read more...]

How to detect heteroscedasticity and rectify it?

January 13, 2016 | Selva Prabhakaran

One of the important assumptions of linear regression is that, there should be no heteroscedasticity of residuals. In simpler terms, this means that the variance of residuals should not increase with fitted values of response variable. In this post, I am going to explain why it is important to check ... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)