Blog Archives

Outlier detection and treatment with R

December 9, 2016
By
Outlier detection and treatment with R

Outliers in data can distort predictions and affect the accuracy, if you don’t detect and handle them appropriately especially in regression models. Why outliers treatment is important? Because, it can drastically bias/change the fit estimates and predictions. Let me illustrate this using the cars dataset. To better understand the implications of outliers better, I am Related PostR for Publication:...

Read more »

Chi-Squared Test

August 14, 2016
By
Chi-Squared Test

Before we build stats/machine learning models, it is a good practice to understand which predictors are significant and have an impact on the response variable. In this post we deal with a particular case when both your response and predictor are categorical variables. By the end of this you’d have gained an understanding of what Related PostMissing Value TreatmentR...

Read more »

Missing Value Treatment

April 25, 2016
By
Missing Value Treatment

Missing values in data is a common phenomenon in real world problems. Knowing how to handle missing values effectively is a required step to reduce bias and to produce powerful models. Lets explore various options of how to deal with missing values and how to implement them. Data prep and pattern Lets use the BostonHousing Related PostR for Publication...

Read more »

Learn R By Intensive Practice – Part 2

April 13, 2016
By
Learn R By Intensive Practice – Part 2

This is a continuation of part 1 of the Learn R By Intensive Practice Series. In this part, we step up the game and learn a number of key concepts such as lists, sampling, data frames etc. At the end of each video, you will be solving a practice challenge based on what you learnt 11. Get specific items...

Read more »

Learn R by Intensive Practice

March 10, 2016
By
Learn R by Intensive Practice

Learn R by Intensive Practice is an introductory R course built especially for beginners who are completely new to R or even to basic programming. This is the first part of a multi-part video lessons aimed to give hands-on learning experience throughout the course. In this and the coming parts, I have covered the essential 01. Install R and...

Read more »

Strategies to Speedup R Code

January 30, 2016
By
Strategies to Speedup R Code

The for-loop in R, can be very slow in its raw un-optimised form, especially when dealing with larger data sets. There are a number of ways you can make your logics run fast, but you will be really surprised how fast you can actually go. This posts shows a number of approaches including simple tweaks

Read more »

Learn R From Scratch – Part 3

January 21, 2016
By
Learn R From Scratch – Part 3

In the previous tutorial, of the “Learn R From Scratch” series, we learn very important concepts such as lists, dataframes and how to import and export data from R. This time, we will discuss more practical aspects such as exploring built-in datasets, handling dates, writing functions and debugging. In this series, I have also laid

Read more »

Learn R From Scratch – Part 2

January 20, 2016
By
Learn R From Scratch – Part 2

This is a continuation from the Part 1 of “Learn R From Scratch” series. In the previous post, the videos covered the very basics for R from scratch. We first installed R, got familiar with the environment, worked some basic math, different types of variables, got introduced to vectors and learnt how to access and

Read more »

Learn R From Scratch – Part 1

January 19, 2016
By
Learn R From Scratch – Part 1

R is an open source programming language with a lot of facilities for problem solving through statistical computing. At the time of writing this, there are more than 6K packages available in CRAN repository. R is a language and an environment for everything related to data analysis. That includes statistical computing, data mining, data analysis,

Read more »

How to detect heteroscedasticity and rectify it?

January 13, 2016
By
How to detect heteroscedasticity and rectify it?

One of the important assumptions of linear regression is that, there should be no heteroscedasticity of residuals. In simpler terms, this means that the variance of residuals should not increase with fitted values of response variable. In this post, I am going to explain why it is important to check for heteroscedasticity, how to detect

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)