Linear Regression with R : step by step implementation part-1

[This article was first published on Pingax » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Welcome to the first part of my series blog post. In this post, I will discuss about how to implement linear regression step by step in R by understanding the concept of regression. I will try to explain the concept of linear regression in very short manner and try to convert mathematical formulas in to codes(hope you may like this!!).  I really inspired from Andrew NG and Machine learning course from coursera and get inspiration to write this blog post.

So, let us start first understanding linear regression. You can also view the video lecture of the Linear regression from Machine learning class. We will try to understand in very short manner.

Regression

Regression is widely used for prediction. Focus of the regression is on the relationship between dependent variable and one or more independent variables. Regression analysis helps to understand how the value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed. In the regression dependent variable is estimated as function of independent variables which is called regression function. In the regression model following parameters are used.

  1. Independent variables X.
  2. Dependent variable Y.
  3. Unknown parameter ø

In regression model Y is function of (X, ø). There are many techniques for regression analysis.

Linear regression

In the Linear regression, dependent variable Y is linear combination of the independent variables. Here regression function is known as hypothesis which is defined as below

hƟ(x) = f(x,Ɵ)

Suppose we have dependent variable is Y and independent variables are x1,x2,x3.  Hypothesis is defined as below

hypothesis

The goal is to find the values Ɵ’s(known as coefficients) so that we can fit model on data. In the linear regression model regression model is the straight line. We can predict the value of dependent variable from independent variables. Starting with Ɵ’s values zeros, we find that difference between actual and predicted value is big. Cost function is used as measurement parameter of linear regression model. Cost function is defined as below.

costFunction

We have to change the values of Ɵs to minimize cost. Gradient descent is used to minimize the cost. Every time values ofƟs are updated until hopefully cost is minimized. it is as below

Gradient

Partial derivative of cost function respect to Ɵs is performed in gradient descent and α is learning rate. We get following equations as below

Convergence

We have to iterate above equation until we get some values of Ɵs which minimize the cost and we get better predictions.

So this is all about the linear regression. We understood the linear regression hypothesis, coefficient, cost function and cost minimization process to derive best coefficient to fit linear model. In the next part, we start implementing Linear regression using sample data set

Powered by Google+ Comments

The post Linear Regression with R : step by step implementation part-1 appeared first on Pingax.

To leave a comment for the author, please follow the link and comment on their blog: Pingax » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)