Site icon R-bloggers

Tidbit: Correlation and Simple Linear Regression

[This article was first published on Kevin Davenport » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In business “Correlation” is generically used as a mutual relationship or connection between two or more things; statistically speaking correlation is the interdependence of variable quantities. I overhear many end users request information on the correlation of variables for prediction use, what they are referring to is actually simple linear regression. I don’t mean to outline all the math used in either function, rather I’d like to differentiate the fundamental reasoning for the business user.

Whether you are examining the data in Excel via CORREL(), R via cor(), or MATLAB via corrcoef(x,y), correlation is best used when X and Y are two variables you can control and measure. Simple Linear Regression would be used if you control X and are measuring Y.  Time allowed to bake or grams of baking soda used are variables you might control (X) whereas height or density of the resulting cake might be the output variable (Y).

Similarities:

Differences:

To leave a comment for the author, please follow the link and comment on their blog: Kevin Davenport » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.