Little useless-useful R functions – How to make R-squared useless

[This article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Uselessness is such a long useless word!

In statistics, R-squared is a statistical measure, that determines the proportion of variance in dependent variable that can be explained by the independent variable. Therefore, it ranges in value from 0 to 1 and is usually interpreted as summarizing the percent of variation in the response that the regression model explains.

So, an R-squared of 0.59 might show how well the data fit to the model (hence goodness of fit) and also explains about 59% of the variation in our dependent variable.

Given this logic, we prefer our regression models to have a high R-squared. Yes? Right! And by useless test, with adding random noise to a function, what happens next?

  set.seed(2908)                   
# some toy/random data
x <- 1:30                        
y <- 2 + 0.5*x + rnorm(30,0,4)    
mod <- lm(y~x)                    
summary(mod)$r.squared

R-squared is also the sum of squared residuals (fitted-value deviations) – mms -over the total sum of squared – tts.

We want to check the useless assumption. R-squared doesn’t necessarily mean measure goodness of fit. It can be arbitrarily low when the model is completely correct. By making sigma2 large
enough, we drive R-squared crazy and towards 0, even when every assumption of the simple linear regression model is correct in every particular.

And the simple function can be described:

useless_r2_with_sigma <- function(sig){
  x <- seq(1,10,length.out = 100)        
  y <- 2 + 1.2*x + rnorm(100,0,sd = sig)  
  summary(lm(y ~ x))$r.squared            
}

And plot the results to check, if it holds water:

assumption_sigma <- seq(0.5,20,length.out = 100)
results <- sapply(assumption_sigma, useless_r2_with_sigma)  
plot(results ~ sigmas, type="b")

And check the results and see, how the sigmas “pull” down the r-squares.

The section of the useless script is as always available on GitHub in  Useless_R_function repository. The sample file in this repository is here (filename: R-squared.R). Check the repository for future updates.

Happy R-coding and stay healthy!

To leave a comment for the author, please follow the link and comment on their blog: R – TomazTsql.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)