Finding the midpoint when creating intervals

[This article was first published on Matt's Stats n stuff » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Nothing ground breaking here. I was doing some work dividing data into deciles and then creating some plots. I couldn’t find an function to calculate this from cut, and I use cut quite a bit. So here we are.

midpoints <- function(x, dp=2){
lower <- as.numeric(gsub(“,.*”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
upper <- as.numeric(gsub(“.*,”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
return(round(lower+(upper-lower)/2, dp))
}

 

And in an example:

midpoints <- function(x, dp=2){
lower <- as.numeric(gsub(“,.*”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
upper <- as.numeric(gsub(“.*,”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
return(round(lower+(upper-lower)/2, dp))
}
mtcars$mpg
cut(mtcars$mpg, quantile(mtcars$mpg), include.lowest=T)
midpoints(cut(mtcars$mpg, quantile(mtcars$mpg), include.lowest=T))

Which looks like this:

> midpoints <- function(x, dp=2){
+   lower <- as.numeric(gsub(“,.*”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
+   upper <- as.numeric(gsub(“.*,”,””,gsub(“\\(|\\[|\\)|\\]“,””, x)))
+   return(round(lower+(upper-lower)/2, dp))
+ }
>
> mtcars$mpg
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4 14.7
[18] 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7 15.0 21.4
> cut(mtcars$mpg, quantile(mtcars$mpg), include.lowest=T)
[1] (19.2,22.8] (19.2,22.8] (19.2,22.8] (19.2,22.8] (15.4,19.2] (15.4,19.2] [10.4,15.4]
[8] (22.8,33.9] (19.2,22.8] (15.4,19.2] (15.4,19.2] (15.4,19.2] (15.4,19.2] [10.4,15.4]
[15] [10.4,15.4] [10.4,15.4] [10.4,15.4] (22.8,33.9] (22.8,33.9] (22.8,33.9] (19.2,22.8]
[22] (15.4,19.2] [10.4,15.4] [10.4,15.4] (15.4,19.2] (22.8,33.9] (22.8,33.9] (22.8,33.9]
[29] (15.4,19.2] (19.2,22.8] [10.4,15.4] (19.2,22.8]
Levels: [10.4,15.4] (15.4,19.2] (19.2,22.8] (22.8,33.9]
> midpoints(cut(mtcars$mpg, quantile(mtcars$mpg), include.lowest=T))
[1] 21.00 21.00 21.00 21.00 17.30 17.30 12.90 28.35 21.00 17.30 17.30 17.30 17.30 12.90
[15] 12.90 12.90 12.90 28.35 28.35 28.35 21.00 17.30 12.90 12.90 17.30 28.35 28.35 28.35
[29] 17.30 21.00 12.90 21.00


To leave a comment for the author, please follow the link and comment on their blog: Matt's Stats n stuff » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)