**Statistical Research » R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

True story (no really, this did actually happen). While in grad school one of the other teaching assistants was approached by one of the students and was asked “will **mu** go out with **me**dian?” The teaching assistant thought the play on words was pretty funny, laughed, and then cluelessly walked away. All of us other grad students were surprised because we knew that really was **mean**.

There are a lot of ways to calculate a measure of center. Here are several examples that include arithmetic mean, geometric mean, harmonic mean, and for good measure the median.

**Arithmetic Mean**

By far the most common is the mean (aka the average). This is simply taking a list of number and dividing by the count of those numbers. It is useful when there are many number that * add* up to a total. What does this tell us? If you were looking at a teeter totter with a bunch of kids on it then it’s where the bar balances. It doesn’t really matter how many kids you have on either side it’s simply where the weight of the kids is even on each side.

**Geometric Mean**

Lesser used is the geometric mean. This is used when there are many quantities that * multiply *together to produce a product of those numbers. This is a more appropriate mean when dealing with proportional growth. Take for example when you invest in something like a 401k. If you get a 8% growth for the first year, 12% for the second, and 11% for the third you would want to take the geometric mean. This can be re-written as 1.08 the first year, 1.12 for the second, and 1.11 for the third. The geometric mean is then calculated as .

This table shows how the results from the geometric mean match the results when applying the rate year by year.

Yearly | Geo-Mean | ||

Rate | 1000 | 1000 | |

0.08 | 1.08 | 1080 | 1103.201691 |

0.12 | 1.12 | 1209.6 | 1217.053972 |

0.11 | 1.11 | 1342.66 |
1342.66 |

0.103201691 |

**Harmonic Mean**

Harmonic mean, like the arithmetic mean, is additive in nature. However, the larger quantities get dampened down. Consequently, it can be used in some situations when there are outliers. This mean can also be useful in a variety of areas including machine learning when averaging *precision and recall* of classifiers.

**Median**

Medians are another example of measure of center. However, unlike arithmetic mean this is less sensitive to outliers. For example when determining a measure of center for national income the mean income would result in a different number than the median income and would lean more toward the very wealthy. However, the median is a better measure of center as it identifies the middle point where half the observations are on either side.

The following code snippets show the three Pythagorean means (arithmetic, geometric, harmonic) as well as the median.

### Generate some fake data x = cbind(sort(rnorm(25,10,1)),rpois(25,10)) ### Write a function for a weighted median X = x[,1]; w = x[,2] weighted.median = function(X,w=1){ ### If a single value of 1 was entered then set up array if(length(w)==1){ w = rep(1,length(X)) } X = cbind(X,w) X = X[complete.cases(X),] y = X[order(X[,1]),] # Sort the matrix y = cbind(y,cumsum(y[,2])) # Attach the cumulative sum ### locate the positions the need to be averaged. ### If there is an exact middle point then it uses the middle point. which.min.lim = min( which(y[,3]/sum(y[,2]) >= 0.5 ) ) which.max.lim = max( which(y[,3]/sum(y[,2]) <= 0.5 ) ) weighted.median = mean(y[max(which.min.lim, which.max.lim),1]) return(weighted.median) } harmonic.mean = function(x,w=1){ if(length(w)==1){ w = rep(1,length(x)) } dem = w/x # Set up denominator values harmonic.mean = sum(w)/sum(dem) # Calculate harmonic mean return(harmonic.mean) } geometric.mean = function(x,w=1){ if(length(w)==1){ w = rep(1,length(x)) } a = x^w b = 1/sum(w) geometric.mean = prod(a) ^ b ### Same calculation just a different way # exp( sum(w * log(x) ) / sum(w) ) return(geometric.mean) } mean(x[,1]) weighted.mean(x[,1],x[,2]) median(x[,1]) weighted.median(x[,1],x[,2]) harmonic.mean(x[,1], x[,2]) harmonic.mean(x[,1]) geometric.mean(x[,1],x[,2]) geometric.mean(x[,1]) hist(x, nclass=100, xlim=c(10,11)); abline(v=weighted.mean(x[,1],x[,2]), col='red', lwd=2) abline(v=weighted.median(x[,1],x[,2]), col='blue', lwd=2) abline(v=harmonic.mean(x[,1], x[,2]), col='green', lwd=2) abline(v=geometric.mean(x[,1],x[,2]), col='purple', lwd=2)

**leave a comment**for the author, please follow the link and comment on their blog:

**Statistical Research » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.