# ave and the [ function in R

[This article was first published on

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The **mages' blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

`ave`

function in R is one of those little helper function I feel I should be using more. Investigating its source code showed me another twist about R and the “[” function. But first let’s look at `ave`

.The top of

`ave`

‘s help page reads:*Group Averages Over Level Combinations of Factors*

Subsets of x[] are averaged, where each subset consist of those observations with the same factor levels.

Subsets of x[] are averaged, where each subset consist of those observations with the same factor levels.

As an example I look at revenue data by product and shop.

revenue <- c(30,20, 23, 17) product <- factor(c("bread", "cake", "bread", "cake")) shop <- gl(2,2, labels=c("shop_1", "shop_2"))To answer the question "Which shop sells proportionally more bread?" I need to divide the revenue vector by the sum of revenue per shop, which can be calculated easily by

`ave`

:(shop_revenue <- ave(revenue, shop, FUN=sum)) # [1] 50 50 40 40 (revenue_split_in_shop <- revenue/shop_revenue) # [1] 0.600 0.400 0.575 0.425 # Shop 1 sells more bread than cakeIn other words,

`ave`

has to split the revenue vector by shop and apply the `sum`

function to it. Well that's exactly what it does. Here is the source code of `ave`

:# Copyright (C) 1995-2012 The R Core Team ave <- function (x, ..., FUN = mean) { if(missing(...)) x[] <- FUN(x) else { g <- interaction(...) split(x,g) <- lapply(split(x, g), FUN) } x }However, and this is what intrigued me, if I don't provide a grouping variable (

`missing(...)`

) it will apply the function `FUN`

on `x`

itself and write its output to `x[]`

. That's actually what the help file to `ave`

mentioned in its description. So what does it do? Here is an example again:ave(revenue, FUN=sum) # [1] 90 90 90 90I get the sum of revenue repeated as many time as the vector has elements, not just once, as with

`sum(revenue)`

. The trick is that the output of `FUN(x)`

is written into `x[]`

, which of course is output of a function call itself "["(x). I think it is the following sentence in the help file of

`"["`

(see ?"["), which explains it: *Subsetting (except by an empty index) will drop all attributes except names, dim and dimnames.*

So there we are. I feel less inclined to use

`ave`

more, as it is just short for the usual `split, lapply`

routine, but I learned something new about the subtleties of R.To

**leave a comment**for the author, please follow the link and comment on their blog:**mages' blog**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.