**R-exercises**, and kindly contributed to R-bloggers)

The `aggregate()`

function subsets dataframes, and time series data, then computes summary statistics. The structure of the `aggregate()`

function is `aggregate(x, by, FUN)`

.

Answers to the exercises are available here.

**Exercise 1**

Aggregate the “`airquality`

” data by “`airquality$Month`

“, returning means on each of the numeric variables. Also, remove “`NA`

” values.

**Exercise 2**

Aggregate the “`airquality`

” data by the variable “`Day`

“, remove “`NA`

” values, and return means on each of the numeric variables.

**Exercise 3**

Aggregate “`airquality$Solar.R`

” by “`Month`

“, returning means of “`Solar.R`

“. The header of column 1 should be “`Month`

“. Remove “`not available`

” values.

**Exercise 4**

Apply the standard deviation function to the data aggregation from Exercise 3.

**Exercise 5**

The structure of the `aggregate()`

formula interface is `aggregate(formula, data, FUN)`

.

The structure of the formula is `y ~ x`

. The “`y`

” variables are numeric data. The “`x`

” variables, usually factors, are grouping variables, that subset the “`y`

” variables.

`aggregate.formula`

allows for one-to-one, one-to-many, many-to-one, and many-to-many aggregation.

Therefore, use `aggregate.formula`

for a one-to-one aggregation of “`airquality`

” by the mean of “`Ozone`

” to the grouping variable “`Day`

“.

**Exercise 6**

Use `aggregate.formula`

for a many-to-one aggregation of “`airquality`

” by the mean of “`Solar.R`

” and “`Ozone`

” by grouping variable, “`Month`

“.

**Exercise 7**

Dot notation can replace the “`y`

” or “`x`

” variables in `aggregate.formula`

. Therefore, use “`.`

” dot notation to find the means of the numeric variables in `airquality`

“, with the grouping variable of “`Month`

“.

**Exercise 8**

Use dot notation to find the means of the “`airquality`

” variables, with the grouping variables of “`Day`

” and “`Month`

“. Display only the first 6 resulting observations.

**Exercise 9**

Use dot notation to find the means of “`Temp`

“, with the remaining “`airquality`

” variables as grouping variables.

**Exercise 10**

`aggregate.ts`

is the time series method for `aggregate()`

.

Using `R`

‘s built-in time series dataset, “`AirPassengers`

“, compute the average annual standard deviation.

Image by Averater (Own work) [CC BY-SA 3.0], via Wikimedia Commons.

**leave a comment**for the author, please follow the link and comment on their blog:

**R-exercises**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...