In section 1.11.4 (p. 50), we discuss referring to lists of variables in a data set. In SAS, this can be done for variable stored in adjacent columns with the “var_x — var_y” syntax and for variables with sequentially enumerated suffixes with the “var_n1 – var_n2″ syntax. We state in the above referenced section that R has no straightforward equivalent ability to reference a list of variables by name, though to reference by location is trvial. Wayne Richter (of the NY State Department of Environmental Conservation) pointed out a reference from Muenchen’s excellent text that makes this task relatively straightforward to undertake in R for variables with sequential numerical suffixes.

**R**

Here we demonstrate this by displaying the means of the `cesd1`, `cesd2`, `cesd3`, and `cesd4` variables measuring depressive symptoms at each of the followup time points for the HELP study.

ds = read.csv("http://www.math.smith.edu/r/data/help.csv")

mean(ds[, paste('cesd', seq(1:4), sep = '')], na.rm=TRUE)

which generates the output:

cesd1 cesd2 cesd3 cesd4

22.71545 23.58373 22.06855 20.14286

This approach selects a set of variables by generating a character vector of variable names using the `paste()` function (section 1.4.5) and the `seq()` function (section 1.11.3). Then the `mean()` function is applied to the selected columns.

**SAS**

This task is straightforward in SAS, using the `-` syntax (section 1.11.4) in the `var` statement in `proc means`.

proc means data=ds maxdec=2 n mean;

var cesd1 - cesd4;

run;

The MEANS Procedure

Variable Label N Mean

-----------------------------------------

CESD1 1 cesd 246 22.72

CESD2 2 cesd 209 23.58

CESD3 3 cesd 248 22.07

CESD4 4 cesd 266 20.14

-----------------------------------------

**Tags:** data frames, list of variables, range of variables, referencing sequential variables, string functions