Towards the basic R mindset.
The post “A first step towards R from spreadsheets” provides an introduction to switching from spreadsheets to R. It also includes a list of additional posts (like this one) on the transition.
Add two columns
Figure 1 shows some numbers in two columns and the start of adding those two columns to each other in a third column.
The next step is to fill the addition formula down the column.
It is not so different to do the same thing in R. First create two objects that are equivalent to the two columns in the spreadsheet:
A <- c(32.5, -3.8, 15.9, 22.5) B <- c(48.1, 19.4, 46.8, 14.7)
In those commands you used the
c function which combines objects. You have created two vectors. The rules for a vector are:
- it can be any length (up to a very large value)
- all the elements are of the same type — all numbers, all character strings or all logicals
- the order matters (just like it matters which row a number is in within a spreadsheet)
To summarize: they’re in little boxes and they all look just the same.
You have two R vectors holding your numbers. Now just add them together (and assign that value into a third name):
C <- A + B
This addition is precisely what is done in the spreadsheet: the first value in C is the first value in A plus the first value in B, the second value in C is the second value in A plus the second value in B, and so on.
See the values in an object by typing its name:
> C  80.6 15.6 62.7 37.2
> ” is the R prompt, you type only what is after that: ‘
C‘ (and the return or enter key).
Also note that R is case-sensitive –
c are different things:
> c function (..., recursive = FALSE) .Primitive("c")
(Don’t try to make sense of what this means other than that
c is a function.)
Multiply by a constant
One way of multiplying a column by a constant is to multiply the values in the column by the value in a single cell. This is illustrated in Figure 2.
Another way of doing the same thing is to fill the value in D1 down column D and then multiply the two columns.
Do this operation in R with:
> C * 33  2659.8 514.8 2069.1 1227.6
In this command you didn’t create a new object to hold the answer.
You can think of R as doing either of the spreadsheet methods, but the fill-down image might be slightly preferable.
Recycling in R
The R recycling rule generalizes the idea of a single value expanding to the length of the vector. It is possible to do operations with vectors of different lengths where both have more than one element:
> 1:6 + c(100, 200)  101 202 103 204 105 206
Figure 3 illustrates how R got to its answer.
Column F shows how column G was created: use the ROW function and fill it down the column. That sequence of numbers was created in R with the
Note how the shorter vector is replicated to the length of the longer one. Each value is used in order, and when it reaches the end it goes back to the beginning again.
You are free to think this is weird. However, it is often useful.
Table 1 translates between spreadsheet and R functions. The spreadsheets consulted were Excel, Works and OpenOffice. Note there is some variation between spreadsheets.
Table 1: Equivalent functions between spreadsheets and R.
|AND||more literally would be the |
|AVG||this danger of |
|AVERAGEIF||subscript before using |
|COLUMN||col||or probably more likely |
|CONFIDENCE||CONFIDENCE(alpha, std, n) is |
|COUNTIF||get length of a subscripted object|
|DGET||use subscripting in R|
|ERF||see the example in |
|ERFC||see the example in |
|EXACT||EXACT is specific to text, |
|FREQUENCY||you probably want to use |
|GAUSS||GAUSS(x) is |
|GESTEP||GESTEP(x, y) is |
|HLOOKUP||use subscripting in R|
|IF||see Circle 3.2 of The R Inferno on |
|INDEX||use subscripting in R|
|INDIRECT||or possibly the eval-parse-text idiom, or (better) make changes that simplify the situation|
|INT||danger: not the same as |
|INTERCEPT||(usually) the first element of |
|LARGE||you can use subscripting after |
|LN||log||danger: the default base in R for |
|LOG||log||danger: the default base in spreadsheets for |
|N||the correspondence is for logicals, |
|OR||the or operators in R are |
|PERCENTRANK||similar to |
|PERMUTATIONA||PERMUTATIONA(n, k) is |
|PROB||you can use the |
|PROPER||see example in |
|RAND||runif||see an introduction to random generation in R|
|RANK||rank||RANK has the |
|RIGHT||you’ll also need |
|ROW||row||or probably more likely |
|SEARCH||also see |
|SMALL||you can use subscripting after |
|SUBSTITUTE||or possibly |
|SUMIF||subscript before using |
|TDIST||TDIST(abs(x), df, tails) is pt(-abs(x), df) * tails|
|TYPE||similar concepts in R are |
|VLOOKUP||use subscripting in R|
The trigonometric functions, like
acosh are the same, except the R functions are all in lowercase.
Spreadsheets show you the arguments of a function. The
args function in R provides similar information. For example:
> args(sample) function (x, size, replace = FALSE, prob = NULL) NULL
This shows that
prob both have default values, and so are not required. Actually
size is not required either —
x is the only mandatory argument.
You will learn to not even see the
NULL on the final line of the result of
You can get help for a function with the question mark operator:
This will show you the help file for the object —
sample in this case. It is best not to let yourself be overwhelmed by a help file.
Most of the R functions are vectorized.
This is like creating a new spreadsheet column where an argument of the function is a value from the same row but a different column. Think of putting
=EXP(A1) in cell B1 and then filling it down.
Giving a vector to
exp returns the exponential of each of the values in the input vector:
> exp(0:5)  1.000000 2.718282 7.389056 20.085537  54.598150 148.413159
The result is a vector of length 6 — the same length as the input. The number in square brackets at the start of each line of output is the index number of the first item on the line.
Some R resources
“Impatient R” provides a grounding in how to use R.
“Some hints for the R beginner” suggests additional ways to learn R.
And they’re all made out of ticky tacky
And they all look just the same
from “Little Boxes” by Malvina Reynolds (1900 – 1978)