# R Function of the Day: sample

**sigmafield - R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

*
This post originally appeared on my WordPress blog on May 23, 2010. I present it here in its original form.*

The **R Function of the Day** series will focus on describing in *plain language* how certain R functions work, focusing on simple examples
that you can apply to gain insight into your own data.

Today, I will discuss the **sample** function.

### Random Permutations

In its simplest form, the **sample** function can be used to return a
random permutation of a vector. To illustrate this, let’s create a
vector of the integers from 1 to 10 and assign it to a variable **x**.

x <- 1:10

Now, use **sample** to create a random permutation of the vector x.

sample(x) [1] 3 2 1 10 7 9 4 8 6 5

Note that if you give **sample** a vector of length 1 (e.g., just the
number 10) that it will do the exact same thing as above, that is,
create a random permutation of the integers from 1 to 10.

sample(10) [1] 10 7 4 8 2 6 1 9 5 3

#### Warning!

This can be a source of confusion if you're not careful. Consider the
following example from the **sample** help file.

sample(x[x > 8]) sample(x[x > 9]) [1] 10 9 [1] 9 3 4 8 1 10 7 5 2 6

Notice how the first output is of length 2, since only two numbers are
greater than eight in our vector. But, because of the fact that only
one number (that is, 10) is greater than nine in our vector, **sample**
thinks we want a sample of the numbers from 1 to 10, and therefore
returns a vector of length 10.

### The **replace** argument

Often, it is useful to not simply take a random permutation of a
vector, but rather sample independent draws of the same vector. For
instance, we can simulate a Bernoulli trial, the result of the flip of
a fair coin. First, using our previous vector, note that we can tell
**sample** the size of the sample we want, using the **size** argument.

sample(x, size = 5) [1] 2 10 5 1 6

Now, let's perform our coin-flipping experiment just once.

coin <- c("Heads", "Tails") sample(coin, size = 1) [1] "Tails"

And now, let's try it 100 times.

sample(coin, size = 100) Error in sample(coin, size = 100) : cannot take a sample larger than the population when 'replace = FALSE'

Oops, we can't take a sample of size 100 from a vector of size 2,
unless we set the **replace** argument to TRUE.

table(sample(coin, size = 100, replace = TRUE)) Heads Tails 53 47

### Simple bootstrap example

The **sample** function can be used to perform a simple bootstrap.
Let's use it to estimate the 95% confidence interval for the mean of a
population. First, generate a random sample from a normal
distribution.

rn <- rnorm(1000, 10)

Then, use **sample** multiple times using the **replicate** function to
get our bootstrap resamples. The defining feature of this technique is
that replace = TRUE. We then take the mean of each new sample, gather them, and finally compute the relevant quantiles.

quantile(replicate(1000, mean(sample(rn, replace = TRUE))), probs = c(0.025, 0.975)) 2.5% 97.5% 9.936387 10.062525

Compare this to the standard parametric technique.

t.test(rn)$conf.int [1] 9.938805 10.061325 attr(,"conf.level") [1] 0.95

**leave a comment**for the author, please follow the link and comment on their blog:

**sigmafield - R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.