Site icon R-bloggers

Mastering Random Sampling in R with the sample() Function

[This article was first published on Steve's Data Tips and Tricks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< section id="introduction" class="level1">

Introduction

The sample() function in R is a powerful tool that allows you to generate random samples from a given dataset or vector. It’s an essential function for tasks such as data analysis, Monte Carlo simulations, and randomized experiments. In this blog post, we’ll explore the sample() function in detail and provide examples to help you understand how to use it effectively.

< section id="understanding-the-sample-function" class="level1">

Understanding the sample() Function

The sample() function in R has the following syntax:

sample(x, size, replace = FALSE, prob = NULL)
< section id="examples" class="level1">

Examples

< section id="simple-random-sampling" class="level2">

Simple Random Sampling

Let’s start with a basic example of simple random sampling without replacement:

# Creating a vector
numbers <- 1:10

# Drawing a sample of 5 elements without replacement
sample_without_replacement <- sample(numbers, 5)

This code will generate a random sample of 5 unique elements from the numbers vector. The output might look something like:

print(sample_without_replacement)
[1] 7 8 3 6 1
< section id="sampling-with-replacement" class="level2">

Sampling with Replacement

Sometimes, you may want to sample with replacement, which means that an element can be selected multiple times. To do this, you can set the replace argument to TRUE:

# Drawing a sample of 5 elements with replacement
sample_with_replacement <- sample(numbers, 5, replace = TRUE)

This code might produce an output like:

print(sample_with_replacement)
[1] 1 3 6 6 2

Notice that the number 2 appears twice in the sample, since we’re sampling with replacement.

< section id="weighted-random-sampling" class="level2">

Weighted Random Sampling

The prob argument in the sample() function allows you to perform weighted random sampling. This means that elements have different probabilities of being selected based on the provided weights. Here’s an example:

# Creating a vector of weights
weights <- c(0.1, 0.2, 0.3, 0.4)

# Drawing a weighted sample of 3 elements without replacement
weighted_sample <- sample(1:4, 3, replace = FALSE, prob = weights)

In this example, the numbers 1, 2, 3, and 4 have weights of 0.1, 0.2, 0.3, and 0.4, respectively. The output might look like:

print(weighted_sample)
[1] 4 3 2

Notice how the elements with higher weights (4 and 3) are more likely to be selected in the sample.

< section id="your-turn" class="level1">

Your Turn!

Now that you’ve seen several examples of using the sample() function in R, it’s time to put your knowledge to the test! Here are some exercises for you to try:

  1. Generate a random sample of 10 elements from the letters of the English alphabet.
  2. Sample 5 elements with replacement from the vector c(10, 20, 30, 40, 50).
  3. Create a vector of weights and perform weighted random sampling to select 3 elements from the vector c("apple", "banana", "orange", "grape").

Feel free to experiment with different combinations of arguments and datasets to solidify your understanding of the sample() function. Happy sampling!

To leave a comment for the author, please follow the link and comment on their blog: Steve's Data Tips and Tricks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version