Cluster Sampling in R, as discussed in one of our old posts, researchers frequently gather samples from a population and use the findings to derive conclusions about the entire population.

Cluster sampling, in which a population is divided into clusters and all members of particular clusters are chosen to be included in the sample, is a frequent sampling method.

This tutorial will show you how to use R to perform cluster sampling.

## Approach: Cluster Sampling in R

Let’s say a consumer goods company wishes to conduct a survey of its clients. They choose four goods at random from the ten goods groups and ask each consumer to score their experience on a scale of one to ten.

The following code demonstrates how to interact with a dummy data frame in R.

create a repeatable example

`set.seed(1)`

Yes, let’s create a data frame

`df <- data.frame(goods = rep(1:10, each=50),experience = rnorm(500, mean=5, sd=2.2))`

Now we can view the first six rows of the data frame.

```head(df)
goods experience
1     1  5.2516591
2     1  4.3155235
3     1  5.6258145
4     1  2.0780667
5     1  6.1671238
6     1  0.3534712```

And the code below demonstrates how to get a sample of customers by picking four goods at random and including every member of those goods in the sample.

Out of the ten goods groups, choose four at random.

`clusters <- sample(unique(df\$goods), size=4, replace=F)`

All participants of one of the four goods groups are included in the sample.

sample <- df[df\$goods %in% clusters, ]

View how many customers came from each tour

```table(sample\$goods)
2  3  7 10
50 50 50 50```

## Conclusion

We can observe from the output that:

The sample includes 50 customers from goods groups 2, 3, 7, and 10.

As a result, our sample is made up of 200 clients who arrived from four distinct goods groups.

