# Sampling Exercise Part 1

**R-exercises**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this Exercise, we will dive quickly through some basic sampling methods. Follow along this series to use these methods later for our decision trees modelling exercise. We will sample using the package caTools and caret. This is a beginner level exercise. Please refer to the help section for `set.seed()`

, `sample.split()`

,`createDataPartition()`

, and `createFolds()`

functions. You may also find it helpful to go over `subset()`

function.

Answers to the exercises are available here.

If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.

**Exercise 1**

Load the iris data and also load the package “caTools”. If the package is not installed, then use `install.packages`

command to install it.

**Exercise 2**

Set the seed to 100

**Exercise 3**

use the function `sample.split`

with a `SplitRatio=0.7 `

to split the dataset into two folds using the species class. store the results in the variable `split`

**Exercise 4**

use subset function to subset the dataframe where the split is True. Store this result in the variable called `Train`

**Exercise 5**

Store the other 30 percent of the sample in the variable `Test`

. Use the same subset method.

**Exercise 6**

Print out the number of rows in the Test and Train variables. You should see 70 percent of data in the Train and 30 percent in the Test.

**Exercise 7**

Install and load the library “caret”

**Exercise 8**

Set the seed to 500 and use the `createDataPartition`

to do the same 2 fold split as Q3 but with a 80:20 ratio with `List=FALSE`

**Exercise 9**

Use the `createDataPartition`

function to create 5 different samples of the training data.

**Exercise 10**

We know how to make 2 splits now and make 5 different samples. But what about 5 equal splits? Use the `createFolds()`

command to make 5 equal partitions of iris data-set. Make sure that each partitiion has an equal representation of the species class as much as possible.

**leave a comment**for the author, please follow the link and comment on their blog:

**R-exercises**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.