# Using R’s set.seed() to set seeds for use in C/C++ (including Rcpp)

**R on roryverse**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In native R, the user sets the seed for random number generation (RNG) with `set.seed()`

. Random number generators exist in C and C++ too; these need their own seeds, which are not obviously settable by `set.seed()`

. Good news! It can be done.

`pacman::p_load(inline, purrr)`

`rbernoulli`

Base R (or technically the `stats`

package) provides no `rbernoulli()`

. It’s a pretty gaping hole in the pantheon of `rbeta()`

, `rbinom()`

, `rcauchy`

, `rchisq()`

, `rexp()`

, `rf()`

, `rgamma()`

, etc. Thankfully, Hadley Wickham noticed this and gave us `purrr::rbernoulli()`

.

```
set.seed(1)
rbernoulli(5, 0.7)
#> [1] FALSE TRUE TRUE TRUE FALSE
set.seed(1)
rbernoulli(5, 0.7)
#> [1] FALSE TRUE TRUE TRUE FALSE
```

So it seems like Hadley managed to get `set.seed()`

to work with `rbernoulli()`

. How did he do this? Let’s take a closer look at `purrr::rbernoulli()`

.

```
purrr::rbernoulli
#> function (n, p = 0.5)
#> {
#> stats::runif(n) > (1 - p)
#> }
#>
```
#>

Ah, it seems Hadley just wrapped `runif()`

; hence, because `set.seed()`

works with `runif()`

, it works with his implementation of `purrr::rbernoulli()`

.

# C++ RNG

The C++ standard library provides the

header file, which includes Bernoulli RNG. Let’s give that a whirl.

```
cpp_rbernoulli <- rcpp(c(n = "integer", p = "numeric", seed = "integer"), '
int n_ = as
```(n), seed_ = as(seed);
double p_ = as(p);
std::default_random_engine generator(seed_);
std::bernoulli_distribution distribution(p_);
IntegerVector out(n_);
for (std::size_t i = 0; i != n_; ++i) {
out[i] = distribution(generator);
}
return out;
', includes = "#include ")

```
cpp_rbernoulli(6, 0.7, seed = 1)
#> [1] 1 0 1 1 0 1
cpp_rbernoulli(6, 0.7, seed = 1)
#> [1] 1 0 1 1 0 1
cpp_rbernoulli(6, 0.7, seed = 2)
#> [1] 1 0 1 0 1 1
```

OK, so now we have `cpp_rbernoulli()`

which is working, but the user has to pass the seed as an argument of the function, there’s no option to use `set.seed()`

.

`get_seed()`

If only there was a `get_seed()`

function that we could use. Well, here it is!

```
get_seed <- function() {
sample.int(.Machine$integer.max, 1)
}
```

This gets a positive number in the unsigned 32-bit integer range (which is always a safe bet for a seed) and it is completely determined by `set.seed()`

. Therefore, it’s fine to use as a seed itself. Let’s take a look.

```
set.seed(1)
replicate(6, get_seed())
#> [1] 570175513 799129990 1230193230 1950361378 433108649 1929277158
set.seed(1)
replicate(6, get_seed())
#> [1] 570175513 799129990 1230193230 1950361378 433108649 1929277158
set.seed(2)
replicate(6, get_seed())
#> [1] 397031630 1508336757 1231208929 360888751 2026879546 2026097046
set.seed(2)
replicate(6, get_seed())
#> [1] 397031630 1508336757 1231208929 360888751 2026879546 2026097046
```

So as we can see, setting a seed via `set.seed()`

determines the seeds that subsequently come out of `get_seed()`

, so all is well with the world. `get_seed()`

can now be used to create a version of `cpp_rbernoulli()`

which uses `set.seed()`

. For the sake of inflating my own ego, I’ll name this version after myself.

```
rorybernoulli <- function(n, p) {
cpp_rbernoulli(n, p, get_seed())
}
```

Let’s check that it’s in working order.

```
set.seed(1)
rorybernoulli(6, 0.7)
#> [1] 1 1 0 0 1 0
set.seed(1)
rorybernoulli(6, 0.7)
#> [1] 1 1 0 0 1 0
set.seed(2)
rorybernoulli(6, 0.7)
#> [1] 0 1 1 1 1 1
set.seed(2)
rorybernoulli(6, 0.7)
#> [1] 0 1 1 1 1 1
```

Everything is awesome.

# Benchmarking

Lastly, let’s compare the two Bernoulli RNGs that we have now by asking them both to give us a million Bernoulli random numbers with `p = 0.5`

.

```
bench::mark(purrr::rbernoulli(1e6, p = 0.5),
rorybernoulli(1e6, p = 0.5),
check = FALSE)
#> # A tibble: 2 x 10
#> expression min mean median max `itr/sec` mem_alloc n_gc n_itr
#>
```
#> 1 purrr::rb… 26.9ms 29.45ms 28.57ms 32.7ms 34.0 11.45MB 4 13
#> 2 roryberno… 8.55ms 9.61ms 9.38ms 12.7ms 104. 3.82MB 4 46
#> # ... with 1 more variable: total_time

Wow, `rorybernoulli()`

is three times faster! I wasn’t expecting that. Perhaps it’s because there’s a quicker way of generating a Bernoulli random number than by going through a uniform random number (as `purrr::rbernoulli()`

does). It’s also three times as efficient with memory, probably related to the time speedup. The point of me writing this post was to share this `get_seed()`

thing with people so that the can use `set.seed()`

with `Rcpp`

and the like; `purrr::rbernoulli()`

was just a cool example of a non-base RNG that popped into my head. Maybe I should submit a pull request to `purrr`

!

**leave a comment**for the author, please follow the link and comment on their blog:

**R on roryverse**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.