Roulette-wheel selection for dqrng

Posted on December 28, 2022 by R on Ralf Stubner in R bloggers | 0 Comments

[This article was first published on R on Ralf Stubner, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There is a long standing issue with my {dqrng} package: weighted sampling. Since implementing fast un-weighted sampling methods quite some time ago, I have now started looking into possibilities for weighted sampling.

The issue contains a reference to a blog post that is by now only available via the wayback machine. This blog post shows a stochastic acceptance method suggested by Lipowski and Lipowska (2012) (also at https://arxiv.org/abs/1109.3627), which appears very promising. Let’s try some simple tests before incorporating it into the package properly. The stochastic acceptance algorithm is very simple:

Select randomly one of the individuals (say, \(i\)). The selection is done with uniform probability \(1/N\), which does not depend on the individual’s fitness \(w_i\).

With probability \(w_i/w_\max\), where \(w_\max = \max\{w_i\}_{i=1}^N\) is the maximal fitness in the population, the selection is accepted. Otherwise, the procedure is repeated from step 1 (i.e., in the case of rejection, another selection attempt is made).

This can be implemented as:

// [[Rcpp::depends(dqrng,BH,sitmo)]]
#include <Rcpp.h>
#include <dqrng_distribution.h>

auto rng = dqrng::generator<>(42);

// [[Rcpp::export]]
Rcpp::IntegerVector sample_prob(int n, Rcpp::NumericVector prob) {
  Rcpp::IntegerVector result(Rcpp::no_init(n));
  double max_prob = Rcpp::max(prob);
  uint32_t m(prob.length());
  std::generate(result.begin(), result.end(),
                [m, prob, max_prob] () {
                  while (true) {
                    int index = (*rng)(m);
                    if (dqrng::uniform01((*rng)()) < prob[index] / max_prob)
                      return index + 1;
                  }
                });
  return result;
}

First, let’s check that sampling still works as expected:

M <- 1e4
N <- 10
prob <- runif(N)
hist(sample.int(N, M, replace = TRUE, prob = prob), breaks = N)

hist(sample_prob(M, prob = prob), breaks = N)

Eyeballing these histograms shows that they are very similar, i.e. the stochastic acceptance algorithm selects the ten possibilities with the same probabilities as R’s build in method.

Second, let’s look at performance:

bm <- bench::mark(
  sample.int = sample.int(N, M, replace = TRUE, prob = prob),
  sample_prob = sample_prob(M, prob = prob),
  check = FALSE
)
knitr::kable(bm[, 1:6])

expression	min	median	itr/sec	mem_alloc	gc/sec
sample.int	247µs	282µs	3313.546	41.6KB	2.019224
sample_prob	271µs	289µs	3305.289	3.81MB	2.016650

plot(bm)
## Loading required namespace: tidyr

There is only little difference between R’s build in method and this algorithm for only ten different possibilities. However, any performance advantage of stochastic acceptance should come from \(O(1)\) scaling, i.e. larger number of possibilities are more interesting:

N <- 1e5
prob <- runif(N)
bm <- bench::mark(
  sample.int = sample.int(N, M, replace = TRUE, prob = prob),
  sample_prob = sample_prob(M, prob = prob),
  check = FALSE
)
knitr::kable(bm[, 1:6])

expression	min	median	itr/sec	mem_alloc	gc/sec
sample.int	2.96ms	3.58ms	268.6911	1.19MB	6.397408
sample_prob	663.21µs	863.2µs	1068.9643	41.6KB	0.000000

plot(bm)

This is very promising! It looks worthwhile including this algorithm into {dqrng}, especially since it can also be used for weighted sampling without replacement.

Lipowski, Adam, and Dorota Lipowska. 2012. “Roulette-Wheel Selection via Stochastic Acceptance.” Physica A: Statistical Mechanics and Its Applications 391 (6): 2193–96. https://doi.org/10.1016/j.physa.2011.12.004.

To leave a comment for the author, please follow the link and comment on their blog: R on Ralf Stubner.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Roulette-wheel selection for dqrng

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)