Pseudo-Random vs. Random Numbers in R

November 25, 2011

(This article was first published on johnramey » r, and kindly contributed to R-bloggers)

Happy Thanksgiving, everyone. Earlier today, I found an interesting post from Bo Allen on pseudo-random vs random numbers, where the author uses a simple bitmap (heat map) to show that the rand function in PHP has a systematic pattern and compares these to truly random numbers obtained from The post’s results suggest that pseudo-randomness in PHP is faulty and, in general, should not be underestimated in practice. Of course, the findings should not be too surprising, as there is a large body of literature on the subtleties, philosophies, and implications of the pseudo aspect of the most common approaches to random number generation. However, it is silly that PHP‘s random number generator (RNG) displays such an obvious pattern nowadays because there are several decent, well-studied pseudo-RNG algorithms available as well as numerous tests for randomness.  For a good introduction to RNG, I recommend John D. Cook’s discussion on testing a random number generator.

Now, I would never use PHP for any (serious) statistical analysis, partly due to my fondness for R, nor do I doubt the practicality of the RNG in R. But I was curious to see what would happen. So, created equivalent plots in R to see if a rand equivalent would exhibit a systematic pattern like in PHP, even if less severe. Also, for comparison, I chose to use the random package, from Dirk Eddelbuettel, to draw truly random numbers from Until today, I had only heard of the random package but had never used it.

I have provided the function rand_bit_matrix, which requires the number of rows and columns to display in the plotted bitmap. To create the bitmaps, I used the pixmap package rather than the much-loved ggplot2 package, simply because of how easy it was for me to create the plots. (If you are concerned that I have lost the faith, please note that I am aware of the awesomeness of ggplot2 and its ability to create heat maps.)

It is important to note that there were two challenges that I encountered when using drawing truly random numbers.

  1. Only 10,000 numbers can be drawn at once from (This is denoted as in the function below.)
  2. There is a daily limit to the number of times the service will provide numbers.

To overcome the first challenge, I split the total number of bits into separate calls, if necessary. This approach, however, increases our number of requests, and after too many requests, you will see the error: suggests to wait until tomorrow. Currently, I do not know the exact number of allowed requests or if the amount of requested random numbers is a factor, but looking back, I would guess about 20ish large requests is too much.

Below, I have plotted 500 x 500 bitmaps based on the random bits from both of R and As far as I can tell, no apparent patterns are visible in either plot, but from the graphics alone, our conclusions are limited to ruling out obvious systematic patterns, which were exhibited from the PHP code. I am unsure if the PHP folks formally tested their RNG algorithms for randomness, but even if they did, the code in both R and PHP is straightforward and provides a quick eyeball test. Armed with similar plots alone, the PHP devs could have sought for better RNG algorithms — perhaps, borrowed those from R.

Click the images for a larger view.



Finally, I have provided my R code, which you may tinker with, copy, manipulate, utilize, etc. without legal action and without fear of death by snoo snoo. Mmmm. Snoo snoo.



rand_bit_matrix <- function(num_rows = 500, num_cols = 500, = 10000, seed = NULL) {
  # I have copied the following function directly from help("integer").
  is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) {
    abs(x - round(x)) < tol
  # The number of bits to draw at "random".
  n <- num_rows * num_cols
  if(n <= 0 || !is.wholenumber(n)) {
    stop("The number of bits 'n' should be a natural number.")

  if(!is.null(seed)) {

  # Create a matrix of pseudo-random bits.
  bits_R <- replicate(n = num_cols, sample(c(0, 1), size = num_rows, replace = TRUE))
  # Because will only return a maximum of 10,000 numbers at a time,
  # we break this up into several calls. <- =, times = n %/%
  if(n %% > 0) { <- c(, n %%
  } <- lapply(, function(n) {
    try_default(randomNumbers(n = n, min = 0, max = 1, col = 1), NA)
  }) <- matrix(unlist(, nrow = num_rows, ncol = num_cols)
  list(R = bits_R, =

bit_mats <- rand_bit_matrix(num_rows = 500, num_cols = 500, seed = 42)

  plot(pixmapGrey(data = R, nrow = nrow(R), ncol = ncol(R)), main = "R")
  plot(pixmapGrey(data =, nrow = nrow(, ncol = ncol(, main = "")
view raw
This Gist brought to you by GitHub.

To leave a comment for the author, please follow the link and comment on their blog: johnramey » r. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , ,

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)