Save your simulation study seeds

September 11, 2014

(This article was first published on Robert Grant's stats blog » R, and kindly contributed to R-bloggers)

Here in the Northern hemisphere, gardeners are gathering seeds from their prize-winning vegetables are storing them away for next year’s crop. Today at the 20th London Stata Users’ Group meeting, I learnt a similar trick. It’s strange I never thought of it before; regular readers will know how keen I am on simulation studies and techniques. Sometimes you want to go back and investigate one particular simulated dataset, either because the results were particularly interesting or because your program didn’t behave the way you hoped, and you think there will be a clue on how to debug it in that one weird dataset.

Bill Gould’s sage advice was to gather the seed for the random number generator (RNG) at the beginning of each iteration and store it as a string. The seed initialises the RNG (which is never really random*, just unpredictable) and if you set it, you can reproduce the results. A basic example in Stata:

clear all

local iter=10
set obs 100
gen str64 myseed=""
gen slopes=.
gen x=rnormal(0,1)
gen mu=1+(2*x)
gen y=rnormal(mu,1)

forvalues i=1/`iter' {
 local myseed=c(seed)
 qui replace myseed="`myseed'" in `i'
 qui replace x=rnormal(0,1)
 qui replace mu=1+(2*x)
 qui replace y=rnormal(mu,1)
 qui regress y x
 matrix A=e(b)
 local beta=A[1,1]
 qui replace slopes=`beta' in `i'

local revisit=myseed[2]
set seed `revisit'
qui replace x=rnormal(0,1)
qui replace mu=1+(2*x)
qui replace y=rnormal(mu,1)
regress y x
dis "`revisit'" 

In this example, we run a linear regression 10 times and save the slope parameters and the seeds. We can then pick out simulation number 2 and recreate it without retracing any other steps. Nice! Here it is in R (but R RNGs are much more complex and I have to say that the help documentation is far from helpful):

for(i in 1:iter) {

* – I’m glossing over the philosophical meaning of ‘random’ here

To leave a comment for the author, please follow the link and comment on their blog: Robert Grant's stats blog » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)