# Example 7.2: Simulate data from a logistic regression

June 13, 2009
By

(This article was first published on SAS and R, and kindly contributed to R-bloggers)

It might be useful to be able to simulate data from a logistic regression (section 4.1.1). Our process is to generate the linear predictor, then apply the inverse link, and finally draw from a distribution with this parameter. This approach is useful in that it can easily be applied to other generalized linear models. In this example we assume an intercept of 0 and a slope of 0.5, and generate 1,000 observations. See section 4.6.1 for an example of fitting logistic regression.

SAS
In SAS, we do this within a data step. We define parameters for the model and use looping (section 1.11.1) to replicate the model scenario for random draws of standard Normal covariate values (section 1.10.5), calculating the linear predictor for each, and testing the resulting expit against a random draw from a standard uniform distribution (section 1.10.3).

data test;
intercept = 0;
beta = .5;
do i = 1 to 1000;
xtest = normal(12345);
linpred = intercept + (xtest * beta);
prob = exp(linpred)/ (1 + exp(linpred));
ytest = uniform(0) lt prob;
output;
end;
run;

R
In R we begin by assigning parameter values for the model. We then generate 1,000 random Normal variates (section 1.10.5), calculating the linear predictor and expit for each, and then testing vectorwise (section 1.11.2) against 1,000 random Uniforms (1.10.3).

intercept <- 0
beta <- 0.5
xtest <- rnorm(1000,1,1)
linpred <- intercept + (xtest * beta)
prob <- exp(linpred)/(1 + exp(linpred))
runis <- runif(1000,0,1)
ytest <- ifelse(runis < prob,1,0)

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...