# Missing data, logistic regression, and a predicted values plot (or two)

July 15, 2009
By

(This article was first published on Learning R, and kindly contributed to R-bloggers)

miss <- read.table ("/data/missing.txt", header = T, sep = "\t") attach miss result1 <- glm(a~b, family=binomial(logit)) summary(result1) Call: glm(formula = a ~ b, family = binomial(logit)) Deviance Residuals:
Min 1Q Median 3Q Max
-1.8864 -1.2036 0.7397 0.9425 1.4385

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -5.96130 1.40609 -4.240 2.24e-05 ***
b 0.10950 0.02404 4.555 5.24e-06 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 279.97 on 203 degrees of freedom
Residual deviance: 236.37 on 202 degrees of freedom
(3 observations deleted due to missingness)
AIC: 240.37

Number of Fisher Scoring iterations: 5
detach (miss)
attach (miss2)

result2 <- glm(a~b, family=binomial(logit)) summary(result2) Call: glm(formula = a ~ b, family = binomial(logit)) Deviance Residuals: Min 1Q Median 3Q Max -1.884 -1.198 0.742 0.936 1.446 Coefficients: Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.0059 1.4162 -4.241 2.23e-05 ***
b 0.1101 0.0242 4.549 5.39e-06 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 278.81 on 202 degrees of freedom
Residual deviance: 235.14 on 201 degrees of freedom
AIC: 239.14

Number of Fisher Scoring iterations: 5

plot(b, fitted(result1))

plot(b, fitted(result1), type=”n”)

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...