**Econometric Sense**, and kindly contributed to R-bloggers)

Previously I wrote about the practical differences between marginal effects and odds ratios with regard to logistic regression.

Recently, I ran across a tweet from Michael Grogan linking to one of his posts using logistic regression to model dividend probabilities. This really got me interested:

*“Moreover, to obtain a measure in probability terms – one could divide the logit coefficient by 4 to obtain an estimate of the probability of a dividend increase. In this case, the logit coefficient of 0.8919 divided by 4 gives us a result of 22.29%, which means that for every 1 year increase in consecutive dividend increases, the probability of an increase in the dividend this year rises by 22.29%.”*

I had never heard of this ‘divide by 4’ short cut to get to marginal effects. While you can get those in STATA, R or SAS with a little work, I think this trick would be very handy for instance if you are reading someone else’s paper/results and just wanted a ballpark on marginal effects (instead of interpreting odds ratios).

I did some additional investigation on this and ran across Stephen Turner’s Getting Genetics Done blog post related to this, where he goes a little deeper into the mathematics behind this:

*“The slope of this curve (1st derivative of the logistic curve) is maximized at a+ßx=0, where it takes on the value:*

ße^{0}/(1+e^{0})²

=ß(1)/(1+1)²

=ß/4

So you can take the logistic regression coefficients (not including the intercept) and divide them by 4 to get an upper bound of the predictive difference in probability of the outcome y=1 per unit increase in x.”

Stephen points to Andrew Gelman, who may be the originator of this, citing the text *Data Analysis Using Regression and Multilevel/Hierarchical Models* by Andrew Gelman, Jennifer Hill. There is some pushback in the comments to Stephen’s post, but I still think this is a nice shortcut for an on the fly interpretation of reported results.

If you go back to the output from my previous post on marginal effects, the estimated logistic regression coefficients were:

** Estimate Std. Error z value Pr(>|z|) ****(Intercept) 5.92972 2.34258 2.531 0.0114 *age -0.14099 0.05656 -2.493 0.0127 ***

And if you apply the divide by 4 rule you get:

**-0.14099 / 4 = -.0352**

While this is not a proof, or even a simulation, it is close to the minimum (which would be the upper bound for negative marginal effects) for this data (see full R program below):

**summary(meffx) **** Min. 1st Qu. Median Mean 3rd Qu. Max. -0.03525 -0.03262 -0.02697 -0.02583 -0.02030 -0.01071 **

R Code:

#------------------------------------------------------------------

# PROGRAM NAME: MEFF and Odds Ratios

# DATE: 3/3/16

# CREATED BY: MATT BOGARD

# PROJECT FILE:

#------------------------------------------------------------------

# PURPOSE: GENERATE MARGINAL EFFECTS FOR LOGISTIC REGRESSION AND COMPARE TO:

# ODDS RATIOS / RESULTS FROM R

#

# REFERENCES: https://statcompute.wordpress.com/2012/09/30/marginal-effects-on-binary-outcome/

# https://diffuseprior.wordpress.com/2012/04/23/probitlogit-marginal-effects-in-r-2/

#

# UCD CENTRE FOR ECONOMIC RESEARCH

# WORKING PAPER SERIES

# 2011

# Simple Logit and Probit Marginal Effects in R

# Alan Fernihough, University College Dublin

# WP11/22

# October 2011

# http://www.ucd.ie/t4cms/WP11_22.pdf

#------------------------------------------------------------------;

#-----------------------------------------------------

# generate data for continuous explantory variable

#----------------------------------------------------

Input = ("participate age

1 25

1 26

1 27

1 28

1 29

1 30

0 31

1 32

1 33

0 34

1 35

1 36

1 37

1 38

0 39

0 40

0 41

1 42

1 43

0 44

0 45

1 46

1 47

0 48

0 49

0 50

0 51

0 52

1 53

0 54

")

dat1 <- read.table(textConnection(Input),header=TRUE)

summary(dat1) # summary stats

#### run logistic regression model

mylogit <- glm(participate ~ age, data = dat1, family = "binomial")

summary(mylogit)

exp(cbind(OR = coef(mylogit), confint(mylogit))) # get odds ratios

#-------------------------------------------------

| marginal effects calculations

#-------------------------------------------------

#--------------------------------------------------------------------

# mfx function for maginal effects from a glm model

#

# from: https://diffuseprior.wordpress.com/2012/04/23/probitlogit-marginal-effects-in-r-2/

# based on:

# UCD CENTRE FOR ECONOMIC RESEARCH

# WORKING PAPER SERIES

# 2011

# Simple Logit and Probit Marginal Effects in R

# Alan Fernihough, University College Dublin

# WP11/22

#October 2011

# http://www.ucd.ie/t4cms/WP11_22.pdf

#---------------------------------------------------------------------

mfx <- function(x,sims=1000){

set.seed(1984)

pdf <- ifelse(as.character(x$call)[3]=="binomial(link = "probit")",

mean(dnorm(predict(x, type = "link"))),

mean(dlogis(predict(x, type = "link"))))

pdfsd <- ifelse(as.character(x$call)[3]=="binomial(link = "probit")",

sd(dnorm(predict(x, type = "link"))),

sd(dlogis(predict(x, type = "link"))))

marginal.effects <- pdf*coef(x)

sim <- matrix(rep(NA,sims*length(coef(x))), nrow=sims)

for(i in 1:length(coef(x))){

sim[,i] <- rnorm(sims,coef(x)[i],diag(vcov(x)^0.5)[i])

}

pdfsim <- rnorm(sims,pdf,pdfsd)

sim.se <- pdfsim*sim

res <- cbind(marginal.effects,sd(sim.se))

colnames(res)[2] <- "standard.error"

ifelse(names(x$coefficients[1])=="(Intercept)",

return(res[2:nrow(res),]),return(res))

}

# marginal effects from logit

mfx(mylogit)

### code it yourself for marginal effects at the mean

summary(dat1)

b0 <- 5.92972 # estimated intercept from logit

b1 <- -0.14099 # estimated b from logit

xvar <- 39.5 # reference value (i.e. mean)for explanatory variable

d <- .0001 # incremental change in x

xbi <- (xvar + d)*b1 + b0

xbj <- (xvar - d)*b1 + b0

meff <- ((exp(xbi)/(1+exp(xbi)))-(exp(xbj)/(1+exp(xbj))))/(d*2) ;

print(meff)

### a different perhaps easier formulation for me at the mean

XB <- xvar*b1 + b0 # this could be expanded for multiple b's or x's

meffx <- (exp(XB)/((1+exp(XB))^2))*b1

print(meffx)

### averaging the meff for the whole data set

dat1$XB <- dat1$age*b1 + b0

meffx <- (exp(dat1$XB)/((1+exp(dat1$XB))^2))*b1

summary(meffx) # get mean

#### marginal effects from linear model

lpm <- lm(dat1$participate~dat1$age)

summary(lpm)

#---------------------------------------------------

#

#

# multivariable case

#

#

#---------------------------------------------------

dat2 <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")

head(dat2)

summary(dat2) # summary stats

#### run logistic regression model

mylogit <- glm(admit ~ gre + gpa, data = dat2, family = "binomial")

summary(mylogit)

exp(cbind(OR = coef(mylogit), confint(mylogit))) # get odds ratios

# marginal effects from logit

mfx(mylogit)

### code it yourself for marginal effects at the mean

summary(dat1)

b0 <- -4.949378 # estimated intercept from logit

b1 <- 0.002691 # estimated b for gre

b2 <- 0.754687 # estimated b for gpa

x1 <- 587 # reference value (i.e. mean)for gre

x2 <- 3.39 # reference value (i.e. mean)for gre

d <- .0001 # incremental change in x

# meff at means for gre

xbi <- (x1 + d)*b1 + b2*x2 + b0

xbj <- (x1 - d)*b1 + b2*x2 + b0

meff <- ((exp(xbi)/(1+exp(xbi)))-(exp(xbj)/(1+exp(xbj))))/(d*2) ;

print(meff)

# meff at means for gpa

xbi <- (x2 + d)*b2 + b1*x1 + b0

xbj <- (x2 - d)*b2 + b1*x1 + b0

meff <- ((exp(xbi)/(1+exp(xbi)))-(exp(xbj)/(1+exp(xbj))))/(d*2) ;

print(meff)

### a different perhaps easier formulation for me at the mean

XB <- x1*b1 +x2*b2 + b0 # this could be expanded for multiple b's or x's

# meff at means for gre

meffx <- (exp(XB)/((1+exp(XB))^2))*b1

print(meffx)

# meff at means for gpa

meffx <- (exp(XB)/((1+exp(XB))^2))*b2

print(meffx)

### averaging the meff for the whole data set

dat2$XB <- dat2$gre*b1 + dat2$gpa*b2 + b0

# sample avg meff for gre

meffx <- (exp(dat1$XB)/((1+exp(dat1$XB))^2))*b1

summary(meffx) # get mean

# sample avg meff for gpa

meffx <- (exp(dat1$XB)/((1+exp(dat1$XB))^2))*b2

summary(meffx) # get mean

#### marginal effects from linear model

lpm <- lm(admit ~ gre + gpa, data = dat2)

summary(lpm)

**leave a comment**for the author, please follow the link and comment on their blog:

**Econometric Sense**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...