When Size Matters: Weighted Effect Coding

February 24, 2017

(This article was first published on Rense Nieuwenhuis » R-Project, and kindly contributed to R-bloggers)

Categorical variables in regression models are often included by dummy variables. In R, this is done with factor variables with treatment coding. Typically, the difference and significance of each category are tested against a preselected reference category. We present a useful alternative.

If all categories have (roughly) the same number of observations, you can also test all categories against the grand mean using effect (ANOVA) coding. In observational studies, however, the number of observations per category typically varies. Our new paper shows how categories of a factor variable can be tested against the sample mean. Although the paper has been online for some time now (and this post is an update to an earlier post some time age), we are happy to announce that our paper has now officially been published a the International Journal of Public Health.

To apply the procedures introduced in these papers, called weighted effect coding, procedures are made available for R, SPSS, and Stata. For R, we created the ‘wec’ package which can be installed by typing:



Grotenhuis, M., Ben Pelzer, Eisinga, R., Nieuwenhuis, R., Schmidt-Catran, A., & Konig, R. (2017). When size matters: advantages of weighted effect coding in observational studies. International Journal of Public Health, (62), 163–167. http://doi.org/10.1007/s00038-016-0901-1

Sweeney R, Ulveling EF (1972) A transformation for simplifying the interpretation of coefficients of binary variables in regression analysis. Am Stat 26:30–32

To leave a comment for the author, please follow the link and comment on their blog: Rense Nieuwenhuis » R-Project.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)