Log odds ratios and an indicator matrix from categorical data

October 4, 2012
By

(This article was first published on is.R(), and kindly contributed to R-bloggers)

A long title, but there are a couple of handy things in this Gist. The first, and more obscure, is the conversion of a data.frame of categorical variables into a matrix of dummy/binary/indicator variables, one for each category of each original variable.

It is non-obvious (to me, at least) how to best do this, so the solution comes from “Gavin Simpson” and “fabians” at Stack Overflow.

The second part of this Gist shows how to construct a table of log odds ratios between each of these indicator variables, which may be a first step in the estimation of something like (but not exactly the same as) multiple correspondence analysis.

To leave a comment for the author, please follow the link and comment on his blog: is.R().

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.