Using xBalance with MatchIt

[This article was first published on Mark M. Fredrickson, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In a previous post, I demonstrated how to create a propensity score matching, test balance, and analyze the outcome variable using the optmatch and RItools packages. The same strategy can be used with other matching algorithms, for example the various methods included in the MatchIt package.

I’ll use the same basic question and data from my previous article. The MatchIt package wraps optmatch to provide its “full” and “optimal” matching methods, so I will the “full” option to maintain consistency with my previous article. The first step is loading the packages and the data:

> library(MatchIt) > library(optmatch) > library(RItools) > data(nuclearplants)

The interface for MatchIt is similar to optmatch for propensity score matches, except that the matchit() function compresses the process into a single step of specifying the propensity formula and producing the match, while fullmatch() allows a user to specify any number of distance matrices. In the end, the interface is fairly similar. As with the previous article, I match on a subset of the covariates.

> example.formula <- formula(pr ~ t1 + t2 + cap) > match.opt <- fullmatch( mdist(glm(example.formula, data = nuclearplants, family = binomial()))) > all.mit <- matchit(example.formula, data = nuclearplants, method = "full")

The all.mit object contains (among other items) a vector indicating each object’s matched set. For compatibility, save it as a factor:

> match.mit <- as.factor(all.mit$subclass)

Unsurprisingly, as MatchIt uses optmatch the two matches are identical.

> lapply(split(nuclearplants, match.opt), rownames) $m.1 [1] "N" "Z" "a" $m.10 [1] "I" "G" $m.2 [1] "A" "B" "D" "V" "F" "b" $m.5 [1] "U" "c" $m.6 [1] "H" "K" "L" "M" "C" "P" "R" "Y" "e" "f" $m.8 [1] "J" "O" "Q" "S" "T" "E" "W" "X" "d" > lapply(split(nuclearplants, match.mit), rownames) $`1` [1] "N" "Z" "a" $`2` [1] "I" "G" $`3` [1] "A" "B" "D" "V" "F" "b" $`4` [1] "U" "c" $`5` [1] "H" "K" "L" "M" "C" "P" "R" "Y" "e" "f" $`6` [1] "J" "O" "Q" "S" "T" "E" "W" "X" "d"

Now that I have a factor listing the groups, I can run xBalance to assess the balance properties of the match:

> xBalance(pr ~ . - (cost + pr), data = nuclearplants, strata = match.mit, report = "chisquare.test") ---Overall Test--- chisquare df p.value strat 5.1 9 0.82 --- Signif. codes: 0 ‘***’ 0.001 ‘** ’ 0.01 ‘* ’ 0.05 ‘. ’ 0.1 ‘ ’ 1

With a reported p-value of 0.82, there is little evidence against the null of balance, so we would fail to reject it.

This walk through used the the “full” method for matchit(), but the same techniques will work with other matchit() methods, such as coarsened exact matching or nearest neighbor. If you are reasonably confident that you wish to use optimal matching, you should consider using the optmatch package directly, instead of using it through MatchIt. In future posts I will be demonstrating important techniques to speed up the matching process (which can be a great benefit to large datasets) and how you can create matches that incorporate more subject matter information than can be included in a simple logit model.

To leave a comment for the author, please follow the link and comment on their blog: Mark M. Fredrickson.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)