# Using xBalance with MatchIt

August 1, 2010
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In a previous post, I demonstrated how to create a propensity score matching, test balance, and analyze the outcome variable using the `optmatch` and `RItools` packages. The same strategy can be used with other matching algorithms, for example the various methods included in the MatchIt package.

I’ll use the same basic question and data from my previous article. The `MatchIt` package wraps `optmatch` to provide its “full” and “optimal” matching methods, so I will the “full” option to maintain consistency with my previous article. The first step is loading the packages and the data:

``` > library(MatchIt) > library(optmatch) > library(RItools) > data(nuclearplants) ```

The interface for `MatchIt` is similar to `optmatch` for propensity score matches, except that the `matchit()` function compresses the process into a single step of specifying the propensity formula and producing the match, while `fullmatch()` allows a user to specify any number of distance matrices. In the end, the interface is fairly similar. As with the previous article, I match on a subset of the covariates.

``` > example.formula <- formula(pr ~ t1 + t2 + cap) > match.opt <- fullmatch( mdist(glm(example.formula, data = nuclearplants, family = binomial()))) > all.mit <- matchit(example.formula, data = nuclearplants, method = "full") ```

The `all.mit` object contains (among other items) a vector indicating each object’s matched set. For compatibility, save it as a factor:

``` > match.mit <- as.factor(all.mit\$subclass) ```

Unsurprisingly, as `MatchIt` uses `optmatch` the two matches are identical.

``` > lapply(split(nuclearplants, match.opt), rownames) \$m.1 [1] "N" "Z" "a" \$m.10 [1] "I" "G" \$m.2 [1] "A" "B" "D" "V" "F" "b" \$m.5 [1] "U" "c" \$m.6 [1] "H" "K" "L" "M" "C" "P" "R" "Y" "e" "f" \$m.8 [1] "J" "O" "Q" "S" "T" "E" "W" "X" "d" > lapply(split(nuclearplants, match.mit), rownames) \$`1` [1] "N" "Z" "a" \$`2` [1] "I" "G" \$`3` [1] "A" "B" "D" "V" "F" "b" \$`4` [1] "U" "c" \$`5` [1] "H" "K" "L" "M" "C" "P" "R" "Y" "e" "f" \$`6` [1] "J" "O" "Q" "S" "T" "E" "W" "X" "d" ```

Now that I have a factor listing the groups, I can run `xBalance` to assess the balance properties of the match:

``` > xBalance(pr ~ . - (cost + pr), data = nuclearplants, strata = match.mit, report = "chisquare.test") ---Overall Test--- chisquare df p.value strat 5.1 9 0.82 --- Signif. codes: 0 ‘***’ 0.001 ‘** ’ 0.01 ‘* ’ 0.05 ‘. ’ 0.1 ‘ ’ 1 ```

With a reported p-value of 0.82, there is little evidence against the null of balance, so we would fail to reject it.

This walk through used the the “full” method for `matchit()`, but the same techniques will work with other `matchit()` methods, such as coarsened exact matching or nearest neighbor. If you are reasonably confident that you wish to use optimal matching, you should consider using the `optmatch` package directly, instead of using it through `MatchIt`. In future posts I will be demonstrating important techniques to speed up the matching process (which can be a great benefit to large datasets) and how you can create matches that incorporate more subject matter information than can be included in a simple logit model.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.