R Tip: Use match_order() to Align Data

[This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R tip. Use wrapr::match_order() to align data.

Suppose we have data in two data frames, and both of these data frames have common row-identifying columns called “idx“.

library("wrapr")

d1 <- build_frame(
   "idx", "x" |
   3    , "a" |
   1    , "b" |
   2    , "c" )

d2 <- build_frame(
   "idx", "y" |
   2    , "D" |
   1    , "E" |
   3    , "F" )

print(d1)
#>   idx x
#> 1   3 a
#> 2   1 b
#> 3   2 c

print(d2)
#>   idx y
#> 1   2 D
#> 2   1 E
#> 3   3 F

(Please see R Tip: Think in Terms of Values for build_frame() and other value capturing tools.)

Often we wish to work with such data aligned so each row in d2 has the same idx value as the same row (by row order) as d1. This is an important data wrangling task, so there are many ways to achieve it in R, such as base::merge(), dplyr::left_join(), or by sorting both tables into the same order and then using base::cbind().

However if you wish to preserve the order of the first table (which may not be sorted), you need one more trick.

You can add a row-id column, sort by the joining id, combine and then re-sort by the row-id column.

Or you can match the orders in one step using wrapr::match_order().

p <- match_order(d2$idx, d1$idx)

print(d2[p, , drop=FALSE])
#>   idx y
#> 3   3 F
#> 2   1 E
#> 1   2 D

match_order is merely wrapping all of the sort and re-sort tricks we mentioned above, however the theory is based on the absolute magic of associative array indexing.

Please see R Tip: Use drop = FALSE with data.frames, for why one should get in the habit of writing drop = FALSE.

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)