outersect(): The opposite of R’s intersect() function

November 29, 2011
By

[This article was first published on Consistently Infrequent » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The Objective

To find the non-duplicated elements between two or more vectors (i.e. the ‘yellow sections of the diagram above)

The Problem

I needed the opposite of R’s intersect() function, an “outersect()“. The closest I found was setdiff() but the order of the input vectors produces different results, e.g.

x = letters[1:3]
#[1] "a" "b" "c"
y = letters[2:4]
#[1] "b" "c" "d"

# The desired result is
# [1] "a" "d"

setdiff(x, y)
#[1] "a"

setdiff(y, x)
#[1] "d"

setdiff() produces all elements of the first input vector without any matching elements from the second input vector (i.e. is asymmetric). Not quite what I’m after. I’m looking for the ‘yellow’ set of elements as in the picture at the top of the page.

The Solution

Concatenating the results of setdiff() with input vectors in both combinations works a treat:

outersect <- function(x, y) {
  sort(c(setdiff(x, y),
         setdiff(y, x)))
}

x = letters[1:3]
#[1] "a" "b" "c"
y = letters[2:4]
#[1] "b" "c" "d"

outersect(x, y)
#[1] "a" "d"

outersect(y, x)
#[1] "a" "d"

Alternative solution

An equivalent alternative would be to use

outersect <- function(x, y) {
  sort(c(x[!x%in%y],
         y[!y%in%x]))
}

but by using setdiff() in the first solution it makes it easier to read I think.

Further Development

It would be nice to extend this to a variable number of input vectors. This final task turns out to be rather simple:

outersect <- function(x, y, ...) {
  big.vec <- c(x, y, ...)
  duplicates <- big.vec[duplicated(big.vec)]
  setdiff(big.vec, unique(duplicates))
}

# desired result is c(1, 2, 3, 6, 9, 10)
outersect(1:5, 4:8, 7:10)
#[1] 1 2 3 6 9 10

Awesome.

To leave a comment for the author, please follow the link and comment on their blog: Consistently Infrequent » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)