R Function Binding Vectors and Matrices of Variable Length, bug fixed

August 19, 2011
By

(This article was first published on Freigeist Blog - Jakobsweg, Pilgern und London » R, and kindly contributed to R-bloggers)

Now this is something very geeky, but useful. I had to bind two matrices or vectors together to become a bigger matrix. However, they need not have the same number of rows or even the same row names.

The standard cbind() functions require the vectors or matrices to be compatible. The matching is “stupid”, in the sense that it ignores any order or assumes that the elements which are to be joined into a matrix have the same row names. This, of course, need not be the case. A classical merge command would fail here, as we dont really know what to merge by and what to merge on.

Ok… I am not being clear here. Suppose you want to merge two vectors

A 2
B 4
C 3

and
G 2
B 1
C 3
E 1

now the resulting matrix should be
A  2  NA
B  4  1
C  3  3
E NA  1
G NA  2

Now the following Rfunction allows you to do this. It is important however, that you assign rownames to the objects to be merged (the A,B,C,E,G in the example), as it does matching on these.
cbindM <-
function(A, v, repl=NA) {

  dif <- setdiff(union(rownames(A),rownames(v)),intersect(rownames(A),rownames(v)))
#if names is the same, then a simple cbind will do
    if(length(dif)==0) {

      A<- cbind(A,v[match(rownames(A),rownames(v))])

        rownames(A) <- rownames(v)

    }    else if(length(dif)>0) {#sets are not equal, so either matrix is longer / shorter

#this tells us which elements in dif are part of A (and of v) respectively      
      for(i in dif)     {

        if(is.element(i,rownames(A))) {
#element is in A but not in v, so add it to v and then a        

          temp<-matrix(data =repl, nrow = 1, ncol = ncol(v), byrow = FALSE, dimnames =list(i))
            v <- rbind(v,temp)

        } else {
# element is in v but not in A, so add it to A

          temp<-matrix(data = repl, nrow = 1, ncol = ncol(A), byrow = FALSE, dimnames =list(i))
            A<-rbind(A,temp)
        }
      }


      A<-cbind(A,v[match(rownames(A),rownames(v))])

    }

  A
}

Note: 09.11.2011: I fixed a bug and added a bit more functionality. You can now tell it, with what you want the missing data to be replaced. Its standard to replace it with NA but you could change it to anything you want.

To leave a comment for the author, please follow the link and comment on his blog: Freigeist Blog - Jakobsweg, Pilgern und London » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.