Subset and Fill via an Index

[This article was first published on R – Jason unedited, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I had a very large (70,000+ columns) problem that I needed to reduce. A function took two matrices and transformed them into a single vector the length as the two inputs. I needed to reduce the inputs and then map the output back to the original position of the corresponding column. This entry may seem obvious to veteran R users and I am mainly writing this to provide a reference to myself. Here is a visual example of what I needed.

time <- c(1, 1, 2, 2, 3, 3)
money <- c(2, 2, 4, 4, 6, 6)
ownership <- c(1, 0, 1, 0, 1, 0)
mat <- rbind(time, money, ownership)
print(mat)

##           [,1] [,2] [,3] [,4] [,5] [,6]
## time         1    1    2    2    3    3
## money        2    2    4    4    6    6
## ownership    1    0    1    0    1    0

dat <- c(1, 0, 2, 0, 3, 0)  
obj <- matrix(dat,nrow=1)
print(obj)

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    0    2    0    3    0

f <- function(mat,obj){
  #generic function with output of the same number of columns as obj
}
soln <- f(mat, obj)

where soln is a 1×6 matrix.

The size of my problem made the function f extremely slow and unreliable. I needed a way to reduce the inputs and then map the output appropriately to a results matrix.

The matrices mat and obj reduce to:

time <- c(1, 2, 3)
money <- c(2, 4, 6)
ownership <- c(1, 1, 1)
mat <- rbind(time, money, ownership) 
print(mat)

##           [,1] [,2] [,3]
## time         1    2    3
## money        2    4    6
## ownership    1    1    1

dat <- c(1, 2, 3)  
obj <- matrix(dat,nrow=1)
print(obj)

##      [,1] [,2] [,3]
## [1,]    1    2    3

<span class="identifier">soln</span> <span class="operator"><-</span> <span class="identifier">f</span><span class="paren">(</span><span class="identifier">mat</span>, <span class="identifier">obj</span><span class="paren">)</span>

with the soln being a 1×3 matrix. For example:

soln=               [,1]     [,2]      [,3]     
           [1,] 4.151969 5.759826  5.537563 

where the decision to exclude a column from mat is based on the value in ownership[]=0 and the same for obj. The added difficulty, is that I need to be able to assign the output in soln to the mapped to the corresponding original position in a larger SOLN matrix. In this case columns 1,3,5. Ownership is randomly assigned, so there will be no pattern other than the zeros described above.

time <- c(1, 1, 2, 2, 3, 3)
money <- c(2, 2, 4, 4, 6, 6)
ownership <- c(1, 0, 1, 0, 1, 0)
mat <- rbind(time, money, ownership)
print(mat)

##           [,1] [,2] [,3] [,4] [,5] [,6]
## time         1    1    2    2    3    3
## money        2    2    4    4    6    6
## ownership    1    0    1    0    1    0

dat <- c(1, 0, 2, 0, 3, 0)  
obj <- matrix(dat,nrow=1)
print(obj)

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    0    2    0    3    0

obj2 <- obj[, as.logical(ownership), drop = FALSE]
print(obj2)

##      [,1] [,2] [,3]
## [1,]    1    2    3

mat2 <- mat[, as.logical(ownership)]
print(mat2)

##           [,1] [,2] [,3]
## time         1    2    3
## money        2    4    6
## ownership    1    1    1

evaluates to

<span class="identifier">soln</span> <span class="operator"><-</span> <span class="identifier">f</span><span class="paren">(</span><span class="identifier">mat</span>, <span class="identifier">obj</span><span class="paren">)</span>

with the soln being a 1×3 matrix. For example:

soln=               [,1]     [,2]      [,3]     
           [1,] 4.151969 5.759826  5.537563 

I need to create a result space:

SOLN <- matrix(data=NA,nrow=1,ncol=6)
print(SOLN)

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]   NA   NA   NA   NA   NA   NA

Then map the results from soln to columns 1, 3, & 5.

SOLN[, as.logical(ownership)] <- soln
print(SOLN)

##          [,1] [,2]     [,3] [,4]     [,5] [,6]
## [1,] 4.151969   NA 5.759826   NA 5.537563   NA

A much more elegant solution than the for loops I was trying to write!

For more R posts visit:http://www.R-bloggers.com

To leave a comment for the author, please follow the link and comment on their blog: R – Jason unedited.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)