Moving columns using basic english prepositions!
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Moveme!
I recently worked with a dataset that had over 100 columns and had to keep moving the order of the columns such that I could easier conduct my analysis. For example, whenever you try and conduct a multiple-factor analysis (FactoMineR::MFA), the function requires specific grouping of your variables to conduct the analysis. This meant that after feature engineering, I was left with the problem of having to order my columns so that the analysis could be run. By now you can guess the problem statement… how the heck was I suppose to move a 100 columns to specific places in the data set and do so in a clean, easy to read format?
If you are a regular user of tidyverse
packages, you should be VERY familiar with the following code:
library(dplyr) iris %>% select(Species, everything()) %>% head ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width ## 1 setosa 5.1 3.5 1.4 0.2 ## 2 setosa 4.9 3.0 1.4 0.2 ## 3 setosa 4.7 3.2 1.3 0.2 ## 4 setosa 4.6 3.1 1.5 0.2 ## 5 setosa 5.0 3.6 1.4 0.2 ## 6 setosa 5.4 3.9 1.7 0.4
So, why is this code so familiar to you? Well, its because you have been using it to move the order of your columns within your data.frame
. But, what if you didn’t want to move columns only to the front or back, but rather after certain columns, between two different columns etc. Imagine you could tell your data.frame
to please (because #rstats people are polite) move column A
just after B
, move C
to the front, and column G
after F
.
Well, thanks to the wonderful world of stackoverflow such a function exist if you know where to look! The original code is accredited to user A5C1D2H2I1M1N2O1R2T1 who answered a question on moving columns within a data frame without retyping. So if we take his code and sprinkle a tiny bit of magic, we to could integrate this into our tidy
workflow:
moveme <- function (df, movecommand) { invec <- names(df) movecommand <- lapply(strsplit(strsplit(movecommand, ";")[[1]], ",|\\s+"), function(x) x[x != ""]) movelist <- lapply(movecommand, function(x) { Where <- x[which(x %in% c("before", "after", "first", "last")):length(x)] ToMove <- setdiff(x, Where) list(ToMove, Where) }) myVec <- invec for (i in seq_along(movelist)) { temp <- setdiff(myVec, movelist[[i]][[1]]) A <- movelist[[i]][[2]][1] if (A %in% c("before", "after")) { ba <- movelist[[i]][[2]][2] if (A == "before") { after <- match(ba, temp) - 1 } else if (A == "after") { after <- match(ba, temp) } } else if (A == "first") { after <- 0 } else if (A == "last") { after <- length(myVec) } myVec <- append(temp, values = movelist[[i]][[1]], after = after) } df[,match(myVec, names(df))] }
To use your new function, you can merely pipe the data.frame
into the moveme
function as follow:
a <- b <- c <- d <- e <- f <- g <- 1:100 df <- data.frame(a,b,c,d,e,f,g) df <- df %>% tbl_df # Usage df %>% moveme(., "g first") ## # A tibble: 100 x 7 ## g a b c d e f ## <int> <int> <int> <int> <int> <int> <int> ## 1 1 1 1 1 1 1 1 ## 2 2 2 2 2 2 2 2 ## 3 3 3 3 3 3 3 3 ## 4 4 4 4 4 4 4 4 ## 5 5 5 5 5 5 5 5 ## 6 6 6 6 6 6 6 6 ## 7 7 7 7 7 7 7 7 ## 8 8 8 8 8 8 8 8 ## 9 9 9 9 9 9 9 9 ## 10 10 10 10 10 10 10 10 ## # ... with 90 more rows
Ok,so that isn’t that impressive, so lets try stringing multiple move commands into one character vector splitting the commands with a semi-colon ;
:
df %>% moveme(., "g first; a last; e before c") ## # A tibble: 100 x 7 ## g b e c d f a ## <int> <int> <int> <int> <int> <int> <int> ## 1 1 1 1 1 1 1 1 ## 2 2 2 2 2 2 2 2 ## 3 3 3 3 3 3 3 3 ## 4 4 4 4 4 4 4 4 ## 5 5 5 5 5 5 5 5 ## 6 6 6 6 6 6 6 6 ## 7 7 7 7 7 7 7 7 ## 8 8 8 8 8 8 8 8 ## 9 9 9 9 9 9 9 9 ## 10 10 10 10 10 10 10 10 ## # ... with 90 more rows
As you can see this API allows for an endless array of ways you can move columns in one single blow. Verbs include:
- before
- after
- first
- last
So, once again, thanks again to the amazing #rstats community out there!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.