[This article was first published on S+/R – Yet Another Blog in Statistical Computing, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In R, there are a couple ways to convert the column-oriented data frame to a row-oriented dictionary list or alike, e.g. a list of lists.

In the code snippet below, I would show each approach and how to extract keys and values from the dictionary. As shown in the benchmark, it appears that the generic R data structure is still the most efficient.

```### LIST() FUNCTION IN BASE PACKAGE ###
x1 <- as.list(iris[1, ])
names(x1)
# [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"
x1[["Sepal.Length"]]
# [1] 5.1

### ENVIRONMENT-BASED SOLUTION ###
envn_dict <- function(x) {
e <- new.env(hash = TRUE)
for (name in names(x)) assign(name, x[, name], e)
return(e)
}

x2 <- envn_dict(iris[1, ])
ls(x2)
# [1] "Petal.Length" "Petal.Width"  "Sepal.Length" "Sepal.Width"  "Species"
x2[["Sepal.Length"]]
# [1] 5.1

### COLLECTIONS PACKAGE ###
coll_dict <-  function(x) {
d <- collections::Dict\$new()
for (name in names(x)) d\$set(name, x[, name])
return(d)
}

x3 <- coll_dict(iris[1, ])
x3\$keys()
# [1] "Petal.Length" "Petal.Width"  "Sepal.Length" "Sepal.Width"  "Species"
x3\$get("Sepal.Length")
# [1] 5.1

### HASH PACKAGE ###
hash_dict <- function(x) {
d <- hash::hash()
for (name in names(x)) d[[name]] <- x[, name]
return(d)
}

x4 <- hash_dict(iris[1, ])
hash::keys(x4)
# [1] "Petal.Length" "Petal.Width"  "Sepal.Length" "Sepal.Width"  "Species"
hash::values(x4, "Sepal.Length")
# Sepal.Length
#          5.1

### DATASTRUCTURES PACKAGE ###
data_dict <- function(x) {
d <- datastructures::hashmap()
for (name in names(x)) d[name] <- x[, name]
return(d)
}

x5 <- data_dict(iris[1, ])
datastructures::keys(x5)
# [1] "Species"      "Sepal.Width"  "Petal.Length" "Sepal.Length" "Petal.Width"
datastructures::get(x5, "Sepal.Length")
# [1] 5.1

### FROM PYTHON ###
py2r_dict <- function(x) {
return(reticulate::py_dict(names(x), x, TRUE))
}

x6 <- py2r_dict(iris[1, ])
x6\$keys()
# [1] "Petal.Length" "Sepal.Length" "Petal.Width"  "Sepal.Width"  "Species"
x6["Sepal.Length"]
# [1] 5.1

### CONVERT DATAFRAME TO DICTIONARY LIST ###
to_list <- function(df, fn) {
l <- list()
for (i in seq(nrow(df))) l[[i]] <- fn(df[i, ])
return(l)
}

rbenchmark::benchmark(replications = 100, order = "elapsed", relative = "elapsed",
columns = c("test", "replications", "elapsed", "relative", "user.self", "sys.self"),
"BASE::LIST"              = to_list(iris, as.list),
"BASE::ENVIRONMENT"       = to_list(iris, envn_dict),
"COLLECTIONS::DICT"       = to_list(iris, coll_dict),
"HASH::HASH"              = to_list(iris, hash_dict),
"DATASTRUCTURES::HASHMAP" = to_list(iris, data_dict),
"RETICULATE::PY_DICT"     = to_list(iris, py2r_dict)
)
#                     test replications elapsed relative user.self sys.self
#1              BASE::LIST          100   0.857    1.000     0.857    0.000
#2       BASE::ENVIRONMENT          100   1.607    1.875     1.607    0.000
#4              HASH::HASH          100   2.600    3.034     2.600    0.000
#3       COLLECTIONS::DICT          100   2.956    3.449     2.956    0.000
#5 DATASTRUCTURES::HASHMAP          100  16.070   18.751    16.071    0.000
#6     RETICULATE::PY_DICT          100  18.030   21.039    18.023    0.008
```

To leave a comment for the author, please follow the link and comment on their blog: S+/R – Yet Another Blog in Statistical Computing.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts.(You will not see this message again.)

Click here to close (This popup will not appear again)