Creating data frame using structure() function in R

May 27, 2019
By

[This article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Structure() function is a simple, yet powerful function that describes a given object with given attributes. It is part of base R language library, so there is no need to load any additional library. And also, since the function was part of S-Language, it is in the base library from the earlier versions, making it backward or forward compatible.

Example:

dd <- structure(list( 
         year = c(2001, 2002, 2004, 2006) 
        ,length_days = c(366.3240, 365.4124, 366.5323423, 364.9573234)) 
        ,.Names = c("year", "length of days") 
        ,row.names = c(NA, -4L) 
        ,class = "data.frame")

All objects created using structure() – whether homogeneous (matrix, vector) or heterogeneous (data.frame, list) – have additional metadata information stored, using attributes. Like creating a simple vector with additional metadata information:

just_vector <- structure(1:10, comment = "This is my simple 
                                       vector with info")

And by using function:

attributes(just_vector)

We get the information back:

$`comment`
[1] "This is my simple vector with info"

In one go

So, let us suppose you want to create a structure (S3) in one step. The following would create a data.frame (heterogeneous) with several steps:

year = c(1999, 2002, 2005, 2008)
pollution = c(346.82,134.308821199349, 130.430379885892, 88.275457392443)
dd2 <- data.frame(year,pollution)
dd2$year <- as.factor(dd2$year)

Using structure, we can do this simpler and faster:

dd <- structure(list( 
   year = as.factor(c(2001, 2002, 2004, 2006))
  ,length_days = c(366.3240, 365.4124, 366.5323423, 364.9573234)) 
  ,.Names = c("year", "length of days") 
  ,row.names = c(NA, -4L) 
  ,class = "data.frame")

 

Useful cases when using structure() function are:

  • when creating a smaller data-set within your Jupyter  notebook (using Markdown )
  • when creating data-sets within your R code demo/example (and not using external CSV / TXT / JSON files)
  • when describing a given object with mixed data types (e.i.: data frame) and prepare it for data import
  • when creating many R environments and each have independent data-set
  • for persisting data
  • and many more…

Constructing data-frame with additional attributes and comments.

dd3 <- structure(list(
   v1 = as.factor(c(2001, 2002, 2004, 2006))
  ,v2 = I(c(2001, 2002, 2004, 2006))
  ,v3 = ordered(c(2001, 2002, 2004, 2006))
  ,v4 = as.double(c(366.3240, 365.4124, 366.5323423, 364.9573234)))
  ,.Names = c("year", "AsIs Year","yearO", "length of days")
  ,.typeOf = c("factor", "numeric", "ordered","numeric")
  ,row.names = c(NA, -4L)
  ,class = "data.frame"
  ,comment = "Ordered YearO for categorical analysis and other variables")

Nesting lists within lists can also be done, or even preserving the original data-sets as sub-list, hidden from the dataframe, can also be an option.

And checking comments can be done as:

attributes(dd3)$comment

attr(dd3, which="comment")

 

Both yield same results, as:

> attributes(dd3)$comment
[1] "Ordered YearO for categorical analysis and other variables"
> attr(dd3, which="comment")
[1] "Ordered YearO for categorical analysis and other variables"

 

This simple, yet very useful code example with effective function is as always, available at Github.

Happy Rrrring! 🙂

To leave a comment for the author, please follow the link and comment on their blog: R – TomazTsql.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)