Creating data frame using structure() function in R

May 27, 2019
By

(This article was first published on R – TomazTsql, and kindly contributed to R-bloggers)

Structure() function is a simple, yet powerful function that describes a given object with given attributes. It is part of base R language library, so there is no need to load any additional library. And also, since the function was part of S-Language, it is in the base library from the earlier versions, making it backward or forward compatible.

Example:

dd <- structure(list( 
         year = c(2001, 2002, 2004, 2006) 
        ,length_days = c(366.3240, 365.4124, 366.5323423, 364.9573234)) 
        ,.Names = c("year", "length of days") 
        ,row.names = c(NA, -4L) 
        ,class = "data.frame")

All objects created using structure() – whether homogeneous (matrix, vector) or heterogeneous (data.frame, list) – have additional metadata information stored, using attributes. Like creating a simple vector with additional metadata information:

just_vector <- structure(1:10, comment = "This is my simple 
                                       vector with info")

And by using function:

attributes(just_vector)

We get the information back:

$`comment`
[1] "This is my simple vector with info"

In one go

So, let us suppose you want to create a structure (S3) in one step. The following would create a data.frame (heterogeneous) with several steps:

year = c(1999, 2002, 2005, 2008)
pollution = c(346.82,134.308821199349, 130.430379885892, 88.275457392443)
dd2 <- data.frame(year,pollution)
dd2$year <- as.factor(dd2$year)

Using structure, we can do this simpler and faster:

dd <- structure(list( 
   year = as.factor(c(2001, 2002, 2004, 2006))
  ,length_days = c(366.3240, 365.4124, 366.5323423, 364.9573234)) 
  ,.Names = c("year", "length of days") 
  ,row.names = c(NA, -4L) 
  ,class = "data.frame")

 

Useful cases when using structure() function are:

  • when creating a smaller data-set within your Jupyter  notebook (using Markdown )
  • when creating data-sets within your R code demo/example (and not using external CSV / TXT / JSON files)
  • when describing a given object with mixed data types (e.i.: data frame) and prepare it for data import
  • when creating many R environments and each have independent data-set
  • for persisting data
  • and many more…

Constructing data-frame with additional attributes and comments.

dd3 <- structure(list(
   v1 = as.factor(c(2001, 2002, 2004, 2006))
  ,v2 = I(c(2001, 2002, 2004, 2006))
  ,v3 = ordered(c(2001, 2002, 2004, 2006))
  ,v4 = as.double(c(366.3240, 365.4124, 366.5323423, 364.9573234)))
  ,.Names = c("year", "AsIs Year","yearO", "length of days")
  ,.typeOf = c("factor", "numeric", "ordered","numeric")
  ,row.names = c(NA, -4L)
  ,class = "data.frame"
  ,comment = "Ordered YearO for categorical analysis and other variables")

Nesting lists within lists can also be done, or even preserving the original data-sets as sub-list, hidden from the dataframe, can also be an option.

And checking comments can be done as:

attributes(dd3)$comment

attr(dd3, which="comment")

 

Both yield same results, as:

> attributes(dd3)$comment
[1] "Ordered YearO for categorical analysis and other variables"
> attr(dd3, which="comment")
[1] "Ordered YearO for categorical analysis and other variables"

 

This simple, yet very useful code example with effective function is as always, available at Github.

Happy Rrrring! 🙂

To leave a comment for the author, please follow the link and comment on their blog: R – TomazTsql.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)