# Creating data frame using structure() function in R

**R – TomazTsql**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Structure() function is a simple, yet powerful function that describes a given object with given attributes. It is part of base R language library, so there is no need to load any additional library. And also, since the function was part of S-Language, it is in the base library from the earlier versions, making it backward or forward compatible.

Example:

dd <- structure(list( year = c(2001, 2002, 2004, 2006) ,length_days = c(366.3240, 365.4124, 366.5323423, 364.9573234)) ,.Names = c("year", "length of days") ,row.names = c(NA, -4L) ,class = "data.frame")

All objects created using structure() – whether homogeneous (matrix, vector) or heterogeneous (data.frame, list) – have additional metadata information stored, using attributes. Like creating a simple vector with additional metadata information:

just_vector <- structure(1:10, comment = "This is my simple vector with info")

And by using function:

attributes(just_vector)

We get the information back:

$`comment` [1] "This is my simple vector with info"

### In one go

So, let us suppose you want to create a structure (S3) in one step. The following would create a data.frame (heterogeneous) with several steps:

year = c(1999, 2002, 2005, 2008) pollution = c(346.82,134.308821199349, 130.430379885892, 88.275457392443) dd2 <- data.frame(year,pollution) dd2$year <- as.factor(dd2$year)

Using structure, we can do this simpler and faster:

dd <- structure(list( year = as.factor(c(2001, 2002, 2004, 2006)) ,length_days = c(366.3240, 365.4124, 366.5323423, 364.9573234)) ,.Names = c("year", "length of days") ,row.names = c(NA, -4L) ,class = "data.frame")

Useful cases when using structure() function are:

- when creating a smaller data-set within your Jupyter notebook (using Markdown )
- when creating data-sets within your R code demo/example (and not using external CSV / TXT / JSON files)
- when describing a given object with mixed data types (e.i.: data frame) and prepare it for data import
- when creating many R environments and each have independent data-set
- for persisting data
- and many more…

Constructing data-frame with additional attributes and comments.

dd3 <- structure(list( v1 = as.factor(c(2001, 2002, 2004, 2006)) ,v2 = I(c(2001, 2002, 2004, 2006)) ,v3 = ordered(c(2001, 2002, 2004, 2006)) ,v4 = as.double(c(366.3240, 365.4124, 366.5323423, 364.9573234))) ,.Names = c("year", "AsIs Year","yearO", "length of days") ,.typeOf = c("factor", "numeric", "ordered","numeric") ,row.names = c(NA, -4L) ,class = "data.frame" ,comment = "Ordered YearO for categorical analysis and other variables")

Nesting lists within lists can also be done, or even preserving the original data-sets as sub-list, hidden from the dataframe, can also be an option.

And checking comments can be done as:

attributes(dd3)$comment attr(dd3, which="comment")

Both yield same results, as:

> attributes(dd3)$comment [1] "Ordered YearO for categorical analysis and other variables" > attr(dd3, which="comment") [1] "Ordered YearO for categorical analysis and other variables"

This simple, yet very useful code example with effective function is as always, available at Github.

Happy Rrrring!

**leave a comment**for the author, please follow the link and comment on their blog:

**R – TomazTsql**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.