R Data Frame

[This article was first published on R feed, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A data frame is a two-dimensional data structure which can store data in tabular format.

Data frames have rows and columns and each column can be a different vector. And different vectors can be of different data types.

Before we learn about Data Frames, make sure you know about R vector.


Create a Data Frame in R

In R, we use the data.frame() function to create a Data Frame.

The syntax of the data.frame() function is

dataframe1 <- data.frame(
   first_col  = c(val1, val2, ...),
   second_col = c(val1, val2, ...),
   ...
)

Here,

  • first_col - a vector with values val1, val2, ... of same data type
  • second_col - another vector with values val1, val2, ... of same data type and so on

Let's see an example,

# Create a data frame
dataframe1 <- data.frame (
  Name = c("Juan", "Alcaraz", "Simantha"),
  Age = c(22, 15, 19),
  Vote = c(TRUE, FALSE, TRUE)
)

print(dataframe1)

Output

      Name   Age       Vote
1     Juan      22    TRUE
2   Alcaraz   15  FALSE
3 Simantha  19    TRUE

In the above example, we have used the data.frame() function to create a data frame named dataframe1. Notice the arguments passed inside data.frame(),

data.frame (
  Name = c("Juan", "Alcaraz", "Simantha"),
  Age = c(22, 15, 19),
  Vote = c(TRUE, FALSE, TRUE)
)

Here, Name, Age, and Vote are column names for vectors of String, Numeric, and Boolean type respectively.

And finally the datas represented in tabular format are printed.


Access Data Frame Columns

There are different ways to extract columns from a data frame. We can [ ], [[ ]], or $ to access specific column of a data frame in R. For example,

# Create a data frame
dataframe1 <- data.frame (
  Name = c("Juan", "Alcaraz", "Simantha"),
  Age = c(22, 15, 19),
  Vote = c(TRUE, FALSE, TRUE)
)

# pass index number inside [ ] 
print(dataframe1[1])

# pass column name inside [[  ]] 
print(dataframe1[["Name"]])

# use $ operator and column name 
print(dataframe1$Name)

Output

     Name
1     Juan
2  Alcaraz
3 Simantha
[1] "Juan"     "Alcaraz"  "Simantha"
[1] "Juan"     "Alcaraz"  "Simantha"

In the above example, we have created a data frame named dataframe1 with three columns Name, Age, Vote.

Here, we have used different operators to access Name column of dataframe1.

Accessing with [[ ]] or $ is similar. However, it differs for [ ], [ ] will return us a data frame but the other two will reduce it into a vector and return a vector.


Combine Data Frames

In R, we use the rbind() and the cbind() function to combine two data frames together.

  • rbind() - combines two data frames vertically
  • cbind() - combines two data frames horizontally

Combine Vertically Using rbind()

If we want to combine two data frames vertically, the column name of two data frames must be equal. For example,

# create a data frame
dataframe1 <- data.frame (
  Name = c("Juan", "Alcaraz"),
  Age = c(22, 15)
)

# create another data frame
dataframe2 <- data.frame (
  Name = c("Yiruma", "Bach"),
  Age = c(46, 89)
)

# combine two data frames vertically 
updated <- rbind(dataframe1, dataframe2)
print(updated)

Output

       Name   Age
1       Juan      22
2  Alcaraz      15
3  Yiruma       46
4      Bach       89

Here, we have used the rbind() function to combine the two data frames: dataframe1 and dataframe2 vertically.

Combine Horizontally Using cbind()

The cbind() function combines two or more data frames horizontally. For example,

# create a data frame
dataframe1 <- data.frame (
  Name = c("Juan", "Alcaraz"),
  Age = c(22, 15)
)

# create another data frame
dataframe2 <- data.frame (
  Hobby = c("Tennis", "Piano")
)

# combine two data frames horizontally 
updated <- cbind(dataframe1, dataframe2)
print(updated)

Output

      Name   Age   Hobby
1      Juan     22    Tennis
2 Alcaraz     15     Piano

Here, we have used cbind() to combine two data frames horizontally.

Note: The number of items on each vector of two or more combining data frames must be equal otherwise we will get an error: arguments imply differing number of rows or columns.


#length Length of a Data Frame in R

In R, we use the length() function to find the number of columns in a data frame. For example,

# Create a data frame
dataframe1 <- data.frame (
  Name = c("Juan", "Alcaraz", "Simantha"),
  Age = c(22, 15, 19),
  Vote = c(TRUE, FALSE, TRUE)
)

cat("Total Elements:", length(dataframe1))

Output

Total Elements: 3

Here, we have used length() to find the total number of columns in dataframe1. Since there are 3 columns, the length() function returns 3.

To leave a comment for the author, please follow the link and comment on their blog: R feed.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)