Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When you go to import data using R Studio, you get a menu like this.

Once that has completed, you’ll see the new import data window (shown below).

Okay, so first let’s make a simple comma delimited data file so we can test out the new import dataset process. I have made a simple file called “x-y-data.txt” as shown below. If you make this same file (no spaces, just a comma to separate the x column from the y column) then we can do this exercise together.

Now, let’s use the RStudio import to bring in the file “x-y-data.txt”. Here’s a screen grab of the import screen with my x-y dataset.

We can see that RStudio has used the first row as names, has recognized that it is a comma delimited file, and has read both x and y values as integers. Everything looks good, so I click “import”.

It was after this import process, that I had tried running some of my standard functions, such as making an empirical CDF (cumulative density function) and then I ran into problems. So let’s check the type of data we have imported.

# get the data structure
typeof(x_y_data)
#[1] "list"
class(x_y_data)
#[1] "tbl_df"     "tbl"        "data.frame"


While the old RStudio would have imported this as a matrix by default, this latest version of RStudio imports data as a data frame by default. Apparently RStudio has created their own version of a data frame called a “tbl_df” or tibble data frame. When you use the ‘readr’ package, your data is imported automatically as a “tbl_df”.

Now this isn’t necessarily a bad thing, in fact it seems like there is some nice functionality gained by using the “tbl_df” format. This change just broke some of my previously written code and it’s good to know what RStudio is doing by default.

If we wanted to get back to the matrix format, we can do this will a simple as.matrix function. From there we can verify it was converted using the typeof and class functions.

# convert to a matrix
data<-as.matrix(x_y_data)
#     x  y
#[1,] 1  2
#[2,] 2  4
#[3,] 3  6
#[4,] 4  8
#[5,] 5 10

typeof(data)
#[1] "integer"
class(data)
#[1] "matrix"


You can read more about the new Tibble structure at these websites:

https://blog.rstudio.org/2016/03/24/tibble-1-0-0/

http://www.sthda.com/english/wiki/tibble-data-format-in-r-best-and-modern-way-to-work-with-your-data

Enjoy!