# Importing Data into R, part II

**The Practical R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I recently downloaded the latest version of R Studio and noticed that their import dataset functionality had changed significantly. I had previously written about this **HERE** and wanted to provide an update for the current version of RStudio.

When you go to import data using R Studio, you get a menu like this.

If you’re using the latest version of RStudio, when you click “From CSV” you’ll get a popup about downloading a new library ‘readr’.

Once that has completed, you’ll see the new import data window (shown below).

Okay, so first let’s make a simple comma delimited data file so we can test out the new import dataset process. I have made a simple file called “x-y-data.txt” as shown below. If you make this same file (no spaces, just a comma to separate the x column from the y column) then we can do this exercise together.

Now, let’s use the RStudio import to bring in the file “x-y-data.txt”. Here’s a screen grab of the import screen with my x-y dataset.

We can see that RStudio has used the first row as names, has recognized that it is a comma delimited file, and has read both x and y values as integers. Everything looks good, so I click “import”.

It was after this import process, that I had tried running some of my standard functions, such as making an empirical CDF (cumulative density function) and then I ran into problems. So let’s check the type of data we have imported.

# get the data structure typeof(x_y_data) #[1] "list" class(x_y_data) #[1] "tbl_df" "tbl" "data.frame"

While the old RStudio would have imported this as a **matrix** by default, this latest version of RStudio imports data as a data frame by default. Apparently RStudio has created their own version of a data frame called a “tbl_df” or tibble data frame. When you use the ‘readr’ package, your data is imported automatically as a “tbl_df”.

Now this isn’t necessarily a bad thing, in fact it seems like there is some nice functionality gained by using the “tbl_df” format. This change just broke some of my previously written code and it’s good to know what RStudio is doing by default.

If we wanted to get back to the matrix format, we can do this will a simple **as.matrix** function. From there we can verify it was converted using the **typeof** and **class** functions.

# convert to a matrix data<-as.matrix(x_y_data) # x y #[1,] 1 2 #[2,] 2 4 #[3,] 3 6 #[4,] 4 8 #[5,] 5 10 typeof(data) #[1] "integer" class(data) #[1] "matrix"

You can read more about the new Tibble structure at these websites:

https://blog.rstudio.org/2016/03/24/tibble-1-0-0/

http://www.sthda.com/english/wiki/tibble-data-format-in-r-best-and-modern-way-to-work-with-your-data

Enjoy!

**leave a comment**for the author, please follow the link and comment on their blog:

**The Practical R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.