Make Your Date Folder Clean with Function unzip & unz

February 26, 2013
By

(This article was first published on Category: R | Huidong Tian's Blog, and kindly contributed to R-bloggers)

I am a somewhat minimalist R user. I feel uncomfortable if something is not in a good order, such as the names of variables and documents, the structures of my codes and projects. I prefer my data stored in .txt or .csv so I can load them to R using read.table or read.csv. For most of the time we got along well, until I got a huge number of .txt files. One of my research need to assign oxygen density value to our field observation. There are more than 600 oxygen files with total size round 1GB for different periods. It’s annoying because: first, they occupy a lot of space, even larger than 90 percent of the whole project; second, it’s time consuming when you copy or synchronize them to cloud server, like Google Drive.

At last, I found one way to deal with such problem: using the native functions unzip and unz of R. What you need to do is compress all .txt files into a .zip file. Here is an example: suppose you have compressed all your .txt files into a .zip file named “TSOC 1961 2010.zip”;

Read Data From Zip File
1
2
3
4
5
## List all files names inside of a .zip file;
file_ls <- as.character(unzip(“TSOC<em>1961</em>2010.zip”, list = TRUE)$Name)</p>

<h2 id="read-each-txt-file-into-r">Read each .txt file into R;</h2>
<p>for (i in file_ls) dat <- read.table(unz(“Material/TSOC<em>1961</em>2010.zip”, i))

Now, 600 files came to one file, size decreased to 100 MB, no more code lines added in the script. More important, it made my mind clean and conveniented project management.

R always surprise me!

To leave a comment for the author, please follow the link and comment on his blog: Category: R | Huidong Tian's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.