Use file.choose to customize output filenames in R functions

April 5, 2012
By

(This article was first published on tuxettechix » R, and kindly contributed to R-bloggers)

In this post, I want to address the following issue: several data files with a common trame have to be dealt with by an R function. The function should export files (such as images or data files or any other file type). I explain how to create filenames such that the function automatically exports files in the same directory than the input file chosen by the user and how to customize the names of the exported files.

I thank Soraya with whom I’ve been looking at this problem (during her work placement) and who helps me find the answer (especially by pointing out the use of the function file.choose).

Suppose that the following file (it is the famous iris data set):

ex-data.txt

is in a directory named /home/tuxette/data1/ (for instance) and that you want to create a function extractNum that has no input, make the user chose a dataset (this one for instance) and export two files (Rdata and csv formats) with only the numerical variables included in the original data set. The exported files must be saved in the same directory than the original file (whatever this directory is) and must be named from the original name by adding the post indication -num.Rdata and -num.csv (respectively).

The following function can be used to make the user chose a data set (that can be this data set but any other one also)

?View Code RSPLUS
selectFile = function(){
	file = file.choose()
	file
}

Then, start the function by making the user select the original data set. The function then load the data set and grepexpr, substr and paste are used to create new filename as described above:

?View Code RSPLUS
extractNum = function(){
	# Make the user choose a file
	filename = selectFile()
	# Load the file
	d = read.table(filename,header=T)
	# Select numerical variables
	# (on the basis of the first observation only: might be improved)
	index.num = is.numeric(d[1,])
	# Create new data set with only the numerical variables
	new.d = d[,index.num]
	# Extract from "filename" the pattern to export the new data set
	# (that is, everything before the final dot)
	pat = grepexpr("[.]",filename,grep=F)
	# (in our example, pat is 28 because 28 is the only dot in filename)
	pat = substr(filename,1,max(pat[[1]])-1)
	# (in our example, pat is then /home/tuxette/data1/ex-data)
 
	# Save the data in Rdata and csv formats at home/tuxette/data1/ex-data-num.Rdata
	# and home/tuxette/data1/ex-data-num.csv
	save(new.d,file=paste(pat,"-num.Rdata",sep=""))
	write.table(new.d,file=paste(pat,"-num.csv",sep=""),row.names=F)
}

In this file, note that the dot (pattern argument in the function grepexpr) is a rationnal expression that has to be specified by “[.]” and not only “.”. Then just use:

?View Code RSPLUS
extractNum()

Write the link to the data set /home/tuxette/data1/ex-data.txt and you should obtain two files with the numerical variables from the iris data set in the original directory of ex-data.txt. Does it work?

To leave a comment for the author, please follow the link and comment on his blog: tuxettechix » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.