Handling Large CSV Files in R. It is really an excellent one worthing a new post to introduce formally.Some of you may know this R reshape package already, I have started to play with it after the post
What is reshape package? reshape: Flexibly reshape data, Reshape lets you flexibly restructure and aggregate data using just two functions: melt and cast. Therefore basically it allows us to massage, re-organize our data as the hierarchy we need with only two steps: first melt the data into a form suitable for easy casting, then cast a molten data frame into the reshaped or aggregated form you want. Sounds tongue twisters? A small example will help you feel clearer.
Suppose you have a matrix of bond data
you are interested in the total amount of bonds of each rating, of each industry, or of each time to maturity, how to proceed? you may be thinking of lapply, sapply or even for loop, that's OK but at the cost of efficiency (coding time & running time) and possible error (personally I often have to modify twice for my sapply code to work, sad...).
It becomes much easier with the R Reshape package,
first, melt the data, newdata <- melt(data, id=c("RATING", "TIME_TO_MATURITY", "INDUSTRY_CODE", "BOND_TYPE"));
second, cast the data based on your needs, for instance, to get the total amount of each industry, cast(newdata, INDUSTRY_CODE ~ variable, sum) returns you a data.frame like
That's it, easy to use, efficient, right? Download the R Reshape package at http://cran.r-project.org/web/packages/reshape/index.html
Tags - r , package
Read the full post at R Reshape Package.