Data.table rocks! Data manipulation the fast way in R

November 27, 2012
By

(This article was first published on mages' blog, and kindly contributed to R-bloggers)

I really should make it a habit of using data.table. The speed and simplicity of this R package are astonishing.

Here is a simple example: I have a data frame showing incremental claims development by line of business and origin year. Now I would like add a column with the cumulative claims position for each line of business and each origin year along the development years.

It's one line with data.table! Here it is:
myData[order(dev), cvalue:=cumsum(value), by=list(origin, lob)]
It is even easy to read! Notice also that I don't have to copy the data. The operator ':=' works by reference and is one of the reasons why data.table is so fast.


And it is getting even better. Suppose you want to get the latest claims development position for each line of business and origin year. Again, it is only one line: Read more »

To leave a comment for the author, please follow the link and comment on his blog: mages' blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.