**Christopher Gandrud (간드루드 크리스토파)**, and kindly contributed to R-bloggers)

Over the past few months I’ve added a few improvements to the repmis–miscellaneous functions for reproducible research–R package. I just want to briefly highlight two of them:

**Caching**downloaded data sets.`source_XlsxData`

for downloading data in Excel formatted files.

Both of these capabilities are in **repmis** version 0.2.9 and greater.

## Caching

When working with data sourced directly from the internet, it can be time consuming (and make the data hoster angry) to repeatedly download the data. So, **repmis**’s `source`

functions (`source_data`

, `source_DropboxData`

, and `source_XlsxData`

) can now cache a downloaded data set by setting the argument `cache = TRUE`

. For example:

`DisData <- source_data("http://bit.ly/156oQ7a", cache = TRUE)`

When the function is run again, the data set at http://bit.ly/156oQ7a will be loaded locally, rather than downloaded.

To delete the cached data set, simply run the function again with the argument `clearCache = TRUE`

.

`source_XlsxData`

I recently added the `source_XlsxData`

function to download Excel data sets directly into R. This function works very similarly to the other `source`

functions. There are two differences:

You need to specify the

`sheet`

argument. This is either the name of one specific sheet in the downloaded Excel workbook or its number (e.g. the first sheet in the workbook would be`sheet = 1`

).You can pass other arguments to the read.xlsx function from the xlsx package.

Here’s a simple example:

`RRurl <- 'http://www.carmenreinhart.com/user_uploads/data/22_data.xls'`

RRData <- source_XlsxData(url = RRurl, sheet = 2, startRow = 5)

`startRow = 5`

basically drops the first 4 rows of the sheet.

**leave a comment**for the author, please follow the link and comment on his blog:

**Christopher Gandrud (간드루드 크리스토파)**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...