Reproducible research with markdown, knitr and pandoc

May 17, 2012
By

(This article was first published on Ecological Modelling... » R, and kindly contributed to R-bloggers)

Over the last few weeks I was trying to optimise my workflow using markdow in combination with knitr and pandoc. Knitr is a grea new package by Yihui, expanding R’s capabilities for reproducible research.

I will illustrate my work flow with the following example, where I have a small R-script (script.r) that I want to embed into a report. However, I do not want to write LaTeX, nor I want/can specify my final output format in the beginning. That is where were pandoc comes in. Pandoc is the swiss-army knife if come to convert between markup languages.

The file script.r:


## @knitr gen-dat
a <- matrix(rnorm(100), nrow=10)

## @knitr plot
 image(a)

The report is written with  pandoc flavored markdown. The file (report_knit_.md) contains


% A sample report
% The author
% `r date()`

<!-- Setting up R -->
`ro warning=FALSE, dev="png", fig.cap="", cache=FALSE or`

<!-- read external r code -->
```{r reading, echo=FALSE}
read_chunk("script.r")
```

# The first part of my R script
Here I can generate my data
```{r}
<<gen-dat>>
```

# Results
An now the reults are plotted
```{r plot-fig, result="asis"}
<<plot>>
```

# More
Of course I can use inline elemtnts: 3 + 3 = `r 3+3`.

For each chunk there are plenty of options to modify it (see options).

To render my report, I need to first knit it in R and then use pandoc to convert it to the final format. This can be done with


Rscript -e "library(knitr); knit('report_knit_.md')"

This results in a pandoc flavored markdown document. Now I can use pandoc to convert this document into all by pandoc supported output format (list of formats):

  • A pdf file: pandoc -s report.md -t latex -o report.pdf
  • A html file: pandoc -s report.md -o report.html (with the -c flag html files can be added easily)
  • Openoffice: pandoc report.md -o report.odt
  • Word docx: pandoc report.md -o report.docx

Files are available on github.


To leave a comment for the author, please follow the link and comment on his blog: Ecological Modelling... » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.