[This article was first published on R is my friend » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A few weeks ago I gave a presentation on using Sweave and Knitr under the guise of promoting reproducible research. I humbly offer this presentation to the blog with full knowledge that there are already loads of tutorials available online. This presentation is
Cheers,
\documentclass[xcolor=svgnames]{beamer}
%\documentclass[xcolor=svgnames,handout]{beamer}
\usetheme{Boadilla}
\usecolortheme[named=Sienna]{structure}
\usepackage{graphicx}
\usepackage[final]{animate}
%\usepackage[colorlinks=true,urlcolor=blue,citecolor=blue,linkcolor=blue]{hyperref}
\usepackage{breqn}
\usepackage{xcolor}
\usepackage{booktabs}
\usepackage{verbatim}
\usepackage{tikz}
\usetikzlibrary{shadows,arrows,positioning}
\usepackage[noae]{Sweave}
\definecolor{links}{HTML}{2A1B81}
\hypersetup{colorlinks,linkcolor=links,urlcolor=links}
\usepackage{pgfpages}
%\pgfpagesuselayout{4 on 1}[letterpaper, border shrink = 5mm, landscape]
\tikzstyle{block} = [rectangle, draw, text width=7em, text centered, rounded corners, minimum height=3em, minimum width=7em, top color = white, bottom color=brown!30, drop shadow]
\newcommand{\ShowSexpr}[1]{\texttt{{\char`\\}Sexpr\{#1\}}}
\begin{document}
\SweaveOpts{concordance=TRUE}
\title[Nuts and bolts of Sweave/Knitr]{The nuts and bolts of Sweave/Knitr for reproducible research with \LaTeX}
\author[M. Beck]{Marcus W. Beck}
\institute[USEPA NHEERL]{ORISE Post-doc Fellow\\
USEPA NHEERL Gulf Ecology Division, Gulf Breeze, FL\\
Email: \href{mailto:beck.marcus@epa.gov}{beck.marcus@epa.gov}, Phone: 850 934 2480}
\date{January 15, 2014}
%%%%%%
\begin{frame}
\vspace{-0.3in}
\titlepage
\end{frame}
%%%%%%
\begin{frame}{Reproducible research}
\onslide<+->
In it's most general sense... the ability to reproduce results from an experiment or analysis conducted by another.\\~\\
\onslide<+->
From Wikipedia... `The ultimate product is the \alert{paper along with the full computational environment} used to produce the results in the paper such as the code, data, etc. that can be \alert{used to reproduce the results and create new work} based on the research.'\\~\\
\onslide<+->
Concept is strongly based on the idea of \alert{literate programming} such that the logic of the analysis is clearly represented in the final product by combining computer code/programs with ordinary human language [Knuth, 1992].
\end{frame}
%%%%%%
\begin{frame}{Non-reproducible research}
\begin{center}
\begin{tikzpicture}[node distance=2.5cm, auto, >=stealth]
\onslide<2->{
\node[block] (a) {1. Gather data};}
\onslide<3->{
\node[block] (b) [right of=a, node distance=4.2cm] {2. Analyze data};
\draw[->] (a) -- (b);}
\onslide<4->{
\node[block] (c) [right of=b, node distance=4.2cm] {3. Report results};
\draw[->] (b) -- (c);}
% \onslide<5->{
% \node [right of=a, node distance=2.1cm] {\textcolor[rgb]{1,0,0}{X}};
% \node [right of=b, node distance=2.1cm] {\textcolor[rgb]{1,0,0}{X}};}
\end{tikzpicture}
\end{center}
\vspace{-0.5cm}
\begin{columns}[t]
\onslide<2->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Begins with general question or research objectives
\item Data collected in raw format (hard copy) converted to digital (Excel spreadsheet)
\end{itemize}
\end{column}}
\onslide<3->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Import data into stats program or analyze directly in Excel
\item Create figures/tables directly in stats program
\item Save relevant output
\end{itemize}
\end{column}}
\onslide<4->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Create research report using Word or other software
\item Manually insert results into report
\item Change final report by hand if methods/analysis altered
\end{itemize}
\end{column}}
\end{columns}
\end{frame}
%%%%%%
\begin{frame}{Reproducible research}
\begin{center}
\begin{tikzpicture}[node distance=2.5cm, auto, >=stealth]
\onslide<1->{
\node[block] (a) {1. Gather data};}
\onslide<1->{
\node[block] (b) [right of=a, node distance=4.2cm] {2. Analyze data};
\draw[<->] (a) -- (b);}
\onslide<1->{
\node[block] (c) [right of=b, node distance=4.2cm] {3. Report results};
\draw[<->] (b) -- (c);}
\end{tikzpicture}
\end{center}
\vspace{-0.5cm}
\begin{columns}[t]
\onslide<1->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Begins with general question or research objectives
\item Data collected in raw format (hard copy) converted to digital (\alert{text file})
\end{itemize}
\end{column}}
\onslide<1->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Create \alert{integrated script} for importing data (data path is known)
\item Create figures/tables directly in stats program
\item \alert{No need to export} (reproduced on the fly)
\end{itemize}
\end{column}}
\onslide<1->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Create research report using RR software
\item \alert{Automatically include results} into report
\item \alert{Change final report automatically} if methods/analysis altered
\end{itemize}
\end{column}}
\end{columns}
\end{frame}
%%%%%%
\begin{frame}{Reproducible research in R}
Easily adopted using RStudio [\href{http://www.rstudio.com/}{http://www.rstudio.com/}]\\~\\
Also possible w/ Tinn-R or via command prompt but not as intuitive\\~\\
Requires a \LaTeX\ distribution system - use MikTex for Windows [\href{http://miktex.org/}{http://miktex.org/}]\\~\\
\onslide<2->{
Essentially a \LaTeX\ document that incorporates R code... \\~\\
Uses Sweave (or Knitr) to convert .Rnw file to .tex file, then \LaTeX\ to create pdf\\~\\
Sweave comes with \texttt{utils} package, may have to tell R where it is \\~\\
}
\end{frame}
%%%%%%
\begin{frame}{Reproducible research in R}
Use same procedure for compiling a \LaTeX\ document with one additional step
\begin{center}
\begin{tikzpicture}[node distance=2.5cm, auto, >=stealth]
\onslide<2->{
\node[block] (a) {1. myfile.Rnw};}
\onslide<3->{
\node[block] (b) [right of=a, node distance=4.2cm] {2. myfile.tex};
\draw[->] (a) -- (b);\node [right of=a, above=0.5cm, node distance=2.1cm] {Sweave};}
\onslide<4->{
\node[block] (c) [right of=b, node distance=4.2cm] {3. myfile.pdf};
\draw[->] (b) -- (c);
\node [right of=b, above=0.5cm, node distance=2.1cm] {pdfLatex};}
\end{tikzpicture}
\end{center}
\vspace{-0.5cm}
\begin{columns}[t]
\onslide<2->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item A .tex file but with .Rnw extension
\item Includes R code as `chunks' or inline expressions
\end{itemize}
\end{column}}
\onslide<3->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item .Rnw file is converted to a .tex file using Sweave
\item .tex file contains output from R, no raw R code
\end{itemize}
\end{column}}
\onslide<4->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item .tex file converted to pdf (or other output) for final format
\item Include biblio with bibtex
\end{itemize}
\end{column}}
\end{columns}
\end{frame}
%%%%%%
\begin{frame}[containsverbatim]{Reproducible research in R} \label{sweaveref}
\begin{block}{.Rnw file}
\begin{verbatim}
\documentclass{article}
\usepackage{Sweave}
\begin{document}
Here's some R code:
\Sexpr{'<<eval=true,echo=true>>='}
options(width=60)
set.seed(2)
rnorm(10)
\Sexpr{'@'}
\end{document}
\end{verbatim}
\end{block}
\end{frame}
%%%%%%
\begin{frame}[containsverbatim,shrink]{Reproducible research in R}
\begin{block}{.tex file}
\begin{verbatim}
\documentclass{article}
\usepackage{Sweave}
\begin{document}
Here's some R code:
\begin{Schunk}
\begin{Sinput}
> options(width=60)
> set.seed(2)
> rnorm(10)
\end{Sinput}
\begin{Soutput}
[1] -0.89691455 0.18484918 1.58784533 -1.13037567
[5] -0.08025176 0.13242028 0.70795473 -0.23969802
[9] 1.98447394 -0.13878701
\end{Soutput}
\end{Schunk}
\end{document}
\end{verbatim}
\end{block}
\end{frame}
%%%%%%
\begin{frame}{Reproducible research in R}
The final product:\\~\\
\centerline{\includegraphics{ex1_input.pdf}}
\end{frame}
%%%%%%
\begin{frame}[fragile]{Sweave - code chunks}
\onslide<+->
R code is entered in the \LaTeX\ document using `code chunks'
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<>>='}
\Sexpr{'@'}
\end{verbatim}
\end{block}
Any text within the code chunk is interpreted as R code\\~\\
Arguments for the code chunk are entered within \verb|\Sexpr{'<<here>>'}|\\~\\
\onslide<+->
\begin{itemize}
\item{\texttt{eval}: evaluate code, default \texttt{T}}
\item{\texttt{echo}: return source code, default \texttt{T}}
\item{\texttt{results}: format of output (chr string), default is `include' (also `tex' for tables or `hide' to suppress)}
\item{\texttt{fig}: for creating figures, default \texttt{F}}
\end{itemize}
\end{frame}
%%%%%%
\begin{frame}[fragile]{Sweave - code chunks}
Changing the default arguments for the code chunk:
\begin{columns}[t]
\begin{column}{0.45\textwidth}
\onslide<+->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<>>='}
2+2
\Sexpr{'@'}
\end{verbatim}
\end{block}
<<>>=
2+2
@
\onslide<+->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<eval=F>>='}
2+2
\Sexpr{'@'}
\end{verbatim}
\end{block}
Returns nothing...
\end{column}
\begin{column}{0.45\textwidth}
\onslide<+->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<results=hide>>='}
2+2
\Sexpr{'@'}
\end{verbatim}
\end{block}
<<results=hide>>=
2+2
@
\onslide<+->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<echo=F>>='}
2+2
\Sexpr{'@'}
\end{verbatim}
\end{block}
<<echo=F>>=
2+2
@
\end{column}
\end{columns}
\end{frame}
%%%%%%
\begin{frame}[t,fragile]{Sweave - figures}
\onslide<1->
Sweave makes it easy to include figures in your document
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<myfig,fig=T,echo=F,include=T,height=3>>='}
set.seed(2)
hist(rnorm(100))
\Sexpr{'@'}
\end{verbatim}
\end{block}
\onslide<2->
<<myfig,fig=T,echo=F,include=T,height=3>>=
set.seed(2)
hist(rnorm(100))
@
\end{frame}
%%%%%%
\begin{frame}[t,fragile]{Sweave - figures}
Sweave makes it easy to include figures in your document
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<myfig,fig=T,echo=F,include=T,height=3>>='}
set.seed(2)
hist(rnorm(100))
\Sexpr{'@'}
\end{verbatim}
\end{block}
\vspace{\baselineskip}
Relevant code options for figures:
\begin{itemize}
\item{The chunk name is used to name the figure, myfile-myfig.pdf}
\item{\texttt{fig}: Lets R know the output is a figure}
\item{\texttt{echo}: Use \texttt{F} to suppress figure code}
\item{\texttt{include}: Should the figure be automatically include in output}
\item{\texttt{height}: (and \texttt{width}) Set dimensions of figure in inches}
\end{itemize}
\end{frame}
%%%%%%
\begin{frame}[t,fragile]{Sweave - figures}
An alternative approach for creating a figure
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<myfig,fig=T,echo=F,include=F,height=3>>='}
set.seed(2)
hist(rnorm(100))
\Sexpr{'@'}
\includegraphics{rnw_name-myfig.pdf}
\end{verbatim}
\end{block}
\includegraphics{Sweave_intro-myfig.pdf}
\end{frame}
%%%%%%
\begin{frame}[t,fragile]{Sweave - tables}
\onslide<1->
Really easy to create tables
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<results=tex,echo=F>>='}
library(stargazer)
data(iris)
stargazer(iris,title='Summary statistics for Iris data')
\Sexpr{'@'}
\end{verbatim}
\end{block}
\onslide<2->
<<results=tex,echo=F>>=
data(iris)
library(stargazer)
stargazer(iris,title='Summary statistics for Iris data')
@
\end{frame}
%%%%%%
\begin{frame}[t,fragile]{Sweave - tables}
Really easy to create tables
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<results=tex,echo=F>>='}
library(stargazer)
data(iris)
stargazer(iris,title='Summary statistics for Iris data')
\Sexpr{'@'}
\end{verbatim}
\end{block}
\vspace{\baselineskip}
\texttt{results} option should be set to `tex' (and \texttt{echo=F})\\~\\
Several packages are available to convert R output to \LaTeX\ table format
\begin{itemize}
\item{xtable: most general package}
\item{hmisc: similar to xtable but can handle specific R model objects}
\item{stargazer: fairly effortless conversion of R model objects to tables}
\end{itemize}
\end{frame}
%%%%%%
\begin{frame}[fragile]{Sweave - expressions}
\onslide<1->
All objects within a code chunk are saved in the workspace each time a document is compiled (unless \texttt{eval=F})\\~\\
This allows the information saved in the workspace to be reproduced in the final document as inline text, via \alert{expressions}\\~\\
\onslide<2->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<echo=F>>='}
data(iris)
dat<-iris
\Sexpr{'@'}
\end{verbatim}
Mean sepal length was \ShowSexpr{mean(dat\$Sepal.Length)}.
\end{block}
\onslide<3->
<<echo=F>>=
data(iris)
dat<-iris
@
\vspace{\baselineskip}
Mean sepal length was \Sexpr{mean(dat$Sepal.Length)}.
\end{frame}
%%%%%%
\begin{frame}[fragile]{Sweave - expressions}
Change the global R options to change the default output\\~\\
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<echo=F>>='}
data(iris)
dat<-iris
options(digits=2)
\Sexpr{'@'}
\end{verbatim}
Mean sepal length was \ShowSexpr{format(mean(dat\$Sepal.Length))}.
\end{block}
<<echo=F>>=
data(iris)
dat<-iris
options(digits=2)
@
\vspace{\baselineskip}
Mean sepal length was \Sexpr{format(mean(dat$Sepal.Length))}.\\~\\
\end{frame}
%%%%%%
\begin{frame}{Sweave vs Knitr}
\onslide<1->
Does not automatically cache R data on compilation\\~\\
\alert{Knitr} is a useful alternative - similar to Sweave but with minor differences in args for code chunks, more flexible output\\~\\
\onslide<2->
\begin{columns}
\begin{column}{0.3\textwidth}
Must change default options in RStudio\\~\\
Knitr included with RStudio, otherwise download as package
\end{column}
\begin{column}{0.6\textwidth}
\centerline{\includegraphics[width=0.8\textwidth]{options_ex.png}}
\end{column}
\end{columns}
\end{frame}
%%%%%%
\begin{frame}[fragile]{Knitr}
\onslide<1->
Knitr can be used to cache code chunks\\~\\
Date are saved when chunk is first evaluated, skipped on future compilations unless changed\\~\\
This allows quicker compilation of documents that import lots of data\\
~\\
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<mychunk, cache=TRUE, eval=FALSE>>='}
load(file='mydata.RData')
\Sexpr{'@'}
\end{verbatim}
\end{block}
\end{frame}
%%%%%%
\begin{frame}[containsverbatim,shrink]{Knitr} \label{knitref}
\begin{block}{.Rnw file}
\begin{verbatim}
\documentclass{article}
\Sexpr{'<<setup, include=FALSE, cache=FALSE>>='}
library(knitr)
#set global chunk options
opts_chunk$set(fig.path='H:/docs/figs/', fig.align='center',
dev='pdf', dev.args=list(family='serif'), fig.pos='!ht')
options(width=60)
\Sexpr{'@'}
\begin{document}
Here's some R code:
\Sexpr{'<<eval=T, echo=T>>='}
set.seed(2)
rnorm(10)
\Sexpr{'@'}
\end{document}
\end{verbatim}
\end{block}
\end{frame}
%%%%%%
\begin{frame}{Knitr}
The final product:\\~\\
\centerline{\includegraphics[width=\textwidth]{knit_ex.pdf}}
\end{frame}
%%%%%%
\begin{frame}[containsverbatim,shrink]{Knitr}
Figures, tables, and expressions are largely the same as in Sweave\\~\\
\begin{block}{Figures}
\begin{verbatim}
\Sexpr{'<<myfig,echo=F>>='}
set.seed(2)
hist(rnorm(100))
\Sexpr{'@'}
\end{verbatim}
\end{block}
\vspace{\baselineskip}
\begin{block}{Tables}
\begin{verbatim}
\Sexpr{"<<mytable,results='asis',echo=F,message=F>>="}
library(stargazer)
data(iris)
stargazer(iris,title='Summary statistics for Iris data')
\Sexpr{'@'}
\end{verbatim}
\end{block}
\end{frame}
%%%%%%
\begin{frame}{A minimal working example}
\onslide<1->
Step by step guide to creating your first RR document\\~\\
\begin{enumerate}
\onslide<2->
\item Download and install \href{http://www.rstudio.com/}{RStudio}
\onslide<3->
\item Dowload and install \href{http://miktex.org/}{MikTeX} if using Windows
\onslide<4->
\item Create a unique folder for the document - This will be the working directory
\onslide<5->
\item Open a new Sweave file in RStudio
\onslide<6->
\item Copy and paste the file found on slide \ref{sweaveref} for Sweave or slide \ref{knitref} for Knitr into the new file (and select correct compile option)
\onslide<7->
\item Compile the pdf (runs Sweave/Knitr, then pdfLatex)\\~\\
\end{enumerate}
\onslide<7->
\centerline{\includegraphics[width=0.6\textwidth]{compile_ex.png}}
\end{frame}
%%%%%%
\begin{frame}{If things go wrong...}
\LaTeX\ Errors can be difficult to narrow down - check the log file\\~\\
Sweave/Knitr errors will be displayed on the console\\~\\
Other resources
\begin{itemize}
\item{`Reproducible Research with R and RStudio' by C. Garund, CRC Press}
\item{\LaTeX forum (like StackOverflow) \href{http://www.latex-community.org/forum/}{http://www.latex-community.org/forum/}}
\item Comprehensive Knitr guide \href{http://yihui.name/knitr/options}{http://yihui.name/knitr/options}
\item Sweave user manual \href{http://stat.ethz.ch/R-manual/R-devel/library/utils/doc/Sweave.pdf}{http://stat.ethz.ch/R-manual/R-devel/library/utils/doc/Sweave.pdf}
\item Intro to Sweave \href{http://www.math.ualberta.ca/~mlewis/links/the_joy_of_sweave_v1.pdf}{http://www.math.ualberta.ca/~mlewis/links/the_joy_of_sweave_v1.pdf}
\end{itemize}
\vspace{\baselineskip}
\end{frame}
\end{document}
To leave a comment for the author, please follow the link and comment on their blog: R is my friend » R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
