To Sweave, or not to Sweave, that is the question

December 16, 2011

(This article was first published on Mario's Entangled Bank » R, and kindly contributed to R-bloggers)

I am about to start writing up the manuscript of my recent biomath seminar (Act 3: Pineda-Krch. 2011. Cycles at the edge of existence: Emergence of quasi-cycles in strongly destabilizedecosystems.). While the slides for the talk were put together using Sweave to illustrate how the literate programming paradigm can improve reproducibility the question now is if I should use Sweave for the manuscript as well. If one is to ensure that the results are reproducible, it is a no brainer. In the computational sciences there currently are no better alternatives to ensuring reproducibility than an “executable” manuscript. The problem is, however, that any self-respecting scientific journals would agree that reproducibility of research presented in manuscripts is important few journals go beyond vague wordings on this topic in their guidelines for authors. Specifically, very few journals explicitly accept manuscript prepared using any of the literate programming systems (e.g. noweb, CWEB, etc) and Sweave is not exception.

Typically the initial manuscript submission only requires a PDF that then goes out for peer review (if your lucky starts are aligned properly). Once your manuscript is accepted, however, you inevitably need to submit the LaTeX source (and if you don’t the journal may take the less travelled and perilous road down the valley of manual typesetting). Of course, with Sweave it would be straight forward to just submit the Sweave generated LaTeX file. The potential issue here is that this is not vanilla LaTeX (but, of course, it is not rocket science either for a progressive and open-minded journal) and this could be a problem, particularly if the journals has very specific format and/or formatting requirements and are particularly obsessive-compulsive about it (plenty of journals are).  So to Sweave your manuscript (and risk the wrath of the journal), or not to Sweave (and compromise reproducibility), that is the question.


Of course, a simple solution would be to submit the manuscript as a Sweave file and then simply considering any upcoming LaTeX problems not to be your problems to deal with. Or as a colleague put it once, once your manuscript is accepted getting the it into print is their problem. As I am starting to write-up the manuscript I remain undecided, but I suspect being a reproducible research/literate programming/Sweave advocate the decision may have already been made for me, now it just needs time to sink in through my thick skull.

Computational sciences has a long way to go before it reaches the level of reproducibility that is taken for granted in empirical research and in mathematics. Or as Roger Peng much more eloquently expresses in his recent Science perspective:

“The field of science will not change overnight, but simply bringing the notion of reproducibility to the forefront and making it routine will make a difference. Ultimately, developing a culture of reproducibility in which it currently does not exist will require time and sustained effort from the scientific community.”

Perhaps my manuscript could be one small contribution towards this goal.

This is from the “Mario’s Entangled Bank” blog ( ) of Mario Pineda-Krch, a theoretical biologist at the University of Alberta.

Filed under: manuscript, R, Sweave, writing

To leave a comment for the author, please follow the link and comment on their blog: Mario's Entangled Bank » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)