To Sweave, or not to Sweave, that is the question

December 16, 2011

I am about to start writing up the manuscript of my recent biomath seminar (Act 3: Pineda-Krch. 2011. Cycles at the edge of existence: Emergence of quasi-cycles in strongly destabilizedecosystems.). While the slides for the talk were put together using Sweave to illustrate how the literate programming paradigm can improve reproducibility the question now is if I should use Sweave for the manuscript as well. If one is to ensure that the results are reproducible, it is a no brainer. In the computational sciences there currently are no better alternatives to ensuring reproducibility than an “executable” manuscript. The problem is, however, that any self-respecting scientific journals would agree that reproducibility of research presented in manuscripts is important few journals go beyond vague wordings on this topic in their guidelines for authors. Specifically, very few journals explicitly accept manuscript prepared using any of the literate programming systems (e.g. noweb, CWEB, etc) and Sweave is not exception.

Typically the initial manuscript submission only requires a PDF that then goes out for peer review (if your lucky starts are aligned properly). Once your manuscript is accepted, however, you inevitably need to submit the LaTeX source (and if you don’t the journal may take the less travelled and perilous road down the valley of manual typesetting). Of course, with Sweave it would be straight forward to just submit the Sweave generated LaTeX file. The potential issue here is that this is not vanilla LaTeX (but, of course, it is not rocket science either for a progressive and open-minded journal) and this could be a problem, particularly if the journals has very specific format and/or formatting requirements and are particularly obsessive-compulsive about it (plenty of journals are).  So to Sweave your manuscript (and risk the wrath of the journal), or not to Sweave (and compromise reproducibility), that is the question.


Of course, a simple solution would be to submit the manuscript as a Sweave file and then simply considering any upcoming LaTeX problems not to be your problems to deal with. Or as a colleague put it once, once your manuscript is accepted getting the it into print is their problem. As I am starting to write-up the manuscript I remain undecided, but I suspect being a reproducible research/literate programming/Sweave advocate the decision may have already been made for me, now it just needs time to sink in through my thick skull.

Computational sciences has a long way to go before it reaches the level of reproducibility that is taken for granted in empirical research and in mathematics. Or as Roger Peng much more eloquently expresses in his recent Science perspective:

“The field of science will not change overnight, but simply bringing the notion of reproducibility to the forefront and making it routine will make a difference. Ultimately, developing a culture of reproducibility in which it currently does not exist will require time and sustained effort from the scientific community.”

Perhaps my manuscript could be one small contribution towards this goal.

