In science consensus is irrelevant. What is relevant is reproducible results. The greatest scientists in history are great precisely because they broke with the consensus. – Michael CrichtonYihui Xie intended to make it easier to do his homework, but instead found himself tackling one of the greatest problems in modern science: the reproducibility of results.
Reproducibility is one of the main principles of the scientific method, and refers to the ability of a test or experiment to be accurately reproduced, or replicated, by someone else working independently.Through his work on the knitR package, he has assembled a toolchain which allows the user to produce beautiful, ready-to-distribute documents containing a whole, self-supporting, and reproducible analysis. By leveraging powerful constructs such as adaptive, nearly transparent caching, Yihui has removed the barriers which have prevented many practitioners from fully addressing reproducibility in their work. Reproducible documents are now flexible, powerful, and fast. No single R package has impacted my personal workflow in the past decade as deeply as knitR has, moving the task of updating analyses with more current data from a chore to a pleasure. Previously, presenting a data story in a flowing document required completing an array of invisible processes plus the usual copy/paste engineering in order to create an incomplete picture of the analysis and code used to produce it. Now, a single living knitR project can now be created in-line and shared in its entirety, as either a PDF, a git repository, or even a living Shiny document, giving my audience a nearly total understanding of the process by which I arrived at my solution. In this interview, Yihui discusses how he came to the R programming language and how he set about building knitR. He also mentions the great momentum and energy of the R community in China, and what he’s currently focused on at RStudio.