Commonmark: Super Fast Markdown Rendering in R

February 2, 2016
By

(This article was first published on OpenCPU, and kindly contributed to R-bloggers)

opencpu logo

A few months ago I first announced the commonmark R package. Since then there have been a few more releases… time for an update!

What is CommonMark?

Markdown is used in many places these days, however the original spec actually leaves some ambiguity which makes it difficult to optimize and leads to inconsistencies between implementations.
Commonmark is an initiative led by John MacFarlane at UC Berkeley (also the author of pandoc) to standardize the markdown syntax.
Besides a specification, the commonmark team provides reference implementations for C (cmark) and JavaScript (commonmark.js).

The commonmark R package wraps around cmark which converts markdown text into various formats, including html, latex and groff man. This makes commonmark very suitable for e.g. writing manual pages which are often stored in exactly these formats. In addition the package exposes the markdown parse tree in xml format to support customized output handling.

# Load library
library(commonmark)

# Render some markdown
md <- readLines(curl::curl("https://raw.githubusercontent.com/yihui/knitr/master/NEWS.md"))
html <- markdown_html(md)
man <- markdown_man(md)
tex <- markdown_latex(md)

# Syntax tree
xml <- markdown_xml(md)

# Back to (standardized) markdown
cm <- markdown_commonmark(md)

Currently, commonmark only specifies the original markdown elements: italic, bold, headings, links, images, quotes, paragraphs, lists, horizontal rule, and code blocks. Extensions from pandoc that were introduced later on such as tables are not supported.

CommonMark is fast

The cmark library is written in elegant C code and highly optimized. It renders a Markdown version of War and Peace in the blink of an eye (127 milliseconds on a ten year old laptop, vs. 100-400 milliseconds for an eye blink). A simple benchmark in R confirms that our example above is converted to any of the formats in only a few milliseconds.

library(microbenchmark)
microbenchmark(
  markdown_html = markdown_html(md),
  markdown_man = markdown_man(md),
  markdown_latex = markdown_latex(md)
)
# Unit: milliseconds
#            expr      min       lq     mean   median       uq      max neval
#   markdown_html 3.228492 3.243339 3.318437 3.263184 3.359420 3.902745   100
#    markdown_man 5.768978 5.803062 5.885971 5.862607 5.942159 6.177985   100
#  markdown_latex 5.906757 5.946995 6.049409 6.001677 6.107563 7.619014   100

The main benefit, besides Tolstoy saving some time on typesetting, is that cmark alows for shipping documents such as help pages in native markdown format and render them on-the-fly in html/latex/man without noticable performance overhead. This is very nice for editing and maintaining any sort of portable, dynamic documentation.

Markdown in R documentation

Several people have independently had the idea to add support for markdown to R documentation which would be super awesome. Gábor has started a package called maxygen which might get merged into roxygen2 at some point. This allows for inserting emphasis, boldface, codeblocks, lists, links, and images in your roxygen fields using simple markdown notation rather than the ugly Rd format.

There has also been some discussion on the r-devel mailing list about extending support for markdown in R and CRAN, but that mostly seems to concern NEWS and README files.

To leave a comment for the author, please follow the link and comment on their blog: OpenCPU.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Mango solutions





RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de









ODSC

CRC R books series













Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)