Enterprise Dashboards with R Markdown
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This is a second post in a series on enterprise dashboards. See our previous post, Enterprise-ready dashboards with Shiny Databases.
We have been living with spreadsheets for so long that most office workers think it is obvious that spreadsheets generated with programs like Microsoft Excel make it easy to understand data and communicate insights. Everyone in a business, from the newest intern to the CEO, has had some experience with spreadsheets. But using Excel as the de facto analytic standard is problematic. Relying exclusively on Excel produces environments where it is almost impossible to organize and maintain efficient operational workflows. In addition to fostering low productivity, organizations risk profits and reputations in an age where insightful analyses and process control translate to a competitive advantage. Most organizations want better control over accessing, distributing, and processing data. You can use the R programming language, along with with R Markdown reports and RStudio Connect, to build enterprise dashboards that are robust, secure, and manageable.
This Excel dashboard attempts to function as a real application by allowing its users to filter and visualize key metrics about customers. It took dozens of hours to build. The intent was to hand off maintenance to someone else, but the dashboard was so complex that the author was forced to maintain it. Every week, the author copied data from an ETL tool and pasted it into the workbook, spot checked a few cells, and then emailed the entire workbook to a distribution list. Everyone on the distribution list got a new copy in their inbox every week. There were no security controls around data management or data access. Anyone with the report could modify its contents. The update process often broke the brittle cell dependencies; or worse, discrepancies between weeks passed unnoticed. It was almost impossible to guarantee the integrity of each weekly report.
Why coding is important
Excel workbooks are hard to maintain, collaborate on, and debug because they are not reproducible. The content of every cell and the design of every chart is set without ever recording the author’s actions. There is no simple way to recreate an Excel workbook because there is no recipe (i.e., set of instructions) that describes how it was made. Because Excel workbooks lack a recipe, they tend to be hard to maintain and prone to errors. It takes care, vigilance, and subject-matter knowledge to maintain a complex Excel workbook. Even then, human errors abound and changes require a lot of effort.
A better approach is to write code. There are many reasons to start programming. When you create a recipe with code, anyone can reproduce your work (including your future self). The act of coding implicitly invites others to collaborate with you. You can systematically validate and debug your code. All of these things lead to better code over time. Coding in R has particular advantages given its vast ecosystem of packages, its vibrant community, and its powerful tool chain.
Using R Markdown
There are many tools for replacing complex Excel dashboards with R code. One of these tools is R Markdown, an open-source R package that turns your analyses into high quality documents, reports, presentations and dashboards. R Markdown documents are fully reproducible and support dozens of output formats including HTML, PDF, and Microsoft Word documents.
Here is the same Excel dashboard translated to an R Markdown report. Because this report is written in code, it is vastly simpler and easier to maintain. Like the Excel dashboard above, this R Markdown report is designed to take user inputs so that it could render custom report versions.
Many people are already aware that R Markdown reports combine narrative, code, and output in a single document. What is less commonly known is that you can generalize any R Markdown report by declaring parameters in the document header. R Markdown documents with parameters are known as parameterized reports. In the Excel dashboard users can select segment
, group
, and period
. In a parameterized R Markdown document, you would specify these inputs with the following YAML header:
--- title: Customer Tracker Report output: html_notebook params: seg: label: "Segment:" value: Total input: select choices: [Total, Heavy, Mainstream, Focus1, Focus2, Specialty, Diverse1, Diverse2, Other, New] grp: label: "Group:" value: Total input: select choices: [Total, Core, Extra] per: label: "Period:" value: Week input: radio choices: [Week, YTD] ---
You can then call the parameters you declare in the YAML header from your R code chunks.
```{r} params$segment params$grp params$per ```
You can render the document with different inputs by selecting knit with parameters in RStudio. This option will open a user interface that allows you to select the parameters you want.
If you want to automate the process of creating custom report versions, you can render these documents programmatically with the rmarkdown::render()
function.
rmarkdown::render( input = "tracker-report.Rmd", params = list(seg = "Focus1", grp = "Core", per = "Weekly") )
Publishing to RStudio Connect
Managing access and permissions for an ocean of Excel files is painful. Data in Excel spreads through an organization without controls like a virus spreads through a body without disease prevention. There are better ways to secure the operation, access, and distribution of information.
RStudio Connect is a server product from RStudio that is designed for secure sharing of R content. It is on-premises software you run behind your firewall. You keep control of your data and of who has access. With RStudio Connect, you can see all your content, decide who should be able to view and collaborate on it, tune performance, schedule updates, and view logs. You can schedule your R Markdown reports to run automatically or even distribute the latest version by email.
When you publish a parameterized R Markdown report to RStudio Connect, an interface appears for selecting inputs. Viewers can create new report versions, then email themselves a copy. Collaborators can save and schedule new report versions, then email others a copy. You can even attach output files to these versions. Using parameterized R Markdown documents in RStudio Connect is a powerful way to communicate information.
You can publish content from the RStudio IDE by clicking the Publish button that looks like a blue Eye of Horus. Pressing this button will begin the publishing process. First, it creates a set of instructions for recreating your content. Second, it deploys your content bundle to the server. Third, it recreates your content on RStudio Connect. Push-button publishing has a long history of being used with RStudio. In 2012, RStudio enabled push-button publishing of R Markdown documents to RPubs. In 2014, RStudio enabled push-button publishing of Shiny apps to shinyapps.io. In 2016, RStudio enabled push-button publishing to RStudio Connect.
Adding Shiny
R Markdown documents are rendered with batch processing. That makes them ideal for automation, long running workflows, and custom report versions. However, if you want your documents to be immediately reactive to user input, then you can add a Shiny runtime. These interactive documents behave like a Shiny application in that they must be hosted. You can host interactive documents and Shiny applications with RStudio Connect. Deciding when to choose between R Markdown, interactive documents, and Shiny applications is a subject for a later post.
Summary
Reproducible code in R leads to better analysis and collaboration. You can use parameterized R Markdown reports to create complex, interactive dashboards. Hosting these dashboards securely in RStudio Connect gives you control over accessing, distributing, and processing data. You can use the R programming language, along with with R Markdown reports and RStudio Connect, to build enterprise dashboards that are robust, secure, and manageable.
Click here for source code.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.