How to Create Automated Analysis Using R?

July 3, 2018
By

(This article was first published on R – nandeshwar.info, and kindly contributed to R-bloggers)

Have you found yourself repeating the same analysis for different groups in your data? And have you wondered about a better way of doing so in R? I have this problem often. RMarkdown, knitr, and RStudio make generating such automated reports very easy.

You need two things:

  • A parent file for data prep, loading libraries and knitting the final output
  • A child file with the elements you want to see on the final report

create-repeatable-analysis-r-parent-child-knitr

Let’s see these files in detail. You can find these files here.

Parent or Wrapper File

This file contains scripts to load and prep data, load relevant libraries, loop to go through each group of data, and knitting the final output.

Here’s what this file looks like for this example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
title: "Briefings for an important event"
output:
  slidy_presentation: 
      keep_md: yes
      css: style.css
---
 
 
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE, 
                      cache = FALSE, fig.asp = 1)
 
#load the libraries
library(tidyverse)
library(ggmap)
library(knitr)
library(igraph)
register_google(key = <your_key>, account_type = 'premium', day_limit = 100000)
 
# http://www.generatorland.com/glgenerator.aspx?id=124
# https://www.fakepersongenerator.com/user-biography-generator
# https://fakena.me/fake-name/
# https://fakena.me/random-real-address/
# https://bit.ly/2INPTbT
# https://www.nature.com/articles/sdata201575
# https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/28201
 
 
# load the data
bio_data <- readxl::read_excel('data/sample-data.xlsx')
 
network_data <- read_csv("data/sample_network.csv")
vertices_data <- read_csv("data/sample_vertices.csv")
```
 
```{r runall, include=FALSE}
# run through each row of the dataset
# https://stackoverflow.com/a/17105758
# https://stackoverflow.com/a/19156308
out <- NULL
for (i in seq_len(nrow(bio_data))) {
  out <- c(out, knitr::knit_child('indiv-briefing.Rmd'))
}
```
 
`r paste(out, collapse = '\n')`

Child File

In this file, we specify each of the element we want to see in the report. In this example, we show photos, brief summaries, names, addresses, Google satellite view of the addresses, and a network graph. We named this file indiv-briefing.Rmd.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
```{r, echo=FALSE}
data <- filter(bio_data, row_number() == i)
```
 
# `r as.character(select(data, name))` 
`r as.character(select(data, brief))`
 
<div class="dos-column-left">
```{r, out.width="30%"}
include_graphics("https://cataas.com/cat?type=sq") 
```
</div>
 
<div class="dos-column-right">
```{r, echo=FALSE, message=FALSE, fig.align='left', fig.width=3.5}
par(mar = rep(0.1, 4))
network_data %>% 
  filter(from == i | to == i) %>% 
  graph_from_data_frame(d = ., vertices = filter(vertices_data, id %in% unlist(c(.$from, .$to), use.names = FALSE))) %>%
  plot.igraph(vertex.color = "orange", vertex.label.cex = 1.2, vertex.label.color = "blue", edge.curved = TRUE)
```
</div>
 
<div class="dos-column-left">
- **Wealth Rating**: `r as.character(select(data, rating))`
- **Giving**: `r scales::dollar(as.numeric(select(data, giving)))`
- **Address**: `r as.character(select(data, address))`
</div>
 
<div class="dos-column-right">
```{r, echo=FALSE, message=FALSE, fig.align='left', fig.width=3.5}
ggmap(get_googlemap(as.character(select(data, address)), zoom = 20, maptype = "hybrid", size = c(300, 300), scale = 1), extent = "device")
```
</div>

Now, if you knit the parent file, you should get a report that looks something like this:

Here’s a video to walk through this process:

I hope this helps. What are some of the challenges you have faced while creating automated, repeatable analysis?

The post How to Create Automated Analysis Using R? appeared first on nandeshwar.info.

To leave a comment for the author, please follow the link and comment on their blog: R – nandeshwar.info.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)