Linpe: make sending and receiving data analysis faster and easier

October 4, 2016
By

(This article was first published on R blog | Quantide - R training & consulting, and kindly contributed to R-bloggers)

When performing some kind of analysis in R it is usually common to use data stored in basic formats such as .csv or .Rdata, and then present the analysis using a .Rmd file.

This process solves a lot of processing-related problems and saves time by allowing the analyst to perform their statistical analysis and at the same time have their work in a nice format such as .pdf or .html ready to be shown to the recipients. However, this also means that whenever you need to send the analysis to someone, you need to knit the .Rmd and send at least two files:

  1. the data
  2. and the knitted .Rmd

This is usually not a major problem, however what if you could embed the analysis in the data and send just a single file, in short:

Is it possible to send a single file instead of two?

The answer is yes, there is an R package specifically designed for doing that and shortening the hustle of sending files back and forth. The package I am talking about is linpe, created by Andrea Spanò, managing partner at Quantide.

What does the linpe package do?

how-linpe-works

In short, the linpe package provides you with a neat set of functions to embed your .Rmd analysis in the data as an attribute, then save the resulting object to a single file, send it to whoever needs it and finally render the .Rmd as a .pdf or a .html file on the recipient pc. Furthermore, if the recipient of the analysis would like to add something to it or make some changes, the linpe package allows them to extract the embedded .Rmd file and make all the necessary edits before sending the analysis back.

How does it work?

Let me walk you through a short tutorial on how to use the linpe package.

First of all, suppose you have just completed your analyisis and saved it as an .Rmd file, for instance test-linpe.Rmd, with the following content:

# ---
# title: "Test linpe"
# output: html_document
# ---
# 
# ```{r setup, include=FALSE}
# knitr::opts_chunk$set(echo = TRUE)
# ```
# 
# 
# ```{r, message = FALSE}
# require(dplyr)
# require(ggplot2)
# ```
# 
# Do something
# 
# ```{r}
# mtcars %>% 
#  tbl_df() %>%
#  group_by(cyl) %>%
#  summarise(n = n(), mean_mpg = mean(mpg), sd_mpg = sd(mpg))
# ```  
#
#
# Plot something
#
# ```{r}
# ggplot(mtcars, aes(disp, mpg)) + geom_point()
# ```

at this point you would usually send by email (or share via Dropbox, Gdrive or else…) both the .Rmd and the data file. However, if you decide to use linpe, you can choose to do the following instead:

require(linpe)
# Link .Rmd to the data using linpe
mtcars_linpe <- link(mtcars, file = "test-linpe.Rmd")

Now the .Rmd file test-linpe.Rmd has just been linked to the data and you can save the resulting object using the basic save function and then send it to your colleagues.

From now on, I’ll be referring to the .Rmd file attached to the data frame as a linpe.

Quoting the readme file in the Github repo of the package:

Any .Rmd linked to a data frame is known as a linpe.

# Save the linked dataset as an Rdata file
save(mtcars_linpe, file = "mtcars-linpe.Rdata"  )

Once your colleague receives the mtcars-linpe.Rdata file, he can load it into R by using the usual load function

# Reload
load("mtcars-linpe.Rdata")

Since you can link more than one .Rmd files using linpe, there is a specific function that lets you check what linpes are available to be rendered after having loaded the .Rdata object. By using the function linpe you can check the names of the available linpes.

The feature that lets you link to the data more than one analysis is perhaps one of the most interesting. If you are running three kind of analysis on the same dataset and you would like to keep them separated from each other, thend you would have at least 4 files to send. In that case linpe can make your life (and the one of the recipient) much easier.

Let’s check what linpes are available within the mtcars_linpe dataset

# Check name of linpes
linpe(mtcars_linpe)

##########################
## Output
##########################

# [1] "test-linpe"

Finally, when you found the one you need to check, you can easily render it by using the perform function. Note that the second parameter of this function takes the name of the linpe to be rendered as an argument.

# Render the linpe
perform(mtcars_linpe, linpe = "test-linpe")

This line of code will render the .Rmd file attached to the data in the exact same way it happens when clicking knit in RStudio. Note that, of course, you’ll need to have installed all the packages used in the analysis in order for the rendering to go through without errors.

If you need to add something to the analysis or making some edits, you can open the source .Rmd file by using the display function

# Display .Rmd linked to the linpe
display(mtcars_linpe, linpe = "test-linpe")

Finally, in case you would like to remove the ‘linpe’ attribute from the data frame, you can use the unlink function as follows

mtcars_linpe <- unlink (mtcars_linpe, linpe = "test-linpe")
linpe(mtcars_linpe)

##########################
## Output
##########################

# No limpe in mtcars_linpe 
# character(0)

Where can I find this package?

The linpe package is available to be downloaded directly from its Github repository under the GPL license.

Sending and receiving a lot of files can, sometimes, be frustrating, the aim linpe is to make this process more fluid and lean, while at the same time keeping all the advantages that make Rmarkdown so attractive to use when performing analysis.

Thank you for reading this article, please feel free to leave a comment if you have any questions or suggestions and share the post with others if you find it useful.

 

The post Linpe: make sending and receiving data analysis faster and easier appeared first on Quantide – R training & consulting.

To leave a comment for the author, please follow the link and comment on their blog: R blog | Quantide - R training & consulting.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)