Tutorial: GitHub for Data Scientists without the Terminal

May 21, 2016

(This article was first published on R – Modern Data, and kindly contributed to R-bloggers)

Git and GitHub are indispensable tools for anyone analysing data, developing software or disseminating results. Originally designed for software engineers, GitHub is now widely used in many disciplines, especially for researchers in academia. Having a source code management software such as GitHub to host your code and have detailed project documentation is a huge step towards ensuring research is reproducible. It also makes it easier for others to build upon the work you have already done which leads to more efficient use of research time, not to mention your citation count will increase.

Learning Git and GitHub can be a daunting task, especially if you’re not familiar or used to working with the command line (a.k.a terminal). With this in mind we created a new introductory tutorial, catered towards data scientists using R, titled:

GitHub for Data Scientists without the terminal

We provide step-by-step instructions and detailed screenshots to guide you along the way. You will learn about:

  1. Installing Git
    2. Signup for a GitHub account and a Hello World tutorial
    3. Installing GitHub Desktop
    4. Version control R code using an example of PCA
    5. Create a branch, pull request and merge
    6. Introduction to Git functionality in RStudio
    7. Create and publish an R Markdown document
    8. Create an online CV

It is not uncommon now for employers to prioritize your GitHub portfolio over your CV. This tutorial demonstrates how simple it is to get up and running with GitHub. In addition to having an easy-to-use interface, it allows you to easily create websites and host dynamic documents. I encourage you to adopt this workflow, whether you work in industry or academia, to showcase your work, increase efficiency and ensure reproducibility.

To leave a comment for the author, please follow the link and comment on their blog: R – Modern Data.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)