The Team Data Science Process

October 17, 2016
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

As more and more organizations are setting up teams of data scientists to make sense of the massive amounts of data they collect, the need grows for a standardized process for managing the work of those teams. To help with this, the data science team at Microsoft has drawn on their experience with large-scale data science projects to develop the Team Data Science Process. The process is built around this data science lifecycle:

Dsprocess

The Team Data Science Process proposes a standardised directory structure for managing the data, code and documents for a data science project, and provides for tracking of those artifacts using a version control system such as Git. It also proposes a shared distributed analytics infrastucture to provide the computational and storage resources that the data scientist tools rely on. It also provides two open-source utilities to support data scientists:

You can find more background on the team data science process in this blog post, and you can also watch this presentation from the developers of the process from the Data Science Summit, embedded below.

You can download the various artifacts of the Team Data Science Process (and even suggest your improvements via a pull request) at the Github repository linked below.

Github (Azure): Team Data Science Process from Microsoft

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)