Johns Hopkins University-Coursera Data Science Specialization

(This article was first published on Reimagined Invention, and kindly contributed to R-bloggers)

I have completed this specialization nearly a year ago but I never wrote about it in detail. You can read more about this specialization here.

What I did learn? To ask questions to making inferences, publishing results, and more. This specialization has a focus on reproducible research and communicating results. Most courses have both quizzed and projects. I had the chance to find projects solved with totally different approaches to mine and I did learn a lot from that.

I did like it as I had no knowledge about R, and I needed to use R to complete my thesis. The courses are well structured and focused on practical applications rather than on statistical theory. At first, it was hard as I had to read a lot and write a lot code that is not needed in programs such as SPSS or Stata.

Good points

  • Self-contained courses
  • Good course materials (texts and videos)
  • You can study at your own pace and learn from other’s projects

Bad points

  • Assignments are partially based on peer reviewing
  • Some reviewers give bad qualifications without providing details
  • Good feedback should be promoted and enhanced

Here you can find R material that includes quizzes, assignments, exercises and my own tricks and functions that I created for courses contained in the specialization. This is available for educational purposes.

Course 1 • The Data Scientist’s Toolbox


This course teaches you how to set up a Github account and sync files. No other quizzes or assignments than those related to configure and use Github

Course 2 • R Programming


  • Week 1: Overview of R, R data types and objects, reading and writing data.
  • Week 2: Control structures, functions, scoping rules, dates and times.
  • Week 3: Loop functions, debugging tools.
  • Week 4: Simulation, code profiling.

Github Repository

Course 3 • Getting and Cleaning Data


  • Obtain data from a variety of sources.
  • Apply the basic tools for data cleaning and manipulation.

Github Repository

Course 4 • Exploratory Data Analysis


  • Visual representations of data using the base, lattice, and ggplot2 plotting systems in R.
  • Exploratory summaries of data.
  • Create visualizations of multidimensional data using exploratory multivariate statistical techniques.

Github Repository

Course 5 • Reproducible Research


  • Use of R markdown.
  • Integrate R code into a literate statistical program.
  • Organize a data analysis so that it is reproducible and accessible to others.

Github Repository

Course 6 • Statistical Inference


  • Fundamentals of statistical inference.
  • Assumptions and modes of performing statistical inference.

Github Repository

Course 7 • Regression Models


  • How to fit regression models.
  • How to interpret coefficients.
  • How to investigate residuals and variability.
  • Special cases of regression models including use of dummy variables and multivariable adjustment.
  • Extensions to generalized linear models, especially considering Poisson and logistic regression.

Github Repository

Course 8 • Practical Machine Learning


  • Components of a machine learning algorithm.
  • Apply multiple basic machine learning tools.
  • Apply machine learning tools to build and evaluate predictors on real data.

Github Repository

Course 9 • Developing Data Products


  • How communicate using statistics and statistical products.
  • Emphasis to communicating uncertainty in statistical results.
  • How to create simple Shiny web applications and R packages .

Course project

Github Repository

Course 10 • Data Science Capstone


It’s the final project to obtain the certification and code won’t be uploaded to avoid plagiarism. The Web Application (Shiny) it’s working for demo purposes.

Web Application

Github Repository

To leave a comment for the author, please follow the link and comment on their blog: Reimagined Invention. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)