Data Science with R (and RStudio)

[This article was first published on R / Notes, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This blog has been silent for a while, and the Covid-19 pandemic has forced me to ditch my R to-do list for 2021. I did, however, manage to assemble a few R-related things in the past couple of years. This note documents the main one, a Data Science with R (and RStudio) course aimed at social scientists.

Historical side note

Around two years ago, I was offered to teach R again at Sciences Po, in Paris, in a spirit close to the Stata-based course that I have been teaching there for over ten years.

I first taught R to social scientists in 2013, but had not repeated the experience since then, except through various short and often focused workshops. I almost got to teach such a course in 2017, just as RStudio Desktop was turning 1.0, but that course failed to materialize.

Many things have changed since 2013, and there is now much higher demand to teach R (and RStudio) to social science audiences. R and RStudio have improved a lot, and the tidyverse, which recently turned 2.0 while still changing a lot, has become a core component of most courses, including mine.

Teaching material

My own attempt to teach R, RStudio and the tidyverse in 2023 has been online for a few months, in the form of a GitHub repository with a few wiki pages, including a long list of readings, videos and Web links, and another list of other R courses.

I have also uploaded a tentative syllabus for the course:

The course has only run once so far, and there are many issues with it that I will try to fix in the coming months. The repository also misses some essential course items (the slides, and the solutions to the exercises), which I am however happy to share privately by email.

A cool aspect of the course is that another instructor, Kim Antunez, will be teaching her own fork of it in the next few weeks. Kim has invested a lot into turning the course into a full-fledged Quarto website, which I will share in a follow-up post once she is done building it.

My own way of teaching the course is more old-school, as I rely on weekly emails and a shared Google Drive folder. I will, however, put some effort in improving the slides and giving the course a Web page, in order to make it more fully and easily accessible online.

Going forward

I feel that I already have enough material to assemble a more advanced R course for social scientists, but first need to streamline this introductory course a bit more, in order to make the reading list, especially, a bit more focused and manageable.

I also feel that there will soon be more changes to the tidyverse that I will have to take into account. I still, for instance, use the %>% pipe for chain operations, whereas the current trend is to use the native |> pipe, introduced in R version 4.1.0, whenever possible.

This note is tangentially related to my previous notes on teaching with RStudio, on R as a data science language and on other technologies for data science.

To leave a comment for the author, please follow the link and comment on their blog: R / Notes. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)