Learning guide: Introduction to R, one-day workshop

[This article was first published on George J. Mount, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When I was an undergrad, a professor suggested I learn this statistical programming language called R.

I took one look at the interface, panicked, and left.

A lot has changed in the R world since then, not the least of which was the release of the RStudio integrated development environment. While the universe of R packages continues to grow, and the work can now be done from the comfort of RStudio, the fact remains: learning R means learning to code R.

Many of my students have never coded before, although this is a half-truth: they’ve probably used Excel, which requires a decent amount of functions and references. What Excel doesn’t require, though, is naming and manipulating variables.

R is an ideal choice for first-time data coders: the familiar tabular data frame is a core structure. Operations are designed with data analysis in mind: after all, R is a statistical programming language. (In my opinion, this makes it preferred to Python, which was designed as a general-purpose scripting language — again, as far as learning to code as a data analyst goes.)

I assume no prior coding language for this workshop. My goals are to equip students to work comfortably from the RStudio environment, ingest and explore data, and make simple graphical representations of data. In particular, students will perform the most common tabular data cleaning and exploration tasks using the dplyr library.

Above all these objectives, however, is my goal to help students not panic over learning R, like I did when I started.

R Introduction workshop

1: Welcome to the R Project

Objective: Student can install and load an R package

Description:

  • What is R and when would I use it?
  • R plus RStudio
  • Installing and loading packages

Exercise: Install a CRAN task view

Assets needed: None

Time: 35 minutes

Lesson 2: Introduction to RStudio

Objective: Student can navigate the RStudio integrated development environment

Description:

  • Basic arithmetic and comparison operations
  • Saving, closing and loading scripts
  • Opening help documentation
  • Plotting graphs
  • Assigning objects

Exercises: Practice assigning and removing objects

Assets needed: None

Time: 40 minutes

Lesson 3: Working with vectors

Objective: Student can create, inspect and modify vectors

Description:

  • Creating vectors
  • Vector operations
  • Indexing elements of a vector

Exercises: Drills

Assets needed: None

Time: 35 minutes

Lesson 4: Working with data frames

Objective: Student can create, inspect and modify data frames

Description:

  • Creating a data frame
  • Data frame operations
  • Indexing data frames
  • Column calculations
  • Filtering and subsetting a data frame
  • Conducting exploratory data analysis on a data frame

Exercises: Drills

Assets needed: Iris dataset

Time: 70 minutes

Lesson 5: Reading, writing and exploring data frames

Objective: Student can read, write and analyze tabular external fines

Description:

  • Reading and writing csv and txt files
  • Reading and writing Excel files
  • Exploring a dataset
  • Descriptive statistics

Exercises: Drills

Assets needed: Iris dataset

Time:  40 minutes

Lesson 6: Data manipulation with dplyr

Objective: Student can perform common data manipulation tasks with dplyr

Description:

  • Manipulating rows
  • Manipulating columns
  • Summarizing data

Exercises: Drills

Assets needed: Airport flight records

Time: 50 minutes

Lesson 7: Data manipulation with dplyr, continued

Objective: Student can perform more advanced data manipulation with dplyr

Description:

  • Building a data pipeline
  • Joining two datasets
  • Reshaping a dataset

Exercises: Drills

Assets needed: Airport flight records

Time: 50 minutes

Lesson 8: R for data visualization

Objective: Student can create graphs in R using visualization best practices

Description:

  • Graphics in base R
  • Visualizing a variable’s distribution
  • Visualizing values across categories
  • Visualizing trends over time
  • Graphics in ggplot2

Exercises: Drills

Assets needed: Airport flight records

Time: 70 minutes

Lesson 9: Capstone

Objective: Student can complete end-to-end data exploration project in R

Assets needed: Baseball records

Time: 40 minutes

This download is part of my resource library. For exclusive free access, subscribe below.

To leave a comment for the author, please follow the link and comment on their blog: George J. Mount.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)