Site icon R-bloggers

A two-hour introduction to data analysis in R

[This article was first published on Citizen-Statistician » R Project, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A few weeks ago I gave a two-hour Introduction to R workshop for the Master of Engineering Management students at Duke. The session was organized by the student-led Career Development and Alumni Relations committee within this program. The slides for the workshop can be found here and the source code is available on GitHub.

Why might this be of interest to you?

 
##    BikeAge_Gr crash_count
##    (chr)      (int)
## 1  0-5        60
## 2  10-Jun     421
## 3  15-Nov     747
## 4  16-19      605
## 5  20-24      680
## 6  25-29      430
## 7  30-39      658
## 8  40-49      920
## 9  50-59      739
## 10 60-69      274
## 11 70         12
## 12 70+        58

Obviously the age groups 10-Jun and 15-Nov don’t make sense. This is a great opportunity to highlight the importance of exploring the data before modeling or doing something more advanced with it. It is also an opportunity to demonstrate how merely opening a file in Excel can result in unexpected issues. These age groups should instead be 6-10 (not June 10th) and 11-15 (not November 15th). Making these corrections also provides an opportunity to talk about text processing in R.

I should admit that I don’t have evidence of Excel causing this issue. However this is my best guess since “helping” the user by formatting date fields is standard Excel behaviour. There may be other software out there that also do this that I’m unaware of…

If you’re looking for a non-diamonds or non-nycflights13 introduction to R / ggplot2 / dplyr feel free to use materials from this workshop.

To leave a comment for the author, please follow the link and comment on their blog: Citizen-Statistician » R Project.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.