Blog Archives

The R-Podcast Episode 13: Interview with Yihui Xie

May 23, 2013
By

It’s an episode of firsts on the R-Podcast! In this episode recorded on location I had the honor and privilege of interviewing Yihui Xie, author of many innovative packages such as knitr and animation. Some of the topics we discussed include: Yihui’s motivation for creating knitr and some key new features How markdown plays a

Read more »

Test from knitr to wordpress

April 4, 2013
By
Test from knitr to wordpress

Title This is an R Markdown document. Markdown is a simple formatting syntax for authoring web pages (click the MD toolbar button for help on Markdown). When you click the Knit HTML button a web page will be generated that includes both content as well as the output of any embedded R code chunks within

Read more »

The R-Podcast Episode 12: Using Version Control with R

April 1, 2013
By

This is not an April Fool’s joke … The R-Podcast is back once again! In this episode, I discuss the concept of version control and how you can get started with using the Git VCS right now with your R projects. Also I discuss a big batch of listener feedback, and highlight a couple of

Read more »

The R-Podcast Episode 11: Reproducible Analysis Part 1 (Introduction)

November 13, 2012
By

Season 2 of the R-Podcast is up and running! This episode begins a multi-part series on reproducible analysis using R. In this episode I discuss the usage of Sweave and LaTeX for producing reproducible reports, an introduction to the capabilities of the knitr package (more episodes will be coming dedicated to this package), and my

Read more »

The R-Podcast Episode 10: Adventures in Data Munging Part 2

September 16, 2012
By

I’m happy to present episode 10 of the R-Podcast! Season 1 of the R-Podcast concludes with part 2 of my series on data munging, in which I discuss issues surrounding importing data sets contained in HTML tables. I share how I used the XML and RCurl packages to validate and import data from hockey-reference.com for

Read more »

The R-Podcast Episode 9: Adventures in Data Munging Part 1

August 5, 2012
By

It’s great to be back with a new episode after an eventful break! This episode begins a series on my adventures in data munging, a.k.a data processing. I discuss three issues that demonstrate the flexibility and versatility R brings for recoding messy values, important inconsistent data files, and pinpointing problematic observations and variables. We also

Read more »

The R-Podcast Screencast 2: Visualization with ggplot2

June 23, 2012
By

Here is the second screencast episode of the R-Podcast to accompany episode 8 of the R-Podcast: Visualization with ggplot2. In this screencast I demonstrate a real-time session of using ggplot2 to create boxplots for a visualization of hockey attendance in the NHL. The R code created in this screencast is available in our GitHub repository,

Read more »

The R-Podcast Episode 8: Visualization with ggplot2

June 20, 2012
By

I’m happy to present this jam-packed episode of the R-Podcast dedicated to using the ggplot2 package for visualization. This episode will have a companion screencast released in the next few days. I use data from the Hockey Summary Project to demonstrate how to create a series of boxplots of NHL regular season attendance for each

Read more »

The R-Podcast Episode 7: Best Practices for Workflow Management

May 28, 2012
By

Hello everybody, I am finally back with a new episode! In this episode: Hardware issues, major update to RStudio, new forums, and discussion on managing your workflow for projects. I discuss useful functions for executing R scripts and saving/loading R objects for future sessions, and summarize different solutions for organizing R code based on task

Read more »

The R-Podcast Episode 6: Importing Data from External Sources

April 29, 2012
By

In this episode: Listener feedback and importing data from external sources into R. We dive into the basics of importing delimited text files using read.table and its varients. We also discuss recommendations for importing MS Excel spreadsheet files, relational databases such as MySQL, data from HTML tables, and files produced by other statistical computing packages.

Read more »