# Parallel Processing Baseball Data with R and mlbgameday

**Data Science Riot!**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

## Just In Time For Baseball

The `mlbgameday`

package has just reached the milestone of version 0.1.0.

Designed to facilitate extract, transform and load for MLBAM “Gameday” data. The package is optimized for parallel processing of data that may be larger than memory. There are other packages in the R universe that were built to perform statistics and visualizations on these data, but mlbgameday is concerned primarily with data collection. More uses of these data can be found in the pitchRx, openWAR, and baseballr packages.

## Install from CRAN

`install.packages("mlbgameday")`

## Parallel Processing

The package’s internal functions are optimized to work with the `doParallel`

package. By default, the R language will use one core of our CPU. The `doParallel`

package enables us to use several cores, which will execute tasks simultaneously. In a standard regular season for all teams, the function has to process more than 2,400 individual files, which depending on your system, can take quite some time. Parallel processing speeds this process up by several times, depending on how many processor cores we choose to use.

## Non Parallel

Although the package is optimized for parallel processing, it will also work without registering a parallel backend. When only querying a single day’s data, a parallel backend may not provide much additional performance. However, parallel backends are suggested for larger data sets, as the process will be faster by several orders of magnitude.

We can download and subset a small amount of data. In the example below, we’ll look for Jake Arrienta’s no-hitter in 2016.

**leave a comment**for the author, please follow the link and comment on their blog:

**Data Science Riot!**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.