# Articles by atmathew

### Writing Functions in R: Example One

July 13, 2019 |

A. Background In previous posts, I covered a number of useful functions and packages for writing reusable code. I wanted to extend on that information by providing a working example of how to put together a function. In particular, I will walk through the process of generating a function that ... [Read more...]

### Powerlytics: Impact of Age, Gender, and Body Weight on Total Weight Lifted in Powerlifting Meets

June 30, 2019 |

A. Background The Open Powerlifting initiative attempts to create an accurate and open archive of all powerlifting meet data throughout the world. As someone who recently started competing again after a six year delay from powerlifting, I often mess around with the Open Powerlifting data as it’s of personal ...

### Examining the Tweeting Patterns of Prominent Crossfit Gyms

December 19, 2018 |

A. Introduction The growth of Crossfit has been one of the biggest developments in the fitness industry over the past decade. Promoted as both a physical exercise philosophy and also as a competitive fitness sport, Crossfit is a high-intensity fitness program incorporating elements from several sports and exercise protocols such ...

### Semiparametric Regression in R

March 4, 2018 |

A. INTRODUCTION When building statistical models, the goal is to define a compact and parsimonious mathematical representation of some data generating process. Many of these techniques require that one make assumptions about the data or how the analysis is specified. For example, Auto Regressive Integrated Moving Average (ARIMA) models require ...

### Packages for Getting Started with Time Series Analysis in R

February 18, 2018 |

A. Motivation During the recent RStudio Conference, an attendee asked the panel about the lack of support provided by the tidyverse in relation to time series data. As someone who has spent the majority of their career on time series problems, this was somewhat surprising because R already has a ... [Read more...]

### Data.Table by Example – Part 3

September 30, 2017 |

For this final post, I will cover some advanced topics and discuss how to use data tables within user generated functions. Once again, let’s use the Chicago crime data. Let’s start by subseting the data. The following code takes the first 50000 rows within the dat dataset, selects four ...

### Data.Table by Example – Part 2

September 26, 2017 |

In part one, I provided an initial walk through of some nice features that are available within the data.table package. In particular, we saw how to filter data and get a count of rows by the date. Let us now add a few columns to our dataset on reported ... [Read more...]

### Data.Table by Example – Part 1

September 26, 2017 |

For many years, I actively avoided the data.table package and preferred to utilize the tools available in either base R or dplyr for data aggregation and exploration. However, over the past year, I have come to realize that this was a mistake. Data tables are incredible and provide R ...

### R Programming Notes – Part 2

July 17, 2017 |

In an older post, I discussed a number of functions that are useful for programming in R. I wanted to expand on that topic by covering other functions, packages, and tools that are useful. Over the past year, I have been working as an R programmer and these are some ... [Read more...]

October 16, 2016 |

For those of us who received statistical training outside of statistics departments, it often emphasized procedures over principles. This entailed that we learned about various statistical techniques and how to perform analysis in a particular statistical software, but glossed over the mechanisms and mathematical statistics underlying these practices. While that ... [Read more...]

### Introduction to the RMS Package

July 4, 2016 |

The rms package offers a variety of tools to build and evaluate regression models in R. Originally named ‘Design’, the package accompanies the book “Regression Modeling Strategies” by Frank Harrell, which is essential reading for anyone who works in the ‘data science’ space. Over the past year or so, I ... [Read more...]

### Batch Forecasting in R

February 29, 2016 |

Given a data frame with multiple columns which contain time series data, let’s say that we are interested in executing an automatic forecasting algorithm on a number of columns. Furthermore, we want to train the model on a particular number of observations and assess how well they forecast future ... [Read more...]

### R Programming Notes

February 17, 2016 |

I’ve been on a note taking binge recently. This post covers a variety of topics related to programming in R. The contents were gathered from many sources and structured in such a way that it provided the author with a useful reference guide for understanding a number of useful ... [Read more...]

### Weekly R-Tips: Visualizing Predictions

February 4, 2016 |

Lets say that we estimated a linear regression model on time series data with lagged predictors. The goal is to estimate sales as a function of inventory, search volume, and media spend from two months ago. After using the lm function to perform linear regression, we predict sales using values ...

### Weekly R-Tips: Importing Packages and User Inputs

December 11, 2015 |

Number 1: Importing Multiple Packages Anyone who has used R for some time has written code that required the use of multiple packages. In most cases, this will be done by using the library or require function to bring in the appropriate extensions. That’s nice and gets the desired result, ... [Read more...]

### Automate the Boring Stuff: GGPlot2

November 26, 2015 |

The majority of my interaction with the ggplot2 package involves the interactive execution of code to visualize data within the context of exploratory data analysis. This is often a manual process and quite laborious. I recently sought to improve these tasks by creating a series of user defined functions that ... [Read more...]

### Applied Statistical Theory: Quantile Regression

November 13, 2015 |

This is part two of the ‘applied statistical theory’ series that will cover the bare essentials of various statistical techniques. As analysts, we need to know enough about what we’re doing to be dangerous and explain approaches to others. It’s not enough to say “I used X because ... [Read more...]

### Applied Statistical Theory: Belief Networks

October 21, 2015 |

Applied statistical theory is a new series that will cover the basic methodology and framework behind various statistical procedures. As analysts, we need to know enough about what we’re doing to be dangerous and explain approaches to others. It’s not enough to say “I used X because the ... [Read more...]

### Basic Forecasting

October 17, 2015 |

Forecasting refers to the process of using statistical procedures to predict future values of a time series based on historical trends. For businesses, being able gauge expected outcomes for a given time period is essential for managing marketing, planning, and finances. For example, an advertising agency may want to utilizes ... [Read more...]

### A Few Days of Python: Using R in Python

September 28, 2015 |

Using R Functions in Python [Read more...]
1 2