Articles by Andrew Landgraf

Time Stacking and Time Slicing in R

December 24, 2014 | Andrew Landgraf

Time lapses are a fun way to quickly show a long period of time. They typically involve setting up your camera on a tripod and taking photos at a regular interval, like every 5 seconds. After all the photos have been taken, they are combined into a mov... [Read more...]

Yet Another Baseball Defense Statistic

April 22, 2014 | Andrew Landgraf

Fangraphs recently published an interesting dataset that measures defensive efficiency of fielders. For each player, the Inside Edge dataset breaks their opportunities to make plays into five categories, ranging from almost impossible to routine. It al... [Read more...]

Top Songs by Artist on CD102.5 in 2013

December 27, 2013 | Andrew Landgraf

In a previous post, I showed you how to scrape playlist data from Columbus, OH alternative rock station CD102.5. Since it's the end of the year and best-of lists are all the fad, I thought I would share the most popular songs and artists of the year, a... [Read more...]

Downloading and Analyzing CD1025’s Playlist

August 20, 2013 | Andrew Landgraf

CD1025 is an “alternative” radio station here in Columbus. They are one of the few remaining radio stations that are independently owned and they take great pride in it. For data nerds like me, they also put a real time list of recently played songs on their website. The page ... [Read more...]

Copying Data from Excel to R and Back

February 24, 2013 | Andrew Landgraf

A lot of times we are given a data set in Excel format and we want to run a quick analysis using R's functionality to look at advanced statistics or make better visualizations. There are packages for importing/exporting data from/to Excel, but I have f... [Read more...]

Restricted Boltzmann Machines in R

January 14, 2013 | Andrew Landgraf

Restricted Boltzmann Machines (RBMs) are an unsupervised learning method (like principal components). An RBM is a probabilistic and undirected graphical model. They are becoming more popular in machine learning due to recent success in training them with contrastive divergence. They have been proven useful in collaborative filtering, being one of ... [Read more...]

Factor Analysis of Baseball’s Hall of Fame Voters

January 9, 2013 | Andrew Landgraf

Factor Analysis of Baseball's Hall of Fame VotersRecently, Nate Silver wrote a post which analyzed how voters who voted for and against Barry Bonds for Baseball's Hall of Fame differed. Not surprisingly, those who voted for Bonds were more likely to vote for other suspected steroids users (like Roger Clemens). ... [Read more...]

Random Forest Variable Importance

July 19, 2012 | Andrew Landgraf

Random forests ™ are great. They are one of the best "black-box" supervised learning methods. If you have lots of data and lots of predictor variables, you can do worse than random forests. They can deal with messy, real data. If there are lots of extraneous predictors, it has no problem. ... [Read more...]

Rounding in R

June 15, 2012 | Andrew Landgraf

Forgive me if you are already aware of this, but I found it quite alarming. I know that most code is interpreted by the computer in binary and we input in decimal, so problems can arise in conversion and with floating point. But the example I have below is so ... [Read more...]

Sending a Text in R

May 25, 2012 | Andrew Landgraf

Don't you hate it when you are running a long piece of code and you keep checking the results every 15 minutes, hoping it will finish? There is a better way.I got the idea from here. He uses a Python script and the text interface is not free. I thought... [Read more...]

Cleveland Indians’ Attendance

May 20, 2012 | Andrew Landgraf

Recently, Chris Perez, the closer for the Indians, displayed some frustration with the fans for not supporting the team. Currently, they have the lowest attendance in the majors -- by a decent margin. The Indians are averaging about 15,000 fans per hom... [Read more...]

What’s Up with Albert Pujols?

May 5, 2012 | Andrew Landgraf

After signing a huge deal with the Angels, Pujols has been having a really bad year. He hasn't hit a home run this year, breaking a career long streak. So I thought it would be a good idea to use some statistics to tell how good or bad we think ... [Read more...]

Visualizing the Correlations of a Matrix

February 17, 2012 | Andrew Landgraf

Correlation matrices are a common way to look at the dependence of a set of variables. When the variables have spatial relationships, the correlation matrix loses some information.Lets say you have repeated observations, each one being a matrix. For ex... [Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)