Blog Archives

Using Countdown Clock Data to Understand the New York City Subway

June 6, 2018
By
Using Countdown Clock Data to Understand the New York City Subway

If you’ve been on a New York City subway platform since January 2018, you should have noticed a countdown clock that displayed an estimate of when the next train would arrive. Although there’s no official record of when trains actually stopped at each station, the countdown clock data can be used to approximate. Over the past 5 months, I’ve...

Read more »

Assessing Shooting Performance in NBA and NCAA Basketball

April 2, 2018
By
Assessing Shooting Performance in NBA and NCAA Basketball

I wrote an open-source app called NBA Shots DB that uses the NBA Stats API to populate a database with all 4.5 million shots attempted in NBA games since 1996. The app also processes a dataset provided by Sportradar of over 1 million NCAA men’s shot attempts since 2013 into a format that can be merged with the NBA...

Read more »

When are Citi Bikes Faster than Taxis in New York City?

September 26, 2017
By
When are Citi Bikes Faster than Taxis in New York City?

Every day in New York City, millions of commuters take part in a giant race to determine transportation supremacy. Cars, bikes, subways, buses, ferries, and more all compete against one another, but we never get much explicit feedback as to who “wins.” I’ve previously written about NYC’s public taxi data and Citi Bike share data, and it occurred to...

Read more »

The Simpsons by the Data

September 28, 2016
By
The Simpsons by the Data

The Simpsons needs no introduction. At 27 seasons and counting, it’s the longest-running scripted series in the history of American primetime television. The show’s longevity, and the fact that it’s animated, provides a vast and relatively unchanging universe of characters to study. It’s easier for an animated show to scale to hundreds of recurring characters; without live-action actors to grow...

Read more »

BallR: Interactive NBA Shot Charts with R and Shiny

March 8, 2016
By
BallR: Interactive NBA Shot Charts with R and Shiny

The NBA’s Stats API provides data for every single shot attempted during an NBA game since 1996, including location coordinates on the court. I built a tool called BallR, using R’s Shiny framework, to explore NBA shot data at the player-level. BallR lets you select a player and season, then creates a customizable chart that shows shot patterns across the...

Read more »

A Tale of Twenty-Two Million Citi Bikes: Analyzing the NYC Bike Share System

January 13, 2016
By
A Tale of Twenty-Two Million Citi Bikes: Analyzing the NYC Bike Share System

In the conclusion of my post analyzing NYC taxi and Uber trips, I noted that Citi Bike, New York City’s bike share system, also releases public data, totaling 22.2 million rides from July 2013 through November 2015. With the recent news that the Citi Bike system topped 10 million rides in 2015, making it one of the world’s largest...

Read more »

Analyzing 1.1 Billion NYC Taxi and Uber Trips, with a Vengeance

November 17, 2015
By
Analyzing 1.1 Billion NYC Taxi and Uber Trips, with a Vengeance

The New York City Taxi & Limousine Commission has released a staggeringly detailed historical dataset covering over 1.1 billion individual taxi trips in the city from January 2009 through June 2015. Taken as a whole, the detailed trip-level data is more than just a vast list of taxi pickup and drop off coordinates: it’s a story of New York....

Read more »

A Statistical Analysis of the LearnedLeague Trivia Competition

July 21, 2015
By
A Statistical Analysis of the LearnedLeague Trivia Competition

LearnedLeague bills itself as “the greatest web-based trivia league in all of civilized earth.” Having been fortunate enough to partake in the past 3 seasons, I’m inclined to agree. LearnedLeague players, known as “LLamas”, answer trivia questions drawn from 18 assorted categories, and one of the many neat things about LearnedLeague is that it provides detailed statistics into your performance...

Read more »

Mortgages Are About Math: Open-Source Loan-Level Analysis of Fannie and Freddie

June 9, 2015
By
Mortgages Are About Math: Open-Source Loan-Level Analysis of Fannie and Freddie

ortgages were acknowledged to be the most mathematically complex securities in the marketplace. The complexity arose entirely out of the option the homeowner has to prepay his loan; it was poetic that the single financial complexity contributed to the marketplace by the common man was the Gordian knot giving the best brains on Wall Street a run...

Read more »

The reddit Front Page is Not a Meritocracy

November 6, 2014
By
The reddit Front Page is Not a Meritocracy

I was pleasantly surprised when somebody shared my traveling salesman animation to reddit and the post made it all the way to reddit's default front page (i.e. the top 25). The gif racked up over 1.3 million pageviews on Imgur, a testament to reddit's traffic-generating prowess. Before the post made it to the front page, though, it was brought...

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)