11 Million Yellow Slips – City of Toronto Parking Tickets, 2008-2011

June 2, 2012
By

(This article was first published on everyday analytics, and kindly contributed to R-bloggers)

Introduction

I don't know about you, but I really hate getting parking tickets. Sometimes I feel like it's all just a giant cash grab. Really? I can't park there between the hours of 11 and 3, but every other time is okay? Well, why the hell not?

But ah, such is life. Rules must be in place to keep civil order, keep the engines of city life running and prevent total chaos in the downtown core. However knowing this does not make coming out to the street to find that bright yellow slip of paper under your windshield wiper any easier.

Like everything else in the universe, parking tickets are a source of data. The great people at Open Data Toronto (@Open_TO) have provided all the data from every parking ticket issued in Toronto from 2008 to the end of last year.

So, let us dive in and have a look. We might just discover why we keeping getting all these tickets, or at least ease the collective pain a little in realizing how many others are sharing in it.

Background

The data set is an anonymized record of every parking ticket issued in the city of Toronto from the period 01/01/2008 - 12/31/2011. The fields provided are: the anonymized ticket #, date of infraction, infraction code, description, fine amount, time of infraction, and location (address).

The data set and more information can be found in Open Data Toronto's data catalogue here.

Originally I had this brilliant idea to geocode every data point, and then create an awesome heat map of the geographical distribution of parking tickets issued. However, given the fact that there are ~11 million records and the Google Maps API has a daily limit of 2,500 geocoding requests per day, even if I was completely diligent and performed the task daily it would still take approximately 4400 days or about 12 years to complete. And no, I am not paying to use the API for Business (which at a limit of 100,000 requests per day would still take ~3.5 months).

If anyone knows a way around this, please drop me an email and fill me in.

Otherwise, you can check out prior art. Patrick Cain at Global News created an awesome interactive map of aggregated parking ticket data from 2010 for locations in the city where over 500 tickets were issued. This turns out to be mainly hospitals, and unsurprisingly, tickets are clustered in the downtown core. Mr. Cain did a similar analysis while at the Toronto Star back in 2009, using data from the previous year.

I just don't like throwing out data points.

Analysis

Parking Infractions by Type 
Next we consider the parking tickets for the period by infraction type. A simple bar chart outlines the most common parking ticket types:



We will consider those codes which stick out most on the bar chart (the top 10):

> sort(codeTable, decreasing=TRUE)[1:11]
    005     029     210     003     207     009     002     008     006     015
2336433 1822690 1366945 1354671  933478  718692  496283  443706  369079 173078

Putting that into more human-readable format, the most commonly issued types of parking infractions were:

1. 005 - Park on Highway at Prohibited Time of Day
2. 029 - Park Prohibited Place/Time - No Permit
3. 210 - Park Fail to Display Receipt
4. 003 - Park on Private Property w/o Consent
5. 207 - Park w/o ticket from machine
6. 009 - Stop on Highway at Prohibited Time/Day
7. 002 - Park Longer than 3 Hours
8. 008 - Vehicle Standing Prohibited Time/Day
9. 006 - Park on Highway - Excess of Permitted Time
10. 015 - Park within 3M of Fire Hydrant

In case you were wondering, the most expensive tickets (in the range of 100's of dollars, the max being $450 [!!] ) are all related to handicapped parking spaces.

Time Distribution of Parking Infractions
Let us now consider the parking ticket information with regards to time. First and foremost, we consider the ticket data as a simple tim
e series and plot the data for the exploratory purposes:

Cool.
Most strikingly, there are clearly defined dips in the total number of tickets over the holiday season each year. There also appears to be some kind of periodic variation in the number of tickets issued over time (the downward spikes). A good first guess would be that this is likely related to the day of the week, due to the cycle of the work week related to the volume of cars parked, vehicles in the city, et cetera.

Quickly whipping up a box plot up for the data, we can see that a significantly less proportion of the tickets are issued on Sunday. Also for some reason plotting there are many outliers on the low end. I suspect these are in the aforementioned dips around the holiday season though I did not investigate this.


Conclusions

Performing a quick analysis of many different aspects of the data was not as easy as I had hoped, given the size of the set. Still, it is interesting to see the most common types of violations and the distribution of the majority of the parking tickets with respect to time.

Interesting general points of note:
  • The most common parking infractions are wrong place / wrong time, followed by various types of failing to display a permit / buy a ticket
  • Significantly reduced number of parking violations during the Christmas holiday season
  • More tickets issued during the work week

For Part II, I plan to create some heat maps / 2D histograms of the ticket data with respect to time, and I may yet create a geospatial representation of the data, albeit in aggregated form.

To leave a comment for the author, please follow the link and comment on his blog: everyday analytics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.