House Data: 41k finance summaries from 2200 candidates

July 13, 2010

(This article was first published on Offensive Politics, and kindly contributed to R-bloggers)

I’d like to announce a new project by Offensive Politics called House Data, launching today. House Data is a large-scale extract of FEC Form 3 Summary of receipts of disbursements (pdf warning) of every US House campaign from mid-2001 onward.

The traditional source for campaign finance summaries is the Candidate Summary File, which is a single set of summary statistics for a campaign for an entire electoral cycle. But a campaign files a new F3 at least quarterly, and before and after every election they participate in. Each F3 provides insight into where a campaign stands, and with access to this intra-cycle data we can better compare campaigns and perform more sophisticated analysis. The House Data file is built from these F3 reports, all 41,050 reports from 2,241 candidates for the US House since 2002. Campaigns often update previously filed reports with amendments, so the file contains only the latest summary provided by a campaign.

The file is compiled automatically using FECHell into a zipped CSV format. New releases will be made within 3 days of a new batch of electronic filings, according to the FEC 2010 Filing Deadline Schedule (pdf warning).

Here is a simple example of a quarterly summary of total receipts and total disbursements made by all house campaigns in 2008:

2008 Cycle Total Receipts US House
2008 Cycle Total Disbursements US House

The House Data Project is live today, with more examples and a data dictionary. The latest version can always be downloaded from

If you have any questions, comments, or suggestions about the house data file please don’t hesitate to contact me.

To leave a comment for the author, please follow the link and comment on their blog: Offensive Politics. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)