in 2012, more than one quarter of the united states population lived in the jurisdiction of a police department that submitted details about every crime to a central repository maintained by the fbi. a production of the uniform crime reports (ucr) program, the national incident-based reporting system (nibrs) compiles statistics from police agencies in thirty five states plus dc. if you are just looking for general crime counts, those justice statisticians might have already tabulated your number here. but for the more discriminating criminal behavior aficionado, the university of michigan’s inter-university consortium for political and social research (icpsr) maintains every microdata extract of criminal events, offense types, victims, and arrestees as far back as the first bush administration in its national archive of criminal justice data (nacjd).
this is event-level american criminal activity microdata, compiled and published by the fbi and then curated by the university of michigan. it’s for you. download it. study it. hold it upside-down and sideways and run analyses on it until you pass out. if you spot anything newsworthy, tell the world. it is your data to do whatever you like with. that is remarkable, isn’t that remarkable? i’ve consistently been astounded by the dedication of federal agencies in the united states publishing their microdata for scrappy outside researchers like you and me. but there’s one hitch: the public use files do not match what the fbi publishes. [email protected] at the fbi told me..
..and email@example.com at the national archive of criminal justice data said..
..so when you run a query, you will not reproduce fbi counts precisely. results are close, but not exact. you’ll see that the reproduction syntax is imperfect replication. oh, and once you’ve run the download automation syntax, the monetdb analysis speeds will outrun even the fastest of imaginary crime-fighting superheroes. this new github repository contains two scripts:
download all microdata.R
- create the batch (.bat) file needed to initiate the monet database in the future
- log into the university of michigan’s website with the free login info you’ll have to obtain beforehand
- download every data file from this study to the local disk
- loop through each dat file in the current working directory, import them into monet with read.sascii.monetdb
- create a well-documented block of code to re-initiate the monetdb server in the future
reproduce fbi tables.R
- initiate the same monetdb server instance, using the same well-documented block of code as above
- create three fbi-produced data tables off of the actual microdata, close but not exactly.
- be amazed. that was dozens of queries, each on millions of records. and it worked on your laptop. wow.
for more detail about the national incident-based reporting system (nibrs) microdata, visit:
- both the fbi’s faq page and also the nacjd’s resource guide.
- the related publications page to see what’s already been done.
the preliminary 2013 crime statistics show a major expansion in the united states population covered by departments participating in nibrs (in table one, compared to 2012 and 2011), so before you trend anything, make sure to examine which police agencies in the locality that you are interested in contributed their data to the program. in other words, don’t confuse a new municipality reporting crime statistics to the fbi with a spike or dip in the crime rate. right? right.
this is not survey data, so use normal statistical tests (not survey-adjusted ones) like these commands in your monetdb sql code to compute measures of variation like a confidence interval. and remember, for more sql query construction help, try the w3schools tutorial and also just searching for specific commands in my archive.
confidential to sas, spss, stata, and sudaan users: these languages will vanish, like d. b. cooper. time to transition to r. 😀