I've recently praised some mainstream media outlets like the New York Times and New Scientist for leading the charge on data journalism. But you don't need to be a large organization to find news in data. With open data sources, and open-source data analysis tools, individuals can make newsworthy discoveries.
Diego Valle-Jones has been investigating the impact of the Drug War in Mexico for a couple of years now, by using R to analyze the homicide statistics reported by the local municipalities. But Diego has noticed some anomalies in the data: many murders are not reported as homicides at all, but instead as accidental deaths. For example, data including the Acteal Massacre of 1997 (where 45 Tzotzil Indigenous people were killed by paramilitaries) shows a spike in accidents, not homicides:
(Diego provides the R code that generated this plot.) A data error, or falsification? Further investigation is needed to decide, but without the availability of open data for data journalism, and the possibility of citizen data journalism such as that by Diego, we might never know.
Diego Valle-Jones: Some problems with the Mexican mortality database