As mentioned in a previous post, the US Department of Energy has been providing data about the Deepwater Horizon as it becomes available. Recently, the amount of oil and gas recovered was made available. The spreadsheet includes a graph of the oil and gas recovered thus far, and there are additional documents on the site that describe the recovery effort.
Although I am glad to see the data available, I have some reservations about the visualizations. One document reports Cumulative Barrels of Oil Recovered. It is titled accurately, but reporting cumulative totals for each day creates an impression of progress even if the the amount recovered in a given day is not significant. The graph included in the spreadsheet is better, but a bit confusing as both the amount of oil (in barrels) and gas (million cubic feet) are reported together with the scales on the right and left had sides of the graph respectively. It allows one to see that oil and gas trend together, but makes the actual amount recovered lack significance.
I downloaded and and opened the spreadsheet in Open Office Calc (an open source alternative to Excel). Since I had to save a portion of the xls as a csv anyway, I eliminated cumulative data and restructured the data in a csv (available on Github) readily readable by R.
|I used the ggplot2 library to create the graph you see above. The code is available on Github as well. The csv is read into a data frame named df. the date is displayed on the x axis, the amount on the y axis. Since oil and gas recovery involved different units of measure, the facet_grid is used to differentiate the scale involved. (If you try to simply plot both oil and gas on the same graph, the gas recovery appears as a nearly straight line near zero).|
The graph above allows you to see the fact that recovery for both oil and gas are comparable and have similar trends, but allows for greater scrutiny of individual results so the scale involved is recorded on the left hand side on the y axis. Again, ggplot is a really amazing package – and I highly recommend the ggplot2 book (by the author of the library) to get a handle on the details of its implementation.