Tis the season for finding out how well Maine fisherfolk did last year; specifically, Maine lobsterfolk.
Most of the news sites in Maine do a feature on the annual landings (here’s one from Bangor Daily News). There was a marked decline — the largest ever — in both poundage and revenue in 2017 and many sources point to the need to improve fishery management to help ensure both the environmental and economic health of the state.
My preferred view for this annual catch comparison is a connected scatterplot, tracing a path along the years. That way you get the feel of a time-series with the actual poundage-to-value without having to resort to two charts or (heaven forbid) a dual-geom/dual-axis chart.
The State of Maine Department of Marine Resources makes the data available but it’s in a PDF:
Thankfully, the PDF is not obfuscated and is just a plain table so it’s easy to parse and turn into:
The code to retrieve the PDF, parse it and produce said connected scatterplot is below.
library(stringi) library(pdftools) library(hrbrthemes) library(tidyverse) lobster_by_county_url <- "https://www.maine.gov/dmr/commercial-fishing/landings/documents/lobster.county.pdf" lobster_by_county_fil <- basename(lobster_by_county_url) if (!file.exists(lobster_by_county_fil)) download.file(lobster_by_county_url, lobster_by_county_fil) # read in the PDF lobster_by_county_pgs <- pdftools::pdf_text(lobster_by_county_fil) map(lobster_by_county_pgs, stri_split_lines) %>% # split each page into lines flatten() %>% flatten_chr() %>% keep(stri_detect_fixed, "$") %>% # keep only lines with "$" in them stri_trim_both() %>% # clean up white space stri_split_regex("\ +", simplify = TRUE) %>% # get the columns as_data_frame() %>% mutate_at(c("V3", "V4"), lucr::from_currency) %>% # turn the formatted text into numbers set_names(c("year", "county", "pounds", "value")) %>% # better column names filter(county != "TOTAL") %>% # we'll calculate our own, thank you mutate(year = as.Date(sprintf("%s-01-01", year))) %>% # I like years to be years for plotting mutate(county = stri_trans_totitle(county)) -> lobster_by_county_df arrange(lobster_by_county_df, year) %>% mutate(value = value / 1000000, pounds = pounds / 1000000) %>% # easier on the eyes group_by(year) %>% summarise(pounds = sum(pounds), value = sum(value)) %>% mutate(year_lab = lubridate::year(year)) %>% mutate(highlight = ifelse(year_lab == 2017, "2017", "Other")) %>% # so we can highlight 2017 ggplot(aes(pounds, value)) + geom_path() + geom_label( aes(label = year_lab, color = highlight, size = highlight), family = font_ps, show.legend = FALSE ) + scale_x_comma(name = "Pounds (millions) →", limits = c(0, 150)) + scale_y_comma(name = "$ USD (millions) →", limits = c(0, 600)) + scale_color_manual(values = c("2017" = "#742111", "Other" = "#2b2b2b")) + scale_size_manual(values = c("2017" = 6, "Other" = 4)) + labs( title = "Historical Maine Fisheries Landings Data — Lobster (1964-2017)", subtitle = "All counties combined; Not adjusted for inflation", caption = "The 2002 & 2003 landings may possibly reflect the increased effort by DMR to collect voluntary landings from some lobster dealers;\nLobster reporting became mandatory in 2004 for all Maine dealers buying directly from harvesters.\nSource: <https://www.maine.gov/dmr/commercial-fishing/landings/historical-data.html>" ) + theme_ipsum_ps(grid = "XY")