Comparing 2017 Maine Lobster Landings To Historical Landings

[This article was first published on R –, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Tis the season for finding out how well Maine fisherfolk did last year; specifically, Maine lobsterfolk.

Most of the news sites in Maine do a feature on the annual landings (here’s one from Bangor Daily News). There was a marked decline — the largest ever — in both poundage and revenue in 2017 and many sources point to the need to improve fishery management to help ensure both the environmental and economic health of the state.

My preferred view for this annual catch comparison is a connected scatterplot, tracing a path along the years. That way you get the feel of a time-series with the actual poundage-to-value without having to resort to two charts or (heaven forbid) a dual-geom/dual-axis chart.

The State of Maine Department of Marine Resources makes the data available but it’s in a PDF:

Thankfully, the PDF is not obfuscated and is just a plain table so it’s easy to parse and turn into:

The code to retrieve the PDF, parse it and produce said connected scatterplot is below.


lobster_by_county_url <- ""
lobster_by_county_fil <- basename(lobster_by_county_url)

if (!file.exists(lobster_by_county_fil)) download.file(lobster_by_county_url, lobster_by_county_fil)

# read in the PDF
lobster_by_county_pgs <- pdftools::pdf_text(lobster_by_county_fil)

map(lobster_by_county_pgs, stri_split_lines) %>% # split each page into lines
  flatten() %>%
  flatten_chr() %>%
  keep(stri_detect_fixed, "$") %>% # keep only lines with "$" in them
  stri_trim_both() %>% # clean up white space
  stri_split_regex("\ +", simplify = TRUE) %>% # get the columns
  as_data_frame() %>%
  mutate_at(c("V3", "V4"), lucr::from_currency) %>% # turn the formatted text into numbers
  set_names(c("year", "county", "pounds", "value")) %>% # better column names
  filter(county != "TOTAL") %>% # we'll calculate our own, thank you
  mutate(year = as.Date(sprintf("%s-01-01", year))) %>% # I like years to be years for plotting
  mutate(county = stri_trans_totitle(county)) -> lobster_by_county_df

arrange(lobster_by_county_df, year) %>%
  mutate(value = value / 1000000, pounds = pounds / 1000000) %>% # easier on the eyes
  group_by(year) %>%
  summarise(pounds = sum(pounds), value = sum(value)) %>%
  mutate(year_lab = lubridate::year(year)) %>%
  mutate(highlight = ifelse(year_lab == 2017, "2017", "Other")) %>% # so we can highlight 2017
  ggplot(aes(pounds, value)) +
  geom_path() +
    aes(label = year_lab, color = highlight, size = highlight),
    family = font_ps, show.legend = FALSE
  ) +
  scale_x_comma(name = "Pounds (millions) →", limits = c(0, 150)) +
  scale_y_comma(name = "$ USD (millions) →", limits = c(0, 600)) +
  scale_color_manual(values = c("2017" = "#742111", "Other" = "#2b2b2b")) +
  scale_size_manual(values = c("2017" = 6, "Other" = 4)) +
    title = "Historical Maine Fisheries Landings Data — Lobster (1964-2017)",
    subtitle = "All counties combined; Not adjusted for inflation",
    caption = "The 2002 & 2003 landings may possibly reflect the increased effort by DMR to collect voluntary landings from some lobster dealers;\nLobster reporting became mandatory in 2004 for all Maine dealers buying directly from harvesters.\nSource: <>"
  ) +
  theme_ipsum_ps(grid = "XY")

To leave a comment for the author, please follow the link and comment on their blog: R – offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)