Tidy Tuesday – Frog Distributions in Time and Space

John Russell

2 mins ago

[This article was first published on John Russell, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

< section id="graphical-and-cartographic-distributions" class="level2">

Graphical and Cartographic Distributions

I’m working my way into Tidy Tuesday, and I wanted to do something that combined both spatial and temporal data. The frogID dataset from week 35 of 2025 has both, so let’s take a look.

This is also my first post using Positron over RStudio, so we’ll see how it goes!

< section id="loading-necessary-packages-and-the-data" class="level3">

Loading necessary packages and the data

< details open="" class="code-fold"> < summary>Code in R

library(tidyverse)
library(rnaturalearth)
library(sf)
library(patchwork)

tuesdata <- tidytuesdayR::tt_load(2025, week = 35)
frogID <- tuesdata$frogID_data
frognames <- tuesdata$frog_names

view(frogID)
view(frognames)

I love how Positron automatically shows distributions of data when you view a dataframe! Looking at the two dataframes, it looks like frogID has the locations and times of observations, while frognames has some additional taxonomic information, which may be nice to join. One weird thing is that the scientific names in frogID have some additional information after the species name, so we’ll need to clean that up a bit.

< section id="joining-the-dataframes-and-cleaning-up-the-data" class="level3">

Joining the dataframes and cleaning up the data

< details open="" class="code-fold"> < summary>Code in R

frogID <- frogID |>
  left_join(frognames |> 
    select(scientificName, subfamily,tribe) |>
    mutate(scientificName = word(scientificName, 1, 2)) |>
    distinct(), by = "scientificName")

< section id="distributions-over-space" class="level2">

Distributions over Space

The first distribution I want to look at is the spatial distribution of frog observations. The data is all from Australia, so let’s pull a map of Australia and plot the points on it.

< details open="" class="code-fold"> < summary>Code in R

# pull the map of australia
australia <- ne_states(country = "australia", returnclass = "sf")

From here, it is just a matter of plotting the points on the map. I’ll color the points by tribe, and make them a bit transparent so that we can see areas with more observations.

< details open="" class="code-fold"> < summary>Code in R

## map the frogs over australia by density

australia |> 
  ggplot() +
  geom_sf(fill = "lightgrey") +
  geom_point(data = frogID |> 
    filter(!is.na(tribe)), 
            aes(x = decimalLongitude, y = decimalLatitude, color=tribe), size = 0.5, alpha = 0.7) +
 # geom_density_2d(data = frogID, aes(x = decimalLongitude, y = decimalLatitude), alpha = 0.6, contour_var = "count") +
  theme_void() +
  theme(legend.position = "bottom",
        legend.text = element_text(size=12)) +
  labs(title = "Frog Species in Australia",
       subtitle = "Locations of various frog species across Australia",
       caption = "Tidy Tuesday (2025, Week 35)",
      color="") +
  coord_sf(xlim = c(110, 155), ylim = c(-45, -10))

Unsurprisingly, most of the observations are along the coast, where the climate is probably more hospitable to frogs, but also to citizen scientists (so there may be some bias in the data)!

< section id="look-at-distribution-of-identifications-by-hour-of-day" class="level2">

look at distribution of identifications by hour of day

I was also curious about the temporal distribution of frog identifications. The eventTime column has the time of day that the identification was made, so let’s look at that by hour of day.

< details open="" class="code-fold"> < summary>Code in R

day <- frogID |>
  filter(!is.na(hour(eventTime)), 
          !is.na(tribe)) |>
  ggplot(aes(x = hour(eventTime), fill = tribe)) +
  geom_histogram(binwidth = 1, position = "stack", color = "black") +
  theme_minimal() +
  labs(title = "Frog Identifications by Hour of Day",
       x = "Hour of Day",
       y = "Number of Identifications",
       fill = "",
       caption = "Tidy Tuesday (2025, Week 35)") +
  scale_x_continuous(breaks = 0:23) +
  theme(legend.position = "bottom")

day

Interestingly, there are two peaks in identification, one around 9/10 AM and one around 8 PM. This doesn’t quite match up with dawn and dusk, which are probably the times when frogs are most active, but it may reflect when people are most likely to be out and about looking for frogs.

< section id="look-at-distribution-of-identifications-by-month" class="level2">

look at distribution of identifications by month

Finally, let’s look at the distribution of frog identifications by month. This will give us an idea of when people are most likely to identify frogs.

< details open="" class="code-fold"> < summary>Code in R

month <- frogID |>
  filter(!is.na(month(eventDate)), 
          !is.na(tribe)) |>
  ggplot(aes(x = month(eventDate), fill = tribe)) +
  geom_histogram(binwidth = 1, position = "stack", color = "black") +
  theme_minimal() +
  labs(title = "Frog Identifications by Month",
       x = "Month",
       y = "Number of Identifications",
       fill = "",
       caption = "Tidy Tuesday (2025, Week 35)") +
  scale_x_continuous(breaks = 1:12, labels = month.abb) +
  theme(legend.position = "bottom")

month

Unsurprisingly, the spring and summer months (October to February) have the most identifications, which is probably when frogs are most active and when people are more likely to be outside looking for them.

< section id="combining-the-temporal-plots" class="level2">

Combining the temporal plots

< details open="" class="code-fold"> < summary>Code in R

collected <- (month + labs(caption="")) + day + plot_layout(ncol = 2, guides = "collect") & theme(legend.position = "bottom")

collected

This pulls the two temporal plots together into one figure, which is a bit easier to compare.

< section class="quarto-appendix-contents" id="quarto-citation">

Citation

BibTeX citation:

@online{russell2025,
  author = {Russell, John},
  title = {Tidy {Tuesday} - {Frog} {Distributions} in {Time} and
    {Space}},
  date = {2025-09-02},
  url = {https://drjohnrussell.github.io/posts/2025-09-02-time-and-frogs/},
  langid = {en}
}

For attribution, please cite this work as:

Russell, John. 2025. “Tidy Tuesday – Frog Distributions in Time and Space.” September 2, 2025. https://drjohnrussell.github.io/posts/2025-09-02-time-and-frogs/.

To leave a comment for the author, please follow the link and comment on their blog: John Russell.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Graphical and Cartographic Distributions

Loading necessary packages and the data

Joining the dataframes and cleaning up the data

Distributions over Space

look at distribution of identifications by hour of day

look at distribution of identifications by month

Combining the temporal plots

Citation

Related