Visualizing The Catholic Lectionary – Part 1

[This article was first published on Stefan Avey - r, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

What’s a Lectionary?

A lectionary, according to Wikipedia, is a listing of scripture readings for Christian or Judaic worship on a given day. The Roman Catholic Lectionary will contain a list of readings for a specific day that are on a 3-year cycle. Here is an example:

Thirtieth Sunday in Ordinary Time
Lectionary: 149
Reading 1: Jer 31:7-9
Psalm: Ps 126:1-2, 2-3, 4-5, 6
Reading 2: Heb 5:1-6
Gospel: Mk 10:46-52

As a Catholic, I’ve often wondered how much of the Bible is included in the lectionary. Fortunately, Felix Just, S.J., Ph.D. compiled a website with statistics on many aspects of the lectionary which help to answer this question.

His analysis shows that a Catholic who attends Mass on Sundays and Feasts (but not weekdays) would hear ~ 4% of the Old Testament (excluding Psalms) and ~ 41% of the New Testament. Wow – this was a shock to me! I bet if you ask around you’ll hear that many Catholics think the lectionary includes the whole Bible. To be fair, these numbers do get a bit higher if you include the complete lectionary (with weekdays). The total coverage of the Old Testament is ~14% (again excluding Psalms) and for the New Testament it is ~72%.

Since I’m a Data Scientist, I naturally wanted to dig into these numbers and ask more questions. The summary tables nicely show which books are covered and which are not. But what verses are actually covered? Is there something unique about those verses vs. those in the Lectionary (e.g. different sorts of themes)? In the future I plan to write another post on understanding the content via Natural Language Processing. For this post, I’ll focus on creating a single visualization of the lectionary coverage for Sundays and Major Feasts to help me understand what parts are covered at a glance.

Ok Cool, But Isn’t the Lectionary a Book?


Yes, the lectionary is traditionally a book. But it’s the 21st century and luckily someone else has transcribed this into an electronic listing so I didn’t have to do it. The data necessary includes the Sunday Lectionary and a pair of reference guides that tell us how many chapters/verses are in each book of the Bible (Old Testament Reference, New Testament Reference). I could have played with web scraping but there were only a few tables so I just copied the tables in Felix’s website into Excel files (link to data files).

Code
##############
## Packages ##
##############
library(readxl)
library(scales)
library(knitr)
library(tidyverse)
library(aveytoolkit)
 
###############
## Variables ##
###############
data_dir <- "plotBible"
OTRefFile <- file.path(data_dir, "OldTestamentReference.xlsx")
NTRefFile <- file.path(data_dir, "NewTestamentReference.xlsx")
abbrevFile <- file.path(data_dir, "BibleAbbreviations.xlsx")
LectSundaysFile <- file.path(data_dir, "Lectionary-SundaysAndFeasts.xlsx")
 
###############
## Functions ##
###############
source(file.path(data_dir, "helperFunctions.R"))
Code
########################
## Read in references ##
########################
testaments <- c("OT", "NT")
abbrev <- map(testaments, ~read_excel(abbrevFile, sheet = .x))
names(abbrev) <- testaments
 
## Old Testament
OTsections <- c("Torah", "Historical", "Wisdom", "MajorProphets", "MinorProphets")
OTref <- map_df(OTsections, function(OTsection) {
    read_excel(OTRefFile, sheet = OTsection, na = ".") %>%
        mutate(Section = OTsection)
}) %>%
    rename(Book = `Book Name`) %>%
    gather(matches("[0-9]+"), key = "Chapter", value = "Verses") %>%
    filter(!is.na(Verses)) %>%
    select(Section, Book, Chapter, Verses) %>%
    mutate(Book = factor(Book,            # Order books in factor levels
                         levels = abbrev$OT$Name),
           Abbrv = abbrev$OT$Abbreviation[match(Book, abbrev$OT$Name)],
           Chapter = as.numeric(Chapter)) %>%
    arrange(Book) %>%
    mutate(Abbrv = factor(Abbrv, levels = unique(Abbrv)))
 
## Munge so that reference contains 1 row for every verse in the order they
## appear in the Old Testament with a `Pos` column to denote the position of the
## verse.
OTref2Pos <- OTref %>%
    arrange(Book, Chapter) %>%
    mutate(Verse = map(Verses, function(x) 1:x)) %>%
    unnest() %>%
    mutate(Chapter_Verse = paste(Chapter, Verse, sep = ':')) %>%
    mutate(Pos = 1:n()) %>%
    mutate(Testament = "Old")    
 
## New Testament
NTsections <- c("Gospels", "NT")
NTref <- map_df(NTsections, function(NTsection) {
    read_excel(NTRefFile, sheet = NTsection, na = ".") %>%
        mutate(Section = NTsection) %>%
        mutate_at(vars(matches("[0-9]+")),
                  funs(as.integer(str_replace(., fixed("*"), ""))))
}) %>%
    rename(Abbrv = `Book  Name`) %>%
    gather(matches("[0-9]+"), key = "Chapter", value = "Verses") %>%
    filter(!is.na(Verses)) %>%
    select(Section, Abbrv, Chapter, Verses) %>%
    mutate(Abbrv = factor(Abbrv,
                         levels = abbrev$NT$Abbreviation),
           Book = abbrev$NT$Name[match(Abbrv, abbrev$NT$Abbreviation)],
           Chapter = as.numeric(Chapter)) %>%
    arrange(Abbrv) %>%
    mutate(Book = factor(Book, levels = unique(Book)))
 
## Munge so that reference contains 1 row for every verse in the order they
## appear in the New Testament with a `Pos` column to denote the position of the
## verse.
NTref2Pos <- NTref %>%
    arrange(Book, Chapter) %>%
    mutate(Verse = map(Verses, function(x) 1:x)) %>%
    unnest() %>%
    mutate(Chapter_Verse = paste(Chapter, Verse, sep = ':')) %>%
    mutate(Pos = 1:n()) %>%
    mutate(Testament = "New")
 
ref2Pos <- bind_rows(OTref2Pos, NTref2Pos) %>%
    mutate(Pos = 1:n())

Reference Data

There is a lot of data transformation necessary but in the end, we have a data frame with 35,524 rows which contains 1 row for every verse in the Catholic Bible. This will be used as the reference to compare the lectionary to. Below I show part of the table containing the first and last 6 verses.

Code
ref2Pos %>%
    select(Testament, Section, Book, Chapter, Verse, Pos) %>%
    head() %>%
    knitr::kable()
Testament Section Book Chapter Verse Pos
Old Torah Genesis 1 1 1
Old Torah Genesis 1 2 2
Old Torah Genesis 1 3 3
Old Torah Genesis 1 4 4
Old Torah Genesis 1 5 5
Old Torah Genesis 1 6 6
Code
ref2Pos %>%
    select(Testament, Section, Book, Chapter, Verse, Pos) %>%
    tail() %>%
    knitr::kable()
Testament Section Book Chapter Verse Pos
New NT Revelation 22 16 35519
New NT Revelation 22 17 35520
New NT Revelation 22 18 35521
New NT Revelation 22 19 35522
New NT Revelation 22 20 35523
New NT Revelation 22 21 35524

Lectionary Data

Code
###################################################
## Read in the Lectionary for Sundays and Feasts ##
###################################################
readings <- c("OT", "Psalm", "NT", "Gospel")
LectSundays <- map(readings, ~read_excel(LectSundaysFile, sheet = .x))
names(LectSundays) <- readings
 
SunLect <- bind_rows(LectSundays) %>%
    tbl_df() %>%
    separate(LectNum_Year, c("LectNum", "Year"), sep = '-') %>%
    mutate(YearA = grepl("A", Year),
           YearB = grepl("B", Year),
           YearC = grepl("C", Year))

Below are some entries in the lectionary including the Reading as well as the Year in the 3-year cycle (ABC means read in years A, B, and C). For this analysis, I’ll ignore the Year and just consider the coverage over the whole 3-year cycle.

Code
SunLect %>%
    select(Reading, LectNum, Year, Day) %>%
    head() %>%
    kable()
Reading LectNum Year Day
Gen 1:1-2:2 or 1:1, 26-31a 41 ABC Easter Vigil (1st Reading)
Gen 2:7-9; 3:1-7 22 A 1st Sunday of Lent
Gen 2:18-24 140 B 27th Sunday in Ordinary Time
Gen 3:9-15 89 B 10th Sunday in Ordinary Time
Gen 9:8-15 23 B 1st Sunday of Lent
Gen 11:1-9 62 ABC Pentecost Sunday: Vigil Mass (Option 1)

Parsing Verses

The crucial part is matching the Reading to the reference. This seems straightforward but can be very tricky in some cases. I wrote a basic parser (see code) to handle the most common cases. It ignores some aspects such as the “or” option as in the first row of the table above. Usually an “or” option gives a longer form followed by a shorter form (subset of the first option) so I’ll only consider the longer forms.

Each set of continuous verses gets stored in a new data frame by the parser. For example, Gen 1:1-2:2 or 1:1, 26-31a will get parsed to:

Code
ParseFull(SunLect$Reading[1])
##   start end Abbrv
## 1   1:1 2:2   Gen

and Gen 2:7-9; 3:1-7 will get parsed to:

Code
ParseFull(SunLect$Reading[2])
##   start end Abbrv
## 1   2:7 2:9   Gen
## 2   3:1 3:7   Gen

Then, once each reading from the lectionary is parsed, it gets matched back to the reference.

Code
## Apply parser on the full Sunday Lectionary
res <- SunLect %>%
    mutate(Pos = map(Reading, function(x) {
        ## cat(x, sep = '\n') # debugging
        ParseFull(x) %>%
        left_join(ref2Pos, by = c(start = "Chapter_Verse", Abbrv = "Abbrv")) %>%
        left_join(ref2Pos, by = c(end = "Chapter_Verse", Abbrv = "Abbrv")) %>%
        rowwise() %>%
            do(data.frame(Pos = if (any(is.na(.))) {
                                    NA
                                } else {
                                    .$Pos.x:.$Pos.y
                                }
                          )) %>%
            pull(Pos)}))
 
## Combine the results with the reference
comb_dat <- res %>%
    unnest(Pos) %>%
    left_join(ref2Pos, by = "Pos")

So each reading is parsed into multiple segments and then mapped onto the position variable (Pos) from the Reference. The result for one reading looks like this:

Code
comb_dat %>%
    select(Reading, Section, Book, Chapter, Verse, Pos) %>%
    filter(Reading == "Isa 63:16b-17, 19b; 64:2-7") %>%
    kable()
Reading Section Book Chapter Verse Pos
Isa 63:16b-17, 19b; 64:2-7 MajorProphets Isaiah 63 16 22923
Isa 63:16b-17, 19b; 64:2-7 MajorProphets Isaiah 63 17 22924
Isa 63:16b-17, 19b; 64:2-7 MajorProphets Isaiah 63 19 22926
Isa 63:16b-17, 19b; 64:2-7 MajorProphets Isaiah 64 2 22928
Isa 63:16b-17, 19b; 64:2-7 MajorProphets Isaiah 64 3 22929
Isa 63:16b-17, 19b; 64:2-7 MajorProphets Isaiah 64 4 22930
Isa 63:16b-17, 19b; 64:2-7 MajorProphets Isaiah 64 5 22931
Isa 63:16b-17, 19b; 64:2-7 MajorProphets Isaiah 64 6 22932
Isa 63:16b-17, 19b; 64:2-7 MajorProphets Isaiah 64 7 22933

Visualize the Lectionary Coverage

The inspiration behind this visualization is picturing the whole Bible as a bookshelf. There is one row for each major section (e.g., Torah, Minor Prophets, Gospels) and each section is made up of multiple books. In the visualization, lines are used to indicate coverage of a particular verse in the lectionary.

Code
## Get all section lengths
sections <- ref2Pos %>%
    select(Section, Abbrv) %>%
    distinct() %>%
    group_by(Section) %>%
    mutate(Abbrv = list(Abbrv)) %>%
    distinct() %>%
    deframe()
sectPos <- map_int(names(sections), function(sect) {
    abbrvs <- sections[[sect]]
    lastAbbrv <- abbrvs[length(abbrvs)]
    maxPos <- ref2Pos %>%
        filter(Abbrv == lastAbbrv) %>%
        pull(Pos) %>%
        max()
    return(maxPos)
})
names(sectPos) <- names(sections)
sectPos2 <- c(0, sectPos[-length(sectPos)])
names(sectPos2) <- names(sectPos)
sectLength <- sectPos - sectPos2
 
## Tweak Section Labels for Plot
sectLabels <- names(sections) %>%
    paste0(., " (V=", comma(sectLength[.]), ")") %>%
    str_replace("Prophets", " Prophets") %>%
    str_replace("NT", "New Testament") %>%
    setNames(names(sections))
 
## Generate plotting data
dat <- comb_dat %>%
    filter(!is.na(Pos)) %>%
    select(Section, Abbrv, Chapter, Verse, Pos) %>%
    mutate(Pos = (Pos - sectPos2[Section]) / sectLength[Section]) %>%
    mutate(SectionLabel = factor(sectLabels[Section],
                                 levels = sectLabels)) %>%
    mutate(Label = paste0(Abbrv, " ", Chapter, ":", Verse)) %>%
    distinct()      # ignore how many times something appeared (read 2 or 3 times)
book_dat <- ref2Pos %>%
    mutate(Abbrv = factor(Abbrv, levels = unique(Abbrv))) %>%
    mutate(Pos = (Pos - sectPos2[Section]) / sectLength[Section]) %>%
    mutate(SectionLabel = factor(sectLabels[Section],
                                 levels = sectLabels)) %>%
    group_by(SectionLabel, Abbrv) %>%
    summarize(Pos = min(Pos)) %>%
    ungroup() %>%
    group_by(SectionLabel) %>%
    mutate(y = rep(c(0, 0.33, 0.66, 1), length.out = n())) %>%
    ungroup()
 
## Create the plot
gg <- ggplot(dat, x = 0, y = 1) +
    geom_vline(aes(xintercept = Pos, color = Abbrv), show.legend = FALSE) +
    geom_label(data = book_dat, aes(x = Pos, y = y, label = Abbrv),
               hjust = 0, size = 3.5) +    
    scale_x_continuous(labels = percent) +    
    scale_y_continuous(labels = NULL, breaks = NULL) +
    xlab("") +
    ylab("") +
    ggtitle(paste("Bible Coverage of Catholic Lectionary for",
                  "Sundays and Major Feasts", sep = '\n')) +
    labs(caption = "Three Year Cycle") +
    facet_grid(SectionLabel ~ ., scales = "fixed") +
    getBaseTheme() +
    theme(strip.text = element_text(size = 10, face = "bold"))
plot(gg)

plot of chunk unnamed-chunk-11

Code
GetCoverage <- function(abbrv) {
    tot <- ref2Pos %>%
        filter(Abbrv == abbrv) %>%
        select(Chapter_Verse) %>%
        distinct() %>%
        nrow()
    cov <- dat %>%
        filter(Abbrv == abbrv) %>%
        select(Pos) %>%
        distinct() %>%
        nrow()
    return(cov/tot)
}

The amount of white space clearly shows what parts of the Bible are not contained in the lectionary. For example, in the New Testament section (bottom row of the “bookshelf”), it is apparent that little of the book of Acts (coverage = 17%) or Revelation (coverage = 10%) are in the Sunday lectionary.

While most of the Old Testament is sparsely covered, the Psalms (abbreviated Ps) are one of the most covered books at 21% coverage. One of the drawbacks of this visualization is that each row contains a different number of verses so that the visual sense of “fullness” can be deceiving.

Acknowledgments

Huge kudos to Felix Just, S.J., Ph.D. for compiling these resources:

To leave a comment for the author, please follow the link and comment on their blog: Stefan Avey - r.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)