The ‘Tanakh’ R package

[This article was first published on pacha.dev/blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee. It shall be used to continue my Open Source efforts. The full explanation is here: A Personal Message from an Open Source Contributor.

You can send me questions for the blog using this form and subscribe to receive an email when there is a new post.

I got the following question for the blog: “How can I scrape texts from Chabad.org in R?”

This question turned out to be funny and challenging! I organized the scraped code for the Tanakh (the Old Testament for Christians) into the R package “tanakh” (https://github.com/pachadotdev/tanakh/). The code using purrr and RSelenium is here.

I would love to receive more ideas about non-English datasets in languages such as Arabic, Akkadian, Enochian, Greek, and others.

The ‘Tanakh’ R package

The tanakh R package provides tidy, verse-level access to the full Hebrew Bible (Tanakh) with English and Hebrew text, organized by book, chapter, and verse. Data is sourced from Chabad.org and includes diacritics (niqqud) in Hebrew.

Features

  • Three datasets: torah (Pentateuch), neviim (Prophets), ketuvim (Writings)
  • Each dataset is a tibble with columns: chapter_number, chapter_name, line, english, hebrew, rashi_english, rashi_hebrew
  • Hebrew text is normalized and includes diacritics
  • Includes Rashi’s commentary in both English and Hebrew (thanks to Rab. Rapoport for the suggestion)
  • Easy filtering and analysis in R

Installation

You can install the development version of tanakh from the R console:

# from GitHub:
remotes::install_github("pachadotdev/tanakh")

Datasets

Torah (Pentateuch): – Bereshit (Genesis), Shemot (Exodus), Vayikra (Leviticus), Bamidbar (Numbers), Devarim (Deuteronomy)

Nevi’im (Prophets): – Yehoshua (Joshua), Shoftim (Judges), Shmuel I & II, Melachim I & II, Yeshayahu, Yirmiyahu, Yechezkel, Hoshea, Yoel, Amos, Ovadiah, Yonah, Michah, Nachum, Chavakuk, Tzefaniah, Chaggai, Zechariah, Malachi

Ketuvim (Writings): – Tehillim (Psalms), Mishlei (Proverbs), Iyov (Job), Shir Hashirim, Rut, Eichah, Kohelet, Esther, Daniel, Ezra, Nechemiah, Divrei Hayamim I & II

Example Usage

library(tanakh)
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
# Find all verses mentioning Moses in the Torah
torah %>%
  filter(grepl("Moses", english))
# A tibble: 599 × 7
   chapter_number chapter_name    line english hebrew rashi_english rashi_hebrew
            <int> <fct>          <int> <chr>   <chr>  <chr>         <chr>       
 1              2 Shemot (Exodu…    10 "The c… וַיִּגְדַּ֣ל… For I drew h… "מְשִׁיתִֽהוּ. שְׁחַ…
 2              2 Shemot (Exodu…    11 "Now i… וַיְהִ֣י … Moses grew u… "וַיִּגְדַּל משֶׁה.…
 3              2 Shemot (Exodu…    14 "And h… וַיֹּ֠אמֶר… Who made you… "מִי שָֽׂמְךָ לְאִי…
 4              2 Shemot (Exodu…    15 "Phara… וַיִּשְׁמַ֤ע… Pharaoh hear… "וַיִּשְׁמַע פַּרְעֹה…
 5              2 Shemot (Exodu…    17 "But t… וַיָּבֹ֥אוּ… and drove th… "וַיְגָֽרְשׁוּם. מִ…
 6              2 Shemot (Exodu…    21 "Moses… וַיּ֥וֹאֶל… consented. H… "וַיּוֹאֶל. כְּתַרְ…
 7              3 Shemot (Exodu…     1 "Moses… וּמשֶׁ֗ה … after the fr… "אַחַר הַמִּדְבָּר.…
 8              3 Shemot (Exodu…     3 "So Mo… וַיֹּ֣אמֶר… Let me turn … "אָסֻֽרָה־נָּא. אָ…
 9              3 Shemot (Exodu…     4 "The L… וַיַּ֥רְא … <NA>           <NA>       
10              3 Shemot (Exodu…     6 "And H… וַיֹּ֗אמֶר… <NA>           <NA>       
# ℹ 589 more rows
# Find all verses mentioning Amos in Nevi'im
neviim %>%
  filter(grepl("Amos", english))
# A tibble: 7 × 7
  chapter_number chapter_name  line english    hebrew rashi_english rashi_hebrew
           <int> <fct>        <int> <chr>      <chr>  <chr>         <chr>       
1              1 Amos             1 "The word… דִּבְרֵ֣י … who was amon… "אשר היה בנ…
2              7 Amos             8 "And the … וַיֹּ֨אמֶר… Behold I pla… "הנני שם אנ…
3              7 Amos            10 "And Amaz… וַיִּשְׁלַ֗ח… the priest o… "כהן בית אל…
4              7 Amos            11 "For so s… כִּֽי־כֹה֙… Jeroboam sha… "בחרב ימות …
5              7 Amos            12 "And Amaz… וַיֹּ֚אמֶר… “Seer”. You,… "חוזה. אתה …
6              7 Amos            14 "And Amos… וַיַּ֚עַן … I am neither… "לא נביא אנ…
7              8 Amos             2 "And He s… וַיֹּ֗אמֶר… <NA>           <NA>       
# Find all verses from Esther in Ketuvim
ketuvim %>%
  filter(chapter_name == "Esther")
# A tibble: 167 × 7
   chapter_number chapter_name  line english   hebrew rashi_english rashi_hebrew
            <int> <fct>        <int> <chr>     <chr>  <chr>         <chr>       
 1              1 Esther           1 Now it c… וַיְהִ֖י … Now it came … "וַיְהִי בִּימֵי …
 2              1 Esther           2 In those… בַּיָּמִ֖ים… when King Ah… "כְּשֶׁבֶת הַמֶּלֶךְ …
 3              1 Esther           3 In the t… בִּשְׁנַ֤ת … the nobles. … "הַפַּרְתְּמִים. שִׁ…
 4              1 Esther           4 When he … בְּהַרְאֹת֗… many days. H… "יָמִים רַבִּים.…
 5              1 Esther           5 And when… וּבִמְל֣א… the garden. … "גִּנַּת. מְקוֹם …
 6              1 Esther           6 [There w… ח֣וּר |… white, fine … "חוּר כַּרְפַּס וּ…
 7              1 Esther           7 And they… וְהַשְׁקוֹ… And they gav… "וְהַשְׁקוֹת בִּכְלֵ…
 8              1 Esther           8 And the … וְהַשְּׁתִיָּ֥… according to… "כַדָּת. לְפִי שֶׁ…
 9              1 Esther           9 Also, Va… גַּם וַשְׁ… <NA>           <NA>       
10              1 Esther          10 On the s… בַּיּוֹם֙ … On the seven… "בַּיּוֹם הַשְּׁבִיעִ…
# ℹ 157 more rows

Source

Data sourced from Chabad.org – The Bible with Rashi

Font

In order to display the Hebrew text with niqqud correctly, please ensure you have a font that supports Hebrew diacritics installed on your system. Recommended font: Noto Sans Hebrew. See the repository for more information on font installation.

To leave a comment for the author, please follow the link and comment on their blog: pacha.dev/blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)