The ‘Tanakh’ R package
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
If this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee. It shall be used to continue my Open Source efforts. The full explanation is here: A Personal Message from an Open Source Contributor.
You can send me questions for the blog using this form and subscribe to receive an email when there is a new post.
I got the following question for the blog: “How can I scrape texts from Chabad.org in R?”
This question turned out to be funny and challenging! I organized the scraped code for the Tanakh (the Old Testament for Christians) into the R package “tanakh” (https://github.com/pachadotdev/tanakh/). The code using purrr and RSelenium is here.
I would love to receive more ideas about non-English datasets in languages such as Arabic, Akkadian, Enochian, Greek, and others.
The ‘Tanakh’ R package
The tanakh
R package provides tidy, verse-level access to the full Hebrew Bible (Tanakh) with English and Hebrew text, organized by book, chapter, and verse. Data is sourced from Chabad.org and includes diacritics (niqqud) in Hebrew.
Features
- Three datasets:
torah
(Pentateuch),neviim
(Prophets),ketuvim
(Writings) - Each dataset is a tibble with columns:
chapter_number
,chapter_name
,line
,english
,hebrew
,rashi_english
,rashi_hebrew
- Hebrew text is normalized and includes diacritics
- Includes Rashi’s commentary in both English and Hebrew (thanks to Rab. Rapoport for the suggestion)
- Easy filtering and analysis in R
Installation
You can install the development version of tanakh
from the R console:
# from GitHub: remotes::install_github("pachadotdev/tanakh")
Datasets
Torah (Pentateuch): – Bereshit (Genesis), Shemot (Exodus), Vayikra (Leviticus), Bamidbar (Numbers), Devarim (Deuteronomy)
Nevi’im (Prophets): – Yehoshua (Joshua), Shoftim (Judges), Shmuel I & II, Melachim I & II, Yeshayahu, Yirmiyahu, Yechezkel, Hoshea, Yoel, Amos, Ovadiah, Yonah, Michah, Nachum, Chavakuk, Tzefaniah, Chaggai, Zechariah, Malachi
Ketuvim (Writings): – Tehillim (Psalms), Mishlei (Proverbs), Iyov (Job), Shir Hashirim, Rut, Eichah, Kohelet, Esther, Daniel, Ezra, Nechemiah, Divrei Hayamim I & II
Example Usage
library(tanakh) library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats': filter, lag
The following objects are masked from 'package:base': intersect, setdiff, setequal, union
# Find all verses mentioning Moses in the Torah torah %>% filter(grepl("Moses", english))
# A tibble: 599 × 7 chapter_number chapter_name line english hebrew rashi_english rashi_hebrew <int> <fct> <int> <chr> <chr> <chr> <chr> 1 2 Shemot (Exodu… 10 "The c… וַיִּגְדַּ֣ל… For I drew h… "מְשִׁיתִֽהוּ. שְׁחַ… 2 2 Shemot (Exodu… 11 "Now i… וַיְהִ֣י … Moses grew u… "וַיִּגְדַּל משֶׁה.… 3 2 Shemot (Exodu… 14 "And h… וַיֹּ֠אמֶר… Who made you… "מִי שָֽׂמְךָ לְאִי… 4 2 Shemot (Exodu… 15 "Phara… וַיִּשְׁמַ֤ע… Pharaoh hear… "וַיִּשְׁמַע פַּרְעֹה… 5 2 Shemot (Exodu… 17 "But t… וַיָּבֹ֥אוּ… and drove th… "וַיְגָֽרְשׁוּם. מִ… 6 2 Shemot (Exodu… 21 "Moses… וַיּ֥וֹאֶל… consented. H… "וַיּוֹאֶל. כְּתַרְ… 7 3 Shemot (Exodu… 1 "Moses… וּמשֶׁ֗ה … after the fr… "אַחַר הַמִּדְבָּר.… 8 3 Shemot (Exodu… 3 "So Mo… וַיֹּ֣אמֶר… Let me turn … "אָסֻֽרָה־נָּא. אָ… 9 3 Shemot (Exodu… 4 "The L… וַיַּ֥רְא … <NA> <NA> 10 3 Shemot (Exodu… 6 "And H… וַיֹּ֗אמֶר… <NA> <NA> # ℹ 589 more rows
# Find all verses mentioning Amos in Nevi'im neviim %>% filter(grepl("Amos", english))
# A tibble: 7 × 7 chapter_number chapter_name line english hebrew rashi_english rashi_hebrew <int> <fct> <int> <chr> <chr> <chr> <chr> 1 1 Amos 1 "The word… דִּבְרֵ֣י … who was amon… "אשר היה בנ… 2 7 Amos 8 "And the … וַיֹּ֨אמֶר… Behold I pla… "הנני שם אנ… 3 7 Amos 10 "And Amaz… וַיִּשְׁלַ֗ח… the priest o… "כהן בית אל… 4 7 Amos 11 "For so s… כִּֽי־כֹה֙… Jeroboam sha… "בחרב ימות … 5 7 Amos 12 "And Amaz… וַיֹּ֚אמֶר… “Seer”. You,… "חוזה. אתה … 6 7 Amos 14 "And Amos… וַיַּ֚עַן … I am neither… "לא נביא אנ… 7 8 Amos 2 "And He s… וַיֹּ֗אמֶר… <NA> <NA>
# Find all verses from Esther in Ketuvim ketuvim %>% filter(chapter_name == "Esther")
# A tibble: 167 × 7 chapter_number chapter_name line english hebrew rashi_english rashi_hebrew <int> <fct> <int> <chr> <chr> <chr> <chr> 1 1 Esther 1 Now it c… וַיְהִ֖י … Now it came … "וַיְהִי בִּימֵי … 2 1 Esther 2 In those… בַּיָּמִ֖ים… when King Ah… "כְּשֶׁבֶת הַמֶּלֶךְ … 3 1 Esther 3 In the t… בִּשְׁנַ֤ת … the nobles. … "הַפַּרְתְּמִים. שִׁ… 4 1 Esther 4 When he … בְּהַרְאֹת֗… many days. H… "יָמִים רַבִּים.… 5 1 Esther 5 And when… וּבִמְל֣א… the garden. … "גִּנַּת. מְקוֹם … 6 1 Esther 6 [There w… ח֣וּר |… white, fine … "חוּר כַּרְפַּס וּ… 7 1 Esther 7 And they… וְהַשְׁקוֹ… And they gav… "וְהַשְׁקוֹת בִּכְלֵ… 8 1 Esther 8 And the … וְהַשְּׁתִיָּ֥… according to… "כַדָּת. לְפִי שֶׁ… 9 1 Esther 9 Also, Va… גַּם וַשְׁ… <NA> <NA> 10 1 Esther 10 On the s… בַּיּוֹם֙ … On the seven… "בַּיּוֹם הַשְּׁבִיעִ… # ℹ 157 more rows
Source
Data sourced from Chabad.org – The Bible with Rashi
Font
In order to display the Hebrew text with niqqud correctly, please ensure you have a font that supports Hebrew diacritics installed on your system. Recommended font: Noto Sans Hebrew. See the repository for more information on font installation.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.