Site icon R-bloggers

The ‘apos;Tanakh’apos; R package

[This article was first published on https://pacha.dev/blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< !DOCTYPE html> < charset="utf-8"> < http-equiv="X-UA-Compatible" content="IE=edge"> < name="viewport" content="width=device-width, initial-scale=1.0"> pacha.dev/blog < !-- MathJax Configuration --> < !-- Smart header: libraries detected based on content --> < !-- File: /tmp/tmp.99xRGwCEtl/index.html -->
  • < !-- DEBUG: Found sourceCode --> < !-- Load custom CSS after any library CSS to ensure proper precedence -->
  • < header class="site-top">

    Mauricio “Pachá” Vargas Sepúlveda

    Blog with notes about R, Shiny, SQL, Python, Linux and C++. This blog is listed on R-Bloggers.

    HOME 🏠
    < !-- categories are printed below this--> < nav class="sidebar-nav">

    Categories

    < header id="title-block-header" class="quarto-title-block default">

    The ‘Tanakh’ R package

    An R package for accessing the Tanakh (Hebrew Bible) with English and Hebrew text.
    Author

    Mauricio “Pachá” Vargas S.

    Published

    September 12, 2025

    If this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee. It shall be used to continue my Open Source efforts. The full explanation is here: A Personal Message from an Open Source Contributor.

    You can send me questions for the blog using this form and subscribe to receive an email when there is a new post.

    I got the following question for the blog: “How can I scrape texts from Chabad.org in R?”

    This question turned out to be funny and challenging! I organized the scraped code for the Tanakh (the Old Testament for Christians) into the R package “tanakh” (https://github.com/pachadotdev/tanakh/). The code using purrr and RSelenium is here.

    I would love to receive more ideas about non-English datasets in languages such as Arabic, Akkadian, Enochian, Greek, and others.

    < section id="the-tanakh-r-package" class="level1">

    The ‘Tanakh’ R package

    The tanakh R package provides tidy, verse-level access to the full Hebrew Bible (Tanakh) with English and Hebrew text, organized by book, chapter, and verse. Data is sourced from Chabad.org and includes diacritics (niqqud) in Hebrew.

    < section id="features" class="level2">

    Features

    • Three datasets: torah (Pentateuch), neviim (Prophets), ketuvim (Writings)
    • Each dataset is a tibble with columns: chapter_number, chapter_name, line, english, hebrew, rashi_english, rashi_hebrew
    • Hebrew text is normalized and includes diacritics
    • Includes Rashi’s commentary in both English and Hebrew (thanks to Rab. Rapoport for the suggestion)
    • Easy filtering and analysis in R
    < section id="installation" class="level2">

    Installation

    You can install the development version of tanakh from the R console:

    if (!require(remotes)) install.packages("remotes")
    Loading required package: remotes
    Attaching package: 'remotes'
    The following objects are masked from 'package:devtools':
    
        dev_package_deps, install_bioc, install_bitbucket, install_cran,
        install_deps, install_dev, install_git, install_github,
        install_gitlab, install_local, install_svn, install_url,
        install_version, update_packages
    if (!require(tanakh)) remotes::install_github("pachadotdev/tanakh")
    Loading required package: tanakh
    Warning in library(package, lib.loc = lib.loc, character.only = TRUE,
    logical.return = TRUE, : there is no package called 'tanakh'
    Downloading GitHub repo pachadotdev/tanakh@HEAD
    ── R CMD build ─────────────────────────────────────────────────────────────────
    * checking for file ‘/tmp/RtmpU1dIpR/remotes1c37a778bed42c/pachadotdev-tanakh-f399afd/DESCRIPTION’ ... OK
    * preparing ‘tanakh’:
    * checking DESCRIPTION meta-information ... OK
    * checking for LF line-endings in source and make files and shell scripts
    * checking for empty or unneeded directories
    * building ‘tanakh_0.0.0.9000.tar.gz’
    Installing package into '/home/pacha/R/x86_64-pc-linux-gnu-library/4.5'
    (as 'lib' is unspecified)
    < section id="datasets" class="level2">

    Datasets

    Torah (Pentateuch): – Bereshit (Genesis), Shemot (Exodus), Vayikra (Leviticus), Bamidbar (Numbers), Devarim (Deuteronomy)

    Nevi’im (Prophets): – Yehoshua (Joshua), Shoftim (Judges), Shmuel I & II, Melachim I & II, Yeshayahu, Yirmiyahu, Yechezkel, Hoshea, Yoel, Amos, Ovadiah, Yonah, Michah, Nachum, Chavakuk, Tzefaniah, Chaggai, Zechariah, Malachi

    Ketuvim (Writings): – Tehillim (Psalms), Mishlei (Proverbs), Iyov (Job), Shir Hashirim, Rut, Eichah, Kohelet, Esther, Daniel, Ezra, Nechemiah, Divrei Hayamim I & II

    < section id="example-usage" class="level2">

    Example Usage

    library(tanakh)
    library(dplyr)
    Attaching package: 'dplyr'
    The following objects are masked from 'package:stats':
    
        filter, lag
    The following objects are masked from 'package:base':
    
        intersect, setdiff, setequal, union
    # Find all verses mentioning Moses in the Torah
    torah %>%
      filter(grepl("Moses", english))
    # A tibble: 599 × 7
       chapter_number chapter_name    line english hebrew rashi_english rashi_hebrew
                <int> <fct>          <int> <chr>   <chr>  <chr>         <chr>       
     1              2 Shemot (Exodu…    10 "The c… וַיִּגְדַּ֣ל… For I drew h… "מְשִׁיתִֽהוּ. שְׁחַ…
     2              2 Shemot (Exodu…    11 "Now i… וַיְהִ֣י … Moses grew u… "וַיִּגְדַּל משֶׁה.…
     3              2 Shemot (Exodu…    14 "And h… וַיֹּ֠אמֶר… Who made you… "מִי שָֽׂמְךָ לְאִי…
     4              2 Shemot (Exodu…    15 "Phara… וַיִּשְׁמַ֤ע… Pharaoh hear… "וַיִּשְׁמַע פַּרְעֹה…
     5              2 Shemot (Exodu…    17 "But t… וַיָּבֹ֥אוּ… and drove th… "וַיְגָֽרְשׁוּם. מִ…
     6              2 Shemot (Exodu…    21 "Moses… וַיּ֥וֹאֶל… consented. H… "וַיּוֹאֶל. כְּתַרְ…
     7              3 Shemot (Exodu…     1 "Moses… וּמשֶׁ֗ה … after the fr… "אַחַר הַמִּדְבָּר.…
     8              3 Shemot (Exodu…     3 "So Mo… וַיֹּ֣אמֶר… Let me turn … "אָסֻֽרָה־נָּא. אָ…
     9              3 Shemot (Exodu…     4 "The L… וַיַּ֥רְא … <NA>           <NA>       
    10              3 Shemot (Exodu…     6 "And H… וַיֹּ֗אמֶר… <NA>           <NA>       
    # ℹ 589 more rows
    # Find all verses mentioning Amos in Nevi'im
    neviim %>%
      filter(grepl("Amos", english))
    # A tibble: 7 × 7
      chapter_number chapter_name  line english    hebrew rashi_english rashi_hebrew
               <int> <fct>        <int> <chr>      <chr>  <chr>         <chr>       
    1              1 Amos             1 "The word… דִּבְרֵ֣י … who was amon… "אשר היה בנ…
    2              7 Amos             8 "And the … וַיֹּ֨אמֶר… Behold I pla… "הנני שם אנ…
    3              7 Amos            10 "And Amaz… וַיִּשְׁלַ֗ח… the priest o… "כהן בית אל…
    4              7 Amos            11 "For so s… כִּֽי־כֹה֙… Jeroboam sha… "בחרב ימות …
    5              7 Amos            12 "And Amaz… וַיֹּ֚אמֶר… “Seer”. You,… "חוזה. אתה …
    6              7 Amos            14 "And Amos… וַיַּ֚עַן … I am neither… "לא נביא אנ…
    7              8 Amos             2 "And He s… וַיֹּ֗אמֶר… <NA>           <NA>       
    # Find all verses from Esther in Ketuvim
    ketuvim %>%
      filter(chapter_name == "Esther")
    # A tibble: 167 × 7
       chapter_number chapter_name  line english   hebrew rashi_english rashi_hebrew
                <int> <fct>        <int> <chr>     <chr>  <chr>         <chr>       
     1              1 Esther           1 Now it c… וַיְהִ֖י … Now it came … "וַיְהִי בִּימֵי …
     2              1 Esther           2 In those… בַּיָּמִ֖ים… when King Ah… "כְּשֶׁבֶת הַמֶּלֶךְ …
     3              1 Esther           3 In the t… בִּשְׁנַ֤ת … the nobles. … "הַפַּרְתְּמִים. שִׁ…
     4              1 Esther           4 When he … בְּהַרְאֹת֗… many days. H… "יָמִים רַבִּים.…
     5              1 Esther           5 And when… וּבִמְל֣א… the garden. … "גִּנַּת. מְקוֹם …
     6              1 Esther           6 [There w… ח֣וּר |… white, fine … "חוּר כַּרְפַּס וּ…
     7              1 Esther           7 And they… וְהַשְׁקוֹ… And they gav… "וְהַשְׁקוֹת בִּכְלֵ…
     8              1 Esther           8 And the … וְהַשְּׁתִיָּ֥… according to… "כַדָּת. לְפִי שֶׁ…
     9              1 Esther           9 Also, Va… גַּם וַשְׁ… <NA>           <NA>       
    10              1 Esther          10 On the s… בַּיּוֹם֙ … On the seven… "בַּיּוֹם הַשְּׁבִיעִ…
    # ℹ 157 more rows
    < section id="source" class="level2">

    Source

    Data sourced from Chabad.org – The Bible with Rashi

    < section id="" class="level2">

    Font

    In order to display the Hebrew text with niqqud correctly, please ensure you have a that supports Hebrew diacritics installed on your system. Recommended : Noto Sans Hebrew. See the repository for more information on installation.

    < footer>

    Loading…

  • < !-- Load shared sidebar -->
    To leave a comment for the author, please follow the link and comment on their blog: https://pacha.dev/blog.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
    Exit mobile version