era 0.3.1: year-based time scales in R

[This article was first published on Joe Roe, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

era v0.3.1 is now available on CRAN:

install.packages("era")

I wrote about the basic concept of the package in my last post. It provides a representation of year-based time scales in R, built around two core classes: yr, a vector of years, and era, defining the parameters that relate those years to a moment in time. The idea is to make it easier to handle different era systems in a tidy analysis, and convert between them, as described in the introductory vignette:

library("tibble")
library("dplyr")

tribble(
  ~period,           ~start_ka,
  "Late Holocene",   4.2,
  "Mid Holocene",    8.326,
  "Early Holocene",  11.7,
  "Younger Dryas",   12.9,
  "Bølling-Allerød", 14.7,
  "Heinrich 1",      17.0
) %>% 
  mutate(start_ka = yr(start_ka, "ka")) %>% 
  mutate(start_bp = yr_transform(start_ka, era("BP")),
         start_bce = yr_transform(start_ka, era("BCE")))
#> # A tibble: 6 x 4
#>   period          start_ka start_bp start_bce
#>                             
#> 1 Late Holocene     4.2 ka  4200 BP  2250 BCE
#> 2 Mid Holocene    8.326 ka  8326 BP  6376 BCE
#> 3 Early Holocene   11.7 ka 11700 BP  9750 BCE
#> 4 Younger Dryas    12.9 ka 12900 BP 10950 BCE
#> 5 Bølling-Allerød  14.7 ka 14700 BP 12750 BCE
#> 6 Heinrich 1         17 ka 17000 BP 15050 BCE

The major changes since the beta release (v0.2.0) flesh out the definition of an “era” so that a broader range of time scales and calendric year numbering systems can be represented as yr vectors:

  • yr_transform() can now handle eras with different units.
  • ‘Year units’ are now explicitly defined as the length of a year in days, with a new class era_year.
  • Built-in definitions for lunar and solar hijra year numbering systems, and the Julian calendar, are now included (as examples of era systems that use different units).

These and other minor changes are described in full in the changelog. In the rest of this post, I want to explain the concepts behind this representation of years.

Years as a real number time scale

era is concerned with year-based time scales; a measurement of the length of time elapsed between one instant and another, counted in years. This is the sense used by archaeologists when they say an object is about 5,000 years old, or geologists when they say the Holocene started 11.7 thousand years ago. It’s not quite the same as the everyday notion of a year, which is date (or part of a date), describing a span of time on a calendar.

To tie a set of numbers to a concrete measurement of time, we need to know:

  • the epoch – the datum instant from which measurements are made, which could be a calendar epoch (e.g. the birth of Jesus in the Common Era) or some other conventionally-agreed point in time (e.g. the “Present” in radiometric dating);
  • the year unit used;
  • the direction in which years are counted, i.e. forwards or backwards from the epoch;
  • the multiplicative scale used, when measurements are made in e.g. thousands of years rather than single years.

Together with two attributes used human-friendly printing (name and label), these four parameters comprise the era attribute of a yr vector in era. They also enable us to derive a generalised algorithm for transforming years between different era systems (yr_transform()).

Rather than something decomposable to a set of calendar dates (12 months, 365 days, etc.), yr vectors are treated as a real number scale with a year as its base unit. For example, it makes sense to think about both fractional years and negative years:

library("era")
frac_year <- function(x) {
  as.numeric(format(x, "%Y")) + (as.numeric(format(x, "%j")) / 365.2425)
}

# It is now:
yr(frac_year(Sys.Date()), "CE")
#> # CE years :
#> [1] 2021.104
#> # Era: Common Era (CE): Gregorian years (365.2425 days), counted forwards from 0

# Julius Caesar was born:
yr_transform(yr(100, "BCE"), "CE")
#> # CE years :
#> [1] -100
#> # Era: Common Era (CE): Gregorian years (365.2425 days), counted forwards from 0

Unfortunately, a “year” is not a single or well-defined unit. era used the era_year class to represents different year units, defined as their average length in days. Although some units (e.g. the astronomical Julian year) can be precisely defined in this way, for most it is an approximation. For the most part we can get away with this because we are not concerned with exact dates, but it does have implications for the accuracy of some transformations, especially when they involve calendar eras.

Time scales and calendars

In addition to “Before Present”-style eras used by palaeoscientists, era supports, and includes built-in definitions for, year numbering systems from contemporary and historic calendars. But since the package treats years as a time scale, not as dates, it can only represents the year numbering components of these calendars. If we were to implement the modern Iranian calendar as a Solar Hijri (SH) era, for example, we would only consider its:

  • Epoch for year numbering – the day of Nowruz (the vernal equinox) in the year 622 CE. Since epochs are defined in era("CE"), we define this as approximately 622.2218, and subtract 1 because the numbering started at 1 SH.
  • Year unit – the length of time between two Nowruz, which is close but not exactly the same as a Gregorian year. We define this as approximately 365.2424 days, based on astronomical data from the last few millennia.
  • Direction and scale

This gives us:

sh <- era(label = "SH",
          epoch = 622.2218 - 1,
          name = "Solar Hijri",
          unit = era_year("Nowruz", 365.2424),
          scale = 1,
          direction = 1)

These parameters should lead to a good approximation of the transformation between the SH era we defined and others, especially over longer time scales. But it isn’t precise, which we might see if we try to distinguish points close in time. For example, transforming the current beginning of this year in the Common Era (yr(2021.0, "CE")) into our new era gives us:

yr_transform(yr(2021, "CE"), sh)
#> # SH years :
#> [1] 1399.779
#> # Era: Solar Hijri (SH): Nowruz years (365.2424 days), counted forwards from 621.2218

In reality the 1 January 2021 was 12 Dey 1399. 1399 is a leap year, with 366 days, and 12 Dey is the 228th day of the year, so this is close to the value we would expect: 1399.787.

Next release

For the next release of era, I plan to focus on the built-in era definitions, especially those derived from contemporary calendar systems. The development version already includes tweaks to the definitions of the Islamic calendars that have made the conversion from Gregorian years more accurate:

this_year()
#> # CE years :
#> [1] 2021
#> # Era: Common Era (CE): Gregorian years (365.2425 days), counted forwards from 0

this_year("AH")
#> # AH years :
#> [1] 1442
#> # Era: Anno Hegirae (AH): Islamic lunar years (354.36708 days), counted forwards from 621.539367680377

this_year("SH")
#> # SH years :
#> [1] 1399
#> # Era: Solar Hijri (SH): Nowruz years (365.2424 days), counted forwards from 621.221770467566

If you have any suggestions of eras to include, please do submit an issue on GitHub.

To leave a comment for the author, please follow the link and comment on their blog: Joe Roe.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)