tsbox 0.3.1: extended functionality

[This article was first published on cynkra, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The tsbox package provides a set of tools that are agnostic towards existing time series classes. The tools also allow you to handle time series as plain data frames, thus making it easy to deal with time series in a dplyr or data.table workflow.

Illustration
Photo by James Sutton

Version 0.3.1 is now on CRAN and provides several bugfixes and extensions (see here for the full change log). A detailed overview of the package functionality is given in the documentation page (or in an older blog-post).

New and extended functionality

  • ts_frequency(): changes the frequency of a time series. It is now possible to aggregate any time series to years, quarters, months, weeks, days, hours, minutes or seconds. For low- to high-frequency conversion, the tempdisagg package can now convert low frequency to high frequency and has support for ts-boxable objects. E.g.:

      library(tsbox)
            x <- ts_tbl(EuStockMarkets)
            x
            #> # A tibble: 7,440 × 3
            #>   id    time                 value
            #>   <chr> <dttm>               <dbl>
            #> 1 DAX   1991-07-01 03:18:27  1629.
            #> 2 DAX   1991-07-02 13:01:32  1614.
            #> 3 DAX   1991-07-03 22:44:38  1607.
            #> 4 DAX   1991-07-05 08:27:43  1621.
            #> 5 DAX   1991-07-06 18:10:48  1618.
            #> # … with 7,435 more rows
            ts_frequency(x, "week")
            #> # A tibble: 1,492 × 3
            #>   id    time        value
            #>   <chr> <date>      <dbl>
            #> 1 DAX   1991-06-30  1618.
            #> 2 DAX   1991-07-07  1633.
            #> 3 DAX   1991-07-14  1632.
            #> 4 DAX   1991-07-21  1620.
            #> 5 DAX   1991-07-28  1616.
            #> # … with 1,487 more rows
            
  • ts_index(): returns an indexed series, with a value of 1 at the base period. This base period can now be specified more flexibly. E.g., the average of a year can defined as 1 (which is a common use case).

  • ts_na_interpolation(): A new function that wraps imputeTS::na_interpolation() from the imputeTS package and allows the imputation of missing values for any time series object.

  • ts_first_of_period(): A new function that replaces the date or time value by the first of the period. This is useful because tsbox usually relies on timestamps being the first of a period. The following monthly series has an offset of 14 days. ts_first_of_period() changes the timestamp to the first date of each month:

      x <- ts_lag(ts_tbl(mdeaths), "14 days")
            x
            #> # A tibble: 72 × 2
            #>   time       value
            #>   <date>     <dbl>
            #> 1 1974-01-15  2134
            #> 2 1974-02-15  1863
            #> 3 1974-03-15  1877
            #> 4 1974-04-15  1877
            #> 5 1974-05-15  1492
            #> # … with 67 more rows
            ts_first_of_period(x)
            #> # A tibble: 72 × 2
            #>   time       value
            #>   <date>     <dbl>
            #> 1 1974-01-01  2134
            #> 2 1974-02-01  1863
            #> 3 1974-03-01  1877
            #> 4 1974-04-01  1877
            #> 5 1974-05-01  1492
            #> # … with 67 more rows
            
  • Convert everything to everything

    tsbox is built around a set of converters, which convert time series stored as ts, xts, data.frame, data.table, tibble, zoo, tsibble, tibbletime, tis, irts or timeSeries to each other:

    library(tsbox)
            x.ts <- ts_c(fdeaths, mdeaths)
            x.xts <- ts_xts(x.ts)
            x.df <- ts_df(x.xts)
            x.dt <- ts_dt(x.df)
            x.tbl <- ts_tbl(x.dt)
            x.zoo <- ts_zoo(x.tbl)
            x.tsibble <- ts_tsibble(x.zoo)
            x.tibbletime <- ts_tibbletime(x.tsibble)
            x.timeSeries <- ts_timeSeries(x.tibbletime)
            x.irts <- ts_irts(x.tibbletime)
            x.tis <- ts_tis(x.irts)
            all.equal(ts_ts(x.tis), x.ts)
            #> [1] TRUE
            

    Use same functions for time series classes

    Because this works reliably, it is easy to define a toolkit that works for all classes. So, whether we want to smooth, scale, differentiate, chain, forecast, regularize, impute or seasonally adjust a time series, we can use the same commands to whatever time series class at hand:

    ts_trend(x.ts)   # estimate a trend line
            ts_pc(x.xts)     # calculate percentage change rates (period on period)
            ts_pcy(x.df)     # calculate percentage change rates (year on year)
            ts_lag(x.dt)     # lagged series
            

    There are many more. Because they all start with ts_, you can use auto-complete to see what’s around. Most conveniently, there is a time series plot function that works for all classes and frequencies:

    ts_plot(
            `Airline Passengers` = AirPassengers,
            `Lynx trappings` = ts_tis(lynx),
            `Deaths from Lung Diseases` = ts_xts(fdeaths),
            title = "Airlines, trappings, and deaths",
            subtitle = "Monthly passengers, annual trappings, monthly deaths"
            )
            

    time series plot

    To leave a comment for the author, please follow the link and comment on their blog: cynkra.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

    Never miss an update!
    Subscribe to R-bloggers to receive
    e-mails with the latest R posts.
    (You will not see this message again.)

    Click here to close (This popup will not appear again)