The Tidy Time Series Platform: tibbletime 0.1.0
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
We’re happy to announce the third release of the tibbletime
package. This is a huge update, mainly due to a complete rewrite of the package. It contains a ton of new functionality and a number of breaking changes that existing users need to be aware of. All of the changes have been well documented in the NEWS file, but it’s worthwhile to touch on a few of them here and discuss the future of the package. We’re super excited so let’s check out the vision for tibbletime
and its new functionality!
About Tibbletime
For those new to to package, tibbletime
is a new package that enables the creation of time aware tibbles. It’s sole purpose is to make working with time series in the tidyverse much easier! The documentation really explains everything, and here are a few important vignettes that can help get you up to speed on all of the functionality:
- Time-Based Filtering
- Changing Periodicity
- Rolling Calculations In tibbletime
- Using tibbletime With dplyr BRAND NEW!!
Package roadmap
The grand view is to have tibbletime
function as a base package that others can build off of, utilizing the infrastructure that “knows” about the index column and provides support for time series transformations on tibbles. This can include extensions to finance, but also has room to grow into other areas such as economic forecasting, longitudinal studies, and other general time series analyses. We’ve already begun work on one such package, but that will be a post for another time ;).
At this point, the first bit of core functionality for tibbletime
is complete. A few other functions will likely be added, but we will definitely support backwards compatability from here on out.
New time series capabilities
The tibbletime
package was completely re-invisioned, making it much more flexible and general. Here are a few of the important new tools in tibbletime
’s toolkit:
-
A new index partitioning function (
collapse_index()
) that opens up powerful time based analysis with anydplyr
function, rather than a specific (and limited) set oftime_summarise()
,time_mutate()
, etc, functions. -
Full support for
Date
andPOSIXct
classes as indices, and experimental support foryearmon
,yearqtr
, andhms
which should get more stable over time. -
A consistent API along with more informative argument names that attempt to give it that intuitive look and feel of a
tidyverse
package.
The one downside is that we had to make a few breaking changes, but with this post you’ll be able to easily get your code up to speed with the new functionality. What follows are a few of the most important changes for those that already used tibbletime
and are interested in seeing what has changed.
Libraries
Load the following libraries to follow along.
time_collapse() -> collapse_index()
Rather than having a function like time_collapse()
that worked on an entire tbl_time
object, it has been replaced with partition_index()
and collapse_index()
that solely manipulate the index (date) vector. This allows them to be used inside of a call to mutate()
and gives the user more control over the outcome (for example, whether they want to assign it to a new column or overwrite the original index column).
The index has been collapsed. We can now do easy dplyr
operations like summarizes.
An added bonus of this is that it promotes an integration with dplyr
that renders the previous need for time_summarise()
and other time_*()
functions obsolete. Rather, you now group on the collapsed date column and can then use any dplyr function that your heart desires. For example, here is a powerful example of easily creating 6 month summaries for every column of Facebook using summarise_if()
.
This incremental approach utilizing dplyr
groups should feel natural to any tidyverse
user. Because of this improved workflow, time_summarise()
and friends have been removed.
time_filter() -> filter_time()
A simple change, but with the removal of other time_*()
functions it makes more sense to rename time_filter()
as filter_time()
.
Formula style arguments
Those familiar with tibbletime
may be used to the formula style shorthand used in specifying both the period
and time_formula
arguments found throughout the package. The period
argument now only accepts characters as there was little added benefit from using formulas. The time_formula
argument found in filter_time()
and create_series()
still use the from ~ to
style syntax, but each side must be a character rather than a bare specification.
Period Specification
Previous way (error):
New way (quoted, no error):
Time Formula Specification
Previous way (error):
New way (quoted, no error):
This may seem like a step backwards, but it is more robust to program with and allows the user to pass in actual variables to the time formula (something that was requested a few times but was difficult to do). In this example you can use characters or real Date objects, both of which are then unquoted appropriately using rlang
.
Programming with character date.
Programming with “date” class date.
While we are on the topic of filter_time()
, check out the new keywords "start"
and "end"
that you can use in your formula specification.
Using keyword "start"
:
Using keyword "end"
:
Other changes
There are plenty of other minor changes that make the package more consistent and easier for the user, so we encourage reading the NEWS file and checking out the updated vignettes for more information.
Special thanks
Dmytro Perepolkin (@dmi3k on Twitter) gave a lot of good feedback on the previous version of tibbletime
, and nicely helped promote the package on Twitter and Stack Overflow, so we just wanted to give a special shout out to him! Thanks!
Wrap Up
We are super excited about the new release of the re-imagined tibbletime
package. It has a ton of new functionality and it can now be extended as a platform to build packages on. The sky is the limit with tibbletime
. Install the package, and let us know what you think!
About Business Science
Business Science specializes in “ROI-driven data science”. Our focus is machine learning and data science in business and financial applications. We build web applications and automated reports to put machine learning in the hands of decision makers. Visit the Business Science or contact us to learn more!
Business Science University
Interested in learning data science for business? Enroll in Business Science University. We’ll teach you how to apply data science and machine learning in real-world business applications. We take you through the entire process of modeling problems, creating interactive data products, and distributing solutions within an organization. We are launching courses in early 2018!
Follow Business Science on Social Media
- @bizScienc is on twitter!
- Check us out on Facebook page!
- Check us out on LinkedIn!
- Sign up for our insights blog to stay updated!
- If you like our software, star our GitHub packages!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.