Twitter’s new R package for anomaly detection

January 7, 2015
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

For Twitter, finding anomalies — sudden spikes or dips — in a time series is important to keep the microblogging service running smoothly. A sudden spike in shared photos may signify an "trending" event, whereas a sudden dip in posts might represent a failure in one of the back-end services that needs to be addressed. To detect such anomalies, the engineering team at Twitter created the AnomalyDetection R package, which they recently released as open source. (Late last year Twitter released a separate but related R package to detect "breakouts" in time series.)

Finding spikes and dips is relatively easy when they are extreme enough to extend beyond the natural seasonal variation in the time series. (Twitter calls these "global anomalies".) The real trick is in identifying "local anomalies": small variations on the seasonal trend, but which don't extend beyond the usual range of values.

Figure_localglobal_anomalies

The AnomalyDetection package uses the Seasonal Hybrid ESD (S-H-ESD) algorithm, which combines seasonal decomposition with robust statistical methods to identify local and global anomalies. The package can also be used to detect anomalies in non-time-series (unordered) data, though in this case the concept of "local" anomalies doesn't apply. You can find out more information about the package and how it's used at Twitter at the link below, or install it from Github for use with R.

Twitter Engineering Blog: Introducing practical and robust anomaly detection in a time series 

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)