**From the bottom of the heap » R**, and kindly contributed to R-bloggers)

Some months ago, I posted a function that extended the base R function `embed()`

to allow for time delay embedding. Today, David Gonzales alerted me to an inconsistency between `embed()`

and `Embed()`

. The example David used was

R> (x <- seq(1,20,3)) [1] 1 4 7 10 13 16 19 R> embed(x, 4) [,1] [,2] [,3] [,4] [1,] 10 7 4 1 [2,] 13 10 7 4 [3,] 16 13 10 7 [4,] 19 16 13 10 R> Embed(x, 4) [,1] [,2] [,3] [,4] [1,] 4 3 2 1 [2,] 7 6 5 4 [3,] 10 9 8 7 [4,] 13 12 11 10

where `Embed()`

clearly returns an incorrect result.

In this post, I present an explanation of the problem and addressing the shortcomings in the original code with an updated version of `Embed()`

.

The reason the original version of `Embed()`

doesn’t work with David’s example is that when I wrote it, I had in mind that it would work on the *indices* of the time series, not the values of the time series. I had overlooked that `embed()`

returned the embedded time series, not the indices — the problem of testing with vectors like `1:10`

!

Updating `Embed()`

to output the same result as `embed()`

is a trivial matter; we just get the function to work with `seq_along(x)`

and not `x`

itself and then use the old `Embed()`

behaviour to index `x`

to return the embedded time series. As an added extra, as we are generating the indices anyway, we can optionally have the function return those instead of the embedded series.

Here is the updated version of `Embed()`

Embed <- function(x, m, d = 1, indices = FALSE, as.embed = TRUE) { n <- length(x) - (m-1)*d X <- seq_along(x) if(n <= 0) stop("Insufficient observations for the requested embedding") out <- matrix(rep(X[seq_len(n)], m), ncol = m) out[,-1] <- out[,-1, drop = FALSE] + rep(seq_len(m - 1) * d, each = nrow(out)) if(as.embed) out <- out[, rev(seq_len(ncol(out)))] if(!indices) out <- matrix(x[out], ncol = m) out }

The main difference is that we create `X <- seq_along(x)`

and create `out`

using that rather than the time series (`x`

). I’ve also added a new argument, `indices`

, that defaults to `FALSE`

. If we want `Embed()`

to return the indices of the embedded time series, call the function with `indices = FALSE`

.

The new version of `Embed()`

gives the same results as before and is consistent with `embed()`

when we pass it a time series that is identical to its indices

R> embed(1:5, 2) [,1] [,2] [1,] 2 1 [2,] 3 2 [3,] 4 3 [4,] 5 4 R> Embed(1:5, 2) [,1] [,2] [1,] 2 1 [2,] 3 2 [3,] 4 3 [4,] 5 4

but it also works for time series like those in David’s example:

R> (x <- seq(1,20,3)) [1] 1 4 7 10 13 16 19 R> embed(x, 4) [,1] [,2] [,3] [,4] [1,] 10 7 4 1 [2,] 13 10 7 4 [3,] 16 13 10 7 [4,] 19 16 13 10 R> Embed(x, 4) [,1] [,2] [,3] [,4] [1,] 10 7 4 1 [2,] 13 10 7 4 [3,] 16 13 10 7 [4,] 19 16 13 10

and we have the added benefit of being able to return the indices of the embedded time series

R> Embed(x, 4, indices = TRUE) [,1] [,2] [,3] [,4] [1,] 4 3 2 1 [2,] 5 4 3 2 [3,] 6 5 4 3 [4,] 7 6 5 4

Now I just need to do something on the recurrence plot that I originally wrote `Embed()`

for!

**leave a comment**for the author, please follow the link and comment on their blog:

**From the bottom of the heap » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...