# Embedding a time series with time delay in R — Part II

June 14, 2011
By

[This article was first published on From the bottom of the heap » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Some months ago, I posted a function that extended the base R function `embed()` to allow for time delay embedding. Today, David Gonzales alerted me to an inconsistency between `embed()` and `Embed()`. The example David used was

```R> (x <- seq(1,20,3))
[1]  1  4  7 10 13 16 19
R> embed(x, 4)
[,1] [,2] [,3] [,4]
[1,]   10    7    4    1
[2,]   13   10    7    4
[3,]   16   13   10    7
[4,]   19   16   13   10
R> Embed(x, 4)
[,1] [,2] [,3] [,4]
[1,]    4    3    2    1
[2,]    7    6    5    4
[3,]   10    9    8    7
[4,]   13   12   11   10
```

where `Embed()` clearly returns an incorrect result.

In this post, I present an explanation of the problem and addressing the shortcomings in the original code with an updated version of `Embed()`.

The reason the original version of `Embed()` doesn’t work with David’s example is that when I wrote it, I had in mind that it would work on the indices of the time series, not the values of the time series. I had overlooked that `embed()` returned the embedded time series, not the indices — the problem of testing with vectors like `1:10`!

Updating `Embed()` to output the same result as `embed()` is a trivial matter; we just get the function to work with `seq_along(x)` and not `x` itself and then use the old `Embed()` behaviour to index `x` to return the embedded time series. As an added extra, as we are generating the indices anyway, we can optionally have the function return those instead of the embedded series.

Here is the updated version of `Embed()`

```Embed <- function(x, m, d = 1, indices = FALSE, as.embed = TRUE) {
n <- length(x) - (m-1)*d
X <- seq_along(x)
if(n <= 0)
stop("Insufficient observations for the requested embedding")
out <- matrix(rep(X[seq_len(n)], m), ncol = m)
out[,-1] <- out[,-1, drop = FALSE] +
rep(seq_len(m - 1) * d, each = nrow(out))
if(as.embed)
out <- out[, rev(seq_len(ncol(out)))]
if(!indices)
out <- matrix(x[out], ncol = m)
out
}
```

The main difference is that we create `X <- seq_along(x)` and create `out` using that rather than the time series (`x`). I’ve also added a new argument, `indices`, that defaults to `FALSE`. If we want `Embed()` to return the indices of the embedded time series, call the function with `indices = FALSE`.

The new version of `Embed()` gives the same results as before and is consistent with `embed()` when we pass it a time series that is identical to its indices

```R> embed(1:5, 2)
[,1] [,2]
[1,]    2    1
[2,]    3    2
[3,]    4    3
[4,]    5    4
R> Embed(1:5, 2)
[,1] [,2]
[1,]    2    1
[2,]    3    2
[3,]    4    3
[4,]    5    4
```

but it also works for time series like those in David’s example:

```R> (x <- seq(1,20,3))
[1]  1  4  7 10 13 16 19
R> embed(x, 4)
[,1] [,2] [,3] [,4]
[1,]   10    7    4    1
[2,]   13   10    7    4
[3,]   16   13   10    7
[4,]   19   16   13   10
R> Embed(x, 4)
[,1] [,2] [,3] [,4]
[1,]   10    7    4    1
[2,]   13   10    7    4
[3,]   16   13   10    7
[4,]   19   16   13   10
```

and we have the added benefit of being able to return the indices of the embedded time series

```R> Embed(x, 4, indices = TRUE)
[,1] [,2] [,3] [,4]
[1,]    4    3    2    1
[2,]    5    4    3    2
[3,]    6    5    4    3
[4,]    7    6    5    4
```

Now I just need to do something on the recurrence plot that I originally wrote `Embed()` for!

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.