Articles by Sigrid Keydana

Introductory time series forecasting with torch

March 9, 2021 | Sigrid Keydana

This is the first post in a series introducing time-series forecasting with torch. It does assume some prior experience with torch and/or deep learning. But as far as time series are concerned, it starts right from the beginning, using recurrent neural networks (GRU or LSTM) to predict how something ...
[Read more...]

torch, tidymodels, and high-energy physics

February 10, 2021 | Sigrid Keydana

So what’s with the clickbait (high-energy physics)? Well, it’s not just clickbait. To showcase TabNet, we will be using the Higgs dataset (Baldi, Sadowski, and Whiteson (2014)), available at UCI Machine Learning Repository. I don’t know about you, b...
[Read more...]

Forecasting El Niño-Southern Oscillation (ENSO)

February 1, 2021 | Sigrid Keydana

Today, we use the convLSTM introduced in a previous post to predict El Niño-Southern Oscillation (ENSO). El Niño, la Niña ENSO refers to a changing pattern of sea surface temperatures and sea-level pressures occurring in the equatorial Pacific. From its three overall states, probably the best-known is ...
[Read more...]

Convolutional LSTM for spatial forecasting

December 16, 2020 | Sigrid Keydana

This post is the first in a loose series exploring forecasting of spatially-determined data over time. By spatially-determined I mean that whatever the quantities we’re trying to predict – be they univariate or multivariate time series, of spatial d... [Read more...]

Brain image segmentation with torch

November 29, 2020 | Sigrid Keydana

When what is not enough True, sometimes it’s vital to distinguish between different kinds of objects. Is that a car speeding towards me, in which case I’d better jump out of the way? Or is it a huge Doberman (in which case I’d probably do the same)?...
[Read more...]

torch for tabular data

November 1, 2020 | Sigrid Keydana

Machine learning on image-like data can be many things: fun (dogs vs. cats), societally useful (medical imaging), or societally harmful (surveillance). In comparison, tabular data – the bread and butter of data science – may seem more mundane. What’s more, if you’re particularly interested in deep learning (DL), and looking ...
[Read more...]

Classifying images with torch

October 17, 2020 | Sigrid Keydana

In recent posts, we’ve been exploring essential torch functionality: tensors, the sine qua non of every deep learning framework; autograd, torch’s implementation of reverse-mode automatic differentiation; modules, composable building blocks of neura...
[Read more...]

Optimizers in torch

October 7, 2020 | Sigrid Keydana

This is the fourth and last installment in a series introducing torch basics. Initially, we focused on tensors. To illustrate their power, we coded a complete (if toy-size) neural network from scratch. We didn’t make use of any of torch’s higher-lev... [Read more...]

Using torch modules

October 5, 2020 | Sigrid Keydana

Initially, we started learning about torch basics by coding a simple neural network from scratch, making use of just a single of torch’s features: tensors. Then, we immensely simplified the task, replacing manual backpropagation with autograd. Today... [Read more...]

Introducing torch autograd

October 3, 2020 | Sigrid Keydana

Last week, we saw how to code a simple network from scratch, using nothing but torch tensors. Predictions, loss, gradients, weight updates – all these things we’ve been computing ourselves. Today, we make a significant change: Namely, we spare ourse... [Read more...]

Getting familiar with torch tensors

September 30, 2020 | Sigrid Keydana

Two days ago, I introduced torch, an R package that provides the native functionality that is brought to Python users by PyTorch. In that post, I assumed basic familiarity with TensorFlow/Keras. Consequently, I portrayed torch in a way I figured wou... [Read more...]

FNN-VAE for noisy time series forecasting

July 29, 2020 | Sigrid Keydana

") training_loop_vae(ds_train) test_batch % iter_next() encoded % round(5)) } ``` Experimental setup and data The idea was to add white noise to a deterministic series. This time, the Roessler system was chosen, mainly for the prettiness of its attractor, apparent even in its two-dimensional projections: (#fig:unnamed-chunk-1)Roessler attractor, ...
[Read more...]

Time series prediction with FNN-LSTM

July 18, 2020 | Sigrid Keydana

") training_loop(ds_train) test_batch % iter_next() encoded % round(5)) }
On to what we'll use as a baseline for comparison.

#### Vanilla LSTM

Here is the vanilla LSTM, stacking two layers, each, again, of size 32. Dropout and recurrent dropout were chosen individually
per dataset, as was the learning rate.

### Data preparation

For all experiments, data were prepared in the same way.

In every case, we used the first 10000 measurements available in the respective `.pkl` files [provided by Gilpin in his GitHub
repository]( To save on file size and not depend on an external
data source, we extracted those first 10000 entries to `.csv` files downloadable directly from this blog's repo:

Should you want to access the complete time series (of considerably greater lengths), just download them from Gilpin's repo
and load them using `reticulate`:

Here is the data preparation code for the first dataset, `geyser` - all other datasets were treated the same way.

Now we're ready to look at how forecasting goes on our four datasets.

## Experiments

### Geyser dataset

People working with time series may have heard of [Old Faithful](, a geyser in
Wyoming, US that has continually been erupting every 44 minutes to two hours since the year 2004. For the subset of data
Gilpin extracted[^3],

[^3]: see dataset descriptions in the [repository\'s README](

> `geyser_train_test.pkl` corresponds to detrended temperature readings from the main runoff pool of the Old Faithful geyser
> in Yellowstone National Park, downloaded from the [GeyserTimes database]( Temperature measurements
> start on April 13, 2015 and occur in one-minute increments.

Like we said above, `geyser.csv` is a subset of these measurements, comprising the first 10000 data points. To choose an
adequate timestep for the LSTMs, we inspect the series at various resolutions:

<div class="figure">
<img src="images/geyser_ts.png" alt="Geyer dataset. Top: First 1000 observations. Bottom: Zooming in on the first 200." width="600" />
<p class="caption">(\#fig:unnamed-chunk-5)Geyer dataset. Top: First 1000 observations. Bottom: Zooming in on the first 200.</p>

It seems like the behavior is periodic with a period of about 40-50; a timestep of 60 thus seemed like a good try.

Having trained both FNN-LSTM and the vanilla LSTM for 200 epochs, we first inspect the variances of the latent variables on
the test set. The value of `fnn_multiplier` corresponding to this run was `0.7`.

   V1     V2        V3          V4       V5       V6       V7       V8       V9      V10
0.258 0.0262 0.0000627 0.000000600 0.000533 0.000362 0.000238 0.000121 0.000518 0.000365
There is a drop in importance between the first two variables and the rest; however, unlike in the Lorenz system, V1 and V2 variances also differ by an order of magnitude. Now, it’s interesting to compare prediction errors ...
[Read more...]

Deep attractors: Where deep learning meets chaos

June 22, 2020 | Sigrid Keydana

") training_loop(ds_train) } ``` After two hundred epochs, overall loss is at 2.67, with the MSE component at 1.8 and FNN at 0.09. Obtaining the attractor from the test set We use the test set to inspect the latent code:
# A tibble: 6,242 x 10
      V1    V2         V3        V4        V5         V6        V7        V8       V9       V10
   <dbl> <dbl>      <dbl>     <dbl>     <dbl>      <dbl>     <dbl>     <dbl>    <dbl>     <dbl>
 1 0.439 0.401 -0.000614  -0.0258   -0.00176  -0.0000276  0.000276  0.00677  -0.0239   0.00906 
 2 0.415 0.504  0.0000481 -0.0279   -0.00435  -0.0000970  0.000921  0.00509  -0.0214   0.00921 
 3 0.389 0.619  0.000848  -0.0240   -0.00661  -0.000171   0.00106   0.00454  -0.0150   0.00794 
 4 0.363 0.729  0.00137   -0.0143   -0.00652  -0.000244   0.000523  0.00450  -0.00594  0.00476 
 5 0.335 0.809  0.00128   -0.000450 -0.00338  -0.000307  -0.000561  0.00407   0.00394 -0.000127
 6 0.304 0.828  0.000631   0.0126    0.000889 -0.000351  -0.00167   0.00250   0.0115  -0.00487 
 7 0.274 0.769 -0.000202   0.0195    0.00403  -0.000367  -0.00220  -0.000308  0.0145  -0.00726 
 8 0.246 0.657 -0.000865   0.0196    0.00558  -0.000359  -0.00208  -0.00376   0.0134  -0.00709 
 9 0.224 0.535 -0.00121    0.0162    0.00608  -0.000335  -0.00169  -0.00697   0.0106  -0.00576 
10 0.211 0.434 -0.00129    0.0129    0.00606  -0.000306  -0.00134  -0.00927   0.00820 -0.00447 
# … with 6,232 more rows
As a result of the FNN regularizer, the latent code units should ...
[Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)