On the Regularization of Learnable Embeddings for Time Series Forecasting (TMLR)

This repository contains the code for the reproducibility of the experiments presented in the paper "On the Regularization of Learnable Embeddings for Time Series Forecasting" (TMLR).
Within the context of deep learning models for forecasting multiple correlated time series, we study the impact of regularization techniques when applied to the learning of time series specific (local) embeddings, used as part of a larger shared architecture.

Authors: Luca Butera, Giovanni De Felice, Andrea Cini, Cesare Alippi

The paper in a nutshell

In forecasting multiple time series, accounting for the individual features of each sequence can be challenging. To address this, modern deep learning methods for time series analysis combine a shared (global) model with local layers, specific to each time series, often implemented as learnable embeddings. Ideally, these local embeddings should encode meaningful representations of the unique dynamics of each sequence. However, when these are learned end-to-end as parameters of a forecasting model, they may end up acting as mere sequence identifiers. Shared processing blocks may then become reliant on such identifiers, limiting their transferability to new contexts. In this paper, we address this issue by investigating methods to regularize the learning of local learnable embeddings for time series processing. Specifically, we perform the first extensive empirical study on the subject and show how such regularizations consistently improve performance in widely adopted architectures. Furthermore, we show that methods attempting to prevent the co-adaptation of local and global parameters by means of embeddings perturbation are particularly effective in this context.

The picture provides an overview of the considered framework. Our model's input is a window of past observations from a set of correlated time series. Its output are the corresponding forecasts for each series. We have a shared model, consisting of an encoder, a stack of processing layers, e.g., RNNs, GNNs, Attention, and a decoder. Learnable vectors of parameters (embeddings) are associated to each time series and combined in the neural architecture as shown. All parameters are learned end-to-end. We show that regularizing the learning of these embeddings, in particular by means of perturbation, leads to performance improvements and more transferable shared processing blocks.

Documentation

Setup

The following steps require the conda package manager.
All commands need to be executed within this file's directory.

Install the environment by running:

conda env create -f environment.yml

or

conda env create -f environment_gpu.yml

depending on absence or presence of gpus on your machine.
Then activate it by running:

conda activate embreg

Table 1

To reproduce the results in Table 1, run:

python main.py -cn=grid_search

Note that the search config file must be modified in order to produce the right combination of models and regularization, in particular note the following parameters inside "conf/grid_search.yaml":

hydra:
  mode: MULTIRUN
  sweeper:
    params:
      seed: 1,2,3,4,5
      regularization: none,l2,dropout
      data: metrla,pemsbay,airquality
      model: time_only,message_passing,transformer
      embedding: none,standard,periodic,variational,clusterized
      model.hidden_size: 32,64,128,256
      engine.optim_kwargs.lr: 0.00025,0.00075,0.0015,0.003

valid combinations of embeddings and regularization are:

none, none
standard, none
standard, l2
standard, dropout
periodic, none
variational, none
clusterized, none

once optimal hyperparameters are found, models can be trained on multiple seed by setting the hidden_size and lr to single values.
Note that the code might hang for a while if the number of combinations in the config is too high, try to reduce it and run multiple configs if that is the case.
Moreover, wandb is used by default for logging, to disable it set "override trainer/logger: dummy" in the config files.

Table 3

To reproduce the results in Table 3, run:

python transfer.py -m -cn=transfer seed=1,2,3,4,5 dataset.target=pems3,pems4,pems7,pems8 regularization=none,l1,l2,dropout

and

python transfer.py -m -cn=transfer_periodic seed=1,2,3,4,5 dataset.target=pems3,pems4,pems7,pems8

Note that you need a wandb account for logging, or you can disable logging by setting "log_run: null" in the config files "conf/transfer.yaml" and "conf/transfer_periodic.yaml".

If you are running the code on a mac with Apple Silicon CPU (M1, M2, ...) add "trainer.accelerator=cpu" to the previous commands, as some operations are not compatible with Apple Metal hardware acceleration.

Experiments with fixed hyperparameters

To reproduce the other experiments run "main.py" with the following yaml config

defaults:
  - experiment
  - override trainer/logger: wandb
  - override engine: predictor
  - override data: metrla
  - override embedding: standard
  - override regularization: none
  - _self_

hydra:
  mode: MULTIRUN
  sweeper:
    params:
      seed: 1,2,3,4,5
      data: metrla,pemsbay
      embedding: standard,periodic,variational,clusterized
      regularization: none,l2,dropout
      model: time_only,message_passing,transformer
      model.hidden_size: 64
      engine.optim_kwargs.lr: 0.00075
      embedding.module.emb_size: 8,32,128
      early_stopping.patience: 100
      trainer.max_epochs: 300

Note that one or more of the parameters in the list can be set at once, depending on the amount of runs to execute sequentially. Moreover, the "embedding" and "regularization" values must be valid paris, as detailed previously. Finally, to regularize the entire model add "hydra:sweeper:params:regularize_whole_model: true" to the config file. This is compatible only with regularization set to "dropout" or "l2".

Additional data access

Some datasets are provided behind a direct request to the appropriate owners. In particular, refer to:

This paper's authors for CLM-D dataset.
This paper's authors for CER-E dataset, after requesting data access permission here
The authors of the original paper for EngRAD dataset.

Cleanup

To remove the downloaded datasets delete the folder printed by the following python script.

from pathlib import Path
from os import environ as env
print(Path(env.get("XDG_CACHE_HOME", "~/.cache")).expanduser())

Double-check the folder does not contain any other important data before deleting it. We are not responsible for any data loss.

To remove the created python environment run:

conda env remove -n embreg

Bibtex reference

If you find this code useful please consider citing our paper:

@article{butera2025regularization,
    title={On the Regularization of Learnable Embeddings for Time Series Forecasting},
    author={Luca Butera and Giovanni De Felice and Andrea Cini and Cesare Alippi},
    journal={Transactions on Machine Learning Research},
    issn={2835-8856},
    year={2025},
    url={https://openreview.net/forum?id=F5ALCh3GWG},
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
conf		conf
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
env.py		env.py
environment.yml		environment.yml
environment_gpu.yml		environment_gpu.yml
main.py		main.py
overview.png		overview.png
transfer.py		transfer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

On the Regularization of Learnable Embeddings for Time Series Forecasting (TMLR)

The paper in a nutshell

Documentation

Setup

Table 1

Table 3

Experiments with fixed hyperparameters

Additional data access

Cleanup

Bibtex reference

About

Uh oh!

Releases

Packages

Languages

License

LucaButera/TS-embedding-regularization

Folders and files

Latest commit

History

Repository files navigation

On the Regularization of Learnable Embeddings for Time Series Forecasting (TMLR)

The paper in a nutshell

Documentation

Setup

Table 1

Table 3

Experiments with fixed hyperparameters

Additional data access

Cleanup

Bibtex reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages