|
| 1 | +# Usage Examples |
| 2 | + |
| 3 | +## Generating Synthetic Time Series (KernelSynth) |
| 4 | + |
| 5 | +- Install this package with with the `training` extra: |
| 6 | + ``` |
| 7 | + pip install "chronos[training] @ git+https://github.com/amazon-science/chronos-forecasting.git" |
| 8 | + ``` |
| 9 | +- Run `kernel-synth.py`: |
| 10 | + ```sh |
| 11 | + # With defaults used in the paper (1M time series and 5 max_kernels) |
| 12 | + python kernel-synth.py |
| 13 | +
|
| 14 | + # You may optionally specify num-series and max-kernels |
| 15 | + python kernel-synth.py \ |
| 16 | + --num-series <num of series to generate> \ |
| 17 | + --max-kernels <max number of kernels to use per series> |
| 18 | + ``` |
| 19 | + The generated time series will be saved in a [GluonTS](https://github.com/awslabs/gluonts)-comptabile arrow file `kernelsynth-data.arrow`. |
| 20 | +
|
| 21 | +## Pretraining (and fine-tuning) Chronos models |
| 22 | +- Install this package with with the `training` extra: |
| 23 | + ``` |
| 24 | + pip install "chronos[training] @ git+https://github.com/amazon-science/chronos-forecasting.git" |
| 25 | + ``` |
| 26 | +- Convert your time series dataset into a GluonTS-compatible file dataset. We recommend using the arrow format. You may use the `convert_to_arrow` function from the following snippet for that. Optionally, you may use [synthetic data from KernelSynth](#generating-synthetic-time-series-kernelsynth) to follow along. |
| 27 | + ```py |
| 28 | + from pathlib import Path |
| 29 | + from typing import List, Optional, Union |
| 30 | +
|
| 31 | + import numpy as np |
| 32 | + from gluonts.dataset.arrow import ArrowWriter |
| 33 | +
|
| 34 | +
|
| 35 | + def convert_to_arrow( |
| 36 | + path: Union[str, Path], |
| 37 | + time_series: Union[List[np.ndarray], np.ndarray], |
| 38 | + start_times: Optional[Union[List[np.datetime64], np.ndarray]] = None, |
| 39 | + compression: str = "lz4", |
| 40 | + ): |
| 41 | + if start_times is None: |
| 42 | + # Set an arbitrary start time |
| 43 | + start_times = [np.datetime64("2000-01-01 00:00", "s")] * len(time_series) |
| 44 | +
|
| 45 | + assert len(time_series) == len(start_times) |
| 46 | +
|
| 47 | + dataset = [ |
| 48 | + {"start": start, "target": ts} for ts, start in zip(time_series, start_times) |
| 49 | + ] |
| 50 | + ArrowWriter(compression=compression).write_to_file( |
| 51 | + dataset, |
| 52 | + path=path, |
| 53 | + ) |
| 54 | +
|
| 55 | +
|
| 56 | + if __name__ == "__main__": |
| 57 | + # Generate 20 random time series of length 1024 |
| 58 | + time_series = [np.random.randn(1024) for i in range(20)] |
| 59 | +
|
| 60 | + # Convert to GluonTS arrow format |
| 61 | + convert_to_arrow("./noise-data.arrow", time_series=time_series) |
| 62 | +
|
| 63 | + ``` |
| 64 | +- Modify the [training configs](training/configs) to use your data. Let's use the KernelSynth data as an example. |
| 65 | + ```yaml |
| 66 | + # List of training data files |
| 67 | + training_data_paths: |
| 68 | + - "/path/to/kernelsynth-data.arrow" |
| 69 | + # Mixing probability of each dataset file |
| 70 | + probability: |
| 71 | + - 1.0 |
| 72 | + ``` |
| 73 | + You may optionally change other parameters of the config file, as required. For instance, if you're interested in fine-tuning the model from a pretrained Chronos checkpoint, you should change the `model_id`, set `random_init: false`, and (optionally) change other parameters such as `max_steps` and `learning_rate`. |
| 74 | +- Start the training (or fine-tuning) job: |
| 75 | + ```sh |
| 76 | + # On single GPU |
| 77 | + CUDA_VISIBLE_DEVICES=0 python training/train.py --config/path/to/modified/config.yaml |
| 78 | +
|
| 79 | + # On multiple GPUs (example with 8 GPUs) |
| 80 | + torchrun --nproc-per-node=8 training/train.py --config /path/to/modified/config.yaml |
| 81 | +
|
| 82 | + # Fine-tune `amazon/chronos-t5-small` for 1000 steps |
| 83 | + CUDA_VISIBLE_DEVICES=0 python training/train.py --config /path/to/modified/config.yaml \ |
| 84 | + --model-id amazon/chronos-t5-small \ |
| 85 | + --no-random-init \ |
| 86 | + --max-steps 1000 |
| 87 | + ``` |
| 88 | + The output and checkpoints will be saved in `output/run_{id}/`. |
| 89 | +> [!TIP] |
| 90 | +> If the initial training step is too slow, you might want to change the `shuffle_buffer_length` and/or set `torch_compile` to `false`. |
0 commit comments