Skip to content

Commit b4e8085

Browse files
lostellaabdulfatir
andauthored
Add training script (#63)
*Description of changes:* Add training script and config files. Can be used for pre-training, or adapted for fine-tuning chronos models. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Abdul Fatir <[email protected]>
1 parent 6ae390f commit b4e8085

File tree

9 files changed

+761
-9
lines changed

9 files changed

+761
-9
lines changed

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010

1111
## 🚀 News
1212

13+
- **10 May 2024**: 🚀 We added the code for pretraining and fine-tuning Chronos models. You can find it in [this folder](./scripts/training).
1314
- **19 Apr 2024**: 🚀 Chronos is now supported on [AutoGluon-TimeSeries](https://auto.gluon.ai/stable/tutorials/timeseries/index.html), the powerful AutoML package for time series forecasting which enables model ensembles, cloud deployments, and much more. Get started with the [tutorial](https://auto.gluon.ai/stable/tutorials/timeseries/forecasting-chronos.html).
1415
- **08 Apr 2024**: 🧪 Experimental [MLX inference support](https://github.com/amazon-science/chronos-forecasting/tree/mlx) added. If you have an Apple Silicon Mac, you can now obtain significantly faster forecasts from Chronos compared to CPU inference. This provides an alternative way to exploit the GPU on your Apple Silicon Macs together with the "mps" support in PyTorch.
1516
- **25 Mar 2024**: [v1.1.0 released](https://github.com/amazon-science/chronos-forecasting/releases/tag/v1.1.0) with inference optimizations and `pipeline.embed` to extract encoder embeddings from Chronos.
@@ -139,6 +140,9 @@ context = torch.tensor(df["#Passengers"])
139140
embeddings, tokenizer_state = pipeline.embed(context)
140141
```
141142

143+
### Pretraining and fine-tuning
144+
145+
Scripts for pretraining and fine-tuning Chronos models can be found in [this folder](./scripts/training).
142146

143147
## 🔥 Coverage
144148

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ dependencies = [
1212
[project.optional-dependencies]
1313
test = ["pytest~=8.0", "numpy~=1.21"]
1414
typecheck = ["mypy~=1.9"]
15+
training = ["gluonts[pro]", "numpy", "tensorboard", "typer", "typer-config"]
1516

1617
[tool.mypy]
1718
ignore_missing_imports = true
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
training_data_paths:
2+
- "/home/ubuntu/tsmixup-data.arrow"
3+
- "/home/ubuntu/kernelsynth-data.arrow"
4+
probability:
5+
- 0.9
6+
- 0.1
7+
context_length: 512
8+
prediction_length: 64
9+
min_past: 60
10+
max_steps: 200_000
11+
save_steps: 100_000
12+
log_steps: 500
13+
per_device_train_batch_size: 32
14+
learning_rate: 0.001
15+
optim: adamw_torch_fused
16+
num_samples: 20
17+
shuffle_buffer_length: 100_000
18+
gradient_accumulation_steps: 1
19+
model_id: google/t5-efficient-base
20+
model_type: seq2seq
21+
random_init: true
22+
tie_embeddings: true
23+
output_dir: ./output/
24+
tf32: true
25+
torch_compile: true
26+
tokenizer_class: "MeanScaleUniformBins"
27+
tokenizer_kwargs:
28+
low_limit: -15.0
29+
high_limit: 15.0
30+
n_tokens: 4096
31+
lr_scheduler_type: linear
32+
warmup_ratio: 0.0
33+
dataloader_num_workers: 1
34+
max_missing_prop: 0.9
35+
use_eos_token: true
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
training_data_paths:
2+
- "/home/ubuntu/tsmixup-data.arrow"
3+
- "/home/ubuntu/kernelsynth-data.arrow"
4+
probability:
5+
- 0.9
6+
- 0.1
7+
context_length: 512
8+
prediction_length: 64
9+
min_past: 60
10+
max_steps: 200_000
11+
save_steps: 100_000
12+
log_steps: 500
13+
per_device_train_batch_size: 8
14+
learning_rate: 0.001
15+
optim: adamw_torch_fused
16+
num_samples: 20
17+
shuffle_buffer_length: 100_000
18+
gradient_accumulation_steps: 4
19+
model_id: google/t5-efficient-large
20+
model_type: seq2seq
21+
random_init: true
22+
tie_embeddings: true
23+
output_dir: ./output/
24+
tf32: true
25+
torch_compile: true
26+
tokenizer_class: "MeanScaleUniformBins"
27+
tokenizer_kwargs:
28+
low_limit: -15.0
29+
high_limit: 15.0
30+
n_tokens: 4096
31+
lr_scheduler_type: linear
32+
warmup_ratio: 0.0
33+
dataloader_num_workers: 1
34+
max_missing_prop: 0.9
35+
use_eos_token: true
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
training_data_paths:
2+
- "/home/ubuntu/tsmixup-data.arrow"
3+
- "/home/ubuntu/kernelsynth-data.arrow"
4+
probability:
5+
- 0.9
6+
- 0.1
7+
context_length: 512
8+
prediction_length: 64
9+
min_past: 60
10+
max_steps: 200_000
11+
save_steps: 100_000
12+
log_steps: 500
13+
per_device_train_batch_size: 32
14+
learning_rate: 0.001
15+
optim: adamw_torch_fused
16+
num_samples: 20
17+
shuffle_buffer_length: 100_000
18+
gradient_accumulation_steps: 1
19+
model_id: google/t5-efficient-mini
20+
model_type: seq2seq
21+
random_init: true
22+
tie_embeddings: true
23+
output_dir: ./output/
24+
tf32: true
25+
torch_compile: true
26+
tokenizer_class: "MeanScaleUniformBins"
27+
tokenizer_kwargs:
28+
low_limit: -15.0
29+
high_limit: 15.0
30+
n_tokens: 4096
31+
lr_scheduler_type: linear
32+
warmup_ratio: 0.0
33+
dataloader_num_workers: 1
34+
max_missing_prop: 0.9
35+
use_eos_token: true
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
training_data_paths:
2+
- "/home/ubuntu/tsmixup-data.arrow"
3+
- "/home/ubuntu/kernelsynth-data.arrow"
4+
probability:
5+
- 0.9
6+
- 0.1
7+
context_length: 512
8+
prediction_length: 64
9+
min_past: 60
10+
max_steps: 200_000
11+
save_steps: 100_000
12+
log_steps: 500
13+
per_device_train_batch_size: 32
14+
learning_rate: 0.001
15+
optim: adamw_torch_fused
16+
num_samples: 20
17+
shuffle_buffer_length: 100_000
18+
gradient_accumulation_steps: 1
19+
model_id: google/t5-efficient-small
20+
model_type: seq2seq
21+
random_init: true
22+
tie_embeddings: true
23+
output_dir: ./output/
24+
tf32: true
25+
torch_compile: true
26+
tokenizer_class: "MeanScaleUniformBins"
27+
tokenizer_kwargs:
28+
low_limit: -15.0
29+
high_limit: 15.0
30+
n_tokens: 4096
31+
lr_scheduler_type: linear
32+
warmup_ratio: 0.0
33+
dataloader_num_workers: 1
34+
max_missing_prop: 0.9
35+
use_eos_token: true
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
training_data_paths:
2+
- "/home/ubuntu/tsmixup-data.arrow"
3+
- "/home/ubuntu/kernelsynth-data.arrow"
4+
probability:
5+
- 0.9
6+
- 0.1
7+
context_length: 512
8+
prediction_length: 64
9+
min_past: 60
10+
max_steps: 200_000
11+
save_steps: 100_000
12+
log_steps: 500
13+
per_device_train_batch_size: 32
14+
learning_rate: 0.001
15+
optim: adamw_torch_fused
16+
num_samples: 20
17+
shuffle_buffer_length: 100_000
18+
gradient_accumulation_steps: 1
19+
model_id: google/t5-efficient-tiny
20+
model_type: seq2seq
21+
random_init: true
22+
tie_embeddings: true
23+
output_dir: ./output/
24+
tf32: true
25+
torch_compile: true
26+
tokenizer_class: "MeanScaleUniformBins"
27+
tokenizer_kwargs:
28+
low_limit: -15.0
29+
high_limit: 15.0
30+
n_tokens: 4096
31+
lr_scheduler_type: linear
32+
warmup_ratio: 0.0
33+
dataloader_num_workers: 1
34+
max_missing_prop: 0.9
35+
use_eos_token: true

0 commit comments

Comments
 (0)