Study diffusability of latent spaces starting from synthetic datasets with controllable geometry, then move toward real audio and vision datasets.
This repo is Hydra-driven and uses Astral uv for environment and dependency management. Use uv add for dependencies. For config-driven construction, use hydra.utils.instantiate; do not wire instantiation manually with OmegaConf.
uv syncMain references:
https://docs.astral.sh/uv/https://hydra.cc/docs/intro/
conf/config.yaml # main experiment entrypoint
conf/data/*.yaml # dataset / datamodule configs
conf/model/*.yaml # model configs
conf/trainer/*.yaml # trainer configs
SiT/train.py # Hydra entrypoint + Lightning Trainer wiring
SiT/lightning_module.py # training / validation / test module
SiT/eval_runner.py # evaluation sampling + metric orchestration
datamodules/synthetic_pointclouds.py # synthetic vector dataset + datamodule
utils/plot_distribution.py # synthetic dataset visualization
utils/validation_distribution_plots.py # validation-time GT vs generated comparison plots
utils/plot_anisotropy_intrinsic_sweep.py # aggregate sweep plots from saved runs
utils/evaluate_checkpoint_metrics.py # evaluate checkpoints every N epochs and write test_loss_by_class.json
program.md # active execution tracker for the current task
docs/synthetic_pointcloud_dataset.md
docs/synthetic_pointcloud_math_foundations.md
conf/config.yaml is the main training config. Its defaults are:
data: synth_pc_datamodulemodel: mini_mlptrainer: sit_trainer
The current synthetic setup is a single-class affine-subspace dataset with:
- one sample = one vector in
R^D ambient_dimdefined once inconf/config.yamlintrinsic_dimdefined once inconf/config.yamlanisotropy_max_scaledefined once inconf/config.yaml
Those shared parameters are propagated via Hydra interpolations:
ambient_dim -> data.in_channelsambient_dim -> model.in_channelsambient_dim -> class_sweeps[*].base.Dintrinsic_dim -> class_sweeps[*].base.danisotropy_max_scale -> class_sweeps[*].sweep.anisotropy.max_scaledata_thickness -> class_sweeps[*].base.thickness
conf/data/synth_pc.yaml uses class_sweeps with a single base class and a sweepable anisotropy value. In practice, Hydra multirun produces one training job per anisotropy setting.
datamodules/synthetic_pointclouds.py now works in the point-wise setting:
- each sample is a single vector
[D], not a cloud[N, D] - class geometry is sampled once per class
- per-sample randomness only resamples latent coordinates, component choice, and additive noise
Supported geometric families:
affine_subspacesine_warp_subspacemog
The default datamodule computes:
- SWD
- Exact-W2
- Energy-U
- Feature-MMD
These metrics are computed directly on vector samples [N, D]; the legacy cloud-level metric path has been removed.
Per-run artifacts are written under results/<experiment>/metrics/:
class_registry.jsonval_loss_by_class.jsonltest_loss_by_class.json
Validation-time distribution plots are written under results/<experiment>/plots/val/:
distribution_comparison_epochXXX_stepXXXXXXX.pngmanifest.jsonl
Each validation pass logs the same ground truth vs generated comparison figure to W&B.
These figures use the same seaborn-style visual language already used by the repo plotting utilities, with a shared blue density palette and a shared density scale between the left and right panels for each class.
Single run with the current defaults:
uv run python SiT/train.pySingle run with explicit synthetic controls:
uv run python SiT/train.py \
ambient_dim=8 \
intrinsic_dim=6 \
anisotropy_max_scale=4.0 \
trainer.strategy=autoAmbient-8 anisotropy sweep over 5 levels:
CUDA_VISIBLE_DEVICES=0 uv run python SiT/train.py -m \
ambient_dim=8 \
intrinsic_dim=6 \
anisotropy_max_scale=1.0,2.0,4.0,8.0,16.0 \
trainer.results_dir=results/anisotropy_sweep_ambient8_d6 \
trainer.strategy=autoAmbient-16 anisotropy sweep over the same 5 levels:
CUDA_VISIBLE_DEVICES=0 uv run python SiT/train.py -m \
ambient_dim=16 \
intrinsic_dim=6 \
anisotropy_max_scale=1.0,2.0,4.0,8.0,16.0 \
trainer.results_dir=results/anisotropy_sweep_ambient16_d6 \
trainer.strategy=autoNotes:
trainer.strategy=autois the safest override for single-GPU runs.model.num_classesis resolved automatically at runtime from the instantiated datamodule.- W&B is enabled by default through
conf/trainer/sit_trainer.yaml. trainer.run_namecontrols the human-readable run label used for local experiment naming.trainer.wandb_run_namecan override the W&B display name when needed.
Evaluate checkpoints every 5 epochs (defaults: W2 at 2048 samples/class, SWD/MMD/L2 at 10000):
uv run python utils/evaluate_checkpoint_metrics.py
# Example override:
uv run python utils/evaluate_checkpoint_metrics.py \
roots='[results/gaussian_anisotropy_sweep]' \
epoch_stride=10Visualize the current synthetic dataset:
uv run python utils/plot_distribution.pyRegenerate the documentation figures:
uv run python utils/plot_distribution.py --config-name plot_dataset_docs
uv run python utils/plot_distribution.py --config-name plot_dataset_docs_anisAggregate anisotropy sweep results:
uv run python utils/plot_anisotropy_intrinsic_sweep.py \
results_root=results/anisotropy_sweep_ambient8_d6
uv run python utils/plot_anisotropy_intrinsic_sweep.py \
results_root=results/anisotropy_sweep_ambient16_d6The sweep plotting utility reads local training artifacts from results/... and matches them with local W&B logs under wandb/.
It now summarizes and plots both val/feature_mmd_mean and val/swd_mean when available.
Sampling is configured in conf/model/*.yaml:
sampling:
mode: ODE # or SDE
ode:
method: dopri5 # dopri5, euler, heun
num_steps: 50
atol: 1.0e-6
rtol: 1.0e-3
sde:
method: Euler # Euler, Heun
num_steps: 250
diffusion_form: SBDM
diffusion_norm: 1.0
last_step: Mean
last_step_size: 0.04When repo behavior changes, update this README so commands, config names, and outputs remain accurate.