Cheng Zhang, Hanwen Liang, Donny Y. Chen, Qianyi Wu, Konstantinos N. Plataniotis, Camilo Cruz Gambardella, Jianfei Cai
Project Page | Paper | Video | Data
PanFlow is a framework for controllable 360Β° panoramic video generation that decouples motion input into two interpretable components: rotation flow and derotated flow.
By conditioning diffusion on spherical-warped motion noise, PanFlow enables precise motion control, produces loop-consistent panoramas, and supports applications such as motion transfer:
and panoramic video editing:
We use conda to manage the environment. You can create the environment by running the following command:
conda create -n panflow python=3.11 -y
conda activate panflow
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txtWe use wandb to log and visualize the training process. You can create an account then login to wandb by running the following command:
wandb loginDownload the pretrained checkpoints from this OneDrive link to checkpoints/ folder, or from their corresponding source:
- Download the pretrained model to
checkpoints/. - Download the pretrained model
PanoFlow(RAFT)-wo-CFE.pthof Panoflow at weiyun, then put it incheckpoints/folder. This is used for optical flow estimation in noise warping. - Download the pretrained model
i3d_pretrained_400.ptin common_metrics_on_video_quality, then put it incheckpoints/folder. This is used for FVD calculation during evaluation.
Download our finetuned LoRA weights from here and put it in logs/ folder.
Download the toy dataset from OneDrive or Hugging Face and put it in data/PanFlow/ folder. The demo videos are from 360-1M, sourced from YouTube, licensed under CC BY 4.0.
Run the following command to generate motion transfer results:
WANDB_RUN_ID=u95jgv9e python -m demo.demo --demo-name motion_transfer --noise_alpha 0.5Run the following command to generate editing results:
WANDB_RUN_ID=u95jgv9e python -m demo.demo --demo-name editing --noise_alpha 0.5We generate latent and noise cache for the filtered subset to speed up training. Please download them from Hugging Face to data/PanFlow/ by:
huggingface-cli download chengzhag/PanFlow --repo-type dataset --local-dir data/PanFlowThis also include pose and meta information for full PanFlow dataset. Please decompress the tar.gz files in data/PanFlow/:
cd data/PanFlow
tar -xzvf meta.tar.gz
tar -xzvf slam_pose.tar.gzAlternatively, you can also download the 360-1M videos we filtered to generate your own cache.
python -m tools.download_360_1mThis script is adapted from 360-1M. Due to the consistent changes in yt-dlp's downloading mechanism to comply with YouTube's anti-scraping mechanism, the script may require some adjustments from time to time.
The cache will be generated automatically during training if not found in the data/PanFlow/cache/ folder.
If you want to download the full videos or go through the data curation process by yourself, please follow the steps in /curation. This will end up with 24k metadata and corresponding poses for 400k clips. They are already included in the Hugging Face dataset (meta and slam_pose folders) and are needed for cache generation and training.
Run the following command to start training:
bash finetune/train_ddp_i2v.shWe used 8 A100 GPUs for training. You'll get a WANDB_RUN_ID (e.g., u95jgv9e) after starting the training. The logs will be synced to your wandb account and the checkpoints will be saved in logs/<WANDB_RUN_ID>/checkpoints/.
Run the following command to evaluate the model:
WANDB_RUN_ID=<u95jgv9e_or_your_id_here> python -m finetune.evaluate --num-test-samples 100This evaluation script computes metrics except Q-Align scores. The results will be logged to logs/<WANDB_RUN_ID>/PanFlow/.
If you find our work helpful, please consider citing:
@inproceedings{zhang2025panflow,
title={PanFlow: Decoupled Motion Control for Panoramic Video Generation},
author={Zhang, Cheng and Liang, Hanwen and Chen, Donny Y and Wu, Qianyi and Plataniotis, Konstantinos N and Gambardella, Camilo Cruz and Cai, Jianfei},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2026}
}Our paper cannot be completed without the amazing open-source projects CogVideo, Go-with-the-Flow, stella_vslam, PySceneDetect...
Also check out our latest work UCPE on camera-controllable video generation and our Pan-Series works PanFusion and PanSplat towards 3D scene generation with panoramic images!
D. Y. Chen's contributions were made while he was affiliated with Monash University.


