GitHub - MattWallingford/360-1M

Website | HuggingFace | Paper |

360-1M is a large-scale 360° video dataset consisting of over 1 million videos for training video and 3D foundation models. This repository contains the following:

Links to the videos URLs for download from YouTube. We also provide a smaller 24k filtered subset for experimentation.
Metadata for each video including category, resolution, and views.
Code for downloading the videos locally and to Google Cloud Platform (recommended).
Code for filtering, processing, and obtaining camera pose for the videos.

Reference Image Generated Scene Trajectory	Reference Image Generated Scene Trajectory	Reference Image Generated Scene Trajectory

Downloading Videos

Metadata and video URLs can be downloaded from here: Metadata with Video URLs . The filtered subset which is around 5 TB in size can be found here: Filtered Subset

To download the videos we recommend using the yt-dlp package. To run our download scripts you'll also need pandas and pyarrow to parse the metadata parquet:

#Install packages for downloading videos
pip install yt-dlp
pip install pandas
pip install pyarrow

The videos can be downloaded using the provided script:

python DownloadVideos/download_local.py --in_path 360-1M.parquet --out_dir /path/to/videos

or to download the high quality subset:

python DownloadVideos/download_local.py --in_path Filtered_24k.parquet --out_dir /path/to/videos

The total size of all videos at max resolution is about 200 TB. We recommend downloading to a cloud platform due to bandwidth limitations and provide a script for use with GCP.

python DownloadVideos/Download_GCP.py --path 360-1M.parquet

Installation Guide for Video Processing And Training

Environment Setup

Create a new Conda environment:

conda create -n ODIN python=3.9
conda activate ODIN

2. Clone the repository:

```bash
cd ODIN
pip install -r requirements.txt

Install additional dependencies:

git clone https://github.com/CompVis/taming-transformers.git
pip install -e taming-transformers/
git clone https://github.com/openai/CLIP.git
pip install -e CLIP/

Clone the MAST3R repository:

git clone --recursive https://github.com/naver/mast3r
cd mast3r

Install MAST3R dependencies:

pip install -r requirements.txt
pip install -r dust3r/requirements.txt
For detailed installation instructions, visit the MAST3R repository.

Extracting Frames

To extract frames from videos, use the video_to_frames.py script:

python video_to_frames.py --path /path/to/videos --out /path/to/frames

Extracting Pairwise Poses Once frames are extracted, pairwise poses can be calculated using:

python extract_poses.py --path /path/to/frames

Training

Download the image-conditioned Stable Diffusion checkpoint released by Lambda Labs:

wget https://cv.cs.columbia.edu/zero123/assets/sd-image-conditioned-v2.ckpt

Run the training script:

python main.py \
    -t \
    --base configs/sd-ODIN-finetune-c_concat-256.yaml \
    --gpus 0,1,2,3,4,5,6,7 \
    --scale_lr False \
    --num_nodes 1 \
    --check_val_every_n_epoch 1 \
    --finetune_from sd-image-conditioned-v2.ckpt

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
DownloadVideos		DownloadVideos
VideoProcessing		VideoProcessing
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Website | HuggingFace | Paper |

Downloading Videos

Installation Guide for Video Processing And Training

Environment Setup

Extracting Frames

Training

About

Uh oh!

Releases

Packages

Languages

MattWallingford/360-1M

Folders and files

Latest commit

History

Repository files navigation

Website | HuggingFace | Paper |

Downloading Videos

Installation Guide for Video Processing And Training

Environment Setup

Extracting Frames

Training

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages