DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation
Xu Guo^*, Fulong Ye^*, Qichao Sun^*^†, Liyang Chen, Bingchuan Li^†, Pengze Zhang, Jiawei Liu, Songtao Zhao^§, Qian He, Xiangwang Hou^§
^*Equal contribution,^†Project lead, ^§Corresponding author
Tsinghua University | Intelligent Creation Team, ByteDance

✨ Key Features

DreamID-Omni is a unified framework designed for high-fidelity human-centric generation. It seamlessly integrates three core capabilities into a single model:

R2AV (Generation): Generate synchronized video and audio from reference images and voice timbres.
RV2AV (Editing): Edit the identity and voice of a source video based on the reference image and voice timbre.
RA2V (Animation): Animate a reference identity driven by audio input with precise lip-sync.

🎬 Demo

demo.mp4

🚀 Code Release

We are currently making final preparations for the open-source release. Pending internal company approval, we aim to release the v1 version in March. Please stay tuned!

🔥 News

[02/13/2026] 🔥 Our paper is released!
[01/05/2026] 🔥 The code for our previous work, DreamID-V, has been released!

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

✨ Key Features

🎬 Demo

🚀 Code Release

🔥 News

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

✨ Key Features

🎬 Demo

🚀 Code Release

🔥 News

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages