Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge} by miguelmartin75 · Pull Request #13066 · huggingface/diffusers

miguelmartin75 · 2026-02-02T19:52:14Z

What does this PR do?

This PR introduces Cosmos Transfer2.5 inference pipeline, which extends the existing code in transformer_cosmos.py and introduces a new controlnet class for cosmos. The conversion script is updated to convert the checkpoints too.

I've intentionally split the controlnet from the base predict model to match the rest of the diffusers codebase. To do this, I have had to duplicate some layers/weights from the base model (relating to the patch & timestep embeddings), but I believe SD3 does this.

Similar to predict2.5, I have added documentation and unit tests.

Additional PRs will be submitted for the following features (in order of priority):

Auto-regressive inference support, currently inference can only be applied to a fix number of frames. In cosmos-transfer2.5 AR inference is performed.
Additional transfer2.5 variants:
- multi-control (multiple controlnets at once)
- auto/multiview
Image reference

In addition, unfortunately, the guardrails safety model is too aggressive: it currently flags "not safe" for the examples we have on cosmos-transfer2.5 (e.g. edge example for 93 frames is flagged). This guardrail model needs to be updated, but this work is ~orthogonal of this PR.

Who can review?

Core library:

Pipelines and pipeline callbacks: @yiyixuxu and @asomoza
Docs: @stevhliu and @sayakpaul
General functionalities: @sayakpaul @yiyixuxu @DN6

yiyixuxu

Thanks for the PR! The overall structure looks good. I left some minor comments.

One question before I can review further: Are the base transformer weights the same across the different control variants?

This helps us understand whether splitting the controlnet from the transformer makes sense (i.e., can users mix and match?), and also helps me understand whether the controlnet is required for this pipeline etc

scripts/convert_cosmos_to_diffusers.py

src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_transfer.py

miguelmartin75 · 2026-02-06T07:38:04Z

Addressed your comment about transfer2_5_forward + updated the example code

Are the base transformer weights the same across the different control variants? ... can users mix and match?

Yes, mix & matching controlnets is possible, but only if an image context reference is not included(see here, including an image reference is not currently supported in this PR). Additionally, including multiple controlnets "multicontrol" will be possible (any base transformer can be used; cosmos-transfer2.5 always picks "edge"), but I will need to submit a separate PR for this. Note, multicontrol does not support an image reference.

To be more specific, the base transformer weights are almost the same. The difference lies in the weights of the cross attention layers for an image reference (see here), i.e. attn2 in diffusers-land for these layers for all blocks in the base transformer. Without an image reference, all base transformers are functionally same, in this case the img_context tensor is torch.zeros; I also qualitatively verified all pairs of base transformer + controlnet as a sanity check and it looks like they output the same results.

I will need to document this when I have a PR up for image reference feature, (3) in my description

miguelmartin75 added 30 commits February 2, 2026 19:50

initial conversion script

dd241dc

cosmos control net block

7e475bd

CosmosAttention

1b934ff

base model conversion

b40da24

wip

cfedde1

pipeline updates

8222e9f

convert controlnet

9fefe1f

pipeline: working without controls

2b67a31

wip

5f2bab8

debugging

97f10d8

Almost working

cc6cf13

temp

4ba9945

control working

35e0653

cleanup + detail on neg_encoder_hidden_states

9da2e88

convert edge

b3852ac

pos emb for control latents

a16e81a

convert all chkpts

cd65899

resolve TODOs

dfe99b8

remove prints

aadf51a

Docs

26b7ee5

add siglip image reference encoder

d7f122d

Add unit tests

50f7e53

controlnet: add duplicate layers

c5c2456

Additional tests

9a55923

skip less

2e2fea1

skip less

bf1f99d

remove image_ref

910103f

minor

751fba4

docs

251b5c1

remove skipped test in transfer

44db782

miguelmartin75 added 2 commits February 2, 2026 19:50

Don't crash process

c1cfa9d

formatting

9b8338c

miguelmartin75 changed the title ~~Cosmos/transfer2.5~~ Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge} Feb 2, 2026

miguelmartin75 added 3 commits February 2, 2026 19:59

revert some changes

b9dd0cb

remove skipped test

d09cf24

make style

2cd7f23

yiyixuxu reviewed Feb 2, 2026

View reviewed changes

scripts/convert_cosmos_to_diffusers.py Show resolved Hide resolved

src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_transfer.py Outdated Show resolved Hide resolved

src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_transfer.py Outdated Show resolved Hide resolved

Address comment + fix example

ef332d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge}#13066

Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge}#13066
miguelmartin75 wants to merge 36 commits intohuggingface:mainfrom
miguelmartin75:cosmos/transfer2.5

miguelmartin75 commented Feb 2, 2026 •

edited

Loading

Uh oh!

yiyixuxu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

miguelmartin75 commented Feb 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

miguelmartin75 commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Who can review?

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

miguelmartin75 commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

miguelmartin75 commented Feb 2, 2026 •

edited

Loading

miguelmartin75 commented Feb 6, 2026 •

edited

Loading