Skip to content

Commit af1a72f

Browse files
committed
resolve conflicts.
2 parents 39f7d2d + 040c118 commit af1a72f

File tree

55 files changed

+3981
-558
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+3981
-558
lines changed

.github/workflows/codeql.yml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
---
2+
name: CodeQL Security Analysis For Github Actions
3+
4+
on:
5+
push:
6+
branches: ["main"]
7+
workflow_dispatch:
8+
# pull_request:
9+
10+
jobs:
11+
codeql:
12+
name: CodeQL Analysis
13+
uses: huggingface/security-workflows/.github/workflows/codeql-reusable.yml@v1
14+
permissions:
15+
security-events: write
16+
packages: read
17+
actions: read
18+
contents: read
19+
with:
20+
languages: '["actions","python"]'
21+
queries: 'security-extended,security-and-quality'
22+
runner: 'ubuntu-latest' #optional if need custom runner

.github/workflows/mirror_community_pipeline.yml

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,6 @@ jobs:
2424
mirror_community_pipeline:
2525
env:
2626
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_COMMUNITY_MIRROR }}
27-
2827
runs-on: ubuntu-22.04
2928
steps:
3029
# Checkout to correct ref
@@ -39,25 +38,28 @@ jobs:
3938
# If ref is 'refs/heads/main' => set 'main'
4039
# Else it must be a tag => set {tag}
4140
- name: Set checkout_ref and path_in_repo
41+
EVENT_NAME: ${{ github.event_name }}
42+
EVENT_INPUT_REF: ${{ github.event.inputs.ref }}
43+
GITHUB_REF: ${{ github.ref }}
4244
run: |
43-
if [ "${{ github.event_name }}" == "workflow_dispatch" ]; then
44-
if [ -z "${{ github.event.inputs.ref }}" ]; then
45+
if [ "$EVENT_NAME" == "workflow_dispatch" ]; then
46+
if [ -z "$EVENT_INPUT_REF" ]; then
4547
echo "Error: Missing ref input"
4648
exit 1
47-
elif [ "${{ github.event.inputs.ref }}" == "main" ]; then
49+
elif [ "$EVENT_INPUT_REF" == "main" ]; then
4850
echo "CHECKOUT_REF=refs/heads/main" >> $GITHUB_ENV
4951
echo "PATH_IN_REPO=main" >> $GITHUB_ENV
5052
else
51-
echo "CHECKOUT_REF=refs/tags/${{ github.event.inputs.ref }}" >> $GITHUB_ENV
52-
echo "PATH_IN_REPO=${{ github.event.inputs.ref }}" >> $GITHUB_ENV
53+
echo "CHECKOUT_REF=refs/tags/$EVENT_INPUT_REF" >> $GITHUB_ENV
54+
echo "PATH_IN_REPO=$EVENT_INPUT_REF" >> $GITHUB_ENV
5355
fi
54-
elif [ "${{ github.ref }}" == "refs/heads/main" ]; then
55-
echo "CHECKOUT_REF=${{ github.ref }}" >> $GITHUB_ENV
56+
elif [ "$GITHUB_REF" == "refs/heads/main" ]; then
57+
echo "CHECKOUT_REF=$GITHUB_REF" >> $GITHUB_ENV
5658
echo "PATH_IN_REPO=main" >> $GITHUB_ENV
5759
else
5860
# e.g. refs/tags/v0.28.1 -> v0.28.1
59-
echo "CHECKOUT_REF=${{ github.ref }}" >> $GITHUB_ENV
60-
echo "PATH_IN_REPO=$(echo ${{ github.ref }} | sed 's/^refs\/tags\///')" >> $GITHUB_ENV
61+
echo "CHECKOUT_REF=$GITHUB_REF" >> $GITHUB_ENV
62+
echo "PATH_IN_REPO=$(echo $GITHUB_REF | sed 's/^refs\/tags\///')" >> $GITHUB_ENV
6163
fi
6264
- name: Print env vars
6365
run: |
@@ -99,4 +101,4 @@ jobs:
99101
- name: Report failure status
100102
if: ${{ failure() }}
101103
run: |
102-
pip install requests && python utils/notify_community_pipelines_mirror.py --status=failure
104+
pip install requests && python utils/notify_community_pipelines_mirror.py --status=failure

docs/source/en/_toctree.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -367,6 +367,8 @@
367367
title: LatteTransformer3DModel
368368
- local: api/models/longcat_image_transformer2d
369369
title: LongCatImageTransformer2DModel
370+
- local: api/models/ltx2_video_transformer3d
371+
title: LTX2VideoTransformer3DModel
370372
- local: api/models/ltx_video_transformer3d
371373
title: LTXVideoTransformer3DModel
372374
- local: api/models/lumina2_transformer2d
@@ -443,6 +445,10 @@
443445
title: AutoencoderKLHunyuanVideo
444446
- local: api/models/autoencoder_kl_hunyuan_video15
445447
title: AutoencoderKLHunyuanVideo15
448+
- local: api/models/autoencoderkl_audio_ltx_2
449+
title: AutoencoderKLLTX2Audio
450+
- local: api/models/autoencoderkl_ltx_2
451+
title: AutoencoderKLLTX2Video
446452
- local: api/models/autoencoderkl_ltx_video
447453
title: AutoencoderKLLTXVideo
448454
- local: api/models/autoencoderkl_magvit
@@ -678,6 +684,8 @@
678684
title: Kandinsky 5.0 Video
679685
- local: api/pipelines/latte
680686
title: Latte
687+
- local: api/pipelines/ltx2
688+
title: LTX-2
681689
- local: api/pipelines/ltx_video
682690
title: LTXVideo
683691
- local: api/pipelines/mochi
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# AutoencoderKLLTX2Audio
13+
14+
The 3D variational autoencoder (VAE) model with KL loss used in [LTX-2](https://huggingface.co/Lightricks/LTX-2) was introduced by Lightricks. This is for encoding and decoding audio latent representations.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import AutoencoderKLLTX2Audio
20+
21+
vae = AutoencoderKLLTX2Audio.from_pretrained("Lightricks/LTX-2", subfolder="vae", torch_dtype=torch.float32).to("cuda")
22+
```
23+
24+
## AutoencoderKLLTX2Audio
25+
26+
[[autodoc]] AutoencoderKLLTX2Audio
27+
- encode
28+
- decode
29+
- all
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# AutoencoderKLLTX2Video
13+
14+
The 3D variational autoencoder (VAE) model with KL loss used in [LTX-2](https://huggingface.co/Lightricks/LTX-2) was introduced by Lightricks.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import AutoencoderKLLTX2Video
20+
21+
vae = AutoencoderKLLTX2Video.from_pretrained("Lightricks/LTX-2", subfolder="vae", torch_dtype=torch.float32).to("cuda")
22+
```
23+
24+
## AutoencoderKLLTX2Video
25+
26+
[[autodoc]] AutoencoderKLLTX2Video
27+
- decode
28+
- encode
29+
- all
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# LTX2VideoTransformer3DModel
13+
14+
A Diffusion Transformer model for 3D data from [LTX](https://huggingface.co/Lightricks/LTX-2) was introduced by Lightricks.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import LTX2VideoTransformer3DModel
20+
21+
transformer = LTX2VideoTransformer3DModel.from_pretrained("Lightricks/LTX-2", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
22+
```
23+
24+
## LTX2VideoTransformer3DModel
25+
26+
[[autodoc]] LTX2VideoTransformer3DModel
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License. -->
14+
15+
# LTX-2
16+
17+
LTX-2 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model. It brings together the core building blocks of modern video generation, with open weights and a focus on practical, local execution.
18+
19+
You can find all the original LTX-Video checkpoints under the [Lightricks](https://huggingface.co/Lightricks) organization.
20+
21+
The original codebase for LTX-2 can be found [here](https://github.com/Lightricks/LTX-2).
22+
23+
## LTX2Pipeline
24+
25+
[[autodoc]] LTX2Pipeline
26+
- all
27+
- __call__
28+
29+
## LTX2ImageToVideoPipeline
30+
31+
[[autodoc]] LTX2ImageToVideoPipeline
32+
- all
33+
- __call__
34+
35+
## LTX2PipelineOutput
36+
37+
[[autodoc]] pipelines.ltx2.pipeline_output.LTX2PipelineOutput

docs/source/en/api/pipelines/ltx_video.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ export_to_video(video, "output.mp4", fps=24)
136136
- The recommended dtype for the transformer, VAE, and text encoder is `torch.bfloat16`. The VAE and text encoder can also be `torch.float32` or `torch.float16`.
137137
- For guidance-distilled variants of LTX-Video, set `guidance_scale` to `1.0`. The `guidance_scale` for any other model should be set higher, like `5.0`, for good generation quality.
138138
- For timestep-aware VAE variants (LTX-Video 0.9.1 and above), set `decode_timestep` to `0.05` and `image_cond_noise_scale` to `0.025`.
139-
- For variants that support interpolation between multiple conditioning images and videos (LTX-Video 0.9.5 and above), use similar images and videos for the best results. Divergence from the conditioning inputs may lead to abrupt transitionts in the generated video.
139+
- For variants that support interpolation between multiple conditioning images and videos (LTX-Video 0.9.5 and above), use similar images and videos for the best results. Divergence from the conditioning inputs may lead to abrupt transitions in the generated video.
140140

141141
- LTX-Video 0.9.7 includes a spatial latent upscaler and a 13B parameter transformer. During inference, a low resolution video is quickly generated first and then upscaled and refined.
142142

@@ -329,7 +329,7 @@ export_to_video(video, "output.mp4", fps=24)
329329

330330
<details>
331331
<summary>Show example code</summary>
332-
332+
333333
```python
334334
import torch
335335
from diffusers import LTXConditionPipeline, LTXLatentUpsamplePipeline
@@ -474,6 +474,12 @@ export_to_video(video, "output.mp4", fps=24)
474474

475475
</details>
476476

477+
## LTXI2VLongMultiPromptPipeline
478+
479+
[[autodoc]] LTXI2VLongMultiPromptPipeline
480+
- all
481+
- __call__
482+
477483
## LTXPipeline
478484

479485
[[autodoc]] LTXPipeline

docs/source/en/api/pipelines/skyreels_v2.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,8 @@ The following SkyReels-V2 models are supported in Diffusers:
3737
- [SkyReels-V2 I2V 1.3B - 540P](https://huggingface.co/Skywork/SkyReels-V2-I2V-1.3B-540P-Diffusers)
3838
- [SkyReels-V2 I2V 14B - 540P](https://huggingface.co/Skywork/SkyReels-V2-I2V-14B-540P-Diffusers)
3939
- [SkyReels-V2 I2V 14B - 720P](https://huggingface.co/Skywork/SkyReels-V2-I2V-14B-720P-Diffusers)
40-
- [SkyReels-V2 FLF2V 1.3B - 540P](https://huggingface.co/Skywork/SkyReels-V2-FLF2V-1.3B-540P-Diffusers)
40+
41+
This model was contributed by [M. Tolga Cangöz](https://github.com/tolgacangoz).
4142

4243
> [!TIP]
4344
> Click on the SkyReels-V2 models in the right sidebar for more examples of video generation.

docs/source/en/training/distributed_inference.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -263,8 +263,8 @@ def main():
263263
world_size = dist.get_world_size()
264264

265265
pipeline = DiffusionPipeline.from_pretrained(
266-
"black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16, device_map=device
267-
)
266+
"black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16
267+
).to(device)
268268
pipeline.transformer.set_attention_backend("_native_cudnn")
269269

270270
cp_config = ContextParallelConfig(ring_degree=world_size)

0 commit comments

Comments
 (0)