feat: Add Modular Pipeline for Stable Diffusion 3 (SD3)#13324
feat: Add Modular Pipeline for Stable Diffusion 3 (SD3)#13324AlanPonnachan wants to merge 17 commits intohuggingface:mainfrom
Conversation
tests/modular_pipelines/stable_diffusion_3/test_modular_pipeline_stable_diffusion_3.py
Outdated
Show resolved
Hide resolved
|
@AlanPonnachan thanks for this PR! Could you also provide some test code and sample outputs? |
sayakpaul
left a comment
There was a problem hiding this comment.
Thanks for getting started on this! I left some comments (majorly on the use of guidance).
src/diffusers/modular_pipelines/stable_diffusion_3/before_denoise.py
Outdated
Show resolved
Hide resolved
src/diffusers/modular_pipelines/stable_diffusion_3/before_denoise.py
Outdated
Show resolved
Hide resolved
src/diffusers/modular_pipelines/stable_diffusion_3/before_denoise.py
Outdated
Show resolved
Hide resolved
|
@claude can you review this? |
|
I'll analyze this and get back to you. |
|
@bot /style |
|
Style bot fixed some files and pushed the changes. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@sayakpaul Sample outputs you can find here: #13324 (comment) |
src/diffusers/modular_pipelines/stable_diffusion_3/modular_blocks_stable_diffusion_3.py
Show resolved
Hide resolved
yiyixuxu
left a comment
There was a problem hiding this comment.
thanks for working on this!
I left one comment
- add autodocstring to assembled blocks
| logger = logging.get_logger(__name__) | ||
|
|
||
|
|
||
| # auto_docstring |
There was a problem hiding this comment.
i added a doc page on this here #13382
basically you need to run
python utils/modular_auto_docstring.py --fix_and_overwrite
and to look through the generated docstring to see if all the paramters are properly defined
There was a problem hiding this comment.
@yiyixuxu , I added description to most of InputParam and OutputParam and ran the above script.
I skimmed throught the docstrings once and felt right .
Let me know your thoughts!
|
@claude |
|
I'll analyze this and get back to you. |
|
does this work? @claude can you help explain what it needs to be done for this to work? based on the current state of PR |
|
I'll analyze this and get back to you. |
@yiyixuxu , I tried out this and working fine. I ran following example on colab T4: import torch
from IPython.display import display
from diffusers import ComponentsManager, ModularPipeline
from diffusers.utils import load_image
components = ComponentsManager()
components.enable_auto_cpu_offload(device="cuda")
repo_id = "stabilityai/stable-diffusion-3-medium-diffusers"
pipeline = ModularPipeline.from_pretrained(repo_id, components_manager=components)
components_to_load =[
"tokenizer", "tokenizer_2", "scheduler", "guider",
"image_processor", "text_encoder", "text_encoder_2",
"transformer", "vae"
]
pipeline.load_components(components_to_load, torch_dtype=torch.float16)
pipeline.update_components(tokenizer_3=None, text_encoder_3=None)
# TEXT-TO-IMAGE
prompt = "A highly detailed macro photography of a glowing bioluminescent blue butterfly resting on a vibrant red rose, dark magical forest background, cinematic lighting, 8k resolution, masterpiece"
print("Running Text-to-Image...")
t2i_output = pipeline(
prompt=prompt,
num_inference_steps=28,
guidance_scale=7.0,
generator=torch.manual_seed(42)
)
t2i_output.images[0].save("sd3_modular_t2i.png")
display(t2i_output.images[0])
# IMAGE-TO-IMAGE
init_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png").resize((1024, 1024))
# Transforming the cat into a dreamy watercolor painting
prompt_i2i = "A dreamy watercolor painting of a cat sitting in a magical glowing forest, soft pastel colors, ethereal atmosphere, highly detailed, trending on artstation"
print("Running Image-to-Image...")
i2i_output = pipeline(
prompt=prompt_i2i,
image=init_image,
strength=0.85, # Slightly higher strength for a stronger stylistic change
num_inference_steps=28,
guidance_scale=7.0,
generator=torch.manual_seed(100)
)
i2i_output.images[0].save("sd3_modular_i2i_watercolor.png")
display(i2i_output.images[0])And the outputs TEXT-TO-IMAGE
IMAGE-TO-IMAGE
|



What does this PR do?
This PR introduces the modular architecture for Stable Diffusion 3 (SD3), implementing both Text-to-Image (T2I) and Image-to-Image (I2I) pipelines.
Key additions:
SD3ModularPipelineandSD3AutoBlocksto the dynamic modular pipeline resolver.BlockStateTestSD3ModularPipelineFastandTestSD3Img2ImgModularPipelineFasttest suites.Related issue: #13295
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Usage Example
Colab notebook: https://colab.research.google.com/drive/18_tZWIQdObq8EX0Vyd9ysGA-oACDwpf8?usp=sharing
Outputs
Text-to-Image:
Image-to-Image:
Who can review?
@sayakpaul @asomoza