Support using different attention kwargs for different types of processors in one model. #4152

eliphatfs · 2023-07-19T08:08:54Z

Is your feature request related to a problem? Please describe.
Say you want to use CustomDiffusion for some layers, and LoRA for some others.
You want to pass a {'scale': 0.5} to LoRA.
Then the code goes:

TypeError: __call__() got an unexpected keyword argument 'scale'

Because CustomDiffusion has no idea what this parameter will do.

Describe the solution you'd like

The easiest solution is to drop excess kwargs for implemented attention processors. The downside is that silent bugs may come up.
Perhaps implementing a flag to indicate whether excess kwargs are expected or not. Downside of this fix is that it looks a bit too ad-hoc.
Add support for attn kwargs that also specify the layers or attn-proc types affected. The downside is a bit complicated design and also is a lot of work, possibly need to modify every pipeline.

Additional context
I see a similar issue in this comment: #1639 (comment)
But it did not get enough attention.

patrickvonplaten · 2023-07-19T12:48:43Z

What exactly is CustomDiffusion? Could you add a reproducible code snippet? :-)

eliphatfs · 2023-07-19T13:03:17Z

from diffusers import StableDiffusionPipeline, UNet2DConditionModel
from diffusers.models.attention_processor import CustomDiffusionAttnProcessor, LoRAAttnProcessor
import torch


mix_lora_and_custom = True
model_base = "stabilityai/stable-diffusion-2"

pipe = StableDiffusionPipeline.from_pretrained(model_base, local_files_only=True)

unet: UNet2DConditionModel = pipe.unet
attn_procs = {}
for name, _ in unet.attn_processors.items():
    cross_attention_dim = None if name.endswith("attn1.processor") else unet.config.cross_attention_dim
    if name.startswith("mid_block"):
        hidden_size = unet.config.block_out_channels[-1]
    elif name.startswith("up_blocks"):
        block_id = int(name[len("up_blocks.")])
        hidden_size = list(reversed(unet.config.block_out_channels))[block_id]
    elif name.startswith("down_blocks"):
        block_id = int(name[len("down_blocks.")])
        hidden_size = unet.config.block_out_channels[block_id]
    if not mix_lora_and_custom or name.startswith("down_blocks.0.attentions"):
        attn_procs[name] = LoRAAttnProcessor(hidden_size, cross_attention_dim)
    else:
        attn_procs[name] = CustomDiffusionAttnProcessor(
            hidden_size=hidden_size,
            cross_attention_dim=cross_attention_dim,
        ).to(unet.device)
unet.set_attn_processor(attn_procs)
pipe.to('cuda:0', torch.bfloat16)
pipe("boom", cross_attention_kwargs=dict(scale=0.5))

patrickvonplaten · 2023-07-20T15:36:32Z

Ah I see, I think we could indeed allow passing the cross_attention_kwargs to custom diffusion. @sayakpaul what do you think?

sayakpaul · 2023-07-21T03:21:35Z

Sure! I like the idea.

Would you maybe like to open a PR? Happy to help.

eliphatfs · 2023-07-21T06:41:21Z

What solution would you prefer?

sayakpaul · 2023-07-21T06:42:11Z

Supporting a scale argument in the Custom Diffusion attention processor?

eliphatfs · 2023-07-21T06:44:51Z

Then say I want to mix LoRA with vanilla AttnProc;
Moreover, when new processors are added as new papers come up, we would need to update all attention processors if we introduce a new kwarg and I don't want to do that.

sayakpaul · 2023-07-21T06:48:23Z

How would you have approached it otherwise? Happy to discuss.

eliphatfs · 2023-07-21T06:50:06Z

I described three solutions that I thought of in my issue body; but none is perfect to me.

sayakpaul · 2023-07-21T06:58:34Z

Okay then we will keep it open for now.

github-actions · 2023-08-18T15:02:55Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot added the stale Issues that haven't received updates label Aug 18, 2023

github-actions bot closed this as completed Aug 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support using different attention kwargs for different types of processors in one model. #4152

Support using different attention kwargs for different types of processors in one model. #4152

eliphatfs commented Jul 19, 2023 •

edited

Loading

patrickvonplaten commented Jul 19, 2023

Uh oh!

eliphatfs commented Jul 19, 2023

Uh oh!

patrickvonplaten commented Jul 20, 2023

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

eliphatfs commented Jul 21, 2023

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

eliphatfs commented Jul 21, 2023

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

eliphatfs commented Jul 21, 2023

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

github-actions bot commented Aug 18, 2023

Uh oh!

Support using different attention kwargs for different types of processors in one model. #4152

Support using different attention kwargs for different types of processors in one model. #4152

Comments

eliphatfs commented Jul 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

patrickvonplaten commented Jul 19, 2023

Uh oh!

eliphatfs commented Jul 19, 2023

Uh oh!

patrickvonplaten commented Jul 20, 2023

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

eliphatfs commented Jul 21, 2023

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

eliphatfs commented Jul 21, 2023

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

eliphatfs commented Jul 21, 2023

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

github-actions bot commented Aug 18, 2023

Uh oh!

eliphatfs commented Jul 19, 2023 •

edited

Loading