Skip to content

Support using different attention kwargs for different types of processors in one model. #4152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eliphatfs opened this issue Jul 19, 2023 · 11 comments
Labels
stale Issues that haven't received updates

Comments

@eliphatfs
Copy link
Contributor

eliphatfs commented Jul 19, 2023

Is your feature request related to a problem? Please describe.
Say you want to use CustomDiffusion for some layers, and LoRA for some others.
You want to pass a {'scale': 0.5} to LoRA.
Then the code goes:

TypeError: __call__() got an unexpected keyword argument 'scale'

Because CustomDiffusion has no idea what this parameter will do.

Describe the solution you'd like

  1. The easiest solution is to drop excess kwargs for implemented attention processors. The downside is that silent bugs may come up.
  2. Perhaps implementing a flag to indicate whether excess kwargs are expected or not. Downside of this fix is that it looks a bit too ad-hoc.
  3. Add support for attn kwargs that also specify the layers or attn-proc types affected. The downside is a bit complicated design and also is a lot of work, possibly need to modify every pipeline.

Additional context
I see a similar issue in this comment: #1639 (comment)
But it did not get enough attention.

@patrickvonplaten
Copy link
Contributor

What exactly is CustomDiffusion? Could you add a reproducible code snippet? :-)

@eliphatfs
Copy link
Contributor Author

from diffusers import StableDiffusionPipeline, UNet2DConditionModel
from diffusers.models.attention_processor import CustomDiffusionAttnProcessor, LoRAAttnProcessor
import torch


mix_lora_and_custom = True
model_base = "stabilityai/stable-diffusion-2"

pipe = StableDiffusionPipeline.from_pretrained(model_base, local_files_only=True)

unet: UNet2DConditionModel = pipe.unet
attn_procs = {}
for name, _ in unet.attn_processors.items():
    cross_attention_dim = None if name.endswith("attn1.processor") else unet.config.cross_attention_dim
    if name.startswith("mid_block"):
        hidden_size = unet.config.block_out_channels[-1]
    elif name.startswith("up_blocks"):
        block_id = int(name[len("up_blocks.")])
        hidden_size = list(reversed(unet.config.block_out_channels))[block_id]
    elif name.startswith("down_blocks"):
        block_id = int(name[len("down_blocks.")])
        hidden_size = unet.config.block_out_channels[block_id]
    if not mix_lora_and_custom or name.startswith("down_blocks.0.attentions"):
        attn_procs[name] = LoRAAttnProcessor(hidden_size, cross_attention_dim)
    else:
        attn_procs[name] = CustomDiffusionAttnProcessor(
            hidden_size=hidden_size,
            cross_attention_dim=cross_attention_dim,
        ).to(unet.device)
unet.set_attn_processor(attn_procs)
pipe.to('cuda:0', torch.bfloat16)
pipe("boom", cross_attention_kwargs=dict(scale=0.5))

@patrickvonplaten
Copy link
Contributor

Ah I see, I think we could indeed allow passing the cross_attention_kwargs to custom diffusion. @sayakpaul what do you think?

@sayakpaul
Copy link
Member

Sure! I like the idea.

Would you maybe like to open a PR? Happy to help.

@eliphatfs
Copy link
Contributor Author

What solution would you prefer?

@sayakpaul
Copy link
Member

Supporting a scale argument in the Custom Diffusion attention processor?

@eliphatfs
Copy link
Contributor Author

Then say I want to mix LoRA with vanilla AttnProc;
Moreover, when new processors are added as new papers come up, we would need to update all attention processors if we introduce a new kwarg and I don't want to do that.

@sayakpaul
Copy link
Member

How would you have approached it otherwise? Happy to discuss.

@eliphatfs
Copy link
Contributor Author

I described three solutions that I thought of in my issue body; but none is perfect to me.

@sayakpaul
Copy link
Member

Okay then we will keep it open for now.

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues that haven't received updates
Projects
None yet
Development

No branches or pull requests

3 participants