Skip to content

[Pipelines] Support for T2I-Adapter #2390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
wfng92 opened this issue Feb 17, 2023 · 12 comments
Closed
2 tasks done

[Pipelines] Support for T2I-Adapter #2390

wfng92 opened this issue Feb 17, 2023 · 12 comments
Labels
stale Issues that haven't received updates

Comments

@wfng92
Copy link
Contributor

wfng92 commented Feb 17, 2023

Model/Pipeline/Scheduler description

From the official repository, T2I-Adapter by @TencentARC is

... a simple and small (~70M parameters, ~300M storage space) network that can provide extra guidance to pre-trained text-to-image models while freezing the original large text-to-image models.
T2I-Adapter aligns internal knowledge in T2I models with external control signals. We can train various adapters according to different conditions, and achieve rich control and editing effects.

Would be great to have this plug and play adapters in diffusers module.

Open source status

  • The model implementation is available
  • The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

Original code: https://github.com/TencentARC/T2I-Adapter
Pre-trained models: https://huggingface.co/TencentARC/T2I-Adapter

@HimariO
Copy link
Contributor

HimariO commented Feb 19, 2023

I have made a quick attempt to implement the T2I-Adapter in diffusers, which can be found over here.
Based on the results obtained from the t2iadapter_seg_sd14v1.pth adapter, it appears to be working correctly.:
Screenshot from 2023-02-19 15-09-51

The adapter module itself is quite simple, so I think the main consideration of integrating the adapter into diffusers will be:

  • Should the design of the adapter module allow it to inject the adapter hidden state to any layer inside UNet (in the official implementation the adapter state is always added after the last ResnetBlock from each downsample block)
  • Should the adapter be integrated into UNet, since the number of feature maps and size of feature maps the adapter output all depend on the UNet model it is working with

@sayakpaul
Copy link
Member

Thanks so much for your hard work, @HimariO! Your questions are quite valid. Do you think the design philosophy of how integrated LoRA into diffusers would be of help (PR)? I mentioned it because LoRA also falls under the adapter series of neural nets.

Let me see what other members think.

Cc: @patil-suraj @patrickvonplaten @williamberman

@HimariO
Copy link
Contributor

HimariO commented Feb 21, 2023

@sayakpaul, thank you for directing me to the LoRA PR. It has been very helpful in giving me a general idea of the design philosophy of similar features. After reviewing the PR for LoRA and the draft PR for ControlNet, I believe we can create a more versatile API that can support T2I-Adapter, ControlNet, and other similar modules that have independent input and will inject the output into diffusion model. PoC can be found here.

@haofanwang
Copy link
Contributor

haofanwang commented Feb 21, 2023

Agree. As T2I and ControlNet (they share similar designs) both require some changes of UNet, more similar pipelines in the future may lead to crash. It is necessary to consider how to efficiently merge them into one framework.

@sayakpaul
Copy link
Member

@HimariO I left a comment directly on your commit. Thanks so much!

We usually consider the code-level impact we might have before accommodating a large change in the API. So, I request @patil-suraj @williamberman @patrickvonplaten @yiyixuxu to chime in here too.

Note that this is a lighter-than-usual week for us, so there might be some delay in our response.

@takuma104
Copy link
Contributor

My PR for ControlNet (#2407) has been open for a while now. I am also in favor of the Sideload-related changes. The T2I-Adapter and ControlNet share a similar concept in that they both interfere with UNet. I think that the Sideload concept could be a common foundation and have good potential for future extensions. (I have left a comment on the ControlNet thread.)

@williamberman
Copy link
Contributor

My understanding from a preliminary read of the t2i adapter paper is that the outputs from the adapter model are just added with the intermediate features of the encoder of the unet. This shouldn't require any hacking of the existing block definitions and could be done just by passing the outputs of the adapter to the forward method of the unet.

@HimariO
Copy link
Contributor

HimariO commented Feb 25, 2023

@williamberman your understanding is correct, and what you describe is exactly what I do with my first prototype, The main motivation for trying out new concepts like sideloading is to avoid modifying every sub-module the adapter/controlnet-like model interacts with, especially when those modules are buried deep in the module hierarchy or there are different adapter variation targeting different modules.

@sayakpaul
Copy link
Member

Thanks for thinking this through @HimariO! Let us know whenever you're read with a PR and / or if you need any help.

@AK391
Copy link

AK391 commented Feb 28, 2023

related: https://github.com/cloneofsimo/t2i-adapter-diffusers

@HimariO
Copy link
Contributor

HimariO commented Mar 1, 2023

Hi @sayakpaul, just a quick note to let you know that I'm planning on creating the PR this week, and I'll let you know if there are any design-related issues that require further discussion. Thanks!

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues that haven't received updates
Projects
None yet
Development

No branches or pull requests

7 participants