-
Notifications
You must be signed in to change notification settings - Fork 6k
Using LoRA will deactivate xFormers, even if it is enabled #3551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
By the way, when I tried it with the latest main a94977b, the log became a bit strange. Even though xformers is not enabled, the log seems to suggest that it is enabled. {"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "OFF", "mem_MB": 2818}
{"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "ON", "mem_MB": 3312}
{"width": 512, "height": 768, "batch": 1, "xformers": "OFF", "lora": "OFF", "mem_MB": 3136}
{"width": 512, "height": 768, "batch": 1, "xformers": "OFF", "lora": "ON", "mem_MB": 4683}
{"width": 512, "height": 512, "batch": 2, "xformers": "OFF", "lora": "OFF", "mem_MB": 3254}
{"width": 512, "height": 512, "batch": 2, "xformers": "OFF", "lora": "ON", "mem_MB": 4444}
{"width": 512, "height": 768, "batch": 2, "xformers": "OFF", "lora": "OFF", "mem_MB": 3791}
{"width": 512, "height": 768, "batch": 2, "xformers": "OFF", "lora": "ON", "mem_MB": 7186}
{"width": 512, "height": 512, "batch": 4, "xformers": "OFF", "lora": "OFF", "mem_MB": 4329}
{"width": 512, "height": 512, "batch": 4, "xformers": "OFF", "lora": "ON", "mem_MB": 6707}
{"width": 512, "height": 768, "batch": 4, "xformers": "OFF", "lora": "OFF", "mem_MB": 5403}
{"width": 512, "height": 768, "batch": 4, "xformers": "OFF", "lora": "ON", "mem_MB": 12192}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "OFF", "mem_MB": 2818}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "ON", "mem_MB": 3312}
{"width": 512, "height": 768, "batch": 1, "xformers": "ON", "lora": "OFF", "mem_MB": 3136}
{"width": 512, "height": 768, "batch": 1, "xformers": "ON", "lora": "ON", "mem_MB": 4683}
{"width": 512, "height": 512, "batch": 2, "xformers": "ON", "lora": "OFF", "mem_MB": 3254}
{"width": 512, "height": 512, "batch": 2, "xformers": "ON", "lora": "ON", "mem_MB": 4444}
{"width": 512, "height": 768, "batch": 2, "xformers": "ON", "lora": "OFF", "mem_MB": 3791}
{"width": 512, "height": 768, "batch": 2, "xformers": "ON", "lora": "ON", "mem_MB": 7186}
{"width": 512, "height": 512, "batch": 4, "xformers": "ON", "lora": "OFF", "mem_MB": 4329}
{"width": 512, "height": 512, "batch": 4, "xformers": "ON", "lora": "ON", "mem_MB": 6707}
{"width": 512, "height": 768, "batch": 4, "xformers": "ON", "lora": "OFF", "mem_MB": 5403}
{"width": 512, "height": 768, "batch": 4, "xformers": "ON", "lora": "ON", "mem_MB": 12192} |
I have created a patch. It seems to be working as intended. The behavior when xFormers is not enabled remains unchanged. {"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "OFF", "mem_MB": 2818}
{"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "ON", "mem_MB": 3311}
{"width": 512, "height": 768, "batch": 1, "xformers": "OFF", "lora": "OFF", "mem_MB": 3135}
{"width": 512, "height": 768, "batch": 1, "xformers": "OFF", "lora": "ON", "mem_MB": 4682}
{"width": 512, "height": 512, "batch": 2, "xformers": "OFF", "lora": "OFF", "mem_MB": 3252}
{"width": 512, "height": 512, "batch": 2, "xformers": "OFF", "lora": "ON", "mem_MB": 4444}
{"width": 512, "height": 768, "batch": 2, "xformers": "OFF", "lora": "OFF", "mem_MB": 3790}
{"width": 512, "height": 768, "batch": 2, "xformers": "OFF", "lora": "ON", "mem_MB": 7185}
{"width": 512, "height": 512, "batch": 4, "xformers": "OFF", "lora": "OFF", "mem_MB": 4327}
{"width": 512, "height": 512, "batch": 4, "xformers": "OFF", "lora": "ON", "mem_MB": 6706}
{"width": 512, "height": 768, "batch": 4, "xformers": "OFF", "lora": "OFF", "mem_MB": 5402}
{"width": 512, "height": 768, "batch": 4, "xformers": "OFF", "lora": "ON", "mem_MB": 12191}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "OFF", "mem_MB": 2817}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "ON", "mem_MB": 2819}
{"width": 512, "height": 768, "batch": 1, "xformers": "ON", "lora": "OFF", "mem_MB": 3135}
{"width": 512, "height": 768, "batch": 1, "xformers": "ON", "lora": "ON", "mem_MB": 3137}
{"width": 512, "height": 512, "batch": 2, "xformers": "ON", "lora": "OFF", "mem_MB": 3252}
{"width": 512, "height": 512, "batch": 2, "xformers": "ON", "lora": "ON", "mem_MB": 3254}
{"width": 512, "height": 768, "batch": 2, "xformers": "ON", "lora": "OFF", "mem_MB": 3790}
{"width": 512, "height": 768, "batch": 2, "xformers": "ON", "lora": "ON", "mem_MB": 3791}
{"width": 512, "height": 512, "batch": 4, "xformers": "ON", "lora": "OFF", "mem_MB": 4327}
{"width": 512, "height": 512, "batch": 4, "xformers": "ON", "lora": "ON", "mem_MB": 4329}
{"width": 512, "height": 768, "batch": 4, "xformers": "ON", "lora": "OFF", "mem_MB": 5402}
{"width": 512, "height": 768, "batch": 4, "xformers": "ON", "lora": "ON", "mem_MB": 5403} |
Thanks for the investigation. I took three snippets from the logs you posted above for the following settings: "width": 512, "height": 512, "batch": 1. So that I better understand what the suspect looks like, what am I looking for here? In the vanilla case: {"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "OFF", "mem_MB": 3837}
{"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "ON", "mem_MB": 3837}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "OFF", "mem_MB": 2806}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "ON", "mem_MB": 3837} We see for LoRA ons, even when it shows xformers on, the memory usage is still the same as the setting when xformers is off and LoRA is on. This is faulty. Correct? From here: {"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "OFF", "mem_MB": 2818}
{"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "ON", "mem_MB": 3312}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "OFF", "mem_MB": 2818}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "ON", "mem_MB": 3312} What are we looking for here? You mentioned
With xformers and LoRA both being off, we have 2818 MBs. Even when xformers is on it's the same. So, it means xformers wasn't off in the first place, correct? From here: {"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "OFF", "mem_MB": 2818}
{"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "ON", "mem_MB": 3311}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "OFF", "mem_MB": 2817}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "ON", "mem_MB": 2819} Things actually don't seem off. Is my understanding correct? Regarding the solution you proposed -- I am okay with it. I think we should consider adding a test suite for this case to make the behaviour more robust. Also, seeking suggestions from @pcuenca @williamberman @patrickvonplaten. |
Yes, correct. I'm starting to think that the reason why it doesn't seem to change whether xFormers is On/Off when LoRA is OFF in versions other than 0.16.1, is because the
Thanks! I just open draft PR for it #3556. |
Is the summary just that the non-xformers LoRA attention is being used when xformers is enabled? |
Yes! In my above environment where both LoRA and xFormers are OFF, the following passes, so it seems that the reason the memory usage is low is not because xFormers is enabled, but because for _, module in pipe.unet.named_modules():
if isinstance(module, Attention):
assert isinstance(module.processor, AttnProcessor2_0) |
We should add a AttnProcessor2_0 LoRA class here actually |
Inline with #3464 @patrickvonplaten |
Closing with #3556 |
Describe the bug
Discussed in #3437 (comment) . It appears that memory usage increases significantly when using LoRA, even in environments using xFormers. I investigated the cause using this script. The results suggest that even in environments where xFormers is enabled, effectively resulting in the same situation as if xFormers had been deactivated.
As a solution, it seems good to use
LoRAXFormersAttnProcessor
instead ofLoRAAttnProcessor
if xFormers is enabled in this part.diffusers/src/diffusers/loaders.py
Lines 275 to 286 in a94977b
What do you think?
Reproduction
https://gist.github.com/takuma104/e2139bda7f74cd977350e18500156683
Logs
System Info
diffusers
version: 0.16.1The text was updated successfully, but these errors were encountered: