-
Notifications
You must be signed in to change notification settings - Fork 6k
[Core] add: controlnet support for SDXL #4038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Hi! Do you mean that in validation during training all images are black, but if you manually load trained checkpoint using external script, the images are fine? |
Exactly. |
I think that might relate to SDXL VAE producing NANs in some cases with fp16 mode. From https://github.com/kohya-ss/sd-scripts/tree/sdxl:
Also: |
Thanks for being willing to help. I think the issue with VAE is handled. See: https://github.com/huggingface/diffusers/blob/db78a4cb4e3f105cbc7534890f606e25e906e23a/src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl.py#L1118C1-L1133C38. Also, when I run the manual validation, it's in FP16 only. |
Ah, really, seems so, thanks. BTW, in the code you mentioned there might be a small bug with unnecessary Also, to run your code, I had to put extra |
@gkorepanov thanks so much for your catches. I incorporated the fixes. Let me run the dummy experiment one more time to check quickly. |
Is it because autocast is used to generate the validation image? diffusers/examples/controlnet/train_controlnet_sdxl.py Lines 123 to 126 in 68f2c38
kohya-ss's problems also seem to have been caused by autocast. |
Cool! Let's make sure we have a working controlnet training run before merging this though :-) |
There's a working script in the PR. The issues described in the original post is why I am seeking reviews for. |
The difference was caused by different resolution in inference. By default, controlnet pipeline takes height/width from control image |
@gkorepanov thanks again for your inputs! Very much appreciated. Let me run a couple of experiments now. |
@gkorepanov may I know which GPU model did you use for your tests? I am currently using a 40GB A100 and when I try |
@patrickvonplaten thanks for all the reviews. A final review and I think we're good to go. Let me know. |
Will existing controlnet 1.1 checkpoints work here? |
No. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
@gkorepanov we start a PR for adding switching support and MultiControlNet too since switching likely impacts that more. Let me know :) |
* add: controlnet sdxl. * modifications to controlnet. * run styling. * add: __init__.pys * incorporate huggingface#4019 changes. * run make fix-copies. * resize the conditioning images. * remove autocast. * run styling. * disable autocast. * debugging * device placement. * back to autocast. * remove comment. * save some memory by reusing the vae and unet in the pipeline. * apply styling. * Allow low precision sd xl * finish * finish * changes to accommodate the improved VAE. * modifications to how we handle vae encoding in the training. * make style * make existing controlnet fast tests pass. * change vae checkpoint cli arg. * fix: vae pretrained paths. * fix: steps in get_scheduler(). * debugging. * debugging./ * fix: weight conversion. * add: docs. * add: limited tests./ * add: datasets to the requirements. * update docstrings and incorporate the usage of watermarking. * incorporate fix from huggingface#4083 * fix watermarking dependency handling. * run make-fix-copies. * Empty-Commit * Update requirements_sdxl.txt * remove vae upcasting part. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make style * run make fix-copies. * disable suppot for multicontrolnet. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make fix-copies. * dtyle/. * fix-copies. --------- Co-authored-by: Patrick von Platen <[email protected]>
* add: controlnet sdxl. * modifications to controlnet. * run styling. * add: __init__.pys * incorporate huggingface#4019 changes. * run make fix-copies. * resize the conditioning images. * remove autocast. * run styling. * disable autocast. * debugging * device placement. * back to autocast. * remove comment. * save some memory by reusing the vae and unet in the pipeline. * apply styling. * Allow low precision sd xl * finish * finish * changes to accommodate the improved VAE. * modifications to how we handle vae encoding in the training. * make style * make existing controlnet fast tests pass. * change vae checkpoint cli arg. * fix: vae pretrained paths. * fix: steps in get_scheduler(). * debugging. * debugging./ * fix: weight conversion. * add: docs. * add: limited tests./ * add: datasets to the requirements. * update docstrings and incorporate the usage of watermarking. * incorporate fix from huggingface#4083 * fix watermarking dependency handling. * run make-fix-copies. * Empty-Commit * Update requirements_sdxl.txt * remove vae upcasting part. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make style * run make fix-copies. * disable suppot for multicontrolnet. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make fix-copies. * dtyle/. * fix-copies. --------- Co-authored-by: Patrick von Platen <[email protected]>
* add: controlnet sdxl. * modifications to controlnet. * run styling. * add: __init__.pys * incorporate huggingface#4019 changes. * run make fix-copies. * resize the conditioning images. * remove autocast. * run styling. * disable autocast. * debugging * device placement. * back to autocast. * remove comment. * save some memory by reusing the vae and unet in the pipeline. * apply styling. * Allow low precision sd xl * finish * finish * changes to accommodate the improved VAE. * modifications to how we handle vae encoding in the training. * make style * make existing controlnet fast tests pass. * change vae checkpoint cli arg. * fix: vae pretrained paths. * fix: steps in get_scheduler(). * debugging. * debugging./ * fix: weight conversion. * add: docs. * add: limited tests./ * add: datasets to the requirements. * update docstrings and incorporate the usage of watermarking. * incorporate fix from huggingface#4083 * fix watermarking dependency handling. * run make-fix-copies. * Empty-Commit * Update requirements_sdxl.txt * remove vae upcasting part. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make style * run make fix-copies. * disable suppot for multicontrolnet. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make fix-copies. * dtyle/. * fix-copies. --------- Co-authored-by: Patrick von Platen <[email protected]>
tranning met loss is nan, and the pred_noise cntain nan, which will case the tranning fail, the traing target is predict noise with given noise, which mse alwas nearby 1 (~= 1), and met the log_validate validate image alwas balck |
Try passing the following as your VAE: Additionally, you can ask questions on the repositories like the following, which leveraged our training scripts to obtain nice results: https://huggingface.co/thibaud/controlnet-openpose-sdxl-1.0/discussions. |
@zdxpan please make sure to open a new issue instead of commenting on the PR here |
* add: controlnet sdxl. * modifications to controlnet. * run styling. * add: __init__.pys * incorporate huggingface#4019 changes. * run make fix-copies. * resize the conditioning images. * remove autocast. * run styling. * disable autocast. * debugging * device placement. * back to autocast. * remove comment. * save some memory by reusing the vae and unet in the pipeline. * apply styling. * Allow low precision sd xl * finish * finish * changes to accommodate the improved VAE. * modifications to how we handle vae encoding in the training. * make style * make existing controlnet fast tests pass. * change vae checkpoint cli arg. * fix: vae pretrained paths. * fix: steps in get_scheduler(). * debugging. * debugging./ * fix: weight conversion. * add: docs. * add: limited tests./ * add: datasets to the requirements. * update docstrings and incorporate the usage of watermarking. * incorporate fix from huggingface#4083 * fix watermarking dependency handling. * run make-fix-copies. * Empty-Commit * Update requirements_sdxl.txt * remove vae upcasting part. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make style * run make fix-copies. * disable suppot for multicontrolnet. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make fix-copies. * dtyle/. * fix-copies. --------- Co-authored-by: Patrick von Platen <[email protected]>
* add: controlnet sdxl. * modifications to controlnet. * run styling. * add: __init__.pys * incorporate huggingface#4019 changes. * run make fix-copies. * resize the conditioning images. * remove autocast. * run styling. * disable autocast. * debugging * device placement. * back to autocast. * remove comment. * save some memory by reusing the vae and unet in the pipeline. * apply styling. * Allow low precision sd xl * finish * finish * changes to accommodate the improved VAE. * modifications to how we handle vae encoding in the training. * make style * make existing controlnet fast tests pass. * change vae checkpoint cli arg. * fix: vae pretrained paths. * fix: steps in get_scheduler(). * debugging. * debugging./ * fix: weight conversion. * add: docs. * add: limited tests./ * add: datasets to the requirements. * update docstrings and incorporate the usage of watermarking. * incorporate fix from huggingface#4083 * fix watermarking dependency handling. * run make-fix-copies. * Empty-Commit * Update requirements_sdxl.txt * remove vae upcasting part. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make style * run make fix-copies. * disable suppot for multicontrolnet. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> * run make fix-copies. * dtyle/. * fix-copies. --------- Co-authored-by: Patrick von Platen <[email protected]>
This PR adds support for ControlNets with SDXL. The two primary components being added to this PR:
train_controlnet_sdxl.py
.StableDiffusionXLControlNetPipeline
(with changes toControlNetModel
to accommodate the pipeline-level changes).However, these seems to be something weird going on here.
I first started training on a small subset of dataset (the circles dataset) with the following command:
The trained checkpoints seem to only generate black images: https://huggingface.co/fusing/controlnet-sdxl-circles-fixed (only visible to the
diffusers
team members).To further debug this, I tried:
This doesn't generate the expected results (which is expected since the number of training steps is quite low) but doesn't generate all black images either.
@patrickvonplaten @williamberman could you take a deeper look here?
TODOs