[SDXL DreamBooth LoRA] add support for text encoder fine-tuning #4097

sayakpaul · 2023-07-14T11:32:39Z

This PR adds support for text encoder fine-tuning in the DreamBooth LoRA script for SDXL.

Summary of the changes:

Major refactor of the dataloader and the collator to accommodate for training the text encoders.
Support for the numerically stable VAE (some type-casting here and there) (Allow low precision vae sd xl #4083).
Changes to the loaders of LoRA.

To help us maintain sanity, I tested the current training script under three settings:

No text encoder and no better VAE

export MODEL_NAME="diffusers/stable-diffusion-xl-base-0.9"
export INSTANCE_DIR="dog"
export OUTPUT_DIR="lora-trained-xl-no-vae-text-encoder"

accelerate launch train_dreambooth_lora_sdxl.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="fp16" \
  --instance_prompt="a photo of sks dog" \
  --resolution=1024 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --learning_rate=1e-4 \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=75 \
  --validation_prompt="A photo of sks dog in a bucket" \
  --validation_epochs=25 \
  --seed="0" \
  --push_to_hub

Artifacts:

WandB logs
Checkpoints (private to the Diffusers team)

No text encoder but better VAE

export MODEL_NAME="diffusers/stable-diffusion-xl-base-0.9"
export INSTANCE_DIR="dog"
export OUTPUT_DIR="lora-trained-xl-no-text-encoder"

accelerate launch train_dreambooth_lora_sdxl.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --pretrained_vae_model_name_or_path="sayakpaul/sdxl-vae-fp16-fix-testing" \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="fp16" \
  --instance_prompt="a photo of sks dog" \
  --resolution=1024 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --learning_rate=1e-4 \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=75 \
  --train_text_encoder \
  --validation_prompt="A photo of sks dog in a bucket" \
  --validation_epochs=25 \
  --seed="0" \
  --push_to_hub

Artifacts:

WandB logs
Checkpoints (private to the Diffusers team)

Better VAE along with the text encoder

export MODEL_NAME="diffusers/stable-diffusion-xl-base-0.9"
export INSTANCE_DIR="dog"
export OUTPUT_DIR="lora-trained-xl-text-encoder-vae"

accelerate launch train_dreambooth_lora_sdxl.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --pretrained_vae_model_name_or_path="sayakpaul/sdxl-vae-fp16-fix-testing" \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="fp16" \
  --instance_prompt="a photo of sks dog" \
  --resolution=1024 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --learning_rate=1e-4 \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=75 \
  --train_text_encoder \
  --validation_prompt="A photo of sks dog in a bucket" \
  --validation_epochs=25 \
  --seed="0" \
  --push_to_hub

Artifacts:

WandB logs
Checkpoints (private to the Diffusers team)

…DXL DreamBooth

…nto feat/sdxl-dreambooth-returns

HuggingFaceDocBuilderDev · 2023-07-14T11:39:18Z

The documentation is not available anymore as the PR was closed or merged.

src/diffusers/loaders.py

sayakpaul · 2023-07-21T09:57:38Z

@patrickvonplaten @williamberman I think I have addressed all your comments:

Simplification of the dataloader
Less state dict munging

I would suggest taking another deeper look.

patrickvonplaten · 2023-07-21T13:22:43Z

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

@@ -809,3 +810,66 @@ def __call__(
            return (image,)

        return StableDiffusionXLPipelineOutput(images=image)
+
+    # Overrride to properly handle the loading and unloading of the additional text encoder.


patrickvonplaten · 2023-07-21T13:24:26Z

examples/dreambooth/train_dreambooth_lora_sdxl.py

-    # needed for the SD XL UNet to operate.
-    def compute_embeddings(prompt, text_encoders, tokenizers):
+    def compute_time_ids():
+        # Adapted from pipeline.StableDiffusionXLPipeline._get_add_time_ids
        original_size = (args.resolution, args.resolution)


Suggested change

original_size = (args.resolution, args.resolution)

original_size = (args.resolution, args.resolution)

This should ideally be the original size of the passed image (before resizing), but ok to leave as is for now

examples/dreambooth/README_sdxl.md

patrickvonplaten

Nice!

Co-authored-by: Patrick von Platen <[email protected]>

patrickvonplaten · 2023-07-24T18:12:55Z

@williamberman ok for you?

williamberman · 2023-07-24T18:29:37Z

src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

+    def save_lora_weights(
+        self,
+        save_directory: Union[str, os.PathLike],
+        unet_lora_layers: Dict[str, Union[torch.nn.Module, torch.Tensor]] = None,
+        text_encoder_lora_layers: Dict[str, Union[torch.nn.Module, torch.Tensor]] = None,
+        text_encoder_2_lora_layers: Dict[str, Union[torch.nn.Module, torch.Tensor]] = None,
+        is_main_process: bool = True,
+        weight_name: str = None,
+        save_function: Callable = None,
+        safe_serialization: bool = False,
+    ):
+        state_dict = {}
+
+        def pack_weights(layers, prefix):
+            layers_weights = layers.state_dict() if isinstance(layers, torch.nn.Module) else layers
+            layers_state_dict = {f"{prefix}.{module_name}": param for module_name, param in layers_weights.items()}
+            return layers_state_dict
+
+        state_dict.update(pack_weights(unet_lora_layers, "unet"))
+
+        if text_encoder_lora_layers and text_encoder_2_lora_layers:
+            state_dict.update(pack_weights(text_encoder_lora_layers, "text_encoder"))
+            state_dict.update(pack_weights(text_encoder_2_lora_layers, "text_encoder_2"))
+


williamberman · 2023-07-24T18:31:33Z

examples/dreambooth/train_dreambooth_lora_sdxl.py

 class DreamBoothDataset(Dataset):
    """
    A dataset to prepare the instance and class images with the prompts for fine-tuning the model.
-    It pre-processes the images and the tokenizes prompts.
+    It pre-processes the images.
    """

    def __init__(


c'est magnifique

williamberman

perfect, lgtm!

sayakpaul · 2023-07-25T00:06:01Z

Thanks all for your suggestions.

…ingface#4097) * Allow low precision sd xl * finish * finish * feat: initial draft for supporting text encoder lora finetuning for SDXL DreamBooth * fix: variable assignments. * add: autocast block. * add debugging * vae dtype hell * fix: vae dtype hell. * fix: vae dtype hell 3. * clean up * lora text encoder loader. * fix: unwrapping models. * add: tests. * docs. * handle unexpected keys. * fix vae dtype in the final inference. * fix scope problem. * fix: save_model_card args. * initialize: prefix to None. * fix: dtype issues. * apply gixes. * debgging. * debugging * debugging * debugging * debugging * debugging * add: fast tests. * pre-tokenize. * address: will's comments. * fix: loader and tests. * fix: dataloader. * simplify dataloader. * length. * simplification. * make style && make quality * simplify state_dict munging * fix: tests. * fix: state_dict packing. * Apply suggestions from code review Co-authored-by: Patrick von Platen <[email protected]> --------- Co-authored-by: Patrick von Platen <[email protected]>

patrickvonplaten and others added 7 commits July 13, 2023 21:53

Allow low precision sd xl

65a5c45

finish

43f842c

finish

7171d42

Merge branch 'main' into allow_low_precision_vae_sd_xl

97f69a7

feat: initial draft for supporting text encoder lora finetuning for S…

9337535

…DXL DreamBooth

Merge remote-tracking branch 'origin/allow_low_precision_vae_sd_xl' i…

69e9bfa

…nto feat/sdxl-dreambooth-returns

fix: variable assignments.

b487dfc

sayakpaul added 14 commits July 14, 2023 17:33

add: autocast block.

c51e559

add debugging

9d23e30

vae dtype hell

ea285db

fix: vae dtype hell.

b0ed1b6

fix: vae dtype hell 3.

20a9186

clean up

9d3e606

lora text encoder loader.

35b8dae

fix: unwrapping models.

9c305b1

add: tests.

4afb793

docs.

42fb433

handle unexpected keys.

0f95887

fix vae dtype in the final inference.

ede8ca2

fix scope problem.

34b536c

fix: save_model_card args.

ad26174

sayakpaul marked this pull request as ready for review July 14, 2023 14:17

sayakpaul added 3 commits July 14, 2023 19:54

initialize: prefix to None.

c5a95d6

fix: dtype issues.

63f62b4

apply gixes.

2d815d2

sayakpaul requested review from williamberman and patrickvonplaten and removed request for williamberman July 14, 2023 14:43

williamberman reviewed Jul 18, 2023

View reviewed changes

src/diffusers/loaders.py Outdated Show resolved Hide resolved

williamberman reviewed Jul 18, 2023

View reviewed changes

src/diffusers/loaders.py Outdated Show resolved Hide resolved

sayakpaul added 8 commits July 21, 2023 11:45

simplify dataloader.

0d77b53

Merge branch 'main' into feat/sdxl-dreambooth-returns

c80915e

length.

91b0c3a

simplification.

52eef75

make style && make quality

ef501c8

simplify state_dict munging

07a45c8

fix: tests.

6ca45f3

fix: state_dict packing.

5f4e089

sayakpaul requested a review from williamberman July 21, 2023 09:56

patrickvonplaten reviewed Jul 21, 2023

View reviewed changes

examples/dreambooth/README_sdxl.md Show resolved Hide resolved

patrickvonplaten approved these changes Jul 21, 2023

View reviewed changes

Apply suggestions from code review

989e54d

Co-authored-by: Patrick von Platen <[email protected]>

williamberman reviewed Jul 24, 2023

View reviewed changes

williamberman approved these changes Jul 24, 2023

View reviewed changes

isidentical mentioned this pull request Jul 24, 2023

Kohya LORA and variant Loading for SDXL #4133

Closed

sayakpaul merged commit 365e846 into main Jul 25, 2023

sayakpaul deleted the feat/sdxl-dreambooth-returns branch July 25, 2023 00:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SDXL DreamBooth LoRA] add support for text encoder fine-tuning #4097

[SDXL DreamBooth LoRA] add support for text encoder fine-tuning #4097

Uh oh!

sayakpaul commented Jul 14, 2023 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jul 14, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

patrickvonplaten Jul 21, 2023

Uh oh!

patrickvonplaten Jul 21, 2023

Uh oh!

Uh oh!

patrickvonplaten left a comment

Uh oh!

patrickvonplaten commented Jul 24, 2023

Uh oh!

williamberman Jul 24, 2023

Uh oh!

williamberman Jul 24, 2023

Uh oh!

williamberman left a comment

Uh oh!

sayakpaul commented Jul 25, 2023

Uh oh!

Uh oh!

	original_size = (args.resolution, args.resolution)
	original_size = (args.resolution, args.resolution)

[SDXL DreamBooth LoRA] add support for text encoder fine-tuning #4097

[SDXL DreamBooth LoRA] add support for text encoder fine-tuning #4097

Uh oh!

Conversation

sayakpaul commented Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented Jul 21, 2023

Uh oh!

patrickvonplaten Jul 21, 2023

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Jul 21, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Jul 24, 2023

Uh oh!

williamberman Jul 24, 2023

Choose a reason for hiding this comment

Uh oh!

williamberman Jul 24, 2023

Choose a reason for hiding this comment

Uh oh!

williamberman left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Jul 25, 2023

Uh oh!

Uh oh!

sayakpaul commented Jul 14, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 14, 2023 •

edited

Loading