From 58a930f5606a6e0193449d2d92fee9bfcd0445f6 Mon Sep 17 00:00:00 2001
From: Sayak Paul <spsayakpaul@gmail.com>
Date: Tue, 24 Jan 2023 13:39:30 +0530
Subject: [PATCH 1/5] add: a doc on LoRA support in diffusers.

---
 docs/source/en/_toctree.yml          |   2 +
 docs/source/en/training/lora.mdx     | 128 +++++++++++++++++++++++++++
 docs/source/en/training/overview.mdx |   1 +
 examples/text_to_image/README.md     |   6 +-
 4 files changed, 134 insertions(+), 3 deletions(-)
 create mode 100644 docs/source/en/training/lora.mdx
diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml
index e580181251a9..c463fd843cec 100644
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@@ -71,6 +71,8 @@
     title: Dreambooth
   - local: training/text2image
     title: Text-to-image fine-tuning
+  - local: training/lora
+    title: LoRA Support in Diffusers
   title: Training
 - sections:
   - local: conceptual/philosophy
diff --git a/docs/source/en/training/lora.mdx b/docs/source/en/training/lora.mdx
new file mode 100644
index 000000000000..81147d977d2e
--- /dev/null
+++ b/docs/source/en/training/lora.mdx
@@ -0,0 +1,128 @@
+<!--Copyright 2023 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+
+# LoRA Support in Diffusers 
+
+Diffusers support LoRA for Stable Diffusion for faster fine-tuning allowing greater memory efficiency and easier portability. 
+
+Low-Rank Adaption of Large Language Models was first introduced by Microsoft in
+[LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685) by *Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen*.
+
+In a nutshell, LoRA allows adapting pretrained models by adding pairs of rank-decomposition weight matrices (called **update marrices**) 
+to existing weights and **only** training those newly added weights. This has a couple of advantages:
+
+- Previous pretrained weights are kept frozen so that model is not prone to [catastrophic forgetting](https://www.pnas.org/doi/10.1073/pnas.1611835114). 
+- Rank-decomposition matrices have significantly fewer parameters than original model, which means that trained LoRA weights are easily portable.
+- LoRA attention layers allow to control to which extent the model is adapted toward new training images via a `scale` parameter.
+
+[cloneofsimo](https://github.com/cloneofsimo) was the first to try out LoRA training for Stable Diffusion in the popular [lora](https://github.com/cloneofsimo/lora) GitHub repository.
+
+<Tip>
+
+LoRA also allows us to achieve greater memory efficiency since the pretrained weights are kept frozen, only the LoRA weights are trained, thereby
+allowing us to run fine-tuning on consumer GPUs like Tesla T4.
+
+</Tip>
+
+## Getting started with LoRA for fine-tuning
+
+Stable Diffusion can be fine-tuned in different ways:
+
+* [Textual inversion](https://huggingface.co/docs/diffusers/main/en/training/text_inversion)
+* [DreamBooth](https://huggingface.co/docs/diffusers/main/en/training/dreambooth) 
+* [Text2Image fine-tuning](https://huggingface.co/docs/diffusers/main/en/training/text2image) 
+
+We provide two end-to-end examples that show how to run fine-tuning with LoRA:
+
+* [DreamBooth](https://github.com/huggingface/diffusers/tree/main/examples/dreambooth#training-with-low-rank-adaptation-of-large-language-models-lora) 
+* [Text2Image](https://github.com/huggingface/diffusers/tree/main/examples/text_to_image#training-with-lora)
+
+If you want to perform DreamBooth training with LoRA, for instance, you would run:
+
+```bash
+export MODEL_NAME="runwayml/stable-diffusion-v1-5"
+export INSTANCE_DIR="path-to-instance-images"
+export OUTPUT_DIR="path-to-save-model"
+
+accelerate launch train_dreambooth_lora.py \
+  --pretrained_model_name_or_path=$MODEL_NAME  \
+  --instance_data_dir=$INSTANCE_DIR \
+  --output_dir=$OUTPUT_DIR \
+  --instance_prompt="a photo of sks dog" \
+  --resolution=512 \
+  --train_batch_size=1 \
+  --gradient_accumulation_steps=1 \
+  --checkpointing_steps=100 \
+  --learning_rate=1e-4 \
+  --report_to="wandb" \
+  --lr_scheduler="constant" \
+  --lr_warmup_steps=0 \
+  --max_train_steps=500 \
+  --validation_prompt="A photo of sks dog in a bucket" \
+  --validation_epochs=50 \
+  --seed="0" \
+  --push_to_hub
+```
+
+Refer to the respective examples linked above to learn more. 
+
+<Tip>
+
+When using LoRA we can use a much higher learning rate (typically 1e-4 as opposed to 1e-5) compared to non-LoRA fine-tuning.
+
+</Tip>
+
+But there is no free lunch. For the given dataset and expected generation quality, you'd still need to experiment with
+different hyperparameters. Here are some important ones:
+
+* Training time
+    * Learning rate 
+    * Number of training steps
+* Inference time 
+    * Number of steps 
+    * Scheduler type
+
+Additionally, you can follow [this blog](https://huggingface.co/blog/dreambooth) that documents some of our experimental
+findings for performing DreamBooth training Stable Diffusion.
+
+When fine-tuning, the LoRA update matrices are only added to the attention layers. To enable this, we added new weight
+loading functionalities. Their details are available [here](https://huggingface.co/docs/diffusers/main/en/api/loaders).
+
+## Inference 
+
+Assuming, you used the `examples/text_to_image/train_text_to_image_lora.py` to fine-tune Stable Diffusion on the [Pokemons
+dataset](https://huggingface.co/lambdalabs/pokemon-blip-captions), you can perform inference like so: 
+
+```py 
+from diffusers import StableDiffusionPipeline
+import torch
+
+model_path = "sayakpaul/sd-model-finetuned-lora-t4"
+pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16)
+pipe.unet.load_attn_procs(model_path)
+pipe.to("cuda")
+
+prompt = "A pokemon with green eyes and red legs."
+image = pipe(prompt, num_inference_steps=30, guidance_scale=7.5).images[0]
+image.save("pokemon.png")
+```
+
+[`sayakpaul/sd-model-finetuned-lora-t4`](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4) contains [LoRA fine-tuned update matrices](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4/blob/main/pytorch_lora_weights.bin)
+which is only 3 MBs in size. During inference, the pre-trained Stable Diffusion checkpoints loaded alongside these update
+matrices and then they are combined to run inference.
+
+Inference for DreamBooth training remains the same. Check
+[this section](https://github.com/huggingface/diffusers/tree/main/examples/dreambooth#inference-1) for more details. 
+
+## Known limitations 
+
+* Currently, we only support LoRA for the attention layers of [`UNet2DConditionModel`](https://huggingface.co/docs/diffusers/main/en/api/models#diffusers.UNet2DConditionModel).
diff --git a/docs/source/en/training/overview.mdx b/docs/source/en/training/overview.mdx
index fd6ec184d274..49aab9aa3647 100644
--- a/docs/source/en/training/overview.mdx
+++ b/docs/source/en/training/overview.mdx
@@ -37,6 +37,7 @@ Training examples show how to pretrain or fine-tune diffusion models for a varie
 - [Text-to-Image Training](./text2image)
 - [Text Inversion](./text_inversion)
 - [Dreambooth](./dreambooth)
+- [LoRA Support](./lora)
 
 If possible, please [install xFormers](../optimization/xformers) for memory efficient attention. This could help make your training faster and less memory intensive.
 
diff --git a/examples/text_to_image/README.md b/examples/text_to_image/README.md
index c9b10ea18a8c..9d7cbdf30d34 100644
--- a/examples/text_to_image/README.md
+++ b/examples/text_to_image/README.md
@@ -162,9 +162,9 @@ accelerate --mixed_precision="fp16" launch train_text_to_image_lora.py \
 
 The above command will also run inference as fine-tuning progresses and log the results to Weights and Biases.
 
-**___Note: When using LoRA we can use a much higher learning rate compared to non-LoRA fine-tuning. Here we use *1e-4* instead of the usual *1e-5*. Also, by using LoRA, it's possible to run `train_text_to_image_lora.py` in consumer GPUs like T4 or V100.**
+**___Note: When using LoRA we can use a much higher learning rate compared to non-LoRA fine-tuning. Here we use *1e-4* instead of the usual *1e-5*. Also, by using LoRA, it's possible to run `train_text_to_image_lora.py` in consumer GPUs like T4 or V100.___**
 
-The final LoRA embedding weights have been uploaded to [sayakpaul/sd-model-finetuned-lora-t4](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4). **___Note: [The final weights](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4/blob/main/pytorch_lora_weights.bin) are only 3 MB in size, which is orders of magnitudes smaller than the original model.**
+The final LoRA embedding weights have been uploaded to [sayakpaul/sd-model-finetuned-lora-t4](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4). **___Note: [The final weights](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4/blob/main/pytorch_lora_weights.bin) are only 3 MB in size, which is orders of magnitudes smaller than the original model.___**
 
 You can check some inference samples that were logged during the course of the fine-tuning process [here](https://wandb.ai/sayakpaul/text2image-fine-tune/runs/q4lc0xsw). 
 
@@ -191,7 +191,7 @@ image.save("pokemon.png")
 
 For faster training on TPUs and GPUs you can leverage the flax training example. Follow the instructions above to get the model and dataset before running the script.
 
-____Note: The flax example don't yet support features like gradient checkpoint, gradient accumulation etc, so to use flax for faster training we will need >30GB cards.___
+**___Note: The flax example don't yet support features like gradient checkpoint, gradient accumulation etc, so to use flax for faster training we will need >30GB cards.___**
 
 
 Before running the scripts, make sure to install the library's training dependencies:

From 233d64683283a4849231cc0d21db86eace2c3f35 Mon Sep 17 00:00:00 2001
From: Sayak Paul <spsayakpaul@gmail.com>
Date: Wed, 25 Jan 2023 08:54:34 +0530
Subject: [PATCH 2/5] Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
---
 docs/source/en/training/lora.mdx | 22 +++++++++++-----------
 examples/text_to_image/README.md |  2 +-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/docs/source/en/training/lora.mdx b/docs/source/en/training/lora.mdx
index 81147d977d2e..bc11e7aa99ad 100644
--- a/docs/source/en/training/lora.mdx
+++ b/docs/source/en/training/lora.mdx
@@ -12,24 +12,24 @@ specific language governing permissions and limitations under the License.
 
 # LoRA Support in Diffusers 
 
-Diffusers support LoRA for Stable Diffusion for faster fine-tuning allowing greater memory efficiency and easier portability. 
+Diffusers supports LoRA for faster fine-tuning of Stable Diffusion, allowing greater memory efficiency and easier portability. 
 
 Low-Rank Adaption of Large Language Models was first introduced by Microsoft in
 [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685) by *Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen*.
 
-In a nutshell, LoRA allows adapting pretrained models by adding pairs of rank-decomposition weight matrices (called **update marrices**) 
+In a nutshell, LoRA allows adapting pretrained models by adding pairs of rank-decomposition weight matrices (called **update matrices**) 
 to existing weights and **only** training those newly added weights. This has a couple of advantages:
 
-- Previous pretrained weights are kept frozen so that model is not prone to [catastrophic forgetting](https://www.pnas.org/doi/10.1073/pnas.1611835114). 
-- Rank-decomposition matrices have significantly fewer parameters than original model, which means that trained LoRA weights are easily portable.
+- Previous pretrained weights are kept frozen so that the model is not so prone to [catastrophic forgetting](https://www.pnas.org/doi/10.1073/pnas.1611835114). 
+- Rank-decomposition matrices have significantly fewer parameters than the original model, which means that trained LoRA weights are easily portable.
 - LoRA attention layers allow to control to which extent the model is adapted toward new training images via a `scale` parameter.
 
 [cloneofsimo](https://github.com/cloneofsimo) was the first to try out LoRA training for Stable Diffusion in the popular [lora](https://github.com/cloneofsimo/lora) GitHub repository.
 
 <Tip>
 
-LoRA also allows us to achieve greater memory efficiency since the pretrained weights are kept frozen, only the LoRA weights are trained, thereby
-allowing us to run fine-tuning on consumer GPUs like Tesla T4.
+LoRA allows us to achieve greater memory efficiency since the pretrained weights are kept frozen and only the LoRA weights are trained, thereby
+allowing us to run fine-tuning on consumer GPUs like Tesla T4, RTX 3080 or even RTX 2080 Ti!
 
 </Tip>
 
@@ -77,7 +77,7 @@ Refer to the respective examples linked above to learn more.
 
 <Tip>
 
-When using LoRA we can use a much higher learning rate (typically 1e-4 as opposed to 1e-5) compared to non-LoRA fine-tuning.
+When using LoRA we can use a much higher learning rate (typically 1e-4 as opposed to ~1e-6) compared to non-LoRA Dreambooth fine-tuning.
 
 </Tip>
 
@@ -92,15 +92,15 @@ different hyperparameters. Here are some important ones:
     * Scheduler type
 
 Additionally, you can follow [this blog](https://huggingface.co/blog/dreambooth) that documents some of our experimental
-findings for performing DreamBooth training Stable Diffusion.
+findings for performing DreamBooth training of Stable Diffusion.
 
 When fine-tuning, the LoRA update matrices are only added to the attention layers. To enable this, we added new weight
 loading functionalities. Their details are available [here](https://huggingface.co/docs/diffusers/main/en/api/loaders).
 
 ## Inference 
 
-Assuming, you used the `examples/text_to_image/train_text_to_image_lora.py` to fine-tune Stable Diffusion on the [Pokemons
-dataset](https://huggingface.co/lambdalabs/pokemon-blip-captions), you can perform inference like so: 
+Assuming you used the `examples/text_to_image/train_text_to_image_lora.py` to fine-tune Stable Diffusion on the [Pokemon
+dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions), you can perform inference like so: 
 
 ```py 
 from diffusers import StableDiffusionPipeline
@@ -117,7 +117,7 @@ image.save("pokemon.png")
 ```
 
 [`sayakpaul/sd-model-finetuned-lora-t4`](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4) contains [LoRA fine-tuned update matrices](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4/blob/main/pytorch_lora_weights.bin)
-which is only 3 MBs in size. During inference, the pre-trained Stable Diffusion checkpoints loaded alongside these update
+which is only 3 MBs in size. During inference, the pre-trained Stable Diffusion checkpoints are loaded alongside these update
 matrices and then they are combined to run inference.
 
 Inference for DreamBooth training remains the same. Check
diff --git a/examples/text_to_image/README.md b/examples/text_to_image/README.md
index 9d7cbdf30d34..31b00e943241 100644
--- a/examples/text_to_image/README.md
+++ b/examples/text_to_image/README.md
@@ -191,7 +191,7 @@ image.save("pokemon.png")
 
 For faster training on TPUs and GPUs you can leverage the flax training example. Follow the instructions above to get the model and dataset before running the script.
 
-**___Note: The flax example don't yet support features like gradient checkpoint, gradient accumulation etc, so to use flax for faster training we will need >30GB cards.___**
+**___Note: The flax example doesn't yet support features like gradient checkpoint, gradient accumulation etc, so to use flax for faster training we will need >30GB cards or TPU v3.___**
 
 
 Before running the scripts, make sure to install the library's training dependencies:

From 72814aa9959ef57d1fda6a1c0180d5607482860d Mon Sep 17 00:00:00 2001
From: Sayak Paul <spsayakpaul@gmail.com>
Date: Wed, 25 Jan 2023 09:40:34 +0530
Subject: [PATCH 3/5] apply PR suggestions.

---
 docs/source/en/training/lora.mdx | 39 +++++++++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/docs/source/en/training/lora.mdx b/docs/source/en/training/lora.mdx
index bc11e7aa99ad..ca2536989b09 100644
--- a/docs/source/en/training/lora.mdx
+++ b/docs/source/en/training/lora.mdx
@@ -22,14 +22,19 @@ to existing weights and **only** training those newly added weights. This has a
 
 - Previous pretrained weights are kept frozen so that the model is not so prone to [catastrophic forgetting](https://www.pnas.org/doi/10.1073/pnas.1611835114). 
 - Rank-decomposition matrices have significantly fewer parameters than the original model, which means that trained LoRA weights are easily portable.
-- LoRA attention layers allow to control to which extent the model is adapted toward new training images via a `scale` parameter.
+- LoRA matrices are generally added to the attention layers of the original model and they control to control to which extent the model is adapted toward new training images via a `scale` parameter.
+
+**__Note that the usage of LoRA is not limited to only attention layers. In the original LoRA work, the authors found out that just ammending
+the attention layers of a language model is sufficient to obtain good downstream performance with great efficiency. This is why, it's common
+to just add the LoRA weights to the attention layers of a model.__**
 
 [cloneofsimo](https://github.com/cloneofsimo) was the first to try out LoRA training for Stable Diffusion in the popular [lora](https://github.com/cloneofsimo/lora) GitHub repository.
 
 <Tip>
 
 LoRA allows us to achieve greater memory efficiency since the pretrained weights are kept frozen and only the LoRA weights are trained, thereby
-allowing us to run fine-tuning on consumer GPUs like Tesla T4, RTX 3080 or even RTX 2080 Ti!
+allowing us to run fine-tuning on consumer GPUs like Tesla T4, RTX 3080 or even RTX 2080 Ti! One can get access to GPUs like T4 in the free 
+tiers of Kaggle Kernels and Google Colab Notebooks.
 
 </Tip>
 
@@ -73,6 +78,9 @@ accelerate launch train_dreambooth_lora.py \
   --push_to_hub
 ```
 
+A similar process can be followed to fully fine-tune Stable Diffusion on a custom dataset using the
+`examples/text_to_image/train_text_to_image_lora.py` script.
+
 Refer to the respective examples linked above to learn more. 
 
 <Tip>
@@ -111,15 +119,40 @@ pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4",
 pipe.unet.load_attn_procs(model_path)
 pipe.to("cuda")
 
-prompt = "A pokemon with green eyes and red legs."
+prompt = "A pokemon with blue eyes."
 image = pipe(prompt, num_inference_steps=30, guidance_scale=7.5).images[0]
 image.save("pokemon.png")
 ```
 
+Here are some example images you can expect:
+
+<div align="center">
+<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pokemon-collage.png"/>
+<div>
+
 [`sayakpaul/sd-model-finetuned-lora-t4`](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4) contains [LoRA fine-tuned update matrices](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4/blob/main/pytorch_lora_weights.bin)
 which is only 3 MBs in size. During inference, the pre-trained Stable Diffusion checkpoints are loaded alongside these update
 matrices and then they are combined to run inference.
 
+<Tip>
+
+You can use the [`huggingface_hub`](https://github.com/huggingface/huggingface_hub) library to retrieve the base model
+from [`sayakpaul/sd-model-finetuned-lora-t4`](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4) like so:
+
+```py
+from huggingface_hub.repocard import RepoCard
+
+card = RepoCard.load("sayakpaul/sd-model-finetuned-lora-t4")
+base_model = card.data.to_dict()["base_model"]
+# 'CompVis/stable-diffusion-v1-4'
+```
+
+And then you can use `pipe = StableDiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.float16)`.
+
+This is especially useful when you don't want to hardcode the base model identifier during initializing the `StableDiffusionPipeline`.
+
+</Tip>
+
 Inference for DreamBooth training remains the same. Check
 [this section](https://github.com/huggingface/diffusers/tree/main/examples/dreambooth#inference-1) for more details. 
 

From 7f23db648306493da83213e1cf3fedecd79a63de Mon Sep 17 00:00:00 2001
From: Sayak Paul <spsayakpaul@gmail.com>
Date: Wed, 25 Jan 2023 13:42:55 +0530
Subject: [PATCH 4/5] Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
---
 docs/source/en/training/lora.mdx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/source/en/training/lora.mdx b/docs/source/en/training/lora.mdx
index ca2536989b09..c0bad7c7035d 100644
--- a/docs/source/en/training/lora.mdx
+++ b/docs/source/en/training/lora.mdx
@@ -22,9 +22,9 @@ to existing weights and **only** training those newly added weights. This has a
 
 - Previous pretrained weights are kept frozen so that the model is not so prone to [catastrophic forgetting](https://www.pnas.org/doi/10.1073/pnas.1611835114). 
 - Rank-decomposition matrices have significantly fewer parameters than the original model, which means that trained LoRA weights are easily portable.
-- LoRA matrices are generally added to the attention layers of the original model and they control to control to which extent the model is adapted toward new training images via a `scale` parameter.
+- LoRA matrices are generally added to the attention layers of the original model and they control to which extent the model is adapted toward new training images via a `scale` parameter.
 
-**__Note that the usage of LoRA is not limited to only attention layers. In the original LoRA work, the authors found out that just ammending
+**__Note that the usage of LoRA is not just limited to attention layers. In the original LoRA work, the authors found out that just amending
 the attention layers of a language model is sufficient to obtain good downstream performance with great efficiency. This is why, it's common
 to just add the LoRA weights to the attention layers of a model.__**
 

From 76562ea4763bb51e4b9c92b06a3806291d8433f6 Mon Sep 17 00:00:00 2001
From: Sayak Paul <spsayakpaul@gmail.com>
Date: Wed, 25 Jan 2023 13:45:19 +0530
Subject: [PATCH 5/5] remove visually incoherent elements.

---
 docs/source/en/training/lora.mdx | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/docs/source/en/training/lora.mdx b/docs/source/en/training/lora.mdx
index c0bad7c7035d..e863e9d56d86 100644
--- a/docs/source/en/training/lora.mdx
+++ b/docs/source/en/training/lora.mdx
@@ -126,16 +126,12 @@ image.save("pokemon.png")
 
 Here are some example images you can expect:
 
-<div align="center">
 <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pokemon-collage.png"/>
-<div>
 
 [`sayakpaul/sd-model-finetuned-lora-t4`](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4) contains [LoRA fine-tuned update matrices](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4/blob/main/pytorch_lora_weights.bin)
 which is only 3 MBs in size. During inference, the pre-trained Stable Diffusion checkpoints are loaded alongside these update
 matrices and then they are combined to run inference.
 
-<Tip>
-
 You can use the [`huggingface_hub`](https://github.com/huggingface/huggingface_hub) library to retrieve the base model
 from [`sayakpaul/sd-model-finetuned-lora-t4`](https://huggingface.co/sayakpaul/sd-model-finetuned-lora-t4) like so:
 
@@ -151,8 +147,6 @@ And then you can use `pipe = StableDiffusionPipeline.from_pretrained(base_model,
 
 This is especially useful when you don't want to hardcode the base model identifier during initializing the `StableDiffusionPipeline`.
 
-</Tip>
-
 Inference for DreamBooth training remains the same. Check
 [this section](https://github.com/huggingface/diffusers/tree/main/examples/dreambooth#inference-1) for more details.