Skip to content

Commit 1f02087

Browse files
authored
[docs] More API stuff (#3835)
* clean up loaders * clean up rest of main class apis * apply feedback
1 parent 95ea538 commit 1f02087

15 files changed

+348
-451
lines changed

docs/source/en/_toctree.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@
149149
- local: api/utilities
150150
title: Utilities
151151
- local: api/image_processor
152-
title: Vae Image Processor
152+
title: VAE Image Processor
153153
title: Main Classes
154154
- sections:
155155
- local: api/pipelines/overview

docs/source/en/api/configuration.mdx

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,13 @@ specific language governing permissions and limitations under the License.
1212

1313
# Configuration
1414

15-
Schedulers from [`~schedulers.scheduling_utils.SchedulerMixin`] and models from [`ModelMixin`] inherit from [`ConfigMixin`] which conveniently takes care of storing all the parameters that are
16-
passed to their respective `__init__` methods in a JSON-configuration file.
15+
Schedulers from [`~schedulers.scheduling_utils.SchedulerMixin`] and models from [`ModelMixin`] inherit from [`ConfigMixin`] which stores all the parameters that are passed to their respective `__init__` methods in a JSON-configuration file.
16+
17+
<Tip>
18+
19+
To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.
20+
21+
</Tip>
1722

1823
## ConfigMixin
1924

docs/source/en/api/diffusion_pipeline.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,12 @@ specific language governing permissions and limitations under the License.
1212

1313
# Pipelines
1414

15-
The [`DiffusionPipeline`] is the easiest way to load any pretrained diffusion pipeline from the [Hub](https://huggingface.co/models?library=diffusers) and use it for inference.
15+
The [`DiffusionPipeline`] is the quickest way to load any pretrained diffusion pipeline from the [Hub](https://huggingface.co/models?library=diffusers) for inference.
1616

1717
<Tip>
18-
18+
1919
You shouldn't use the [`DiffusionPipeline`] class for training or finetuning a diffusion model. Individual
20-
components (for example, [`UNetModel`] and [`UNetConditionModel`]) of diffusion pipelines are usually trained individually, so we suggest directly working with instead.
20+
components (for example, [`UNet2DModel`] and [`UNet2DConditionModel`]) of diffusion pipelines are usually trained individually, so we suggest directly working with them instead.
2121

2222
</Tip>
2323

docs/source/en/api/image_processor.mdx

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10,24 +10,18 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
1010
specific language governing permissions and limitations under the License.
1111
-->
1212

13-
# Image Processor for VAE
14-
15-
Image processor provides a unified API for Stable Diffusion pipelines to prepare their image inputs for VAE encoding, as well as post-processing their outputs once decoded. This includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and Numpy arrays.
16-
17-
All pipelines with VAE image processor will accept image inputs in the format of PIL Image, PyTorch tensor, or Numpy array, and will able to return outputs in the format of PIL Image, Pytorch tensor, and Numpy array based on the `output_type` argument from the user. Additionally, the User can pass encoded image latents directly to the pipeline, or ask the pipeline to return latents as output with `output_type = 'pt'` argument. This allows you to take the generated latents from one pipeline and pass it to another pipeline as input, without ever having to leave the latent space. It also makes it much easier to use multiple pipelines together, by passing PyTorch tensors directly between different pipelines.
18-
19-
20-
# Image Processor for VAE adapted to LDM3D
21-
22-
LDM3D Image processor does the same as the Image processor for VAE but accepts both RGB and depth inputs and will return RGB and depth outputs.
13+
# VAE Image Processor
2314

15+
The [`VaeImageProcessor`] provides a unified API for [`StableDiffusionPipeline`]'s to prepare image inputs for VAE encoding and post-processing outputs once they're decoded. This includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and NumPy arrays.
2416

17+
All pipelines with [`VaeImageProcessor`] accepts PIL Image, PyTorch tensor, or NumPy arrays as image inputs and returns outputs based on the `output_type` argument by the user. You can pass encoded image latents directly to the pipeline and return latents from the pipeline as a specific output with the `output_type` argument (for example `output_type="pt"`). This allows you to take the generated latents from one pipeline and pass it to another pipeline as input without leaving the latent space. It also makes it much easier to use multiple pipelines together by passing PyTorch tensors directly between different pipelines.
2518

2619
## VaeImageProcessor
2720

2821
[[autodoc]] image_processor.VaeImageProcessor
2922

30-
3123
## VaeImageProcessorLDM3D
3224

25+
The [`VaeImageProcessorLDM3D`] accepts RGB and depth inputs and returns RGB and depth outputs.
26+
3327
[[autodoc]] image_processor.VaeImageProcessorLDM3D

docs/source/en/api/loaders.mdx

Lines changed: 8 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -12,31 +12,26 @@ specific language governing permissions and limitations under the License.
1212

1313
# Loaders
1414

15-
There are many ways to train adapter neural networks for diffusion models, such as
16-
- [Textual Inversion](./training/text_inversion.mdx)
17-
- [LoRA](https://github.com/cloneofsimo/lora)
18-
- [Hypernetworks](https://arxiv.org/abs/1609.09106)
15+
Adapters (textual inversion, LoRA, hypernetworks) allow you to modify a diffusion model to generate images in a specific style without training or finetuning the entire model. The adapter weights are typically only a tiny fraction of the pretrained model's which making them very portable. 🤗 Diffusers provides an easy-to-use `LoaderMixin` API to load adapter weights.
1916

20-
Such adapter neural networks often only consist of a fraction of the number of weights compared
21-
to the pretrained model and as such are very portable. The Diffusers library offers an easy-to-use
22-
API to load such adapter neural networks via the [`loaders.py` module](https://github.com/huggingface/diffusers/blob/main/src/diffusers/loaders.py).
17+
<Tip warning={true}>
2318

24-
**Note**: This module is still highly experimental and prone to future changes.
19+
🧪 The `LoaderMixins` are highly experimental and prone to future changes. To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.
2520

26-
## LoaderMixins
21+
</Tip>
2722

28-
### UNet2DConditionLoadersMixin
23+
## UNet2DConditionLoadersMixin
2924

3025
[[autodoc]] loaders.UNet2DConditionLoadersMixin
3126

32-
### TextualInversionLoaderMixin
27+
## TextualInversionLoaderMixin
3328

3429
[[autodoc]] loaders.TextualInversionLoaderMixin
3530

36-
### LoraLoaderMixin
31+
## LoraLoaderMixin
3732

3833
[[autodoc]] loaders.LoraLoaderMixin
3934

40-
### FromCkptMixin
35+
## FromCkptMixin
4136

4237
[[autodoc]] loaders.FromCkptMixin

docs/source/en/api/logging.mdx

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,9 @@ specific language governing permissions and limitations under the License.
1212

1313
# Logging
1414

15-
🧨 Diffusers has a centralized logging system, so that you can setup the verbosity of the library easily.
15+
🤗 Diffusers has a centralized logging system to easily manage the verbosity of the library. The default verbosity is set to `WARNING`.
1616

17-
Currently the default verbosity of the library is `WARNING`.
18-
19-
To change the level of verbosity, just use one of the direct setters. For instance, here is how to change the verbosity
20-
to the INFO level.
17+
To change the verbosity level, use one of the direct setters. For instance, to change the verbosity to the `INFO` level.
2118

2219
```python
2320
import diffusers
@@ -33,7 +30,7 @@ DIFFUSERS_VERBOSITY=error ./myprogram.py
3330
```
3431

3532
Additionally, some `warnings` can be disabled by setting the environment variable
36-
`DIFFUSERS_NO_ADVISORY_WARNINGS` to a true value, like *1*. This will disable any warning that is logged using
33+
`DIFFUSERS_NO_ADVISORY_WARNINGS` to a true value, like `1`. This disables any warning logged by
3734
[`logger.warning_advice`]. For example:
3835

3936
```bash
@@ -52,20 +49,21 @@ logger.warning("WARN")
5249
```
5350

5451

55-
All the methods of this logging module are documented below, the main ones are
52+
All methods of the logging module are documented below. The main methods are
5653
[`logging.get_verbosity`] to get the current level of verbosity in the logger and
57-
[`logging.set_verbosity`] to set the verbosity to the level of your choice. In order (from the least
58-
verbose to the most verbose), those levels (with their corresponding int values in parenthesis) are:
59-
60-
- `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` (int value, 50): only report the most
61-
critical errors.
62-
- `diffusers.logging.ERROR` (int value, 40): only report errors.
63-
- `diffusers.logging.WARNING` or `diffusers.logging.WARN` (int value, 30): only reports error and
64-
warnings. This is the default level used by the library.
65-
- `diffusers.logging.INFO` (int value, 20): reports error, warnings and basic information.
66-
- `diffusers.logging.DEBUG` (int value, 10): report all information.
67-
68-
By default, `tqdm` progress bars will be displayed during model download. [`logging.disable_progress_bar`] and [`logging.enable_progress_bar`] can be used to suppress or unsuppress this behavior.
54+
[`logging.set_verbosity`] to set the verbosity to the level of your choice.
55+
56+
In order from the least verbose to the most verbose:
57+
58+
| Method | Integer value | Description |
59+
|----------------------------------------------------------:|--------------:|----------------------------------------------------:|
60+
| `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` | 50 | only report the most critical errors |
61+
| `diffusers.logging.ERROR` | 40 | only report errors |
62+
| `diffusers.logging.WARNING` or `diffusers.logging.WARN` | 30 | only report errors and warnings (default) |
63+
| `diffusers.logging.INFO` | 20 | only report errors, warnings, and basic information |
64+
| `diffusers.logging.DEBUG` | 10 | report all information |
65+
66+
By default, `tqdm` progress bars are displayed during model download. [`logging.disable_progress_bar`] and [`logging.enable_progress_bar`] are used to enable or disable this behavior.
6967

7068
## Base setters
7169

docs/source/en/api/outputs.mdx

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,9 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
1010
specific language governing permissions and limitations under the License.
1111
-->
1212

13-
# BaseOutputs
13+
# Outputs
1414

15-
All models have outputs that are subclasses of [`~utils.BaseOutput`]. Those are
16-
data structures containing all the information returned by the model, but they can also be used as tuples or
17-
dictionaries.
15+
All models outputs are subclasses of [`~utils.BaseOutput`], data structures containing all the information returned by the model. The outputs can also be used as tuples or dictionaries.
1816

1917
For example:
2018

src/diffusers/configuration_utils.py

Lines changed: 19 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -81,18 +81,17 @@ def __setitem__(self, name, value):
8181

8282
class ConfigMixin:
8383
r"""
84-
Base class for all configuration classes. Stores all configuration parameters under `self.config` Also handles all
85-
methods for loading/downloading/saving classes inheriting from [`ConfigMixin`] with
86-
- [`~ConfigMixin.from_config`]
87-
- [`~ConfigMixin.save_config`]
84+
Base class for all configuration classes. All configuration parameters are stored under `self.config`. Also
85+
provides the [`~ConfigMixin.from_config`] and [`~ConfigMixin.save_config`] methods for loading, downloading, and
86+
saving classes that inherit from [`ConfigMixin`].
8887
8988
Class attributes:
9089
- **config_name** (`str`) -- A filename under which the config should stored when calling
9190
[`~ConfigMixin.save_config`] (should be overridden by parent class).
9291
- **ignore_for_config** (`List[str]`) -- A list of attributes that should not be saved in the config (should be
9392
overridden by subclass).
9493
- **has_compatibles** (`bool`) -- Whether the class has compatible classes (should be overridden by subclass).
95-
- **_deprecated_kwargs** (`List[str]`) -- Keyword arguments that are deprecated. Note that the init function
94+
- **_deprecated_kwargs** (`List[str]`) -- Keyword arguments that are deprecated. Note that the `init` function
9695
should only have a `kwargs` argument if at least one argument is deprecated (should be overridden by
9796
subclass).
9897
"""
@@ -139,12 +138,12 @@ def __getattr__(self, name: str) -> Any:
139138

140139
def save_config(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs):
141140
"""
142-
Save a configuration object to the directory `save_directory`, so that it can be re-loaded using the
141+
Save a configuration object to the directory specified in `save_directory` so that it can be reloaded using the
143142
[`~ConfigMixin.from_config`] class method.
144143
145144
Args:
146145
save_directory (`str` or `os.PathLike`):
147-
Directory where the configuration JSON file will be saved (will be created if it does not exist).
146+
Directory where the configuration JSON file is saved (will be created if it does not exist).
148147
"""
149148
if os.path.isfile(save_directory):
150149
raise AssertionError(f"Provided path ({save_directory}) should be a directory, not a file")
@@ -164,15 +163,14 @@ def from_config(cls, config: Union[FrozenDict, Dict[str, Any]] = None, return_un
164163
165164
Parameters:
166165
config (`Dict[str, Any]`):
167-
A config dictionary from which the Python class will be instantiated. Make sure to only load
168-
configuration files of compatible classes.
166+
A config dictionary from which the Python class is instantiated. Make sure to only load configuration
167+
files of compatible classes.
169168
return_unused_kwargs (`bool`, *optional*, defaults to `False`):
170169
Whether kwargs that are not consumed by the Python class should be returned or not.
171-
172170
kwargs (remaining dictionary of keyword arguments, *optional*):
173171
Can be used to update the configuration object (after it is loaded) and initiate the Python class.
174-
`**kwargs` are directly passed to the underlying scheduler/model's `__init__` method and eventually
175-
overwrite same named arguments in `config`.
172+
`**kwargs` are passed directly to the underlying scheduler/model's `__init__` method and eventually
173+
overwrite the same named arguments in `config`.
176174
177175
Returns:
178176
[`ModelMixin`] or [`SchedulerMixin`]:
@@ -280,16 +278,16 @@ def load_config(
280278
Whether or not to force the (re-)download of the model weights and configuration files, overriding the
281279
cached versions if they exist.
282280
resume_download (`bool`, *optional*, defaults to `False`):
283-
Whether or not to resume downloading the model weights and configuration files. If set to False, any
281+
Whether or not to resume downloading the model weights and configuration files. If set to `False`, any
284282
incompletely downloaded files are deleted.
285283
proxies (`Dict[str, str]`, *optional*):
286284
A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128',
287285
'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
288286
output_loading_info(`bool`, *optional*, defaults to `False`):
289287
Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.
290-
local_files_only(`bool`, *optional*, defaults to `False`):
291-
Whether to only load local model weights and configuration files or not. If set to True, the model
292-
wont be downloaded from the Hub.
288+
local_files_only (`bool`, *optional*, defaults to `False`):
289+
Whether to only load local model weights and configuration files or not. If set to `True`, the model
290+
won't be downloaded from the Hub.
293291
use_auth_token (`str` or *bool*, *optional*):
294292
The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from
295293
`diffusers-cli login` (stored in `~/.huggingface`) is used.
@@ -307,14 +305,6 @@ def load_config(
307305
`dict`:
308306
A dictionary of all the parameters stored in a JSON configuration file.
309307
310-
<Tip>
311-
312-
To use private or [gated models](https://huggingface.co/docs/hub/models-gated#gated-models), log-in with
313-
`huggingface-cli login`. You can also activate the special
314-
["offline-mode"](https://huggingface.co/transformers/installation.html#offline-mode) to use this method in a
315-
firewalled environment.
316-
317-
</Tip>
318308
"""
319309
cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE)
320310
force_download = kwargs.pop("force_download", False)
@@ -536,10 +526,11 @@ def config(self) -> Dict[str, Any]:
536526

537527
def to_json_string(self) -> str:
538528
"""
539-
Serializes this instance to a JSON string.
529+
Serializes the configuration instance to a JSON string.
540530
541531
Returns:
542-
`str`: String containing all the attributes that make up this configuration instance in JSON format.
532+
`str`:
533+
String containing all the attributes that make up the configuration instance in JSON format.
543534
"""
544535
config_dict = self._internal_dict if hasattr(self, "_internal_dict") else {}
545536
config_dict["_class_name"] = self.__class__.__name__
@@ -560,11 +551,11 @@ def to_json_saveable(value):
560551

561552
def to_json_file(self, json_file_path: Union[str, os.PathLike]):
562553
"""
563-
Save this instance to a JSON file.
554+
Save the configuration instance's parameters to a JSON file.
564555
565556
Args:
566557
json_file_path (`str` or `os.PathLike`):
567-
Path to the JSON file in which this configuration instance's parameters will be saved.
558+
Path to the JSON file to save a configuration instance's parameters.
568559
"""
569560
with open(json_file_path, "w", encoding="utf-8") as writer:
570561
writer.write(self.to_json_string())

0 commit comments

Comments
 (0)