Skip to content

[docs] More API stuff #3835

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@
- local: api/utilities
title: Utilities
- local: api/image_processor
title: Vae Image Processor
title: VAE Image Processor
title: Main Classes
- sections:
- local: api/pipelines/overview
Expand Down
9 changes: 7 additions & 2 deletions docs/source/en/api/configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,13 @@ specific language governing permissions and limitations under the License.

# Configuration

Schedulers from [`~schedulers.scheduling_utils.SchedulerMixin`] and models from [`ModelMixin`] inherit from [`ConfigMixin`] which conveniently takes care of storing all the parameters that are
passed to their respective `__init__` methods in a JSON-configuration file.
Schedulers from [`~schedulers.scheduling_utils.SchedulerMixin`] and models from [`ModelMixin`] inherit from [`ConfigMixin`] which stores all the parameters that are passed to their respective `__init__` methods in a JSON-configuration file.

<Tip>

To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.

</Tip>

## ConfigMixin

Expand Down
6 changes: 3 additions & 3 deletions docs/source/en/api/diffusion_pipeline.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@ specific language governing permissions and limitations under the License.

# Pipelines

The [`DiffusionPipeline`] is the easiest way to load any pretrained diffusion pipeline from the [Hub](https://huggingface.co/models?library=diffusers) and use it for inference.
The [`DiffusionPipeline`] is the quickest way to load any pretrained diffusion pipeline from the [Hub](https://huggingface.co/models?library=diffusers) for inference.

<Tip>

You shouldn't use the [`DiffusionPipeline`] class for training or finetuning a diffusion model. Individual
components (for example, [`UNetModel`] and [`UNetConditionModel`]) of diffusion pipelines are usually trained individually, so we suggest directly working with instead.
components (for example, [`UNet2DModel`] and [`UNet2DConditionModel`]) of diffusion pipelines are usually trained individually, so we suggest directly working with them instead.

</Tip>

Expand Down
16 changes: 5 additions & 11 deletions docs/source/en/api/image_processor.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,24 +10,18 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->

# Image Processor for VAE

Image processor provides a unified API for Stable Diffusion pipelines to prepare their image inputs for VAE encoding, as well as post-processing their outputs once decoded. This includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and Numpy arrays.

All pipelines with VAE image processor will accept image inputs in the format of PIL Image, PyTorch tensor, or Numpy array, and will able to return outputs in the format of PIL Image, Pytorch tensor, and Numpy array based on the `output_type` argument from the user. Additionally, the User can pass encoded image latents directly to the pipeline, or ask the pipeline to return latents as output with `output_type = 'pt'` argument. This allows you to take the generated latents from one pipeline and pass it to another pipeline as input, without ever having to leave the latent space. It also makes it much easier to use multiple pipelines together, by passing PyTorch tensors directly between different pipelines.


# Image Processor for VAE adapted to LDM3D

LDM3D Image processor does the same as the Image processor for VAE but accepts both RGB and depth inputs and will return RGB and depth outputs.
# VAE Image Processor

The [`VaeImageProcessor`] provides a unified API for [`StableDiffusionPipeline`]'s to prepare image inputs for VAE encoding and post-processing outputs once they're decoded. This includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and NumPy arrays.

All pipelines with [`VaeImageProcessor`] accepts PIL Image, PyTorch tensor, or NumPy arrays as image inputs and returns outputs based on the `output_type` argument by the user. You can pass encoded image latents directly to the pipeline and return latents from the pipeline as a specific output with the `output_type` argument (for example `output_type="pt"`). This allows you to take the generated latents from one pipeline and pass it to another pipeline as input without leaving the latent space. It also makes it much easier to use multiple pipelines together by passing PyTorch tensors directly between different pipelines.

## VaeImageProcessor

[[autodoc]] image_processor.VaeImageProcessor


## VaeImageProcessorLDM3D

The [`VaeImageProcessorLDM3D`] accepts RGB and depth inputs and returns RGB and depth outputs.

[[autodoc]] image_processor.VaeImageProcessorLDM3D
21 changes: 8 additions & 13 deletions docs/source/en/api/loaders.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,31 +12,26 @@ specific language governing permissions and limitations under the License.

# Loaders

There are many ways to train adapter neural networks for diffusion models, such as
- [Textual Inversion](./training/text_inversion.mdx)
- [LoRA](https://github.com/cloneofsimo/lora)
- [Hypernetworks](https://arxiv.org/abs/1609.09106)
Adapters (textual inversion, LoRA, hypernetworks) allow you to modify a diffusion model to generate images in a specific style without training or finetuning the entire model. The adapter weights are typically only a tiny fraction of the pretrained model's which making them very portable. 🤗 Diffusers provides an easy-to-use `LoaderMixin` API to load adapter weights.

Such adapter neural networks often only consist of a fraction of the number of weights compared
to the pretrained model and as such are very portable. The Diffusers library offers an easy-to-use
API to load such adapter neural networks via the [`loaders.py` module](https://github.com/huggingface/diffusers/blob/main/src/diffusers/loaders.py).
<Tip warning={true}>

**Note**: This module is still highly experimental and prone to future changes.
🧪 The `LoaderMixins` are highly experimental and prone to future changes. To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.

## LoaderMixins
</Tip>

### UNet2DConditionLoadersMixin
## UNet2DConditionLoadersMixin

[[autodoc]] loaders.UNet2DConditionLoadersMixin

### TextualInversionLoaderMixin
## TextualInversionLoaderMixin

[[autodoc]] loaders.TextualInversionLoaderMixin

### LoraLoaderMixin
## LoraLoaderMixin

[[autodoc]] loaders.LoraLoaderMixin

### FromCkptMixin
## FromCkptMixin

[[autodoc]] loaders.FromCkptMixin
36 changes: 17 additions & 19 deletions docs/source/en/api/logging.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,9 @@ specific language governing permissions and limitations under the License.

# Logging

🧨 Diffusers has a centralized logging system, so that you can setup the verbosity of the library easily.
🤗 Diffusers has a centralized logging system to easily manage the verbosity of the library. The default verbosity is set to `WARNING`.

Currently the default verbosity of the library is `WARNING`.

To change the level of verbosity, just use one of the direct setters. For instance, here is how to change the verbosity
to the INFO level.
To change the verbosity level, use one of the direct setters. For instance, to change the verbosity to the `INFO` level.

```python
import diffusers
Expand All @@ -33,7 +30,7 @@ DIFFUSERS_VERBOSITY=error ./myprogram.py
```

Additionally, some `warnings` can be disabled by setting the environment variable
`DIFFUSERS_NO_ADVISORY_WARNINGS` to a true value, like *1*. This will disable any warning that is logged using
`DIFFUSERS_NO_ADVISORY_WARNINGS` to a true value, like `1`. This disables any warning logged by
[`logger.warning_advice`]. For example:

```bash
Expand All @@ -52,20 +49,21 @@ logger.warning("WARN")
```


All the methods of this logging module are documented below, the main ones are
All methods of the logging module are documented below. The main methods are
[`logging.get_verbosity`] to get the current level of verbosity in the logger and
[`logging.set_verbosity`] to set the verbosity to the level of your choice. In order (from the least
verbose to the most verbose), those levels (with their corresponding int values in parenthesis) are:

- `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` (int value, 50): only report the most
critical errors.
- `diffusers.logging.ERROR` (int value, 40): only report errors.
- `diffusers.logging.WARNING` or `diffusers.logging.WARN` (int value, 30): only reports error and
warnings. This is the default level used by the library.
- `diffusers.logging.INFO` (int value, 20): reports error, warnings and basic information.
- `diffusers.logging.DEBUG` (int value, 10): report all information.

By default, `tqdm` progress bars will be displayed during model download. [`logging.disable_progress_bar`] and [`logging.enable_progress_bar`] can be used to suppress or unsuppress this behavior.
[`logging.set_verbosity`] to set the verbosity to the level of your choice.

In order from the least verbose to the most verbose:

| Method | Integer value | Description |
|----------------------------------------------------------:|--------------:|----------------------------------------------------:|
| `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` | 50 | only report the most critical errors |
| `diffusers.logging.ERROR` | 40 | only report errors |
| `diffusers.logging.WARNING` or `diffusers.logging.WARN` | 30 | only report errors and warnings (default) |
| `diffusers.logging.INFO` | 20 | only report errors, warnings, and basic information |
| `diffusers.logging.DEBUG` | 10 | report all information |

By default, `tqdm` progress bars are displayed during model download. [`logging.disable_progress_bar`] and [`logging.enable_progress_bar`] are used to enable or disable this behavior.

## Base setters

Expand Down
6 changes: 2 additions & 4 deletions docs/source/en/api/outputs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,9 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->

# BaseOutputs
# Outputs

All models have outputs that are subclasses of [`~utils.BaseOutput`]. Those are
data structures containing all the information returned by the model, but they can also be used as tuples or
dictionaries.
All models outputs are subclasses of [`~utils.BaseOutput`], data structures containing all the information returned by the model. The outputs can also be used as tuples or dictionaries.

For example:

Expand Down
47 changes: 19 additions & 28 deletions src/diffusers/configuration_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,18 +81,17 @@ def __setitem__(self, name, value):

class ConfigMixin:
r"""
Base class for all configuration classes. Stores all configuration parameters under `self.config` Also handles all
methods for loading/downloading/saving classes inheriting from [`ConfigMixin`] with
- [`~ConfigMixin.from_config`]
- [`~ConfigMixin.save_config`]
Base class for all configuration classes. All configuration parameters are stored under `self.config`. Also
provides the [`~ConfigMixin.from_config`] and [`~ConfigMixin.save_config`] methods for loading, downloading, and
saving classes that inherit from [`ConfigMixin`].

Class attributes:
- **config_name** (`str`) -- A filename under which the config should stored when calling
[`~ConfigMixin.save_config`] (should be overridden by parent class).
- **ignore_for_config** (`List[str]`) -- A list of attributes that should not be saved in the config (should be
overridden by subclass).
- **has_compatibles** (`bool`) -- Whether the class has compatible classes (should be overridden by subclass).
- **_deprecated_kwargs** (`List[str]`) -- Keyword arguments that are deprecated. Note that the init function
- **_deprecated_kwargs** (`List[str]`) -- Keyword arguments that are deprecated. Note that the `init` function
should only have a `kwargs` argument if at least one argument is deprecated (should be overridden by
subclass).
"""
Expand Down Expand Up @@ -139,12 +138,12 @@ def __getattr__(self, name: str) -> Any:

def save_config(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs):
"""
Save a configuration object to the directory `save_directory`, so that it can be re-loaded using the
Save a configuration object to the directory specified in `save_directory` so that it can be reloaded using the
[`~ConfigMixin.from_config`] class method.

Args:
save_directory (`str` or `os.PathLike`):
Directory where the configuration JSON file will be saved (will be created if it does not exist).
Directory where the configuration JSON file is saved (will be created if it does not exist).
"""
if os.path.isfile(save_directory):
raise AssertionError(f"Provided path ({save_directory}) should be a directory, not a file")
Expand All @@ -164,15 +163,14 @@ def from_config(cls, config: Union[FrozenDict, Dict[str, Any]] = None, return_un

Parameters:
config (`Dict[str, Any]`):
A config dictionary from which the Python class will be instantiated. Make sure to only load
configuration files of compatible classes.
A config dictionary from which the Python class is instantiated. Make sure to only load configuration
files of compatible classes.
return_unused_kwargs (`bool`, *optional*, defaults to `False`):
Whether kwargs that are not consumed by the Python class should be returned or not.

kwargs (remaining dictionary of keyword arguments, *optional*):
Can be used to update the configuration object (after it is loaded) and initiate the Python class.
`**kwargs` are directly passed to the underlying scheduler/model's `__init__` method and eventually
overwrite same named arguments in `config`.
`**kwargs` are passed directly to the underlying scheduler/model's `__init__` method and eventually
overwrite the same named arguments in `config`.

Returns:
[`ModelMixin`] or [`SchedulerMixin`]:
Expand Down Expand Up @@ -280,16 +278,16 @@ def load_config(
Whether or not to force the (re-)download of the model weights and configuration files, overriding the
cached versions if they exist.
resume_download (`bool`, *optional*, defaults to `False`):
Whether or not to resume downloading the model weights and configuration files. If set to False, any
Whether or not to resume downloading the model weights and configuration files. If set to `False`, any
incompletely downloaded files are deleted.
proxies (`Dict[str, str]`, *optional*):
A dictionary of proxy servers to use by protocol or endpoint, for example, `{'http': 'foo.bar:3128',
'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
output_loading_info(`bool`, *optional*, defaults to `False`):
Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only(`bool`, *optional*, defaults to `False`):
Whether to only load local model weights and configuration files or not. If set to True, the model
wont be downloaded from the Hub.
local_files_only (`bool`, *optional*, defaults to `False`):
Whether to only load local model weights and configuration files or not. If set to `True`, the model
won't be downloaded from the Hub.
use_auth_token (`str` or *bool*, *optional*):
The token to use as HTTP bearer authorization for remote files. If `True`, the token generated from
`diffusers-cli login` (stored in `~/.huggingface`) is used.
Expand All @@ -307,14 +305,6 @@ def load_config(
`dict`:
A dictionary of all the parameters stored in a JSON configuration file.

<Tip>

To use private or [gated models](https://huggingface.co/docs/hub/models-gated#gated-models), log-in with
`huggingface-cli login`. You can also activate the special
["offline-mode"](https://huggingface.co/transformers/installation.html#offline-mode) to use this method in a
firewalled environment.

</Tip>
"""
cache_dir = kwargs.pop("cache_dir", DIFFUSERS_CACHE)
force_download = kwargs.pop("force_download", False)
Expand Down Expand Up @@ -536,10 +526,11 @@ def config(self) -> Dict[str, Any]:

def to_json_string(self) -> str:
"""
Serializes this instance to a JSON string.
Serializes the configuration instance to a JSON string.

Returns:
`str`: String containing all the attributes that make up this configuration instance in JSON format.
`str`:
String containing all the attributes that make up the configuration instance in JSON format.
"""
config_dict = self._internal_dict if hasattr(self, "_internal_dict") else {}
config_dict["_class_name"] = self.__class__.__name__
Expand All @@ -560,11 +551,11 @@ def to_json_saveable(value):

def to_json_file(self, json_file_path: Union[str, os.PathLike]):
"""
Save this instance to a JSON file.
Save the configuration instance's parameters to a JSON file.

Args:
json_file_path (`str` or `os.PathLike`):
Path to the JSON file in which this configuration instance's parameters will be saved.
Path to the JSON file to save a configuration instance's parameters.
"""
with open(json_file_path, "w", encoding="utf-8") as writer:
writer.write(self.to_json_string())
Expand Down
Loading