Skip to content

add load textual inversion embeddings to stable diffusion #2009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 55 commits into from
Mar 30, 2023

Conversation

piEsposito
Copy link
Contributor

Should close #1985

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 16, 2023

The documentation is not available anymore as the PR was closed or merged.

@piEsposito piEsposito marked this pull request as ready for review January 16, 2023 15:10
Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for working on this @piEsposito , this will make loading embeddings very easy!

Instead of adding methods to every pipeline, we could create TextualInversionLoaderMixin class with a method load_textual_inversion_embeddings in pipelines/loaders.py, and the pipelines which can do text inversion can subclass from that mixin.

Also, left some comments below, more specifically.

  • We should follow the embeddings format of the textual inversion script so that we can load all the embeddings in https://huggingface.co/sd-concepts-library
  • We could also support loading single vector embedding from auto1111 and then extend it to multiple embeddings.

Thanks!

@piEsposito
Copy link
Contributor Author

@patil-suraj I'm addressing your review on the next few days, thanks!

@piEsposito piEsposito requested review from patil-suraj and removed request for pcuenca, williamberman and patrickvonplaten January 17, 2023 13:10
@piEsposito
Copy link
Contributor Author

@patil-suraj Github somehow un-requested review from a bunch of HF people. Can you please add them again?

@patrickvonplaten
Copy link
Contributor

@sayakpaul @pcuenca @williamberman could you take a final look here? Made the PR now ready for diffusers design - should work for all use cases.

image = pipe(
"An logo of a turtle in Style-Winter with <low-poly-hd-logos-icons>", generator=generator, output_type="np"
).images[0]
# np.save("/home/patrick/diffusers-images/text_inv/winter_logo_style.npy", image)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# np.save("/home/patrick/diffusers-images/text_inv/winter_logo_style.npy", image)

Copy link
Contributor

@williamberman williamberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we could squash and rebase on main, that would be nice.

Also assuming tests pass

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ship this thing!

Excellent tests, btw. Let's make them pass.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the API!

embedding = state_dict["string_to_param"]["*"]

if token is not None and loaded_token != token:
logger.warn(f"The loaded token: {loaded_token} is overwritten by the passed token {token}.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we want to do the opposite override? (What comes in the state_dict is what gets added)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I'd say what gets passed has priority! If you do:

load_textual_inversion("./textual_inversion", token="<special-token>")

I think the token should be "<special-token>" no matter what's in the dict - it's similar to how we do from_pretrained(unet=unet) overrides

@piEsposito
Copy link
Contributor Author

Come on folks everything but an unrelated test in MPS is passing let's get this thing merged!

embeddings = [e for e in embedding] # noqa: C416
else:
tokens = [token]
embeddings = [embedding] if len(embedding.shape) > 1 else [embedding[0]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying the latest version and I wasn't getting anything related to the embeddings, after changing this I was able to get good results. I'm not sure how this would work with len(embedding.shape) greater than 1 but at least when the shape has only one dimension this seems to fix it.

Suggested change
embeddings = [embedding] if len(embedding.shape) > 1 else [embedding[0]]
embeddings = [embedding[0]] if len(embedding.shape) > 1 else [embedding]
Suggested change
embeddings = [embedding] if len(embedding.shape) > 1 else [embedding[0]]
embeddings = [embedding] if len(embedding.shape) <= 1 else [embedding[0]]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm cannot reproduce this one - my tests are passing just fine on this branch

Copy link

@JarvusChen JarvusChen Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed diffusers/examples/textual_inversion to train my own embedding and got learned_embeds.bin file in the end, then use the example here.

pretrained_path = 'xxx'
embedding_path = 'learned_embeds.bin'
pipe = DiffusionPipeline.from_pretrained(pretrained_path, torch_dtype=torch.float16)
pipe.load_textual_inversion(embedding_path)

It is not working to show any concept from my enbedding token. I have to modify the same with @GuiyeC for the file src/diffusers/loaders.py to make it correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a PR with the fix for this where I try to explain the problem a bit more.

@patrickvonplaten
Copy link
Contributor

Ok let's merge it 🚀

Puuh big PR - thanks so much for kickstarting this @piEsposito and for everybody involved here. Hope that the final solution / design works for everybody

@patrickvonplaten patrickvonplaten merged commit a937e1b into huggingface:main Mar 30, 2023
@EandrewJones
Copy link
Contributor

EandrewJones commented Mar 30, 2023 via email

@patrickvonplaten
Copy link
Contributor

Gosh forgot the most important part: Docs! 😅

In case someone has some spare time for a quick PR on docs on how to use it (both A1111 and diffusers) here would be some good spots to put it:

@blx0102
Copy link

blx0102 commented Apr 6, 2023

@piEsposito @patrickvonplaten Cheers this is done! But yes Docs are needed to show newbies like me how to use it, the discuss here is too long to read them all lol.

@sayakpaul
Copy link
Member

@blx0102 would you be up for contributing a PR? :)

@patrickvonplaten
Copy link
Contributor

Opened a quick PR here: #3068

w4ffl35 pushed a commit to w4ffl35/diffusers that referenced this pull request Apr 14, 2023
…e#2009)

* add load textual inversion embeddings draft

* fix quality

* fix typo

* make fix copies

* move to textual inversion mixin

* make it accept from sd-concept library

* accept list of paths to embeddings

* fix styling of stable diffusion pipeline

* add dummy TextualInversionMixin

* add docstring to textualinversionmixin

* add load textual inversion embeddings draft

* fix quality

* fix typo

* make fix copies

* move to textual inversion mixin

* make it accept from sd-concept library

* accept list of paths to embeddings

* fix styling of stable diffusion pipeline

* add dummy TextualInversionMixin

* add docstring to textualinversionmixin

* add case for parsing embedding from auto1111 UI format

Co-authored-by: Evan Jones <[email protected]>
Co-authored-by: Ana Tamais <[email protected]>

* fix style after rebase

* move textual inversion mixin to loaders

* move mixin inheritance to DiffusionPipeline from StableDiffusionPipeline)

* update dummy class name

* addressed allo comments

* fix old dangling import

* fix style

* proposal

* remove bogus

* Apply suggestions from code review

Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: Will Berman <[email protected]>

* finish

* make style

* up

* fix code quality

* fix code quality - again

* fix code quality - 3

* fix alt diffusion code quality

* fix model editing pipeline

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <[email protected]>

* Finish

---------

Co-authored-by: Evan Jones <[email protected]>
Co-authored-by: Ana Tamais <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: Will Berman <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
…e#2009)

* add load textual inversion embeddings draft

* fix quality

* fix typo

* make fix copies

* move to textual inversion mixin

* make it accept from sd-concept library

* accept list of paths to embeddings

* fix styling of stable diffusion pipeline

* add dummy TextualInversionMixin

* add docstring to textualinversionmixin

* add load textual inversion embeddings draft

* fix quality

* fix typo

* make fix copies

* move to textual inversion mixin

* make it accept from sd-concept library

* accept list of paths to embeddings

* fix styling of stable diffusion pipeline

* add dummy TextualInversionMixin

* add docstring to textualinversionmixin

* add case for parsing embedding from auto1111 UI format

Co-authored-by: Evan Jones <[email protected]>
Co-authored-by: Ana Tamais <[email protected]>

* fix style after rebase

* move textual inversion mixin to loaders

* move mixin inheritance to DiffusionPipeline from StableDiffusionPipeline)

* update dummy class name

* addressed allo comments

* fix old dangling import

* fix style

* proposal

* remove bogus

* Apply suggestions from code review

Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: Will Berman <[email protected]>

* finish

* make style

* up

* fix code quality

* fix code quality - again

* fix code quality - 3

* fix alt diffusion code quality

* fix model editing pipeline

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <[email protected]>

* Finish

---------

Co-authored-by: Evan Jones <[email protected]>
Co-authored-by: Ana Tamais <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: Will Berman <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
…e#2009)

* add load textual inversion embeddings draft

* fix quality

* fix typo

* make fix copies

* move to textual inversion mixin

* make it accept from sd-concept library

* accept list of paths to embeddings

* fix styling of stable diffusion pipeline

* add dummy TextualInversionMixin

* add docstring to textualinversionmixin

* add load textual inversion embeddings draft

* fix quality

* fix typo

* make fix copies

* move to textual inversion mixin

* make it accept from sd-concept library

* accept list of paths to embeddings

* fix styling of stable diffusion pipeline

* add dummy TextualInversionMixin

* add docstring to textualinversionmixin

* add case for parsing embedding from auto1111 UI format

Co-authored-by: Evan Jones <[email protected]>
Co-authored-by: Ana Tamais <[email protected]>

* fix style after rebase

* move textual inversion mixin to loaders

* move mixin inheritance to DiffusionPipeline from StableDiffusionPipeline)

* update dummy class name

* addressed allo comments

* fix old dangling import

* fix style

* proposal

* remove bogus

* Apply suggestions from code review

Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: Will Berman <[email protected]>

* finish

* make style

* up

* fix code quality

* fix code quality - again

* fix code quality - 3

* fix alt diffusion code quality

* fix model editing pipeline

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <[email protected]>

* Finish

---------

Co-authored-by: Evan Jones <[email protected]>
Co-authored-by: Ana Tamais <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: Will Berman <[email protected]>
Co-authored-by: Pedro Cuenca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Please create simple way to load new embeddings on Stable Diffusion Pipeline