Skip to content

Save and display per-token attention maps #1866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 10, 2022

Conversation

damian0815
Copy link
Contributor

@damian0815 damian0815 commented Dec 8, 2022

This pull request enables the display of per-token attention maps after generating an image.

Done:

  • Collect and return attention maps to generate.py
  • Pass attention maps and tokens to webUI - currently pushed as the following fields on the object emitted to the webUI socket with generationResult:
    • attentionMaps (base64 image, size width/8 x 77*height/8), and
    • tokens, see below.

Todo:

Typical content of the tokens array is, eg for prompt a fluffy miyazaki dog, an array: ['a</w>', 'fluffy</w>', 'mi', 'yaz', 'aki</w>', 'dog']. The </w> strings represent "end-of-word". With this implementation, to match tokens to fragments of the input prompt text in the input box, the frontend code is going to have to crawl through the prompt and do a best-fit match of these tokens to the prompt string.

@blessedcoolant
Copy link
Collaborator

Generation error: AttributeError: 'CrossAttention' object has no attribute 'cached_mem_free_total'

@damian0815 damian0815 force-pushed the feat_save_attention_maps_redo branch from 8b8a43e to a054fae Compare December 9, 2022 13:03
@damian0815 damian0815 marked this pull request as ready for review December 9, 2022 14:46
@damian0815
Copy link
Contributor Author

this should be merged into main asap, even if that means frontend isn't using it.

attention map collection is always-on in this PR but the memory/performance impact is negligible.

@psychedelicious psychedelicious self-requested a review December 10, 2022 10:52
Copy link
Collaborator

@psychedelicious psychedelicious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 simple change requested - adding types to util function. I haven't been consistent with this at all but I want to make an effort to add types whenever possible going forward.

This is not related to this particular PR, but we will need to save the attention map data somewhere besides just the gallery. The gallery images arrays are not persisted across reloads, and even if they were, resetting localstorage will of course clear them.

So we need to think about how to store them as metadata.

@damian0815 damian0815 merged commit 786b887 into invoke-ai:main Dec 10, 2022
lstein pushed a commit that referenced this pull request Dec 10, 2022
lstein pushed a commit that referenced this pull request Dec 11, 2022
* fix for crash using inpainting model

* prevent crash due to invalid attention_maps_saver
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants