Skip to content

Fix Falcon weight mapping for H2O.ai checkpoints#953

Merged
Narsil merged 1 commit intohuggingface:mainfrom
Vinno97:feature/falcon-fix-weight-alias
Aug 31, 2023
Merged

Fix Falcon weight mapping for H2O.ai checkpoints#953
Narsil merged 1 commit intohuggingface:mainfrom
Vinno97:feature/falcon-fix-weight-alias

Conversation

@Vinno97
Copy link
Contributor

@Vinno97 Vinno97 commented Aug 30, 2023

What does this PR do?

During the safetensor conversion, duplicate weights are removed. However, which of the duplicates gets removed, differs per checkpoint. In some, like h2oai/h2ogpt-oig-oasst1-falcon-40b, the weight transformer.word_embeddings.weightSafetensor gets removed. In others, lm_head.weight gets removed. Long story long, we need to support both.

Originally, f018143 mapped lm_head to word_embeddings. Then ac736fd switched this around. This commit merges them and allows for both.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@Narsil, you wrote both commits I referenced in this PR. I think you'll understand this change :)

During the safetensor conversion, duplicate weights are removed.
However, which of the duplicates gets removed, differs per checkpoint.
In some, like `h2oai/h2ogpt-oig-oasst1-falcon-40b`, the weight
`transformer.word_embeddings.weightSafetensor` gets removed. In others,
`lm_head.weight` gets removed. Long story long, we need to support both.

Originally, f018143 mapped `lm_head` to `word_embeddings`. Then ac736fd
switched this around. This commit merges them and allows for both.
@Narsil
Copy link
Contributor

Narsil commented Aug 31, 2023

@Vinno97 The supposedly randomness should have been removed by earlier commits (using ttransformers hints).
Ofc the hints will not work for falcon, let's go with this.

@Narsil Narsil merged commit 8a5f564 into huggingface:main Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants