Decouple custom ops in llama_transformer.py Part 1/N (#3005) #3052
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
This is a no-op
Pull Request resolved: #3005
Test Plan:
CI
Run with
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -kv --use_sdpa_with_kv_cache -X
and with
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -kv -X
Make sure both work
Reviewed By: cccclai
Differential Revision: D56048177
Pulled By: mergennachin
fbshipit-source-id: 3ac9ac5c34f6fe215de1cfe8b5ddc7aae3635359 (cherry picked from commit 488afc5)