-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Closed
Description
System Info
transformers 4.36.1.
transformers/models/llama/modeling_llama.py", line 1093, in forward
next_cache = next_decoder_cache.to_legacy_cache() if use_legacy_cache else next_decoder_cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'tuple' object has no attribute 'to_legacy_cache'
This error pops up when running inference with llama 2 model with the new tranformers 4.36.1. I didn't test 4.36.0. It was running correctly with 4.35.x.
This seems to be related to changes from #26681, and commit 633215b.
@ArthurZucker and @younesbelkada according to suggestions in "Who can help?"
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Sorry that I don't have an easy reprod now. Here is the relavant stack trace:
File "###transformers/generation/utils.py", line 1764, in generate
return self.sample(
^^^^^^^^^^^^
File "###transformers/generation/utils.py", line 2861, in sample
outputs = self(
^^^^^
File "###torch/nn/modules/module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "###transformers/models/llama/modeling_llama.py", line 1181, in forward
outputs = self.model(
^^^^^^^^^^^
File "###torch/nn/modules/module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "###transformers/models/llama/modeling_llama.py", line 1093, in forward
next_cache = next_decoder_cache.to_legacy_cache() if use_legacy_cache else next_decoder_cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'tuple' object has no attribute 'to_legacy_cache'
Expected behavior
Crash with the provided stack track.
junewgl, tonystz, yiakwy-xpu-ml-framework-team, allanj, deema-A and 1 more
Metadata
Metadata
Assignees
Labels
No labels