[llama 4] dynamic rope decorator #37365

gante · 2025-04-08T11:39:36Z

What does this PR do?

Same as #37249, but applied to Llama 4

Also updates the docstrings of the Hybrid Caches.

github-actions · 2025-04-08T11:39:49Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

gante · 2025-04-08T11:40:51Z

src/transformers/cache_utils.py

-    Hybrid Cache class to be used with `torch.compile` for Gemma2 models that alternate between a local sliding window attention
-    and global attention in every other layer. Under the hood, Hybrid Cache leverages ["SlidingWindowCache"] for sliding window attention
-    and ["StaticCache"] for global attention. For more information, see the documentation of each subcomponeent cache class.
+    Hybrid Cache class to be used with `torch.compile` for models that alternate between a local sliding window


(the two classes had the same docstring)

ArthurZucker

Thanks

HuggingFaceDocBuilderDev · 2025-04-08T12:05:22Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

l4 + dynamic rope decorator

l4 + dynamic rope decorator

d43d8a1

gante requested a review from ArthurZucker April 8, 2025 11:39

github-actions bot marked this pull request as draft April 8, 2025 11:39

gante commented Apr 8, 2025

View reviewed changes

ArthurZucker approved these changes Apr 8, 2025

View reviewed changes

gante marked this pull request as ready for review April 8, 2025 13:55

Merge branch 'main' into l4_nits

df7b933

gante merged commit 35f0f5b into huggingface:main Apr 8, 2025
20 checks passed

gante deleted the l4_nits branch April 8, 2025 14:56

cyr0930 pushed a commit to cyr0930/transformers that referenced this pull request Apr 18, 2025

[llama 4] dynamic rope decorator (huggingface#37365)

0b38139

l4 + dynamic rope decorator

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025

[llama 4] dynamic rope decorator (huggingface#37365)

1c34b19

l4 + dynamic rope decorator

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llama 4] dynamic rope decorator #37365

[llama 4] dynamic rope decorator #37365

Uh oh!

gante commented Apr 8, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 8, 2025

Uh oh!

gante Apr 8, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[llama 4] dynamic rope decorator #37365

[llama 4] dynamic rope decorator #37365

Uh oh!

Conversation

gante commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

github-actions bot commented Apr 8, 2025

Uh oh!

gante Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gante commented Apr 8, 2025 •

edited

Loading