Skip to content

Conversation

@gante
Copy link
Contributor

@gante gante commented Apr 8, 2025

What does this PR do?

Same as #37249, but applied to Llama 4

Also updates the docstrings of the Hybrid Caches.

@gante gante requested a review from ArthurZucker April 8, 2025 11:39
@github-actions github-actions bot marked this pull request as draft April 8, 2025 11:39
@github-actions
Copy link
Contributor

github-actions bot commented Apr 8, 2025

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

Hybrid Cache class to be used with `torch.compile` for Gemma2 models that alternate between a local sliding window attention
and global attention in every other layer. Under the hood, Hybrid Cache leverages ["SlidingWindowCache"] for sliding window attention
and ["StaticCache"] for global attention. For more information, see the documentation of each subcomponeent cache class.
Hybrid Cache class to be used with `torch.compile` for models that alternate between a local sliding window
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the two classes had the same docstring)

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante gante marked this pull request as ready for review April 8, 2025 13:55
@gante gante merged commit 35f0f5b into huggingface:main Apr 8, 2025
20 checks passed
@gante gante deleted the l4_nits branch April 8, 2025 14:56
cyr0930 pushed a commit to cyr0930/transformers that referenced this pull request Apr 18, 2025
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants