Skip to content

Conversation

@MekkCyber
Copy link
Contributor

What does this PR do?

Change the model used for testing the kv cache implementation in quanto, the previous model llama2-7b-hf is very old and is causing an error related to rotary_emb because they were moved to base model and are not part of attention anymore #32135

Who can review ?

@SunMarc

@MekkCyber MekkCyber requested a review from SunMarc March 12, 2025 13:55
@github-actions github-actions bot marked this pull request as draft March 12, 2025 13:55
@github-actions
Copy link
Contributor

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

@MekkCyber MekkCyber marked this pull request as ready for review March 12, 2025 14:01
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks !

@MekkCyber MekkCyber merged commit 47cc4da into main Mar 13, 2025
11 checks passed
@MekkCyber MekkCyber deleted the fixing_quanto_kv_cache branch March 13, 2025 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants