Skip to content

Feature Request: Granite 4 Support #13275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
5 of 16 tasks
gabe-l-hart opened this issue May 2, 2025 · 2 comments
Open
5 of 16 tasks

Feature Request: Granite 4 Support #13275

gabe-l-hart opened this issue May 2, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@gabe-l-hart
Copy link
Contributor

gabe-l-hart commented May 2, 2025

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

This issue is to track work to support IBM's Granite 4 model architecture (GraniteMoEHybrid in transformers). The model uses a number of components that are not yet supported in llama.cpp, but are being worked independently, so I'm raising this issue to triangulate the different work streams that will be needed to support the model.

Necessary Components

Motivation

I lead IBM's efforts to ensure that Granite models work everywhere, and llama.cpp is a critical part of "everywhere!"

Possible Implementation

No response

@gabe-l-hart
Copy link
Contributor Author

@ngxson
Copy link
Collaborator

ngxson commented May 3, 2025

Support for NoPE positional encoding instead of RoPE

If this is the same idea with llama 4, then I think we already support this. In short, it's just an if condition:

llama.cpp/src/llama-model.cpp

Lines 4536 to 4547 in 3bf785f

if (use_rope) {
Qcur = ggml_rope_ext(
ctx0, Qcur, inp_pos, rope_factors,
n_rot, rope_type, n_ctx_orig, freq_base, freq_scale,
ext_factor, attn_factor, beta_fast, beta_slow
);
Kcur = ggml_rope_ext(
ctx0, Kcur, inp_pos, rope_factors,
n_rot, rope_type, n_ctx_orig, freq_base, freq_scale,
ext_factor, attn_factor, beta_fast, beta_slow
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants