Skip to content

[OpenVINO] Support Qwen3-next#1523

Merged
rkazants merged 49 commits intohuggingface:mainfrom
rkazants:support_qwen3_next
Mar 9, 2026
Merged

[OpenVINO] Support Qwen3-next#1523
rkazants merged 49 commits intohuggingface:mainfrom
rkazants:support_qwen3_next

Conversation

@rkazants
Copy link
Collaborator

@rkazants rkazants commented Nov 16, 2025

What does this PR do?

Example of conversion cmd-line for Qwen/Qwen3-Next-80B-A3B-Instruct:

optimum-cli export openvino -m Qwen/Qwen3-Next-80B-A3B-Instruct Qwen3-Next-80B-A3B-Instruct

Example of inference for Qwen/Qwen3-Next-80B-A3B-Instruct using OpenVINO backend:

from transformers import AutoTokenizer
from optimum.intel.openvino import OVModelForCausalLM

model_path = "./Qwen3-Next-80B-A3B-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = OVModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)

# change input text as desired
input_text = "The capital of France is"
# tokenize the text
input_tokens = tokenizer(input_text, return_tensors="pt")
# generate output tokens
output = model.generate(**input_tokens, max_length=10)
# decode output tokens into text
output = tokenizer.batch_decode(output)
print(output[0])

Before submitting

  • [N/A] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@rkazants rkazants marked this pull request as draft November 16, 2025 13:10
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rkazants added 3 commits March 5, 2026 15:28
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Copy link
Member

@IlyasMoutawwakil IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@MaximProshin
Copy link
Contributor

FYI After fixes in OV for group-wise quantization, I was able to measure the accuracy for default int4 and I got (CPU, WWB Similarity): 0.960114

@rkazants rkazants merged commit 0566b76 into huggingface:main Mar 9, 2026
20 of 24 checks passed
@savvadesogle
Copy link

Thank you 🙏

@savvadesogle
Copy link

Hello, Roman. @rkazants

Can we convert models from the Intel repository to OpenVINO? AutoRound models.
And how to do in properly

https://huggingface.co/Intel/Qwen3-Coder-Next-int4-AutoRound

@rkazants
Copy link
Collaborator Author

optimum-cli export openvino -m Qwen/Qwen3-Next-80B-A3B-Instruct Qwen3-Next-80B-A3B-Instruct

Hi @savvadesogle,

Can you try this command for conversion:

optimum-cli export openvino -m Intel/Qwen3-Coder-Next-int4-AutoRound Qwen3-Coder-Next-int4-AutoRound

@ljaljushkin, @MaximProshin, @mvafin, did you see any problems for converting this quantized model from NNCF perspective or PyTorch FE?

Best regards,
Roman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants