Skip to content

✨ feat: Add support for Qwen3 model #286

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 29, 2025

Conversation

johnmai-dev
Copy link
Contributor

@awni
Copy link
Member

awni commented Apr 28, 2025

Looks great!!

Can you add a model configuration for Qwen3 1.7B:

Assume it will be here: mlx-community/Qwen3-1.7B-4bit

And can you update the default model in the LLM app to use that config (as apposed to the dated older qwen of a similar size?

@johnmai-dev
Copy link
Contributor Author

Looks great!!

Can you add a model configuration for Qwen3 1.7B:

Assume it will be here: mlx-community/Qwen3-1.7B-4bit

And can you update the default model in the LLM app to use that config (as apposed to the dated older qwen of a similar size?

Received, sir.🫡🫡🫡

@johnmai-dev johnmai-dev marked this pull request as ready for review April 28, 2025 17:23
@awni
Copy link
Member

awni commented Apr 28, 2025

Thanks for the changes @johnmai-dev. The model runs and produces ok output.. but it looks like there may be something off with the chat template 🤔 .
Screenshot 2025-04-28 at 4 31 30 PM

@johnmai-dev
Copy link
Contributor Author

Thanks for the changes @johnmai-dev. The model runs and produces ok output.. but it looks like there may be something off with the chat template 🤔 . Screenshot 2025-04-28 at 4 31 30 PM

Yes, it is indeed an issue with Jinja. I will only have time to fix it after work in the evening. The error message is: Runtime: Cannot call something that is not a function: got UndefinedValue

@DePasqualeOrg
Copy link
Contributor

DePasqualeOrg commented Apr 29, 2025

I added some missing string methods here: johnmai-dev/Jinja#15

This seems to fix the error.

@awni
Copy link
Member

awni commented Apr 29, 2025

Awesome thanks @DePasqualeOrg. Let us know when it lands. We may need to make a PR on swift transformers to bump the jinja package version as well.

@DePasqualeOrg
Copy link
Contributor

DePasqualeOrg commented Apr 29, 2025

The update is available now. Because swift-transformers uses upToNextMinor for the dependencies, and this was a patch update, mlx-swift-examples should pick up the latest version when you update to the latest package versions in Xcode.

@johnmai-dev
Copy link
Contributor Author

Awesome thanks @DePasqualeOrg. Let us know when it lands. We may need to make a PR on swift transformers to bump the jinja package version as well.

Jinja version 1.1.2 has been released. Only Update Package 😉

image

@johnmai-dev
Copy link
Contributor Author

Thinking Mode

image

Non-thinking Mode

LLMEval 2025-04-29 21 37 04

@awni
Copy link
Member

awni commented Apr 29, 2025

Awesome! Tested locally and it works well now.

Regarding the thinking toggle.. I think we should find a way to keep it cause it's fun to play with.

But there are a couple issues to think through there:

  1. It will always be there even for non-thinking models. Which doesn't really make sense
  2. In the past I've seen other models use different tokenizer flags for that. So how does one generalize that other models?

@DePasqualeOrg
Copy link
Contributor

I don't know if it will be possible to reliably determine this based on the tokenizer config files. My app Local Chat adds a lot of metadata about the models, such as whether they use thinking tags, whether the response starts with the start thinking tag, etc. To work reliably, this should probably be part of the model preset configurations, and more options may need to be added over time as models become more diverse.

@johnmai-dev
Copy link
Contributor Author

Awesome! Tested locally and it works well now.

Regarding the thinking toggle.. I think we should find a way to keep it cause it's fun to play with.

But there are a couple issues to think through there:

  1. It will always be there even for non-thinking models. Which doesn't really make sense
  2. In the past I've seen other models use different tokenizer flags for that. So how does one generalize that other models?

Yes, I'm also debating whether to add a non-thinking mode.😂 But if we don't add it, we'll have to default to thinking mode.

# Conflicts:
#	mlx-swift-examples.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved
@awni
Copy link
Member

awni commented Apr 29, 2025

To work reliably, this should probably be part of the model preset configurations

Indeed.. I was thinking something similar. Models may need optional additional config info including which toggles they have (tools, thinking). And we can dynamically set that based on the config..

I'm ok keeping it the way it is for now. But with the intention of keeping an eye on it and refactoring / redesigning if it becomes more important.

@johnmai-dev
Copy link
Contributor Author

To work reliably, this should probably be part of the model preset configurations

Indeed.. I was thinking something similar. Models may need optional additional config info including which toggles they have (tools, thinking). And we can dynamically set that based on the config..

I'm ok keeping it the way it is for now. But with the intention of keeping an eye on it and refactoring / redesigning if it becomes more important.

I had planned to expose modelType and only display the Thinking Toggle when modelType is Qwen3.

@johnmai-dev
Copy link
Contributor Author

I'm ok keeping it the way it is for now. But with the intention of keeping an eye on it and refactoring / redesigning if it becomes more important.

If we keeping it, this PR is ready for review & merge. 🍻
@awni @davidkoski

@awni
Copy link
Member

awni commented Apr 29, 2025

I had planned to expose modelType and only display the Thinking Toggle when modelType is Qwen3.

Yes, something like that could also work. Though I think it may be better to read it as an optinoal field in the config rather than hard code the model types that should have different features.

But, let's do this type of stuff in a follow on PR so we can prioritize merging this without the extra complication.

Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@davidkoski
Copy link
Collaborator

Yeah, I think we can keep the toggle there in the example (just like we have the tool piece showing how to integrate that). If there is some metadata in the tokenizer config we can hook that up -- maybe there is something similar for tools?

@davidkoski
Copy link
Collaborator

Looks great, thank you for the contribution! @johnmai-dev and @DePasqualeOrg

@davidkoski davidkoski merged commit a6cb969 into ml-explore:main Apr 29, 2025
1 check passed
@davidkoski davidkoski mentioned this pull request Apr 29, 2025
@mzbac
Copy link
Contributor

mzbac commented Apr 30, 2025

Thank you all for the awesome work, can't wait to integrate it in my mlx swift projects.

@DePasqualeOrg
Copy link
Contributor

I had planned to expose modelType and only display the Thinking Toggle when modelType is Qwen3.

It might be easier to maintain if you use a field related to the functionality you want to control. For this type of additional functionality that is specific to some models, my app has a field that is a list of features, one of which could be an enum case toggleThinking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants