-
Notifications
You must be signed in to change notification settings - Fork 29.7k
Fix convert to original state dict for VLMs #38385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
* fix convert to original state dict * fix * lint * Update modeling_utils.py
* fix convert to original state dict * fix * lint * Update modeling_utils.py
* fix convert to original state dict * fix * lint * Update modeling_utils.py
* fix convert to original state dict * fix * lint * Update modeling_utils.py
* fix convert to original state dict * fix * lint * Update modeling_utils.py
* make it go brrrr * date time * update * fix * up * uppp * up * no number i * udpate * fix * [paligemma] fix processor with suffix (#38365) fix pg processor * [video utils] group and reorder by number of frames (#38374) fix * Fix convert to original state dict for VLMs (#38385) * fix convert to original state dict * fix * lint * Update modeling_utils.py * update * warn * no verbose * fginal * ouft * style --------- Co-authored-by: Raushan Turganbay <[email protected]> Co-authored-by: hoshi-hiyouga <[email protected]>
@zucchini-nlp Thanks for the merging. Could we have a unittest for this function? |
I don't think we need a test, since it's fixed already and we probably won't touch that part anymore. We have a test for loading the ckpt and for creating a new dummy model, loading-saving-loading-saving it back. That is enough imo |
@zucchini-nlp Okay. By the way, do you think we should override the |
### What does this PR do? Fixes #1710  1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`  ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.
### What does this PR do? Fixes volcengine#1710  1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`  ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.
### What does this PR do? Fixes volcengine#1710  1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`  ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.
Fixes volcengine#1710  1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`  - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.
What does this PR do?
#37033 introduces the base models for all VLMs. The model weights will be converted by mapping the original keys according to
transformers/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py
Lines 1735 to 1738 in 701caef
However, the previous implementation of Transformers could not properly convert the weights back due to existing bugs (
replacement = re.sub(r"\(.*?\)", "", pattern)
->replacement = re.sub(r"\(.*?\)", "", replacement)
) and the lack of support for nested parenthesestransformers/src/transformers/modeling_utils.py
Lines 3644 to 3657 in 701caef
We want to provide a more accurate weight conversion implementation to prevent issues with third-party apps.
hiyouga/LLaMA-Factory#8147
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ArthurZucker @zucchini-nlp