[Hotfix] solve fp8 w8a8 ci test fail #4531

BBuf · 2025-03-18T03:09:14Z

Motivation

Modifications

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

qeternity · 2025-03-22T14:36:59Z

This commit has broken loading of older Marlin packed models.

KeyError: 'model.layers.0.mlp.gate_up_proj.B'

Looking into it now.

qeternity · 2025-03-22T15:37:18Z

Ok so this is actually related to the deprecation of the SGLang types, which now correctly passes the check_marlin_supported check in GPTQMarlinConfig. Early Marlin reference code set the model config quant method to gptq with flags like is_marlin_format. So now that we are using vllm.scalar_type.ScalarType instead of sglang.srt.layers.quantization.utils.ScalarType the type check passes, and causes this error (previously the type check failed, forcing the Marlin config usage).

Changing the model config to marlin quant method resolves this.

qeternity · 2025-03-22T15:52:39Z

Not sure if we want to fix this, or just have people change older configs, but PR here: #4675. Feel free to close it if out of scope.

[Hotfix] solve fp8 w8a8 ci test fail

ab8984f

BBuf requested review from merrymercy, Ying1123, zhyncs, ispobock and HaiShaw as code owners March 18, 2025 03:09

fix typo (#4529)

9872e10

zhyncs added the high priority label Mar 18, 2025

zhyncs assigned BBuf and zhyncs Mar 18, 2025

BBuf added 4 commits March 18, 2025 03:46

fix typo

2f13138

upd

6799913

fix last ci error

6ce21af

fix amd

e2c5678

zhyncs merged commit dd865be into main Mar 18, 2025
3 of 20 checks passed

zhyncs deleted the fix_fp8_w8a8_ci_test branch March 18, 2025 06:17

qeternity mentioned this pull request Mar 22, 2025

check marlin format before attempting conversion #4675

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Hotfix] solve fp8 w8a8 ci test fail #4531

[Hotfix] solve fp8 w8a8 ci test fail #4531

Uh oh!

BBuf commented Mar 18, 2025

Uh oh!

Uh oh!

qeternity commented Mar 22, 2025

Uh oh!

qeternity commented Mar 22, 2025 •

edited

Loading

Uh oh!

qeternity commented Mar 22, 2025

Uh oh!

Uh oh!

[Hotfix] solve fp8 w8a8 ci test fail #4531

[Hotfix] solve fp8 w8a8 ci test fail #4531

Uh oh!

Conversation

BBuf commented Mar 18, 2025

Motivation

Modifications

Checklist

Uh oh!

Uh oh!

qeternity commented Mar 22, 2025

Uh oh!

qeternity commented Mar 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qeternity commented Mar 22, 2025

Uh oh!

Uh oh!

qeternity commented Mar 22, 2025 •

edited

Loading