Skip to content

mistral3 #1561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ewof opened this issue Apr 27, 2025 · 8 comments · May be fixed by #1563
Open

mistral3 #1561

ewof opened this issue Apr 27, 2025 · 8 comments · May be fixed by #1563

Comments

@ewof
Copy link

ewof commented Apr 27, 2025

can someone reply here when/if mistral3 support is added (not sure how ISTA-DASLab/Mistral-Small-3.1-24B-Instruct-2503-GPTQ-4b-128g was made)

@ewof
Copy link
Author

ewof commented Apr 27, 2025

in the meantime if anyone else is looking for a gptq mistral small jeffcookio/Mistral-Small-3.1-24B-Instruct-2503-HF-gptqmodel-4b-128g worked for me

@Qubitium
Copy link
Collaborator

@ewof What error do you get with Mistral3?

@ewof
Copy link
Author

ewof commented Apr 29, 2025

INFO  ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
INFO  ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Traceback (most recent call last):
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 93, in <module>
    main()
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 65, in main
    model = GPTQModel.load(pretrained_model_id, quantize_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 261, in load
    return cls.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 289, in from_pretrained
    model_type = check_and_get_model_type(model_id_or_path, trust_remote_code)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 198, in check_and_get_model_type
    raise TypeError(f"{config.model_type} isn't supported yet.")
TypeError: mistral3 isn't supported yet.

@Qubitium Qubitium linked a pull request Apr 29, 2025 that will close this issue
@Qubitium
Copy link
Collaborator

@wemoveon2 Please check out PR/branch #1563 and recompile gptqmodel using

git clone https://github.com/ModelCloud/GPTQModel
cd GPTQModel
git checkout Qubitium-patch-1
pip install -e . --no-build-isolation -v

and check if mistra3 is fixed.

@ewof
Copy link
Author

ewof commented Apr 29, 2025

INFO  ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
INFO  ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Using the latest cached version of the dataset since wikitext couldn't be found on the Hugging Face Hub
WARNING:datasets.load:Using the latest cached version of the dataset since wikitext couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'wikitext-2-raw-v1' at /home/ubuntu/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/0.0.0/b08601e04326c79dfdd32d625aee71d232d685c3 (last modified on Sun
 Apr 27 03:53:56 2025).
WARNING:datasets.packaged_modules.cache.cache:Found the latest cached dataset configuration 'wikitext-2-raw-v1' at /home/ubuntu/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/0.0.0/b08601e04326c79d
fdd32d625aee71d232d685c3 (last modified on Sun Apr 27 03:53:56 2025).
INFO  Estimated Quantization BPW (bits per weight): 4.2875 bpw, based on [bits: 4, group_size: 128]
INFO  Loader: Auto dtype (native bfloat16): `torch.bfloat16`
Traceback (most recent call last):
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 93, in <module>
    main()
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 65, in main
    model = GPTQModel.load(pretrained_model_id, quantize_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/gptqmodel/models/auto.py", line 262, in load
    return cls.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/gptqmodel/models/auto.py", line 291, in from_pretrained
    return MODEL_MAP[model_type].from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/gptqmodel/models/loader.py", line 190, in from_pretrained
    model = cls.loader.from_pretrained(model_local_path, config=config, **model_init_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 574, in from_pretrained
    raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.mistral3.configuration_mistral3.Mistral3Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfi
g, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, DeepseekV3Config, DiffLlamaConfig, ElectraConfig, Emu3Config, ErnieCon
fig, FalconConfig, FalconMambaConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, GitConfig, GlmConfig, Glm4Config, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNe
oConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeSharedConfig, HeliumConfig, JambaConfig, JetMoeConfig, LlamaConfig, Llama4Config, Llama4TextConfig, Mam
baConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MllamaConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, Nemot
ronConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, Phi4MultimodalConfig, PhimoeConfig, PLBartConfig, Prophe
tNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, Qwen3Config, Qwen3MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerCo
nfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetC
onfig, XmodConfig, ZambaConfig, Zamba2Config.

im on transformers 4.51.3

@Qubitium
Copy link
Collaborator

@ewof Is MIstra3 a visual (hybrid) model that contains both text and visual input?

@ewof
Copy link
Author

ewof commented Apr 29, 2025

yea and vision_config.model_type is pixtral from the models config.json

@Qubitium
Copy link
Collaborator

@ewof Ugh... hybrid models needs manual quantization support as many hybrid model has standard, and non-standarard way of defining how the secondary model (multiple models inside one model config) is defined.

We will try to tackle this with a manual MIstral3 support first, then create a generic code so that all future multi-modal models can work without too much integration work. Right now, the hybrid models are a pain since everyone has not yet decided on how the modeling code (preprocessing) and the forwarding hand-offs should work internally. Wild wild west.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants