mistral3 #1561

ewof · 2025-04-27T07:56:24Z

can someone reply here when/if mistral3 support is added (not sure how ISTA-DASLab/Mistral-Small-3.1-24B-Instruct-2503-GPTQ-4b-128g was made)

ewof · 2025-04-27T22:47:36Z

in the meantime if anyone else is looking for a gptq mistral small jeffcookio/Mistral-Small-3.1-24B-Instruct-2503-HF-gptqmodel-4b-128g worked for me

Qubitium · 2025-04-29T01:21:24Z

@ewof What error do you get with Mistral3?

ewof · 2025-04-29T01:46:58Z

INFO  ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
INFO  ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Traceback (most recent call last):
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 93, in <module>
    main()
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 65, in main
    model = GPTQModel.load(pretrained_model_id, quantize_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 261, in load
    return cls.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 289, in from_pretrained
    model_type = check_and_get_model_type(model_id_or_path, trust_remote_code)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/gptqmodel/models/auto.py", line 198, in check_and_get_model_type
    raise TypeError(f"{config.model_type} isn't supported yet.")
TypeError: mistral3 isn't supported yet.

Qubitium · 2025-04-29T02:10:05Z

@wemoveon2 Please check out PR/branch #1563 and recompile gptqmodel using

git clone https://github.com/ModelCloud/GPTQModel
cd GPTQModel
git checkout Qubitium-patch-1
pip install -e . --no-build-isolation -v

and check if mistra3 is fixed.

ewof · 2025-04-29T02:14:30Z

INFO  ENV: Auto setting PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' for memory saving.
INFO  ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Using the latest cached version of the dataset since wikitext couldn't be found on the Hugging Face Hub
WARNING:datasets.load:Using the latest cached version of the dataset since wikitext couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'wikitext-2-raw-v1' at /home/ubuntu/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/0.0.0/b08601e04326c79dfdd32d625aee71d232d685c3 (last modified on Sun
 Apr 27 03:53:56 2025).
WARNING:datasets.packaged_modules.cache.cache:Found the latest cached dataset configuration 'wikitext-2-raw-v1' at /home/ubuntu/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/0.0.0/b08601e04326c79d
fdd32d625aee71d232d685c3 (last modified on Sun Apr 27 03:53:56 2025).
INFO  Estimated Quantization BPW (bits per weight): 4.2875 bpw, based on [bits: 4, group_size: 128]
INFO  Loader: Auto dtype (native bfloat16): `torch.bfloat16`
Traceback (most recent call last):
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 93, in <module>
    main()
  File "/home/ubuntu/GPTQModel/examples/quantization/basic_usage_wikitext2.py", line 65, in main
    model = GPTQModel.load(pretrained_model_id, quantize_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/gptqmodel/models/auto.py", line 262, in load
    return cls.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/gptqmodel/models/auto.py", line 291, in from_pretrained
    return MODEL_MAP[model_type].from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/gptqmodel/models/loader.py", line 190, in from_pretrained
    model = cls.loader.from_pretrained(model_local_path, config=config, **model_init_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/GPTQModel/venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 574, in from_pretrained
    raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.mistral3.configuration_mistral3.Mistral3Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfi
g, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, DeepseekV3Config, DiffLlamaConfig, ElectraConfig, Emu3Config, ErnieCon
fig, FalconConfig, FalconMambaConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, GitConfig, GlmConfig, Glm4Config, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNe
oConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeSharedConfig, HeliumConfig, JambaConfig, JetMoeConfig, LlamaConfig, Llama4Config, Llama4TextConfig, Mam
baConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MllamaConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, Nemot
ronConfig, OlmoConfig, Olmo2Config, OlmoeConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, Phi4MultimodalConfig, PhimoeConfig, PLBartConfig, Prophe
tNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, Qwen3Config, Qwen3MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerCo
nfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetC
onfig, XmodConfig, ZambaConfig, Zamba2Config.

im on transformers 4.51.3

Qubitium · 2025-04-29T02:16:21Z

@ewof Is MIstra3 a visual (hybrid) model that contains both text and visual input?

ewof · 2025-04-29T02:18:30Z

yea and vision_config.model_type is pixtral from the models config.json

Qubitium · 2025-04-29T02:22:33Z

@ewof Ugh... hybrid models needs manual quantization support as many hybrid model has standard, and non-standarard way of defining how the secondary model (multiple models inside one model config) is defined.

We will try to tackle this with a manual MIstral3 support first, then create a generic code so that all future multi-modal models can work without too much integration work. Right now, the hybrid models are a pain since everyone has not yet decided on how the modeling code (preprocessing) and the forwarding hand-offs should work internally. Wild wild west.

Qubitium linked a pull request Apr 29, 2025 that will close this issue

Mistral3 Support #1563

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mistral3 #1561

mistral3 #1561

ewof commented Apr 27, 2025

ewof commented Apr 27, 2025

Uh oh!

Qubitium commented Apr 29, 2025

Uh oh!

ewof commented Apr 29, 2025

Uh oh!

Qubitium commented Apr 29, 2025

Uh oh!

ewof commented Apr 29, 2025

Uh oh!

Qubitium commented Apr 29, 2025

Uh oh!

ewof commented Apr 29, 2025

Uh oh!

Qubitium commented Apr 29, 2025

Uh oh!

mistral3 #1561

mistral3 #1561

Comments

ewof commented Apr 27, 2025

ewof commented Apr 27, 2025

Uh oh!

Qubitium commented Apr 29, 2025

Uh oh!

ewof commented Apr 29, 2025

Uh oh!

Qubitium commented Apr 29, 2025

Uh oh!

ewof commented Apr 29, 2025

Uh oh!

Qubitium commented Apr 29, 2025

Uh oh!

ewof commented Apr 29, 2025

Uh oh!

Qubitium commented Apr 29, 2025

Uh oh!