-
Notifications
You must be signed in to change notification settings - Fork 29.8k
Closed
Labels
Description
System Info
transformers
version: 4.53.1- Platform: Linux-5.10.192-183.736.amzn2.x86_64-x86_64-with-glibc2.31
- Python version: 3.11.13
- Huggingface_hub version: 0.33.2
- Safetensors version: 0.5.3
- Accelerate version: not installed
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.6.0+cu124 (CUDA)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: No
- Using GPU in script?: Yes
- GPU type: NVIDIA H100 80GB HBM3
Who can help?
Reproduction
With vLLM <= 0.8.5.post1, upgrading transformers to 4.53.0 and above causes AttributeError: 'Gemma3TextConfig' object has no attribute 'sliding_window_pattern'.
, likely because of the changes to Gemma 3 in this PR: #37866.
pip install transformers==4.53.1 # latest version, as long as >= 4.53.0 breaks
pip install vllm==0.8.4
from vllm import LLM
llm = LLM(model="google/gemma-3-12b-it")
Error stacktrace
ERROR 07-08 22:51:23 [core.py:396] Traceback (most recent call last): ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/v1/engine/core.py", line 387, in run_engine_core ERROR 07-08 22:51:23 [core.py:396] engine_core = EngineCoreProc(*args, **kwargs) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/v1/engine/core.py", line 329, in __init__ ERROR 07-08 22:51:23 [core.py:396] super().__init__(vllm_config, executor_class, log_stats, ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/v1/engine/core.py", line 64, in __init__ ERROR 07-08 22:51:23 [core.py:396] self.model_executor = executor_class(vllm_config) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/executor/executor_base.py", line 52, in __init__ ERROR 07-08 22:51:23 [core.py:396] self._init_executor() ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/executor/uniproc_executor.py", line 47, in _init_executor ERROR 07-08 22:51:23 [core.py:396] self.collective_rpc("load_model") ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc ERROR 07-08 22:51:23 [core.py:396] answer = run_method(self.driver_worker, method, args, kwargs) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/utils.py", line 2456, in run_method ERROR 07-08 22:51:23 [core.py:396] return func(*args, **kwargs) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/v1/worker/gpu_worker.py", line 162, in load_model ERROR 07-08 22:51:23 [core.py:396] self.model_runner.load_model() ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1332, in load_model ERROR 07-08 22:51:23 [core.py:396] self.model = get_model(vllm_config=self.vllm_config) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model ERROR 07-08 22:51:23 [core.py:396] return loader.load_model(vllm_config=vllm_config) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 452, in load_model ERROR 07-08 22:51:23 [core.py:396] model = _initialize_model(vllm_config=vllm_config) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model ERROR 07-08 22:51:23 [core.py:396] return model_class(vllm_config=vllm_config, prefix=prefix) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/models/gemma3_mm.py", line 490, in __init__ ERROR 07-08 22:51:23 [core.py:396] self.language_model = init_vllm_registered_model( ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 286, in init_vllm_registered_model ERROR 07-08 22:51:23 [core.py:396] return _initialize_model(vllm_config=vllm_config, prefix=prefix) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model ERROR 07-08 22:51:23 [core.py:396] return model_class(vllm_config=vllm_config, prefix=prefix) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/models/gemma3.py", line 493, in __init__ ERROR 07-08 22:51:23 [core.py:396] self.model = Gemma3Model(vllm_config=vllm_config, ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/compilation/decorators.py", line 151, in __init__ ERROR 07-08 22:51:23 [core.py:396] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs) ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/models/gemma3.py", line 360, in __init__ ERROR 07-08 22:51:23 [core.py:396] self.start_layer, self.end_layer, self.layers = make_layers( ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 609, in make_layers ERROR 07-08 22:51:23 [core.py:396] [PPMissingLayer() for _ in range(start_layer)] + [ ERROR 07-08 22:51:23 [core.py:396] ^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/models/utils.py", line 610, in <listcomp> ERROR 07-08 22:51:23 [core.py:396] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}")) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/models/gemma3.py", line 362, in <lambda> ERROR 07-08 22:51:23 [core.py:396] lambda prefix: Gemma3DecoderLayer( ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/models/gemma3.py", line 288, in __init__ ERROR 07-08 22:51:23 [core.py:396] self.self_attn = Gemma3Attention( ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/vllm/model_executor/models/gemma3.py", line 151, in __init__ ERROR 07-08 22:51:23 [core.py:396] (layer_idx + 1) % config.sliding_window_pattern)) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] File "/root/miniconda3/envs/transformers-issue/lib/python3.11/site-packages/transformers/configuration_utils.py", line 209, in __getattribute__ ERROR 07-08 22:51:23 [core.py:396] return super().__getattribute__(key) ERROR 07-08 22:51:23 [core.py:396] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 07-08 22:51:23 [core.py:396] AttributeError: 'Gemma3TextConfig' object has no attribute 'sliding_window_pattern'
Newer versions of vLLM also have quality issues particularly when upgrading transformers>=4.53.0 which are reported in vllm-project/vllm#20341 .
Expected behavior
Should have the same behavior as transformers 4.52.4 + vLLM 0.8.4
from vllm import LLM
llm = LLM(model="google/gemma-3-12b-it")
print(llm.generate("what is transformers")[0].outputs[0])
CompletionOutput(index=0, text='?>\n\nTransformers are a powerful type of neural network architecture that has revolutionized the', token_ids=[255999, 13765, 108, 214568, 659, 496, 8632, 1722, 529, 22823, 3707, 13217, 600, 815, 176839, 506], cumulative_logprob=None, logprobs=None, finish_reason=length, stop_reason=None)
zucchini-nlp