Could not find an attention_chunk_size argument in the config, or it is not set [Reasoning with transformers]

**_System Info*_*
transformers==4.53.3


**_🐛 Describe the bug_**
The attention_chunk_size parameter in config of Llama-Guard-4-12B is null. The transformers must require that parameter causing a reasoning failure. How can I avoid it and make it work via transformers library?

Error logs
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/ydongbl/d/general/inference.py", line 121, in
[rank0]: main()
[rank0]: ~~~~^^
[rank0]: File "/home/ydongbl/d/general/inference.py", line 72, in main
[rank0]: outputs = model.generate(
[rank0]: input_ids=batch['input_ids'],
[rank0]: ...<10 lines>...
[rank0]: use_cache=True
[rank0]: )
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/generation/utils.py", line 2644, in generate
[rank0]: result = self._beam_search(
[rank0]: input_ids,
[rank0]: ...<4 lines>...
[rank0]: **model_kwargs,
[rank0]: )
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/generation/utils.py", line 4073, in _beam_search
[rank0]: model_inputs = self.prepare_inputs_for_generation(flat_running_sequences, **model_kwargs)
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/generation/utils.py", line 662, in prepare_inputs_for_generation
[rank0]: attention_mask = causal_mask_creation_function(
[rank0]: config=self.config,
[rank0]: ...<6 lines>...
[rank0]: token_type_ids=token_type_ids,
[rank0]: )
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/masking_utils.py", line 1058, in create_masks_for_generate
[rank0]: causal_masks[[layer_pattern](https://github.com/meta-llama/llama-stack/issues/**mask_kwargs)] = LAYER_PATTERN_TO_MASK_FUNCTION_MAPPINGlayer_pattern
[rank0]: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/masking_utils.py", line 946, in create_chunked_causal_mask
[rank0]: raise ValueError("Could not find an attention_chunk_size argument in the config, or it is not set")
[rank0]: ValueError: Could not find an attention_chunk_size argument in the config, or it is not set

**_Expected behavior_**
May I know how I could avoid it and make the reasoning work via transformers library?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could not find an attention_chunk_size argument in the config, or it is not set [Reasoning with transformers] #119

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Could not find an attention_chunk_size argument in the config, or it is not set [Reasoning with transformers] #119

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions