Skip to content

Could not find an attention_chunk_size argument in the config, or it is not set [Reasoning with transformers] #119

@Captain-Named

Description

@Captain-Named

_System Info_
transformers==4.53.3

🐛 Describe the bug
The attention_chunk_size parameter in config of Llama-Guard-4-12B is null. The transformers must require that parameter causing a reasoning failure. How can I avoid it and make it work via transformers library?

Error logs
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/ydongbl/d/general/inference.py", line 121, in
[rank0]: main()
[rank0]: ~~~~^^
[rank0]: File "/home/ydongbl/d/general/inference.py", line 72, in main
[rank0]: outputs = model.generate(
[rank0]: input_ids=batch['input_ids'],
[rank0]: ...<10 lines>...
[rank0]: use_cache=True
[rank0]: )
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/generation/utils.py", line 2644, in generate
[rank0]: result = self._beam_search(
[rank0]: input_ids,
[rank0]: ...<4 lines>...
[rank0]: **model_kwargs,
[rank0]: )
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/generation/utils.py", line 4073, in _beam_search
[rank0]: model_inputs = self.prepare_inputs_for_generation(flat_running_sequences, **model_kwargs)
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/generation/utils.py", line 662, in prepare_inputs_for_generation
[rank0]: attention_mask = causal_mask_creation_function(
[rank0]: config=self.config,
[rank0]: ...<6 lines>...
[rank0]: token_type_ids=token_type_ids,
[rank0]: )
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/masking_utils.py", line 1058, in create_masks_for_generate
[rank0]: causal_masks[layer_pattern] = LAYER_PATTERN_TO_MASK_FUNCTION_MAPPINGlayer_pattern
[rank0]: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
[rank0]: File "/home/ydongbl/.conda/envs/alignment/lib/python3.13/site-packages/transformers/masking_utils.py", line 946, in create_chunked_causal_mask
[rank0]: raise ValueError("Could not find an attention_chunk_size argument in the config, or it is not set")
[rank0]: ValueError: Could not find an attention_chunk_size argument in the config, or it is not set

Expected behavior
May I know how I could avoid it and make the reasoning work via transformers library?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions