Fix `is_causal` fail with compile #36374

Cyrilvallez · 2025-02-24T15:22:16Z

What does this PR do?

As per the title. There is a weird issue when running the following:

import torch
from transformers import AutoModelForCausalLM, LlamaConfig

config = LlamaConfig()
config.num_hidden_layers = 2
model = AutoModelForCausalLM.from_config(config).to(0)


model.compile(fullgraph=True)

with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 200), device=0)
    out = model(input_ids)

# Change shape of input
with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 100), device=0)
    out = model(input_ids)

It fails with

TypeError: scaled_dot_product_attention(): argument 'is_causal' must be bool, not SymBool
...
RuntimeError: Failed running call_function <built-in function scaled_dot_product_attention>(*(FakeTensor(..., device='cuda:0', size=(1, 32, s0, 128)), FakeTensor(..., device='cuda:0', size=(1, 32, s0, 128)), FakeTensor(..., device='cuda:0', size=(1, 32, s0, 128))), **{'attn_mask': None, 'dropout_p': 0.0, 'scale': 0.08838834764831845, 'is_causal': s0 > 1}):
scaled_dot_product_attention(): argument 'is_causal' must be bool, not SymBool

i.e. it somehow traces the is_causal assignment as a SymBool instead of evaluating it.

If we pass a mask every time it works, however, as soon as we remove it (which should be equivalent in term of logic) it fails again:

import torch
from transformers import AutoModelForCausalLM, LlamaConfig

config = LlamaConfig()
config.num_hidden_layers = 2
model = AutoModelForCausalLM.from_config(config).to(0)


model.compile(fullgraph=True)

with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 200), device=0)
    out = model(input_ids, attention_mask=torch.ones_like(input_ids))

# Change shape of input but keep a mask
with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 100), device=0)
    out = model(input_ids, attention_mask=torch.ones_like(input_ids))

# Change shape of input but remove mask -> it fails here
with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 100), device=0)
    out = model(input_ids)

Simply switching the order of the check in sdpa_attention fixes the issue, and ensures that it works independently of passing a mask or not. Not entirely sure why, but it looks like dynamo somehow does not treat correctly the shape check if it is done afterwards.

With the current fix, both of the following scenarios work correctly:

Suddenly removing the mask when changing shape:

import torch
from transformers import AutoModelForCausalLM, LlamaConfig

config = LlamaConfig()
config.num_hidden_layers = 2
model = AutoModelForCausalLM.from_config(config).to(0)


model.compile(fullgraph=True)

with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 200), device=0)
    out = model(input_ids, attention_mask=torch.ones_like(input_ids))

with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 100), device=0)
    out = model(input_ids, attention_mask=torch.ones_like(input_ids))

# Stop passing the mask
with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 100), device=0)
    out = model(input_ids)

Adding a mask when changing shape:

import torch
from transformers import AutoModelForCausalLM, LlamaConfig

config = LlamaConfig()
config.num_hidden_layers = 2
model = AutoModelForCausalLM.from_config(config).to(0)


model.compile(fullgraph=True)

with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 200), device=0)
    out = model(input_ids)

# Change shape of input
with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 100), device=0)
    out = model(input_ids)

# Add the mask to the inputs
with torch.no_grad():
    input_ids = torch.randint(0, 100, (1, 100), device=0)
    out = model(input_ids, attention_mask=torch.ones_like(input_ids))

HuggingFaceDocBuilderDev · 2025-02-24T15:48:36Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Thanks 🤗

fix

2fffc12

ArthurZucker approved these changes Feb 25, 2025

View reviewed changes

Cyrilvallez merged commit 401543a into main Feb 25, 2025
24 checks passed

Cyrilvallez deleted the fix-compile branch February 25, 2025 09:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix `is_causal` fail with compile #36374

Fix `is_causal` fail with compile #36374

Uh oh!

Cyrilvallez commented Feb 24, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Feb 24, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix is_causal fail with compile #36374

Fix is_causal fail with compile #36374

Uh oh!

Conversation

Cyrilvallez commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 24, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix `is_causal` fail with compile #36374

Fix `is_causal` fail with compile #36374

Cyrilvallez commented Feb 24, 2025 •

edited

Loading