decoder.py forces exclusive SDPBackend.FLASH_ATTENTION, crashes on platforms without Flash Attention

## Problem

In `sam3/model/decoder.py`, the `else` branch (non-FA3 path) of the attention computation uses:

```python
with sdpa_kernel(SDPBackend.FLASH_ATTENTION):
    out = torchF.scaled_dot_product_attention(q, k, v, dropout_p=dropout)
```

`sdpa_kernel` is exclusive, it disables any backend not in the list. So this disables `EFFICIENT_ATTENTION` and `MATH` entirely, leaving Flash Attention as the only option.

On Windows, PyTorch doesn't ship with the Flash Attention backend compiled in. This means zero backends are available and you get:

```
RuntimeError: No available kernel. Aborting execution.
```

with warnings like:
```
UserWarning: No available kernel. Aborting execution.
FlashAttention is not supported (Flash Attention was not compiled for this system)
mem_efficient_sdp_enabled was set to False
math_sdp_enabled was set to False
```

## Other files already handle this correctly

`vl_combiner.py` allows all three backends:

```python
sdpa_context = sdpa_kernel(
    [
        SDPBackend.MATH,
        SDPBackend.EFFICIENT_ATTENTION,
        SDPBackend.FLASH_ATTENTION,
    ]
)
```

And `sam/transformer.py` enables all three globally:

```python
torch.backends.cuda.enable_flash_sdp(True)
torch.backends.cuda.enable_math_sdp(True)
torch.backends.cuda.enable_mem_efficient_sdp(True)
```

So `decoder.py` is the only place that forces a single backend exclusively.

## Fix

Change the `sdpa_kernel` call in `decoder.py` to allow fallback backends, same as `vl_combiner.py`:

```python
with sdpa_kernel([SDPBackend.FLASH_ATTENTION, SDPBackend.EFFICIENT_ATTENTION, SDPBackend.MATH]):
    out = torchF.scaled_dot_product_attention(q, k, v, dropout_p=dropout)
```

This way Flash Attention is still preferred when available, but PyTorch can fall back to efficient attention or math on platforms that don't have it.

## Environment

- Windows 11, CUDA 12.8
- PyTorch 2.7.0 (cu128)
- SAM3.1 multiplex video predictor with `use_fa3=False`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decoder.py forces exclusive SDPBackend.FLASH_ATTENTION, crashes on platforms without Flash Attention #523

Problem

Other files already handle this correctly

Fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

decoder.py forces exclusive SDPBackend.FLASH_ATTENTION, crashes on platforms without Flash Attention #523

Description

Problem

Other files already handle this correctly

Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions