Skip to content

Conversation

@qubvel
Copy link
Contributor

@qubvel qubvel commented Mar 4, 2025

What does this PR do?

Updates the way attention implementation is chosen. Instead of defining separate classes we use functional approach and switch attention implementation on the fly with congig._attn_implementaiton param.

The following model will have SDPA and FA2 support:

  • vit
  • audio_spectrogram_transformer
  • deit
  • dinov2
  • dinov2_with_registers
  • dpt
  • ijepa
  • videomae
  • vit_mae
  • vit_msn
  • vitpose_backbone
  • vivit
  • yolos

It also affects the following models:

  • depth_anything (use dinov2 backbone)
  • zoedepth (use dinov2 backbone)

Fixes:

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qubvel
Copy link
Contributor Author

qubvel commented Mar 5, 2025

run-slow: vit, audio_spectrogram_transformer, deit, dinov2, dinov2_with_registers, dpt, ijepa, videomae, vit_mae, vit_msn, vitpose_backbone, vivit, yolos

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2025

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['models/audio_spectrogram_transformer', 'models/deit', 'models/dinov2', 'models/dinov2_with_registers', 'models/dpt', 'models/ijepa', 'models/videomae', 'models/vit', 'models/vit_mae', 'models/vit_msn', 'models/vitpose_backbone', 'models/vivit', 'models/yolos']
quantizations: [] ...

@qubvel qubvel marked this pull request as ready for review March 12, 2025 15:33
@qubvel
Copy link
Contributor Author

qubvel commented Mar 12, 2025

run-slow: vit, audio_spectrogram_transformer, deit, dinov2, dinov2_with_registers, dpt, ijepa, videomae, vit_mae, vit_msn, vitpose_backbone, vivit, yolos

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['models/audio_spectrogram_transformer', 'models/deit', 'models/dinov2', 'models/dinov2_with_registers', 'models/dpt', 'models/ijepa', 'models/videomae', 'models/vit', 'models/vit_mae', 'models/vit_msn', 'models/vitpose_backbone', 'models/vivit', 'models/yolos']
quantizations: [] ...

@qubvel
Copy link
Contributor Author

qubvel commented Mar 14, 2025

cc @Cyrilvallez for review if you have bandwidth 🤗

@qubvel qubvel requested a review from Cyrilvallez March 14, 2025 13:26
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧼 clean and perfect! Thanks a lot for working on this, quite tedious!

@qubvel qubvel merged commit 6629177 into huggingface:main Mar 20, 2025
22 of 23 checks passed
@sbucaille sbucaille mentioned this pull request Mar 22, 2025
5 tasks
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
…6545)

* Refactor vit attention

* Refactor ViT-based models

* 🚨🚨🚨 Fix prefix for DPT

* Update params order

* trigger tests

* Fix Dinov2 attention

* Fix DPT attention impl propagation for backbone config

* Common test fix: config is modif. inplace - avoid it

* view->reshape

* Fixup

* Fixup

* Enable IJepa FA2

* Add FA2 in corresponding model docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants