Fix tests for vision models #35654

qubvel · 2025-01-13T11:00:48Z

What does this PR do?

Fixing tests for vision models

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

qubvel · 2025-01-13T11:03:09Z

run-slow: beit, detr, dinov2, vit, textnet

HuggingFaceDocBuilderDev · 2025-01-13T11:33:39Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

MHRDYN7 · 2025-01-17T04:50:01Z

Hi @qubvel, I'm not sure what this draft pr intends to do, however it might be relevant to the pr #35138. That pr fixes the incompatibility of FlaxDinov2 with batch sizes of more than 1. This error could not be detected by the flax tests (same as the pytorch tests), when I first contributed this model. All the slow tests in transformers simply pass a single image with a batch size of 1 and that is why such batch sizes incompatibility errors might not be detected. As a result, I changed the images batch size to 2 for the flax dinov2 slow tests (in that pr, not yet merged). Probably, doing the same for all the other future model slow tests would greatly assist the development process. Also, may I request a review on pr #35138 so that FlaxDinov2 can be used properly.

qubvel · 2025-01-22T09:09:04Z

run-slow: beit, detr, dinov2, vit, textnet

github-actions · 2025-01-22T09:10:21Z

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/detr', 'models/dinov2', 'models/textnet', 'models/vit'] ...

qubvel · 2025-01-22T09:30:08Z

run-slow: beit, detr, dinov2, vit, textnet

github-actions · 2025-01-22T09:31:19Z

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/detr', 'models/dinov2', 'models/textnet', 'models/vit'] ...

qubvel · 2025-01-28T15:18:44Z

run-slow: beit, detr, dinov2, vit, textnet

github-actions · 2025-01-28T15:20:02Z

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/detr', 'models/dinov2', 'models/textnet', 'models/vit'] ...

qubvel · 2025-01-28T16:38:48Z

run-slow: beit, data2vec, dpt

github-actions · 2025-01-28T16:40:12Z

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/dpt'] ...

qubvel · 2025-01-28T16:46:49Z

run-slow: detr

github-actions · 2025-01-28T16:49:11Z

This comment contains run-slow, running the specified jobs: ['models/detr'] ...

qubvel · 2025-01-28T17:07:43Z

run-slow: beit, detr, dinov2, vit, textnet, data2vec, dpt

github-actions · 2025-01-28T17:09:10Z

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/detr', 'models/dinov2', 'models/dpt', 'models/textnet', 'models/vit'] ...

qubvel · 2025-01-28T17:32:23Z

run-slow: beit, detr, dinov2, vit, textnet, data2vec, dpt

github-actions · 2025-01-28T17:33:40Z

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/detr', 'models/dinov2', 'models/dpt', 'models/textnet', 'models/vit'] ...

qubvel · 2025-01-28T18:11:18Z

run-slow: beit, data2vec, dpt, zoedepth

github-actions · 2025-01-28T18:12:37Z

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/dpt', 'models/zoedepth'] ...

qubvel · 2025-01-28T18:30:46Z

tests/models/beit/test_modeling_beit.py

-        # with interpolate_pos_encoding being False an exception should be raised with higher resolution
-        # images than what the model supports.
-        self.assertFalse(processor.do_center_crop)
-        with torch.no_grad():
-            with self.assertRaises(ValueError, msg="doesn't match model"):
-                model(pixel_values, interpolate_pos_encoding=False)
-


We always interpolate, error raising was removed for ZoeDepth in
https://github.com/huggingface/transformers/pull/30136/files#diff-3f84bebd6be8d9c0f5c5068199f5c49eac8489d5fa466fb6fa08b0365e78dba4

that's why we are removing it from tests as well

qubvel · 2025-01-28T20:32:39Z

src/transformers/models/beit/modeling_beit.py

-        if self.position_embeddings is not None:
-            if interpolate_pos_encoding:
-                cls_tokens = cls_tokens + self.interpolate_pos_encoding(embeddings, height, width)


This is actually a bug, but we probably never reach it because self.position_embeddings is None

Or it's likely no one never use interpolate_pos_encoding=True?

If it is True, would it fail the call at this point or it's just compute a different value of cls_tokens?

BTW, I see

if config.use_absolute_position_embeddings: self.position_embeddings = nn.Parameter(torch.zeros(1, num_patches + 1, config.hidden_size)) else: self.position_embeddings = None

So it's possible self.position_embeddings is not None if config.use_absolute_position_embeddings is True?

Yeah, I checked some popular models on the hub, all have use_absolute_position_embeddings: false (didn't do extensive testing tbh). So it's likely the combination of factors: no one uses use_absolute_position_embeddings=True (should be a new model) + interpolate_pos_encoding=True.

It's a bug introduced with adding interpolate_pos_encoding flag, it should be a embedding = embedding + .. not cls_tokens. But even in that case we will have double interpolation: here and one in BeitPatchEmbedding, so I just cleaned this up.

OK, indeed. Thanks!

qubvel · 2025-01-28T20:35:25Z

src/transformers/models/beit/modeling_beit.py

-    def forward(
-        self,
-        pixel_values: torch.Tensor,
-        position_embedding: Optional[torch.Tensor] = None,
-    ) -> torch.Tensor:


for consistency with other models position_embedding removed from BeitPatchEmbeddings and applied in BeitEmbeddings module, which is a breaking change, but I suppose BeitPatchEmbeddings module is used only as a part of the BeitEmbeddings.

ydshieh

Before looking at the tests, I have some question in the modeling code changes 🙏

ydshieh · 2025-01-29T14:12:03Z

src/transformers/models/beit/modeling_beit.py

-        if self.position_embeddings is not None:
-            if interpolate_pos_encoding:
-                cls_tokens = cls_tokens + self.interpolate_pos_encoding(embeddings, height, width)


Or it's likely no one never use interpolate_pos_encoding=True?

If it is True, would it fail the call at this point or it's just compute a different value of cls_tokens?

BTW, I see

if config.use_absolute_position_embeddings: self.position_embeddings = nn.Parameter(torch.zeros(1, num_patches + 1, config.hidden_size)) else: self.position_embeddings = None

So it's possible self.position_embeddings is not None if config.use_absolute_position_embeddings is True?

src/transformers/models/beit/modeling_beit.py

ydshieh

LGTM, thanks a lot.

A nit question regarding logger.warning_once() or warnings.warn.

tests/models/beit/test_modeling_beit.py

tests/models/textnet/test_modeling_textnet.py

tests/models/zoedepth/test_modeling_zoedepth.py

src/transformers/models/beit/modeling_beit.py

tests/models/beit/test_modeling_beit.py

tests/models/data2vec/test_modeling_data2vec_vision.py

qubvel · 2025-01-30T21:35:05Z

run-slow: beit, data2vec, dpt, zoedepth, detr, dinov2, vit, textnet

github-actions · 2025-01-30T21:36:22Z

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/detr', 'models/dinov2', 'models/dpt', 'models/textnet', 'models/vit', 'models/zoedepth'] ...

qubvel · 2025-01-30T22:18:31Z

cc @ArthurZucker for review

ArthurZucker

🤗 thanks for taking care of our Ci's health!

* Trigger tests * [run-slow] beit, detr, dinov2, vit, textnet * Fix BEiT interpolate_pos_encoding * Fix DETR test * Update DINOv2 test * Fix textnet * Fix vit * Fix DPT * fix data2vec test * Fix textnet test * Update interpolation check * Fix ZoeDepth tests * Update interpolate embeddings for BEiT * Apply suggestions from code review

qubvel added Tests Related to tests Vision run-slow labels Jan 13, 2025

qubvel added 2 commits January 28, 2025 15:17

Trigger tests

e82811c

[run-slow] beit, detr, dinov2, vit, textnet

6a703ee

qubvel force-pushed the fix-vision-tests branch from eb2a32c to 6a703ee Compare January 28, 2025 15:17

Fix BEiT interpolate_pos_encoding

81d752c

Fix DETR test

6fbf989

qubvel added 4 commits January 28, 2025 16:53

Update DINOv2 test

de717ca

Fix textnet

b8c0b51

Fix vit

d30ae45

Fix DPT

81287f0

qubvel added 3 commits January 28, 2025 17:27

fix data2vec test

ebfc350

Fix textnet test

46c9a8b

Update interpolation check

17831ed

qubvel added 2 commits January 28, 2025 18:03

Fix ZoeDepth tests

db4bdae

Update interpolate embeddings for BEiT

4e15b2a

qubvel requested a review from ydshieh January 28, 2025 18:27

qubvel marked this pull request as ready for review January 28, 2025 18:28

qubvel commented Jan 28, 2025

View reviewed changes

ydshieh reviewed Jan 29, 2025

View reviewed changes

ydshieh approved these changes Jan 29, 2025

View reviewed changes

tests/models/beit/test_modeling_beit.py Show resolved Hide resolved

tests/models/textnet/test_modeling_textnet.py Show resolved Hide resolved

tests/models/zoedepth/test_modeling_zoedepth.py Show resolved Hide resolved

src/transformers/models/beit/modeling_beit.py Show resolved Hide resolved

qubvel commented Jan 30, 2025

View reviewed changes

tests/models/beit/test_modeling_beit.py Show resolved Hide resolved

qubvel commented Jan 30, 2025

View reviewed changes

tests/models/data2vec/test_modeling_data2vec_vision.py Show resolved Hide resolved

Apply suggestions from code review

db29556

qubvel requested a review from ArthurZucker January 30, 2025 22:18

Merge branch 'main' into fix-vision-tests

e40d2ef

qubvel mentioned this pull request Feb 11, 2025

Add common test for torch.export and fix some vision models #35124

Merged

5 tasks

ArthurZucker approved these changes Feb 13, 2025

View reviewed changes

qubvel merged commit d419862 into huggingface:main Feb 13, 2025
16 checks passed

Fix tests for vision models #35654

Fix tests for vision models #35654

Uh oh!

Conversation

qubvel commented Jan 13, 2025

What does this PR do?

Who can review?

Uh oh!

qubvel commented Jan 13, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jan 13, 2025

Uh oh!

MHRDYN7 commented Jan 17, 2025

Uh oh!

qubvel commented Jan 22, 2025

Uh oh!

github-actions bot commented Jan 22, 2025

Uh oh!

qubvel commented Jan 22, 2025

Uh oh!

github-actions bot commented Jan 22, 2025

Uh oh!

qubvel commented Jan 28, 2025

Uh oh!

github-actions bot commented Jan 28, 2025

Uh oh!

qubvel commented Jan 28, 2025

Uh oh!

github-actions bot commented Jan 28, 2025

Uh oh!

qubvel commented Jan 28, 2025

Uh oh!

github-actions bot commented Jan 28, 2025

Uh oh!

qubvel commented Jan 28, 2025

Uh oh!

github-actions bot commented Jan 28, 2025

Uh oh!

qubvel commented Jan 28, 2025

Uh oh!

github-actions bot commented Jan 28, 2025

Uh oh!

qubvel commented Jan 28, 2025

Uh oh!

github-actions bot commented Jan 28, 2025

Uh oh!

qubvel Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qubvel Jan 28, 2025

Choose a reason for hiding this comment

Uh oh!

ydshieh Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

qubvel Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

qubvel Jan 28, 2025

Choose a reason for hiding this comment

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

ydshieh Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qubvel Jan 28, 2025 •

edited

Loading

qubvel Jan 29, 2025 •

edited

Loading