Skip to content

Conversation

@qubvel
Copy link
Contributor

@qubvel qubvel commented Jan 13, 2025

What does this PR do?

Fixing tests for vision models

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@qubvel qubvel added Tests Related to tests Vision run-slow labels Jan 13, 2025
@qubvel
Copy link
Contributor Author

qubvel commented Jan 13, 2025

run-slow: beit, detr, dinov2, vit, textnet

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@MHRDYN7
Copy link
Contributor

MHRDYN7 commented Jan 17, 2025

Hi @qubvel, I'm not sure what this draft pr intends to do, however it might be relevant to the pr #35138. That pr fixes the incompatibility of FlaxDinov2 with batch sizes of more than 1. This error could not be detected by the flax tests (same as the pytorch tests), when I first contributed this model. All the slow tests in transformers simply pass a single image with a batch size of 1 and that is why such batch sizes incompatibility errors might not be detected. As a result, I changed the images batch size to 2 for the flax dinov2 slow tests (in that pr, not yet merged). Probably, doing the same for all the other future model slow tests would greatly assist the development process. Also, may I request a review on pr #35138 so that FlaxDinov2 can be used properly.

@qubvel
Copy link
Contributor Author

qubvel commented Jan 22, 2025

run-slow: beit, detr, dinov2, vit, textnet

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/detr', 'models/dinov2', 'models/textnet', 'models/vit'] ...

@qubvel
Copy link
Contributor Author

qubvel commented Jan 22, 2025

run-slow: beit, detr, dinov2, vit, textnet

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/detr', 'models/dinov2', 'models/textnet', 'models/vit'] ...

@qubvel
Copy link
Contributor Author

qubvel commented Jan 28, 2025

run-slow: beit, detr, dinov2, vit, textnet

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/detr', 'models/dinov2', 'models/textnet', 'models/vit'] ...

@qubvel
Copy link
Contributor Author

qubvel commented Jan 28, 2025

run-slow: beit, data2vec, dpt

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/dpt'] ...

@qubvel
Copy link
Contributor Author

qubvel commented Jan 28, 2025

run-slow: detr

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: ['models/detr'] ...

@qubvel
Copy link
Contributor Author

qubvel commented Jan 28, 2025

run-slow: beit, detr, dinov2, vit, textnet, data2vec, dpt

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/detr', 'models/dinov2', 'models/dpt', 'models/textnet', 'models/vit'] ...

@qubvel
Copy link
Contributor Author

qubvel commented Jan 28, 2025

run-slow: beit, detr, dinov2, vit, textnet, data2vec, dpt

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/detr', 'models/dinov2', 'models/dpt', 'models/textnet', 'models/vit'] ...

@qubvel
Copy link
Contributor Author

qubvel commented Jan 28, 2025

run-slow: beit, data2vec, dpt, zoedepth

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/dpt', 'models/zoedepth'] ...

@qubvel qubvel requested a review from ydshieh January 28, 2025 18:27
@qubvel qubvel marked this pull request as ready for review January 28, 2025 18:28
Comment on lines -767 to -773
# with interpolate_pos_encoding being False an exception should be raised with higher resolution
# images than what the model supports.
self.assertFalse(processor.do_center_crop)
with torch.no_grad():
with self.assertRaises(ValueError, msg="doesn't match model"):
model(pixel_values, interpolate_pos_encoding=False)

Copy link
Contributor Author

@qubvel qubvel Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always interpolate, error raising was removed for ZoeDepth in
https://github.com/huggingface/transformers/pull/30136/files#diff-3f84bebd6be8d9c0f5c5068199f5c49eac8489d5fa466fb6fa08b0365e78dba4

that's why we are removing it from tests as well

Comment on lines -214 to -216
if self.position_embeddings is not None:
if interpolate_pos_encoding:
cls_tokens = cls_tokens + self.interpolate_pos_encoding(embeddings, height, width)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a bug, but we probably never reach it because self.position_embeddings is None

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or it's likely no one never use interpolate_pos_encoding=True?

If it is True, would it fail the call at this point or it's just compute a different value of cls_tokens?

BTW, I see

        if config.use_absolute_position_embeddings:
            self.position_embeddings = nn.Parameter(torch.zeros(1, num_patches + 1, config.hidden_size))
        else:
            self.position_embeddings = None

So it's possible self.position_embeddings is not None if config.use_absolute_position_embeddings is True?

Copy link
Contributor Author

@qubvel qubvel Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I checked some popular models on the hub, all have use_absolute_position_embeddings: false (didn't do extensive testing tbh). So it's likely the combination of factors: no one uses use_absolute_position_embeddings=True (should be a new model) + interpolate_pos_encoding=True.

It's a bug introduced with adding interpolate_pos_encoding flag, it should be a embedding = embedding + .. not cls_tokens. But even in that case we will have double interpolation: here and one in BeitPatchEmbedding, so I just cleaned this up.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, indeed. Thanks!

Comment on lines -251 to -255
def forward(
self,
pixel_values: torch.Tensor,
position_embedding: Optional[torch.Tensor] = None,
) -> torch.Tensor:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for consistency with other models position_embedding removed from BeitPatchEmbeddings and applied in BeitEmbeddings module, which is a breaking change, but I suppose BeitPatchEmbeddings module is used only as a part of the BeitEmbeddings.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before looking at the tests, I have some question in the modeling code changes 🙏

Comment on lines -214 to -216
if self.position_embeddings is not None:
if interpolate_pos_encoding:
cls_tokens = cls_tokens + self.interpolate_pos_encoding(embeddings, height, width)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or it's likely no one never use interpolate_pos_encoding=True?

If it is True, would it fail the call at this point or it's just compute a different value of cls_tokens?

BTW, I see

        if config.use_absolute_position_embeddings:
            self.position_embeddings = nn.Parameter(torch.zeros(1, num_patches + 1, config.hidden_size))
        else:
            self.position_embeddings = None

So it's possible self.position_embeddings is not None if config.use_absolute_position_embeddings is True?

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot.

A nit question regarding logger.warning_once() or warnings.warn.

@qubvel
Copy link
Contributor Author

qubvel commented Jan 30, 2025

run-slow: beit, data2vec, dpt, zoedepth, detr, dinov2, vit, textnet

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs: ['models/beit', 'models/data2vec', 'models/detr', 'models/dinov2', 'models/dpt', 'models/textnet', 'models/vit', 'models/zoedepth'] ...

@qubvel
Copy link
Contributor Author

qubvel commented Jan 30, 2025

cc @ArthurZucker for review

@qubvel qubvel requested a review from ArthurZucker January 30, 2025 22:18
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤗 thanks for taking care of our Ci's health!

@qubvel qubvel merged commit d419862 into huggingface:main Feb 13, 2025
16 checks passed
eleanorTurintech pushed a commit to eleanorTurintech/transformers that referenced this pull request Feb 13, 2025
* Trigger tests

* [run-slow] beit, detr, dinov2, vit, textnet

* Fix BEiT interpolate_pos_encoding

* Fix DETR test

* Update DINOv2 test

* Fix textnet

* Fix vit

* Fix DPT

* fix data2vec test

* Fix textnet test

* Update interpolation check

* Fix ZoeDepth tests

* Update interpolate embeddings for BEiT

* Apply suggestions from code review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants