Skip to content

Conversation

@sebbaur
Copy link
Contributor

@sebbaur sebbaur commented Mar 3, 2025

What does this PR do?

Make parameters of ViTPooler (activation function, output size) configurable, so that I can open-source https://arxiv.org/abs/2403.02522 more easily (the encoder slightly differs from the current implementation).

Fixes # (issue)
#36513

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Note that the code didn't have any test to the best of my knowledge, so I have not added any. Besides, the change is trivial so it may not be needed.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2025

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

@github-actions github-actions bot marked this pull request as draft March 3, 2025 15:56
@qubvel qubvel added the Vision label Mar 3, 2025
@qubvel
Copy link
Contributor

qubvel commented Mar 3, 2025

Hey @sebbaur, feel free to ping me when it's ready for review! Btw, we have activation functions defined in ACT2FN mapping, see llama for example.

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 3, 2025

Thanks for the prompt reply and the pointers! I hadn't noticed ACT2FN -- I have made the change

Once the CI is happy, I will ping you

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 3, 2025

I had to modify files in dpt and deit too to make CI happy

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 3, 2025

tests are not done but I suppose they will pass

@sebbaur sebbaur marked this pull request as ready for review March 3, 2025 20:47
@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 3, 2025

@qubvel I think it is ready -- it took more work than I initially expected because I had to propagate changes to other parts of the codebase which use the same pooler. This felt a bit out of scope, but I suppose this is WAI?

Copy link
Contributor

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good to me, just a question re docs:

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 4, 2025

Somebody suggested here #36513 that this change was not appropriate

I am a bit confused why -- it is a no-op and just makes some hardcoded hyperparameters configurable

@Rocketknight1
Copy link
Member

Hey! That was me but on reflection I think it's okay - we do have a general rule in our philosophy against adding additional features to existing pretrained model architectures like this, but when it's compact and backward compatibility is guaranteed I think it's probably okay, especially if the PR is already complete.

Copy link
Contributor

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebbaur, thanks for making CI happy, just the last nit comment

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 7, 2025

@qubvel it says 2 workflows awaiting approval -- do you know who I should reach out to to review this?

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 10, 2025

@qubvel @Rocketknight1 friendly ping :) -- do you know who should review the workflows? 2 workflows awaiting approval

@Rocketknight1
Copy link
Member

@sebbaur the additional workflows are the documentation builder - it's usually not essential to run those. I enabled them anyway, but at this point we just need core maintainer approval and we're good to merge cc @ArthurZucker @Cyrilvallez

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 12, 2025

hello @ArthurZucker and @Cyrilvallez, would you be able to take a look?

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 13, 2025

friendly ping :)

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 17, 2025

Happy Monday @ArthurZucker and @Cyrilvallez! Would you please be able to take a quick look? That would unlock a few things on my side

Thanks!

@qubvel qubvel requested a review from Cyrilvallez March 17, 2025 16:50
@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 19, 2025

Hello @Cyrilvallez ! Could you please take a look? Even if this cannot be merged, it would be very helpful for me to know this sooner than later, as I am planning to open-source a Pytorch implementation of https://huggingface.co/google/hear, which this PR allows easily. I need time to find an alternative if this is not ok

Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Sorry for the delay! @Rocketknight1 is right that we usually don't like to change modeling code so much, but let's make an exception here!
However, let's still minimize impact and removed unecesary changes, as from my understanding you only need vit:

  • Ijepa should never have been impacted
  • let's juste remove "Copied from" for deit and dpt instead of modifying them, and revert the changes to modeling and configs
  • flax change of vit is wrong -> let's make sure it works by adding tanh in the mapping in modeling_flax_utils

Comment on lines 420 to +421
)
self.activation = ACT2FN[self.config.pooler_act]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"tanh" is a KeyError here, it does not exist in the mapping

@Cyrilvallez
Copy link
Member

For the "Copied from", you can replace them with something such as "Originally copied from", it will stop our linter from forcing the changes, and at the same time we still keep track of the fact it was previously similar, in case we need it later

@Cyrilvallez
Copy link
Member

Alright, after discussions with @qubvel, let's keep the changes that were propagated to the other models, it's fine! You just need to fix the issue on flax vit then 😉 Sorry for the roller coaster! 🙃

@sebbaur
Copy link
Contributor Author

sebbaur commented Mar 20, 2025

Thanks for your review!

I have made the change to the ACT2FN dict -- good catch!

Next time I will know that the models' code shouldn't be updated... Sorry about this

@Cyrilvallez
Copy link
Member

Merging! Thanks!!

@Cyrilvallez Cyrilvallez merged commit 62116c9 into huggingface:main Mar 21, 2025
21 checks passed
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
* Make ViT Pooler configurable, so that it is possible to pick the activation function and the number of channels in the output

* Add documentation and allow functions as activations (instead of just string)

* formatting change

* Use ACT2FN

* Formatting change

* Formatting changes

* force pooler_act to be string

* force pooler_act to be string

* Add configs to OBJECTS_TO_IGNORE to make check_docstrings happy

* Making the same change in ijepa to make check_modular_conversion happy

* Add IJepaConfig to make CI happy

* rename pooler_size to pooler_output_size as defined in the config

* typo

* revert change to ignore variable

* Ran utils/check_docstrings.py --fix_and_overwrite

* revert unrelated change

* remove redundant defaults

* rename self.act -> self.activation

* tanh activation function in mapping
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants