Skip to content

Conversation

@Cyrilvallez
Copy link
Member

@Cyrilvallez Cyrilvallez commented Apr 7, 2025

What does this PR do?

As per the title. We prioritize this family for now as they seem to have corrupted weights on the hub, resulting in bad inits (see #37070 as well). They are also used in optimum's tests!

@github-actions github-actions bot marked this pull request as draft April 7, 2025 11:07
@github-actions
Copy link
Contributor

github-actions bot commented Apr 7, 2025

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@Cyrilvallez Cyrilvallez added the for patch Tag issues / labels that should be included in the next patch label Apr 7, 2025
@Cyrilvallez Cyrilvallez marked this pull request as ready for review April 7, 2025 11:07
@github-actions github-actions bot requested review from ArthurZucker and eustlb April 7, 2025 11:07
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@IlyasMoutawwakil
Copy link
Member

IlyasMoutawwakil commented Apr 7, 2025

Thanks !
Can you please also add : ["poolformer", "dpt", "roformer", "mpnet", "deberta", "deberta_v2", "big_bird"]

2025-04-07T11:54:59.3014635Z =========================== short test summary info ============================
2025-04-07T11:54:59.3015409Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForMaskedLMIntegrationTest::test_compare_to_transformers_02_big_bird - AssertionError: Tensor-likes are not close!
2025-04-07T11:54:59.3015420Z 
2025-04-07T11:54:59.3015532Z Mismatched elements: 36 / 453222 (0.0%)
2025-04-07T11:54:59.3015796Z Greatest absolute difference: 1.6531025742728464e+31 at index (0, 0, 2) (up to 0.0001 allowed)
2025-04-07T11:54:59.3016081Z Greatest relative difference: 2693914.25 at index (0, 0, 0) (up to 0.0001 allowed)
2025-04-07T11:54:59.3016599Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForMaskedLMIntegrationTest::test_compare_to_transformers_06_deberta - AssertionError: Tensor-likes are not close!
2025-04-07T11:54:59.3016607Z 
2025-04-07T11:54:59.3016715Z Mismatched elements: 15360 / 15360 (100.0%)
2025-04-07T11:54:59.3016973Z Greatest absolute difference: 2.5342160989229478e+26 at index (0, 0, 463) (up to 0.0001 allowed)
2025-04-07T11:54:59.3017208Z Greatest relative difference: 9.20669464448467e+19 at index (0, 4, 110) (up to 0.0001 allowed)
2025-04-07T11:54:59.3017783Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForMaskedLMIntegrationTest::test_compare_to_transformers_07_deberta_v2 - AssertionError: Tensor-likes are not close!
2025-04-07T11:54:59.3017789Z 
2025-04-07T11:54:59.3017904Z Mismatched elements: 1152007 / 1152009 (100.0%)
2025-04-07T11:54:59.3018102Z Greatest absolute difference: nan at index (0, 0, 65) (up to 0.0001 allowed)
2025-04-07T11:54:59.3018295Z Greatest relative difference: nan at index (0, 0, 65) (up to 0.0001 allowed)
2025-04-07T11:54:59.3018813Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForMaskedLMIntegrationTest::test_compare_to_transformers_12_mobilebert - AssertionError: Tensor-likes are not close!
2025-04-07T11:54:59.3018822Z 
2025-04-07T11:54:59.3018927Z Mismatched elements: 15168 / 26976 (56.2%)
2025-04-07T11:54:59.3019123Z Greatest absolute difference: nan at index (0, 0, 2) (up to 0.0001 allowed)
2025-04-07T11:54:59.3019309Z Greatest relative difference: nan at index (0, 0, 2) (up to 0.0001 allowed)
2025-04-07T11:54:59.3019814Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForMaskedLMIntegrationTest::test_compare_to_transformers_13_mpnet - AssertionError: Tensor-likes are not close!
2025-04-07T11:54:59.3019819Z 
2025-04-07T11:54:59.3019921Z Mismatched elements: 26831 / 27000 (99.4%)
2025-04-07T11:54:59.3020175Z Greatest absolute difference: 1.6455917599407446e+31 at index (0, 0, 466) (up to 0.0001 allowed)
2025-04-07T11:54:59.3020395Z Greatest relative difference: 22199113728.0 at index (0, 5, 2) (up to 0.0001 allowed)
2025-04-07T11:54:59.3020908Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForMaskedLMIntegrationTest::test_compare_to_transformers_16_roformer - AssertionError: Tensor-likes are not close!
2025-04-07T11:54:59.3020973Z 
2025-04-07T11:54:59.3021073Z Mismatched elements: 9 / 450000 (0.0%)
2025-04-07T11:54:59.3021296Z Greatest absolute difference: 523218419712.0 at index (0, 0, 0) (up to 0.0001 allowed)
2025-04-07T11:54:59.3021505Z Greatest relative difference: 323862.34375 at index (0, 0, 0) (up to 0.0001 allowed)
2025-04-07T11:54:59.3022028Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForMaskedLMIntegrationTest::test_compare_to_transformers_17_squeezebert - AssertionError: Tensor-likes are not close!
2025-04-07T11:54:59.3022033Z 
2025-04-07T11:54:59.3022136Z Mismatched elements: 26976 / 26976 (100.0%)
2025-04-07T11:54:59.3022330Z Greatest absolute difference: nan at index (0, 0, 0) (up to 0.0001 allowed)
2025-04-07T11:54:59.3022516Z Greatest relative difference: nan at index (0, 0, 0) (up to 0.0001 allowed)
2025-04-07T11:54:59.3023102Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForMaskedLMIntegrationTest::test_pipeline_ort_model_12_mobilebert - AssertionError: nan not greater than or equal to 0.0
2025-04-07T11:54:59.3023718Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForImageClassificationIntegrationTest::test_pipeline_ort_model_11_poolformer - AssertionError: nan not greater than or equal to 0.0
2025-04-07T11:54:59.3024263Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForSemanticSegmentationIntegrationTest::test_compare_to_transformers_1_dpt - AssertionError: Tensor-likes are not close!
2025-04-07T11:54:59.3024268Z 
2025-04-07T11:54:59.3024369Z Mismatched elements: 2048 / 2048 (100.0%)
2025-04-07T11:54:59.3024639Z Greatest absolute difference: 1.6033836527025424e+34 at index (0, 0, 10, 20) (up to 0.0001 allowed)
2025-04-07T11:54:59.3024924Z Greatest relative difference: 1.0000416040420532 at index (0, 1, 10, 20) (up to 0.0001 allowed)
2025-04-07T11:54:59.3025590Z FAILED tests/onnxruntime/test_modeling.py::ORTModelForImageClassificationIntegrationTest::test_compare_to_transformers_11_poolformer - AssertionError: Tensor-likes are not close!
2025-04-07T11:54:59.3025605Z 
2025-04-07T11:54:59.3025703Z Mismatched elements: 2 / 2 (100.0%)
2025-04-07T11:54:59.3025924Z Greatest absolute difference: 8058407156187136.0 at index (0, 1) (up to 0.0001 allowed)
2025-04-07T11:54:59.3026132Z Greatest relative difference: 1692201984.0 at index (0, 1) (up to 0.0001 allowed)

@Cyrilvallez Cyrilvallez merged commit 22065bd into main Apr 7, 2025
17 checks passed
@Cyrilvallez Cyrilvallez deleted the fix-derived-bert-family branch April 7, 2025 16:25
vasqu pushed a commit to vasqu/transformers that referenced this pull request Apr 7, 2025
* fix derived berts

* more

* roformer
ArthurZucker pushed a commit that referenced this pull request Apr 7, 2025
* fix derived berts

* more

* roformer
cyr0930 pushed a commit to cyr0930/transformers that referenced this pull request Apr 18, 2025
* fix derived berts

* more

* roformer
@yaswanth19 yaswanth19 mentioned this pull request Apr 18, 2025
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
* fix derived berts

* more

* roformer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

for patch Tag issues / labels that should be included in the next patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants