Skip to content

Conversation

@gante
Copy link
Contributor

@gante gante commented May 2, 2025

What does this PR do?

Our CI is failing due to OOM in some slow tests (see CacheHardIntegrationTest failures here).

This PR replaces the 7B model (requires ~15GB VRAM) with a 4B model (requires ~9GB VRAM) in tests that were using a 7B model. It also makes a few more minor modifications to ensure a green CI (commented in the diff)

@gante gante requested a review from ydshieh May 2, 2025 09:31
@github-actions github-actions bot marked this pull request as draft May 2, 2025 09:31
@github-actions
Copy link
Contributor

github-actions bot commented May 2, 2025

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@gante gante marked this pull request as ready for review May 2, 2025 09:31
Comment on lines +304 to +312
def setUp(self):
# Clears memory before each test. Some tests use large models, which might result in suboptimal torch
# re-allocation if we run multiple tests in a row without clearing memory.
cleanup(torch_device, gc_collect=True)

@classmethod
def tearDownClass(cls):
# Clears memory after the last test. See `setUp` for more details.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to this fix.

On main, we're not reusing GPU memory properly across from_pretrained calls. Until it is sorted, to prevent flaky tests, we have to start each test with a memory reset. If we don't start with a memory reset, the first test in this class might have memory-related issues (see diff in test_cache_copy, the first test run in this class)

Comment on lines +466 to +468
"enriching experience that broadens our horizons and allows us to explore the world beyond our comfort "
"zones. Whether it's a short weekend getaway",
Copy link
Contributor Author

@gante gante May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the device, if we run this test in isolation (RUN_SLOW=1 py.test tests/utils/test_cache_utils.py -k test_cache_copy) we might get a different output compared to a full test suite run. With the updated memory reset, this is the correct output in all combinations -- see comment above

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@ydshieh
Copy link
Collaborator

ydshieh commented May 5, 2025

it's nice to try to use run-slow: utils maybe?

@gante
Copy link
Contributor Author

gante commented May 6, 2025

run-slow: utils

@github-actions
Copy link
Contributor

github-actions bot commented May 6, 2025

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['utils']
quantizations: [] ...

@gante gante force-pushed the smaller_model_cache_tests branch from 394e83f to 899cb26 Compare May 6, 2025 09:54
@gante
Copy link
Contributor Author

gante commented May 6, 2025

(there are some multi-gpu issues, fixing them)

@gante
Copy link
Contributor Author

gante commented May 6, 2025

@ydshieh the multi-gpu issues require extensive changes on the offloaded caches, I'm merging this as-is and I'll open a new PR to make multi-gpu work!

@gante gante merged commit 9981214 into huggingface:main May 6, 2025
11 checks passed
@gante gante deleted the smaller_model_cache_tests branch May 6, 2025 10:15
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants