fix: CPU RAM efficient loading for nd or HSDP parallelisms by kmehant · Pull Request #3740 · huggingface/accelerate

kmehant · 2025-08-20T13:57:13Z

What does this PR do?

This fixes cpu ram efficient loading bug for nd parallel or even HSDP (replicate + shard) which needs broadcasting the tensor to all world ranks from global rank 0. Since from_pretrained from transformers is designed to load on global rank 0 in ram efficient loading case.

Fixes

    dist.broadcast(full_tensor, src=0, group=device_mesh.get_group())
                                             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/lib/python3.12/site-packages/torch/distributed/device_mesh.py", line 752, in get_group
    raise RuntimeError: ('Found the DeviceMesh have 2 dimensions', 'Optional kwarg `mesh_dim` needs to be specified when device_mesh.ndim > 1.', 'If you want to get the list of all the ProcessGroups in the DeviceMesh,please use `get_all_groups()` instead.')

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@S1ro1 @SunMarc @zach-huggingface

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>

kmehant · 2025-08-20T13:59:16Z

@S1ro1 @SunMarc can we attend to this pressing bug please? Thank you.

S1ro1

LGTM!

HuggingFaceDocBuilderDev · 2025-08-21T10:11:39Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

kmehant · 2025-08-21T11:38:44Z

@S1ro1 Thank you, can we merge this?

S1ro1 · 2025-08-21T11:40:00Z

Yep, sorry forgot to come back to this after checks passed haha!

kmehant · 2025-10-08T05:07:01Z

Hi @S1ro1 @SunMarc When are we making a release on accelerate and including these patches? Would be helpful having these released.

SunMarc · 2025-10-08T16:46:57Z

Hi @S1ro1 @SunMarc When are we making a release on accelerate and including these patches? Would be helpful having these released.

I will try to do a patch end of the week if I manage to fix some minor bugs !

kmehant · 2025-10-09T06:46:17Z

Thank you @SunMarc

fix: cpu ram efficient loading for nd or hsdp parallelisms

5d2be6c

Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com>

kmehant changed the title ~~fix: cpu ram efficient loading for nd or hsdp parallelisms~~ fix: CPU RAM efficient loading for nd or HSDP parallelisms Aug 20, 2025

S1ro1 approved these changes Aug 21, 2025

View reviewed changes

S1ro1 merged commit 979d81e into huggingface:main Aug 21, 2025
25 checks passed

This was referenced Nov 28, 2025

feat: CP support for mamba layer foundation-model-stack/fms-hf-tuning#642

Open

feat: CP support for mamba layer foundation-model-stack/fms-acceleration#164

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: CPU RAM efficient loading for nd or HSDP parallelisms#3740

fix: CPU RAM efficient loading for nd or HSDP parallelisms#3740
S1ro1 merged 1 commit intohuggingface:mainfrom
kmehant:fix-cpu-ram-hsdp

kmehant commented Aug 20, 2025 •

edited

Loading

Uh oh!

kmehant commented Aug 20, 2025

Uh oh!

S1ro1 left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Aug 21, 2025

Uh oh!

kmehant commented Aug 21, 2025

Uh oh!

S1ro1 commented Aug 21, 2025

Uh oh!

Uh oh!

kmehant commented Oct 8, 2025 •

edited

Loading

Uh oh!

SunMarc commented Oct 8, 2025

Uh oh!

kmehant commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kmehant commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

kmehant commented Aug 20, 2025

Uh oh!

S1ro1 left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Aug 21, 2025

Uh oh!

kmehant commented Aug 21, 2025

Uh oh!

S1ro1 commented Aug 21, 2025

Uh oh!

Uh oh!

kmehant commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SunMarc commented Oct 8, 2025

Uh oh!

kmehant commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kmehant commented Aug 20, 2025 •

edited

Loading

kmehant commented Oct 8, 2025 •

edited

Loading