Support for Multi-GPU #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

simsim314 wants to merge 12 commits into HiDream-ai:main from simsim314:simsim314-patch-1

simsim314 commented Apr 13, 2025

Multi GPU support added by modifying mainly the tranfsormer_hidream_image.py

Some usage of pytorch in the file (i.e. cat function) was assuming all the model is fully on single GPU, this patch solves the issue by copying the relevant computational results to the same GPU (before applying torch.cat)

The initial distribution of the models among different GPUs is using accelerate library with auto.

simsim314 added 5 commits

April 13, 2025 05:59


          Add files via upload

5a460c7


          Update runpod_HiDream.ipynb

887cb1e


          Update transformer_hidream_image.py

dda7670

multi GPU support added with Gemini-2.5, checked working


          Update pipeline_hidream_image.py

9fa17f8


          Update runpod_HiDream.ipynb

9a81c01

bghira reviewed

View reviewed changes

hi_diffusers/models/transformers/transformer_hidream_image.py Outdated

    
                      adaln_input: Optional[torch.FloatTensor] = None,

                      rope: torch.FloatTensor = None,

                  ) -> torch.FloatTensor:

                  ) -> Tuple[torch.FloatTensor, torch.FloatTensor]: # Use Tuple from typing

bghira Apr 13, 2025

these comments look like they were scattered around by an LLM and aren't needed.

bghira reviewed

View reviewed changes

hi_diffusers/models/transformers/transformer_hidream_image.py Outdated

    
                          adaln_input,

                          rope,

                      )

                  ): # Removed return type hint

bghira Apr 13, 2025

why remove return typehint?

bghira reviewed

View reviewed changes

hi_diffusers/models/transformers/transformer_hidream_image.py Outdated

Comment on lines 322 to 340

    
                      if is_training:

                          x = einops.rearrange(x, 'B S (p1 p2 C) -> B C S (p1 p2)', p1=self.config.patch_size, p2=self.config.patch_size)

                          x = einops.rearrange(x, 'B S (p1 p2 C) -> B C S (p1 p2)', p1=patch_size, p2=patch_size)

                          H = self.config.max_resolution[0] // patch_size

                          W = self.config.max_resolution[1] // patch_size

                          if x.shape[2] == H * W:

                               x = einops.rearrange(x, 'B C (H W) (p1 p2) -> B C (H p1) (W p2)', H=H, W=W, p1=patch_size, p2=patch_size)

                          else:

                               logger.warning(f"Training unpatchify mismatch: S={x.shape[2]} vs H*W={H*W}")

                      else:

                          x_arr = []

                          for i, img_size in enumerate(img_sizes):

                              pH, pW = img_size

                              x_arr.append(

                                  einops.rearrange(x[i, :pH*pW].reshape(1, pH, pW, -1), 'B H W (p1 p2 C) -> B C (H p1) (W p2)', 

                                      p1=self.config.patch_size, p2=self.config.patch_size)

                              )

                              num_patches = pH * pW

                              current_x = x[i, :num_patches]

                              rearranged_x = einops.rearrange(current_x.unsqueeze(0), 'b (H W) (p1 p2 C) -> b C (H p1) (W p2)',

                                                              H=pH, W=pW, p1=patch_size, p2=patch_size)

                              x_arr.append(rearranged_x)

                          if not x_arr:

                               return torch.empty((0, self.out_channels, 0, 0), device=x.device, dtype=x.dtype)

bghira Apr 13, 2025

what are these changes for?

bghira reviewed

View reviewed changes

hi_diffusers/models/transformers/transformer_hidream_image.py Outdated

Comment on lines 370 to 373

    
                      encoder_hidden_states: torch.Tensor = None, # Keep original annotation

                      pooled_embeds: torch.Tensor = None,

                      img_sizes: Optional[List[Tuple[int, int]]] = None,

                      img_ids: Optional[torch.Tensor] = None,

                      img_sizes: Optional[List[Tuple[int, int]]] = None, # Keep original annotation

                      img_ids: Optional[torch.Tensor] = None, # Keep original annotation

bghira Apr 13, 2025

LLM comment spam

bghira reviewed

View reviewed changes

hi_diffusers/models/transformers/transformer_hidream_image.py Outdated

    
                              )

                      # spatial forward

                               logger.warning("...")

bghira Apr 13, 2025

wheres the warning now?

bghira commented Apr 13, 2025

it feels like a lot of unnecessary changes.

simsim314 added 7 commits

April 13, 2025 17:32


          Update pipeline_hidream_image.py


          Update transformer_hidream_image.py

50284e4


          Update transformer_hidream_image.py

d990640

small spaces


          Update pipeline_hidream_image.py

a222c27


          Update transformer_hidream_image.py

b957ee4

unexpected indent fix


          Update transformer_hidream_image.py

ba1696b

bug in name fixed 
duplicate of row fixed


          Update runpod_HiDream.ipynb

09e1272

Author

simsim314 commented Apr 13, 2025

it feels like a lot of unnecessary changes.

OK sorry thought those changes are superficial - done by LLM prompted to only change the Multi GPU problems regarding torch.cat, I guess those LLMs are pretty stupid still :)

Anyway I have checked manually all the changes now - 4 places that were made sure all tensors are on the same device before cat operation, 3 in the model, 1 in the pipeline, nothing else was touched.

Checked working the full model with 4xRTX 3090 with my runpod jupyther script, it's cloning my branch so don't use it as is. But otherwise it provides a good example how to run the model.

Do I need to open a new PR - or the changes are automated into this one?

bghira commented Apr 13, 2025

it's not my project, i just follow its development and port fixes to my own project(s). this is probably fine to leave open, but, i don't think the model authors are very responsive. what tends to happen to these repos for new models is pull requests simply pile up and never get merged.

Author

simsim314 commented Apr 13, 2025

Anyway just wanted to share with the community the option to run on several gpus, and you were right some changes didnt make sense...

My branch now is clean of strange llm artifacts, and supports multi gpu - I guess I am not the only one who will need it - this model is huge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet