-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Closed
Description
The recent merge request (#35837) works with accelerate but breaks with DeepSpeed (w/ and w/o deepspeed config)
- distributed_type: MULTI_GPU (work)
- distributed_type: DEEPSPEED (no longer works)
To be more precise the issue lies in this section: https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py#L200
emb = torch.cat((rotary_pos_emb, rotary_pos_emb), dim=-1)
cos = emb.cos().float()
sin = emb.sin().float()
else:
cos, sin = position_embeddings
q, k = apply_rotary_pos_emb_flashatt(q.unsqueeze(0), k.unsqueeze(0), cos, sin)
cos, sin = position_embeddings these are not casted to float and are subject to various dtypes depending on the DeepSpeed and mixed_precision config.
This accelerate config works:
compute_environment: LOCAL_MACHINE
debug: false
distributed_type: MULTI_GPU
downcast_bf16: 'no'
enable_cpu_affinity: #false
main_training_function: main
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
mixed_precision: bf16
This accelerate config no longer works:
compute_environment: LOCAL_MACHINE
debug: false
distributed_type: DEEPSPEED
deepspeed_config:
zero_stage: 3
downcast_bf16: 'no'
enable_cpu_affinity: false
main_training_function: main
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
ArvinZhuang, MrToy, zucchini-nlp, VeryLazyBoy and NIL-zhuang
Metadata
Metadata
Assignees
Labels
No labels