Skip to content

Commit 1f42205

Browse files
authored
remove mcore-dist-opt (for now) (#323)
* remove mcore-dist-opt (for now) Signed-off-by: Alexandros Koumparoulis <[email protected]> * use EP=1 since we use APEX dist opt Signed-off-by: Alexandros Koumparoulis <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]>
1 parent 0632bb8 commit 1f42205

File tree

1 file changed

+4
-6
lines changed

1 file changed

+4
-6
lines changed

launcher_scripts/conf/training/mixtral/mixtral_8x7b.yaml

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -52,9 +52,9 @@ model:
5252
micro_batch_size: 1
5353
global_batch_size: 256
5454
rampup_batch_size: null
55-
tensor_model_parallel_size: 2
56-
pipeline_model_parallel_size: 1
57-
expert_model_parallel_size: 8
55+
tensor_model_parallel_size: 8
56+
pipeline_model_parallel_size: 4
57+
expert_model_parallel_size: 1
5858
virtual_pipeline_model_parallel_size: null
5959
encoder_seq_length: 4096
6060
max_position_embeddings: 32768
@@ -145,9 +145,7 @@ model:
145145
- 0
146146
gen_shape: false
147147
optim:
148-
name: mcore_distributed_optim
149-
overlap_grad_sync: true
150-
overlap_param_sync: true
148+
name: distributed_fused_adam
151149
lr: 0.0001
152150
weight_decay: 0.1
153151
betas:

0 commit comments

Comments
 (0)