Fix FSDP mixed precision setting and loss w/ accelerate #465

Maxusmusti · 2025-04-15T21:04:55Z

Accelerate requires mixed_precision to be passed directly in config, else the fp32 upcast gets skipped and training loses overall precision, resulting in notably worse performance. Passing this also makes our manual mixed precision policy redundant, and since we are now actually using mixed precision, hybrid shard finally becomes the new default sharding strategy to keep memory expectations in parity.

Signed-off-by: Mustafa Eyceoz <[email protected]>

RobotSail

Tested locally and confirmed that this PR produces the same loss as DeepSpeed

Signed-off-by: Mustafa Eyceoz <[email protected]>

JamesKunstle

👍

Fix FSDP mixed precision setting and loss w/ accelerate

07ce6b8

Signed-off-by: Mustafa Eyceoz <[email protected]>

Maxusmusti self-assigned this Apr 15, 2025

mergify bot added the ci-failure label Apr 15, 2025

RobotSail approved these changes Apr 16, 2025

View reviewed changes

mergify bot added the one-approval label Apr 16, 2025

RobotSail approved these changes Apr 16, 2025

View reviewed changes

Remove unused import

caaa6d3

Signed-off-by: Mustafa Eyceoz <[email protected]>

Maxusmusti marked this pull request as ready for review April 16, 2025 17:42

JamesKunstle approved these changes Apr 16, 2025

View reviewed changes

JamesKunstle merged commit 9948a1f into instructlab:main Apr 16, 2025
11 of 13 checks passed

mergify bot removed the one-approval label Apr 16, 2025

RobotSail mentioned this pull request Apr 16, 2025

moves deepspeed requirements into their own file; add deepspeed extras #455

Merged

Maxusmusti mentioned this pull request Apr 17, 2025

HF Format Checkpoints saving in FP32 (FSDP) #477

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix FSDP mixed precision setting and loss w/ accelerate #465

Fix FSDP mixed precision setting and loss w/ accelerate #465

Uh oh!

Maxusmusti commented Apr 15, 2025

Uh oh!

RobotSail left a comment

Uh oh!

JamesKunstle left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix FSDP mixed precision setting and loss w/ accelerate #465

Fix FSDP mixed precision setting and loss w/ accelerate #465

Uh oh!

Conversation

Maxusmusti commented Apr 15, 2025

Uh oh!

RobotSail left a comment

Choose a reason for hiding this comment

Uh oh!

JamesKunstle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants