-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Support grad_clip_norm_()
for FSDP
#20784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
amorehead
wants to merge
20
commits into
Lightning-AI:master
Choose a base branch
from
amorehead:fsdp-grad-clip-by-norm
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 17 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
2fc2fb7
Update fsdp.py
amorehead aa6e482
Merge branch 'Lightning-AI:master' into fsdp-grad-clip-by-norm
amorehead c36f40c
Support gradient norm clipping for FSDP
amorehead 8fad423
Update CHANGELOG.md
amorehead 04fbaf1
Fix args for certain precisions
amorehead bce69ca
Standardize precision args
amorehead 0df38f5
Guard for typing
amorehead a42b974
Fix argument typing
amorehead ed2fe05
Wrap AMP test module in FSDP
amorehead 2f62a0a
Simplify guard
amorehead 7f7987e
Remove FSDP traces in AMP precision unit test
amorehead 0b9b2a3
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead f98ce47
Merge branch 'master' into fsdp-grad-clip-by-norm
Borda 5814091
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead 75d6d9f
Merge branch 'master' into fsdp-grad-clip-by-norm
Borda 395c7fd
Merge branch 'master' into fsdp-grad-clip-by-norm
Borda 6f04f9c
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead dee2225
Apply suggestions from code review
Borda 169e20c
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead 3d80102
Merge branch 'master' into fsdp-grad-clip-by-norm
amorehead File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would be a breaking change, it has to go to the end of arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to how other codebases like Fabric would call
clip_gradients
? As far as I can see with this PR's unit tests, all references in the Lightning codebase are not broken by this change. And if you are, for clarification, wouldmodule
have to be made amodule: Optional[Module] = None
as the last argument in all of the modified functions below?