Apply packed sequence params change for fused rope compatibility#11506
Apply packed sequence params change for fused rope compatibility#11506ananthsub merged 5 commits intoNVIDIA-NeMo:mainfrom
Conversation
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
5aa617c to
3346acb
Compare
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
Signed-off-by: ananthsub <ananthsub@users.noreply.github.com>
| # Pipeline dtype is coupled with the bf16 mixed precision plugin | ||
| pipeline_dtype=torch.bfloat16, |
There was a problem hiding this comment.
fyi @hemildesai @BoxiangW for the parallelism setting refactor. this requirement is coming after #10954 which validates that the pipeline dtype is now set here.
there are multiple paths to set this:
- either on the megatron strategy directly
- via the precision plugin
setting it in multiple places feels wrong, especially since users have to make 2 hops in the codebase to figure this out:
- https://github.com/NVIDIA/NeMo/blob/bde672e75f1ac45ead08e2b977920a28eb81448e/nemo/lightning/pytorch/strategies/megatron_strategy.py#L288-L290
- https://github.com/NVIDIA/NeMo/blob/bde672e75f1ac45ead08e2b977920a28eb81448e/nemo/lightning/pytorch/plugins/mixed_precision.py#L107-L113C46
There was a problem hiding this comment.
I think a better way than hardcoding pipeline_dtype is to make it a function attribute and set it's value if it's used https://github.com/NVIDIA/NeMo/pull/11504/files#diff-78f81f4094cfea056c177e87c0d527b9ce27cee11813138e5a2a69370b922c19R282
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
|
beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base. Your code was analyzed with PyLint. The following annotations have been identified: Thank you for improving NeMo's documentation! |
…DIA-NeMo#11506) * Apply packed sequence params change for fused rope compatibility Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com> * fix lint Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com> --------- Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com> Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>
What does this PR do ?
Compatibility with NVIDIA/Megatron-LM@210162a
Collection: nlp
Changelog
Usage
# Add a code snippet demonstrating how to use thisGitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information