NeMo-UX: fix nemo-ux export path by akoumpa · Pull Request #11081 · NVIDIA-NeMo/NeMo

akoumpa · 2024-10-29T10:45:34Z

What does this PR do ?

Fixes #10939

In particular changes include:

when using nemo_load setting setup_optimizers to False
load_context will try to load from /context or parent if it fails
when exporting to HF will prune padding from input and output layers to restore original vocab size.

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

nemo/collections/llm/gpt/model/starcoder2.py

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

github-actions · 2024-10-30T12:55:08Z

[🤖]: Hi @akoumpa 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully

So it might be time to merge this PR or get some approvals

I'm just a bot so I'll leave it you what to do next.

//cc @pablo-garay @ko3n1g

* only make optim config if model has optim and setup_optimizers is True Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * pass setup_optimizers=False in nemo_load Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix backwards compatibility in load_context Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * add torch_dtype_from_mcore_config Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix hf model dtype & prune embedding size Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: mistral Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: mixtral Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: nemotron Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: qwen2 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: startcoder Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: startcoder2 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate chatglm Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * remove commented code Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Add lazy init for export (#10613) * Add lazy init for export Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com> * NeMo-UX: fix nemo-ux export path (#11081) * only make optim config if model has optim and setup_optimizers is True Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * pass setup_optimizers=False in nemo_load Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix backwards compatibility in load_context Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * add torch_dtype_from_mcore_config Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix hf model dtype & prune embedding size Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: mistral Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: mixtral Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: nemotron Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: qwen2 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: startcoder Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: startcoder2 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate chatglm Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * remove commented code Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* only make optim config if model has optim and setup_optimizers is True Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * pass setup_optimizers=False in nemo_load Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix backwards compatibility in load_context Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * add torch_dtype_from_mcore_config Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix hf model dtype & prune embedding size Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: mistral Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: mixtral Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: nemotron Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: qwen2 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: startcoder Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: startcoder2 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate chatglm Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * remove commented code Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>

* only make optim config if model has optim and setup_optimizers is True Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * pass setup_optimizers=False in nemo_load Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix backwards compatibility in load_context Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * add torch_dtype_from_mcore_config Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix hf model dtype & prune embedding size Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: mistral Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: mixtral Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: nemotron Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: qwen2 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: startcoder Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate changes: startcoder2 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * propagate chatglm Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * remove commented code Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

akoumpa force-pushed the akoumparouli/nemo_ux_fix_export_path branch 3 times, most recently from 7f05199 to 3ea99d4 Compare October 29, 2024 11:14

akoumpa mentioned this pull request Oct 29, 2024

NeMo2.0 nemorun llm export ValueError: PyTorch DDP is not enabled for mcore optimizer #10939

Closed

akoumpa force-pushed the akoumparouli/nemo_ux_fix_export_path branch 4 times, most recently from 5a50005 to 0ce22da Compare October 29, 2024 15:56

akoumpa added the Run CICD label Oct 29, 2024

akoumpa force-pushed the akoumparouli/nemo_ux_fix_export_path branch from 77a1e61 to 0ce22da Compare October 29, 2024 16:01

github-advanced-security bot found potential problems Oct 29, 2024

View reviewed changes

nemo/collections/llm/gpt/model/starcoder2.py Fixed Show fixed Hide fixed

akoumpa force-pushed the akoumparouli/nemo_ux_fix_export_path branch from 86f9067 to 3a1a0bb Compare October 29, 2024 16:15

github-actions bot added NLP CI and removed NLP CI labels Oct 29, 2024

akoumpa force-pushed the akoumparouli/nemo_ux_fix_export_path branch from 212522f to bb08dbe Compare October 29, 2024 16:17

github-actions bot added NLP CI and removed NLP CI labels Oct 29, 2024

akoumpa force-pushed the akoumparouli/nemo_ux_fix_export_path branch from e19759e to eb9e86a Compare October 29, 2024 16:19

github-actions bot added NLP CI labels Oct 29, 2024

akoumpa removed the Run CICD label Oct 29, 2024

github-actions bot removed NLP CI labels Oct 29, 2024

akoumpa added the Run CICD label Oct 29, 2024

akoumpa changed the title ~~Akoumparouli/nemo ux fix export path~~ Fix nemo-ux export path Oct 29, 2024

akoumpa added the r2.0.0 label Oct 29, 2024

akoumpa added 2 commits October 29, 2024 19:03

rm rename

328df7d

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

rm rename

e8d045c

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

akoumpa force-pushed the akoumparouli/nemo_ux_fix_export_path branch 3 times, most recently from 678501a to e8d045c Compare October 30, 2024 02:14

github-actions bot removed NLP CI labels Oct 30, 2024

akoumpa added 2 commits October 29, 2024 19:15

fix

cca1958

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

fix

227a693

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

akoumpa force-pushed the akoumparouli/nemo_ux_fix_export_path branch from 582bc4a to 227a693 Compare October 30, 2024 02:15

akoumpa added Run CICD and removed Run CICD labels Oct 30, 2024

Apply isort and black reformatting

4731c75

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

akoumpa requested review from cuichenx and marcromeyn October 30, 2024 02:16

akoumpa added Run CICD and removed Run CICD labels Oct 30, 2024

akoumpa changed the title ~~Fix nemo-ux export path~~ NeMo-UX: fix nemo-ux export path Oct 30, 2024

akoumpa marked this pull request as ready for review October 30, 2024 07:40

marcromeyn approved these changes Oct 30, 2024

View reviewed changes

akoumpa enabled auto-merge (squash) October 30, 2024 09:36

akoumpa merged commit b543225 into main Oct 30, 2024

akoumpa deleted the akoumparouli/nemo_ux_fix_export_path branch October 30, 2024 12:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NeMo-UX: fix nemo-ux export path#11081

NeMo-UX: fix nemo-ux export path#11081
akoumpa merged 19 commits intomainfrom
akoumparouli/nemo_ux_fix_export_path

akoumpa commented Oct 29, 2024 •

edited

Loading

Uh oh!

Uh oh!

github-actions bot commented Oct 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

akoumpa commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

Uh oh!

Uh oh!

github-actions bot commented Oct 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akoumpa commented Oct 29, 2024 •

edited

Loading