Lora ckpt in HF format for NeMo AutoModel#11712
Conversation
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
| @@ -117,7 +117,15 @@ def ckpt_to_dir(filepath: Union[str, Path]) -> Path: | |||
|
|
|||
|
|
|||
| def create_checkpoint_io(wrapping_ckpt_io=None, **kwargs): | |||
There was a problem hiding this comment.
can you add a test to make a checkpoint with NeMo and restore it in huggingface? We have tests now for LLM & VLM.
Also, right now checkpoint saving is disabled in the tests, can you turn it on (minor change in the test command)?
There was a problem hiding this comment.
Enabled the ckpt savings in those tests. I'll need to address the restore in a separate PR using AutoResume right after this PR.
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
…/NeMo into onur/auto-model-peft-ckpt
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Updating peft test name Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
changing the hf vlm test name Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com>
|
beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base. Your code was analyzed with PyLint. The following annotations have been identified: Mitigation guide:
By applying these rules, we reduce the occurance of this message in future. Thank you for improving NeMo's documentation! |
1 similar comment
|
beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base. Your code was analyzed with PyLint. The following annotations have been identified: Mitigation guide:
By applying these rules, we reduce the occurance of this message in future. Thank you for improving NeMo's documentation! |
|
[🤖]: Hi @oyilmaz-nvidia 👋, We wanted to let you know that a CICD pipeline for this PR just finished successfully So it might be time to merge this PR or get some approvals I'm just a bot so I'll leave it you what to do next. //cc @pablo-garay @ko3n1g |
| RUNNER: self-hosted-azure | ||
| SCRIPT: | | ||
| TRANSFORMERS_OFFLINE=1 python tests/collections/llm/hf/peft.py --model /home/TestData/nlp/hf_gemma/hf_gemma_2b --max-steps 10 --devices 2 --strategy ddp --disable-ckpt | ||
| TRANSFORMERS_OFFLINE=1 python tests/collections/llm/hf/peft_hf.py --model /home/TestData/nlp/hf_gemma/hf_gemma_2b --max-steps 10 --devices 2 --strategy ddp --disable-ckpt |
* Save lora ckpt in safetensor and a config Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * remove hf variable from peft Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * vllm with automodel peft working * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> * revert changes Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * update examples Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> * removed unused import Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * enable ckpt saving Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> * remove unused import Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> * fix minor bug Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> --------- Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
* Save lora ckpt in safetensor and a config Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * remove hf variable from peft Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * vllm with automodel peft working * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> * revert changes Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * update examples Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> * removed unused import Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * enable ckpt saving Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> * remove unused import Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> * fix minor bug Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> --------- Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> Signed-off-by: Abhinav Garg <abhgarg@nvidia.com>
* Save lora ckpt in safetensor and a config Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * remove hf variable from peft Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * vllm with automodel peft working * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> * revert changes Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * update examples Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> * removed unused import Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * enable ckpt saving Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> * remove unused import Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> * Apply isort and black reformatting Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> * fix minor bug Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> --------- Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com> Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> Signed-off-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com> Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com> Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>
What does this PR do ?
Adds support to save Lora ckpt in HF format for NeMo automodel.