Skip to content

Whisper model support in Lite#11464

Merged
oyilmaz-nvidia merged 25 commits intomainfrom
onur/whisper
Dec 24, 2024
Merged

Whisper model support in Lite#11464
oyilmaz-nvidia merged 25 commits intomainfrom
onur/whisper

Conversation

@oyilmaz-nvidia
Copy link
Copy Markdown
Collaborator

What does this PR do ?

Adds AutoModelSeq2Seq HF support

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
oyilmaz-nvidia and others added 3 commits December 3, 2024 18:59
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
@oyilmaz-nvidia oyilmaz-nvidia changed the title Onur/whisper Whisper model support in Lite Dec 3, 2024
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
@pzelasko
Copy link
Copy Markdown
Collaborator

I made some changes to enable the training to converge (commit c605d12):

  • Preserve HF tokenizer's Whisper special tokens (prompt)
  • Change the precision to bf16 (fp16 results in nan after the first weight update)
  • Fix decoder_input_ids and labels for training step

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
@github-actions github-actions bot removed the ASR label Dec 12, 2024

from typing_extensions import Annotated

import nemo.lightning as nl

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from'

Module 'nemo.lightning' is imported with both 'import' and 'import from'.
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
oyilmaz-nvidia and others added 3 commits December 16, 2024 12:25
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
oyilmaz-nvidia and others added 2 commits December 16, 2024 16:16
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
@oyilmaz-nvidia oyilmaz-nvidia marked this pull request as ready for review December 16, 2024 21:27
Copy link
Copy Markdown
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good !

@github-actions
Copy link
Copy Markdown
Contributor

[🤖]: Hi @oyilmaz-nvidia 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully

So it might be time to merge this PR or get some approvals

I'm just a bot so I'll leave it you what to do next.

//cc @pablo-garay @ko3n1g

@oyilmaz-nvidia oyilmaz-nvidia merged commit 6224655 into main Dec 24, 2024
@oyilmaz-nvidia oyilmaz-nvidia deleted the onur/whisper branch December 24, 2024 05:28
malay-nagda pushed a commit that referenced this pull request Dec 31, 2024
* Data loader

* another dataset

* preprocessed audio dataset

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* seq2seq support

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* remove any update

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* fixing validation errors

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Modify training step and tokenizer to achieve correct Whisper training

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* Moved files into speechlm collection

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* revert changes

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* create recipes folder

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* generalize forward

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* example update

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* address codeql reviews

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* remove examples

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
abhinavg4 pushed a commit that referenced this pull request Jan 30, 2025
* Data loader

* another dataset

* preprocessed audio dataset

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* seq2seq support

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* remove any update

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* fixing validation errors

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Modify training step and tokenizer to achieve correct Whisper training

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* Moved files into speechlm collection

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* revert changes

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* create recipes folder

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* generalize forward

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* example update

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* address codeql reviews

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* remove examples

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Abhinav Garg <abhgarg@nvidia.com>
youngeunkwon0405 pushed a commit to youngeunkwon0405/NeMo that referenced this pull request Feb 10, 2025
* Data loader

* another dataset

* preprocessed audio dataset

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* seq2seq support

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* remove any update

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* fixing validation errors

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Modify training step and tokenizer to achieve correct Whisper training

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* Moved files into speechlm collection

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* revert changes

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* create recipes folder

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* generalize forward

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* example update

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* address codeql reviews

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* remove examples

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants