Skip to content

Conversation

@mnoukhov
Copy link
Contributor

@mnoukhov mnoukhov commented Jun 1, 2023

for supervised finetuning

  • removed tokenizer hacks since they are no longer necessary with the updated llamatokenizer
  • black + isort

for reward modeling

  • added tokenizer_name so it can be separately specified from model
  • again removed old llama tokenizer hacks
  • added eval_first_step option to add an eval loop after the first step to make nicer graphs
  • black + isort

mnoukhov added 2 commits June 1, 2023 14:49
tokenizer can be separately specified from model
removed old llama tokenizer hacks
evaluate after first step option to make nicer graphs
black + isort
@mnoukhov mnoukhov mentioned this pull request Jun 1, 2023
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 2, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your hardwork @mnoukhov !
I have the same comments as here: #398 (review)
If you run the styling checks we should be all good!

@mnoukhov
Copy link
Contributor Author

mnoukhov commented Jun 4, 2023

Ran the checks and added the configs to my workspace config for the future :)

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@younesbelkada younesbelkada requested a review from lvwerra June 5, 2023 08:26
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for updating!

@younesbelkada younesbelkada merged commit ef57cdd into huggingface:main Jun 6, 2023
yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025
…ingface#399)

* better reward modelling

tokenizer can be separately specified from model
removed old llama tokenizer hacks
evaluate after first step option to make nicer graphs
black + isort

* removed tokenizer hacks from supervised ft

* black and flake8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants