StackLlama: fixed RL training and added args #400

mnoukhov · 2023-06-01T20:22:42Z

added steps argument and break to respect max training epochs added more PPOConfig args to script args
removed llama tokenizer hacks
black + isort

~~switched to LlamaTokenizer from AutoTokenizer~~ added return_token_type_ids=False to pipeline kwargs because

the latter loads LlamaTokenizerFast
LlamaTokenizerFast will output token_type_ids see LLaMATokenizerFast works abnormally transformers#23818 and 🚨🚨 🚨🚨 [Tokenizer] attemp to fix add_token issues🚨🚨 🚨🚨 transformers#23909
these token_type_ids cause an error in our reward model pipeline, namely TypeError: LlamaForSequenceClassification.forward() got an unexpected keyword argument 'token_type_ids'

added steps argument and break to respect max training epochs added more PPOConfig args to script args removed llama tokenizer hacks removed extra args in dataset changed to llamatokenizer from autotokenizer black + isort

ArthurZucker · 2023-06-02T06:41:31Z

You can use return_token_type_ids=False as a quick fix to prevent them from being passed

HuggingFaceDocBuilderDev · 2023-06-02T07:30:58Z

The documentation is not available anymore as the PR was closed or merged.

younesbelkada

THank you so much for your contribution! Could you just run the styling checks? After that we should be good for merging

make style && make quality

mnoukhov · 2023-06-04T03:44:09Z

Fixed style and quality and Switched back to Autotokenizer, thanks for the tip @ArthurZucker

younesbelkada

Thanks so much!

* fixed rl training args added steps argument and break to respect max training epochs added more PPOConfig args to script args removed llama tokenizer hacks removed extra args in dataset changed to llamatokenizer from autotokenizer black + isort * black and flake8 * style, quality, and switch back to AutoTokenizer

fixed rl training args

876bc76

added steps argument and break to respect max training epochs added more PPOConfig args to script args removed llama tokenizer hacks removed extra args in dataset changed to llamatokenizer from autotokenizer black + isort

mnoukhov mentioned this pull request Jun 1, 2023

Reproducing StackLLaMA #401

Closed

younesbelkada reviewed Jun 2, 2023

View reviewed changes

mnoukhov added 2 commits June 4, 2023 03:21

black and flake8

642c3ec

style, quality, and switch back to AutoTokenizer

0ab0ad3

younesbelkada approved these changes Jun 5, 2023

View reviewed changes

younesbelkada requested a review from lvwerra June 5, 2023 08:27

lvwerra approved these changes Jun 5, 2023

View reviewed changes

younesbelkada merged commit a4793c2 into huggingface:main Jun 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

StackLlama: fixed RL training and added args #400

StackLlama: fixed RL training and added args #400

Uh oh!

mnoukhov commented Jun 1, 2023 •

edited

Loading

Uh oh!

ArthurZucker commented Jun 2, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Jun 2, 2023 •

edited

Loading

Uh oh!

younesbelkada left a comment

Uh oh!

mnoukhov commented Jun 4, 2023

Uh oh!

younesbelkada left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

StackLlama: fixed RL training and added args #400

StackLlama: fixed RL training and added args #400

Uh oh!

Conversation

mnoukhov commented Jun 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArthurZucker commented Jun 2, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Jun 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

mnoukhov commented Jun 4, 2023

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mnoukhov commented Jun 1, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 2, 2023 •

edited

Loading