Skip to content

Conversation

@gante
Copy link
Contributor

@gante gante commented May 9, 2025

What does this PR do?

The main goal of this PR is to enable user-friendly generate parameterization. This also facilitates performance-related customization, which will be the focus of a follow-up PR.

After the deprecation cycle, new users typing transformers chat -h will be redirected to a (new) docs intro section to generation arguments, instead of seeing a wall of CLI arguments. For transformers power users, chat is now usable and can be parameterized without prior knowledge about the CLI. These changes were inspired by the idea that CLIs should be a conversation and that they shouldn't drown users in information

More specifically, with this PR:

  1. We can accept almost any generate flag as a positional argument, present and future, as opposed to being limited to a set of hardcoded flags;
  2. We can pass a generation_config.json, for power users to pass complex generate arguments that may be difficult to specify in a CLI;
  3. User chat commands are clearly distinguished from potential chat entries -- they now start with !
  4. !status, a new command, can be used to print state-related information, such as the current generate flags
  5. !set can now be used to set arbitrary generate flags
  6. !reset was removed -- it was providing minimal benefits (relaunching the CLI with the previous command is the same) but it was requiring us to maintain and pass the input state around
  7. help is now printed if there is a typo in a user command (e.g. !stats -> not a valid command -> prints error and help)
  8. (non-chat specific) There is a new intro section to generate args in the docs, allowing a soft-landing into the parameterization of the text generation universe

Example usage:

transformers chat Qwen/Qwen2.5-0.5B-Instruct do_sample=False max_new_tokens=10

@github-actions github-actions bot marked this pull request as draft May 9, 2025 15:00
@github-actions
Copy link
Contributor

github-actions bot commented May 9, 2025

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@gante gante changed the title [chat] generate parameterization powered by generation config and UX-friendly changes [chat] generate parameterization powered by generation config and UX-related changes May 9, 2025
@gante gante marked this pull request as ready for review May 9, 2025 15:06
@gante gante changed the title [chat] generate parameterization powered by generation config and UX-related changes [chat] generate parameterization powered by GenerationConfig and UX-related changes May 9, 2025
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante gante requested a review from Rocketknight1 May 9, 2025 16:29
@gante gante requested a review from LysandreJik May 12, 2025 08:29
Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! Played around with it locally, really like it 👌

> [!TIP]
> You can also chat with a model directly from the command line.
> ```shell
> transformers chat --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boom

@gante gante merged commit 8efe3a9 into huggingface:main May 12, 2025
10 checks passed
@gante gante deleted the chat_generation_config branch May 12, 2025 13:04
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
…UX-related changes (huggingface#38047)

* accept arbitrary kwargs

* move user commands to a separate fn

* work with generation config files

* rm cmmt

* docs

* base generate flag doc section

* nits

* nits

* nits

* no <br>

* better basic args description
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants