Skip to content

Skip {% generation %} and {% endgeneration %} template handling#3204

Merged
Narsil merged 4 commits intomainfrom
phi4-template
May 1, 2025
Merged

Skip {% generation %} and {% endgeneration %} template handling#3204
Narsil merged 4 commits intomainfrom
phi4-template

Conversation

@alvarobartt
Copy link
Member

@alvarobartt alvarobartt commented May 1, 2025

What does this PR do?

This PR is just a tentative fix for both {% generation %} and {% endgeneration %} custom syntax introduced within some chat templates for returning a mask of the assistant generated tokens which is only useful during training and should indeed be ignored during inference, affecting models as e.g. microsoft/Phi-4-reasoning-plus.

Thanks to @Rocketknight1 for the explanation!

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@Narsil

Custom syntax within the chat template for the Phi4 Reasoning models
e.g. https://huggingface.co/microsoft/Phi-4-reasoning-plus, which is
AFAIK not handled natively yet, so skipping for now
@alvarobartt alvarobartt marked this pull request as ready for review May 1, 2025 09:51
Narsil
Narsil previously approved these changes May 1, 2025
Copy link
Contributor

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Narsil Narsil merged commit 40dfce6 into main May 1, 2025
15 checks passed
@Narsil Narsil deleted the phi4-template branch May 1, 2025 10:13
@Rocketknight1
Copy link
Member

Rocketknight1 commented May 1, 2025

This will work fine in a lot of cases, but will fail to catch syntax like {%- generation %}! Because the tags can also include whitespace-stripping syntax like {-, it can be dangerous to just delete them instead of adding a null Jinja handler for generation.

@alvarobartt
Copy link
Member Author

This will work fine in a lot of cases, but will fail to catch syntax like {%- generation %}! Because the tags can also include whitespace-stripping syntax like {-, it can be dangerous to just delete them instead of adding a null Jinja handler for generation.

Thanks! I'll make sure to update it to handle that properly rather than a raw replacement 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants