Skip to content

Conversation

@gante
Copy link
Contributor

@gante gante commented Mar 12, 2025

What does this PR do?

We dropped support for the experimental end-to-end compilation of generate a while back, but there were a few traces of related code. This PR removes them.

@github-actions github-actions bot marked this pull request as draft March 12, 2025 18:36
@github-actions
Copy link
Contributor

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

@gante gante marked this pull request as ready for review March 12, 2025 18:36
@gante gante requested review from zucchini-nlp and removed request for ArthurZucker March 12, 2025 18:56
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, much cleaner! Not sure if we have any overriden code left in other models, so left a comment below

inputs_embeds = inputs_embeds[:, -cache_position.shape[0] :]
elif (
inputs_embeds is not None # Exception 1
or (is_torchdynamo_compiling() or cache_position[-1] >= input_ids.shape[1]) # Exception 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember this code overridden in some models. If that's still the case, we'll need to clean up there also

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll do a scan and replace the pattern in other points!

@gante gante merged commit 0fe0bae into huggingface:main Mar 19, 2025
23 checks passed
@gante gante deleted the rm_end_to_end_compilation branch March 19, 2025 11:28
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
@vfdev-5
Copy link

vfdev-5 commented Sep 14, 2025

Please also update the docs according to that: https://huggingface.co/docs/transformers/llm_optims?static-kv=3.+compile+entire+generate+function#static-kv-cache-and-torchcompile
which suggests that we can compile model.generate but failing on the execution with the latest stable release.
Thanks!

image

@gante
Copy link
Contributor Author

gante commented Sep 15, 2025

@vfdev-5 thank you for noticing it and reporting it back to us 🙈 I'll update the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants