Skip to content

Conversation

@guangy10
Copy link
Contributor

@guangy10 guangy10 commented Apr 15, 2025

What does this PR do?

Enable dynamism at seq_len dim in order to utilize parallel prefill in the executorch runtime. In this PR,

  • allow caller side to override example inputs, dynamic shapes and strict flag, but keep the default unchanged for BC
  • add unit test to cover export with dynamic shapes, and strict False as it's the mainstream in latest version of torch.export
  • make the unit test non-slow, to avoid being skipped on PRs and causing regressions
  • add test for HybirdCache

Tests
pytest tests/utils/test_cache_utils.py -vv -s -k cache_exportability

collected 23 items / 20 deselected / 3 selected

tests/utils/test_cache_utils.py::CacheExportIntegrationTest::test_dynamic_cache_exportability PASSED                                                             [ 33%]
tests/utils/test_cache_utils.py::CacheExportIntegrationTest::test_hybrid_cache_exportability PASSED                                                              [ 66%]
tests/utils/test_cache_utils.py::CacheExportIntegrationTest::test_static_cache_exportability PASSED                                                              [100%]

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ArthurZucker @ydshieh

CC: @tugsbayasgalan

@github-actions github-actions bot marked this pull request as draft April 15, 2025 00:49
@github-actions
Copy link
Contributor

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@guangy10 guangy10 marked this pull request as ready for review April 15, 2025 00:57
@github-actions github-actions bot requested review from MekkCyber and SunMarc April 15, 2025 00:57
dynamic_shapes = (
dynamic_shapes
if dynamic_shapes is not None
else {"input_ids": {1: torch.export.Dim.AUTO}, "cache_position": None}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain a bit what is the value like torch.export.Dim.AUTO means, and the one in this guide

# Create a dynamic batch size
batch = Dim("batch")
# Specify that the first dimension of each input is that batch size
dynamic_shapes = {"x1": {0: batch}, "x2": {0: batch}}

IIRC, the key 0 or 1 means which dimension, but don't know torch.export.Dim.AUTO or Dim("batch").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key 0 or 1 means which dimension

Correct.

torch.export.Dim.AUTO or Dim("batch")

It's new feature introduced since 2.6.0, smarter than the traditional way of specifying the dynamic range, e.g Dim("seq_len", min, max). @tugsbayasgalan can explain more. @ydshieh if you think Dim("seq_len", min, max) is easier to understand, we can switch to explicit dynamism.

Copy link
Collaborator

@ydshieh ydshieh Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can find a doc that explains what torch.export.Dim.AUTO does and why it's smarter, adding the link along with a a short comment in the source code here, that would be nice.

When I look doc or source, I can't find any info.

But if you can't find neither, it's fine. In this case, maybe we can reflect this situation to pytorch team.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave it for @tugsbayasgalan to chime in

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dim.AUTO works by refining the range as you encounter shape related constraints (possibly specializing to static integer). So in that sense, Dim.AUTO is less likely to run into shape related errors and doesn't require user code change as it will just respect user code intention. If you really want to keep seq_len to be dynamic, you should use Dim.DYNAMIC which will error when we specialize. For official doc, cc @pianpwk

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! left a few comments

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ydshieh
Copy link
Collaborator

ydshieh commented Apr 15, 2025

Other than the nit comments I left, just a question: so this dynamism is necessary if the exported model want to be used to call prompts (i.e. prefill stage) as well as to use the generation steps (here seq=1)?

@guangy10 guangy10 force-pushed the export_w_dynamism branch 2 times, most recently from 10e7489 to 772606e Compare April 15, 2025 20:58
Comment on lines 286 to 293
input_ids = tokenizer("Here's everything I know", return_tensors="pt").input_ids
dynamic_shapes = {"input_ids": {1: torch.export.Dim.AUTO}, "cache_position": None}
exported_program = convert_and_export_with_cache(
model, example_input_ids=input_ids, dynamic_shapes=dynamic_shapes
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm export this model with a non-specialized input_ids and the dynamic shape on "seq_len" dim. @tugsbayasgalan Is there a way I can assert the exported program do have the dim set to be dynamic, and inspect what the dynamic range is?

@guangy10 guangy10 force-pushed the export_w_dynamism branch from 772606e to 1451ea8 Compare April 16, 2025 00:57
@guangy10 guangy10 changed the title Add option to specify dynamic shapes during export Allow override dynamic shapes and strict in export recipe Apr 16, 2025
@guangy10 guangy10 changed the title Allow override dynamic shapes and strict in export recipe Allow override inputs to export recipe Apr 16, 2025
Comment on lines 228 to 231
if dynamic_shapes is not None:
logging.warning("Dynamic shapes spec will be ingored for < 2.6.0.")
if strict is not None:
logging.warning("strict flag spec will be ingored for < 2.6.0.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, first, I think we should have

if is_torch_greater_or_equal("2.6.0"):
...
elif is_torch_greater_or_equal("2.5.0"):
...
else:
...
right ...?

If so, the 2 new messages have to be in both elif and else.

Also, let's rephrase it as

Dynamic shapes spec will be ignored by convert_and_export_with_cache for < 2.6.0.

Copy link
Contributor Author

@guangy10 guangy10 Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ydshieh That could be one option. Alternatively, we can simplify it by just not use dynamic shapes for torch<2.6 including 2.5, which is BC.
Additionally, in torch 2.5 though it supports dynamic shapes, but no Dim.AUTO. As Tugsuu mentioned above, w/o Dim.AUTO it is more likely to run into shape related errors. Hence I'd rather keep things BC by not using dynamic shapes for torch<2.6. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK :-)

@ydshieh
Copy link
Collaborator

ydshieh commented Apr 16, 2025

just a if/else v.s. if/elif/else question and 2 nits, but overall LGTM.

Comment on lines 230 to 231
if strict is not None:
logging.warning("strict flag spec will be ingored for < 2.6.0.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's default strict to True. Does it require 2.6 to set strict to False ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong option to default True or False, but I think the export team wants to promote default to False in newer version of torch. cc: @tugsbayasgalan

@guangy10
Copy link
Contributor Author

just a if/else v.s. if/elif/else question and 2 nits, but overall LGTM.

Shared my thoughts here: #37508 (comment). Just in case you missed it. Let me know you thoughts

@guangy10 guangy10 force-pushed the export_w_dynamism branch from 1451ea8 to 174c382 Compare April 16, 2025 18:11
@guangy10
Copy link
Contributor Author

More comments on this PR? Can we merge it if no?

@guangy10
Copy link
Contributor Author

I will rebase this PR since #37728 has been merged

@guangy10 guangy10 force-pushed the export_w_dynamism branch 2 times, most recently from 5390845 to c2bd31f Compare April 29, 2025 03:32
Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the iterations. LGTM, torched test is passing and fast to run 👍

I will wait until the end of the day before merge to see if @SunMarc has any comment.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Please fix the conflits and we are good to merge

@guangy10 guangy10 force-pushed the export_w_dynamism branch from bee41b7 to cbf2207 Compare April 29, 2025 23:41
@guangy10
Copy link
Contributor Author

LGTM ! Please fix the conflits and we are good to merge

Couple updates:

  • Added a non-slow test to cover export for hybird cache
  • Rebased to latest trunk

@SunMarc @ydshieh Let me know if there are new comments

@ydshieh ydshieh merged commit a572744 into huggingface:main Apr 30, 2025
13 checks passed
@ydshieh
Copy link
Collaborator

ydshieh commented Apr 30, 2025

Thanks

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
Add option to specify dynamic shapes during export

Co-authored-by: Guang Yang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants