[llm] Add generate_from_pos API to LLM runner #11570

larryliu0820 · 2025-06-11T20:25:04Z

As titled, this API allows us to support multi-turn conversation by passing in a start_pos argument to generate_from_pos.

This pull request introduces a new feature to support text generation from a specific starting position (generate_from_pos) and includes updates to ensure proper error handling and functionality when max_new_tokens is negative. The changes primarily focus on extending the TextLLMRunner class and its associated methods to accommodate this new feature while maintaining backward compatibility.

New Feature: Text Generation from a Specific Starting Position

Added generate_from_pos Method: Introduced a new method generate_from_pos in TextLLMRunner to allow text generation starting from a specified position in the KV cache. This includes updates to the method signature, logic, and error handling. (extension/llm/runner/text_llm_runner.cpp [1] [2] [3] [4]; extension/llm/runner/text_llm_runner.h [5]
Updated Documentation: Enhanced method documentation in TextLLMRunner to describe the new functionality, including parameters like start_pos and the expected behavior. (extension/llm/runner/text_llm_runner.h [1] [2]

Error Handling Improvements

Validation for max_new_tokens: Added checks to ensure max_new_tokens is positive. If it is not, an InvalidArgument error is returned. This prevents invalid configurations during text generation. (extension/llm/runner/text_llm_runner.cpp extension/llm/runner/text_llm_runner.cppR129-R156)
Unit Test for Negative max_new_tokens: Created a new test case (GenerateFromPosErrorsWithNegativeMaxNewTokens) to verify that the generate_from_pos method correctly handles scenarios where max_new_tokens is negative. (extension/llm/runner/test/test_text_llm_runner.cpp extension/llm/runner/test/test_text_llm_runner.cppR325-R379)

As titled, this API allows us to support multi-turn conversation by passing in a `start_pos` argument to `generate_from_pos`.

pytorch-bot · 2025-06-11T20:25:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11570

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 4644bb9 with merge base 72a095f ():

NEW FAILURE - The following job has failed:

pull / test-phi-3-mini-runner-linux / linux-job (gh)
RuntimeError: Command docker exec -t 1e2cac2688cf8ef37fbba8d384ca7f6a668948e88ff949acd7835395dec42602 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-06-11T20:25:39Z

@larryliu0820 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2025-06-11T20:48:16Z

@larryliu0820 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

[llm] Add generate_from_pos API to LLM runner

a1cb4c2

As titled, this API allows us to support multi-turn conversation by passing in a `start_pos` argument to `generate_from_pos`.

larryliu0820 requested review from jackzhxng, iseeyuan and swolchok as code owners June 11, 2025 20:25

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 11, 2025

larryliu0820 added the release notes: llm To capture llm specific changes in release notes label Jun 11, 2025

Add API to irunner.h

4644bb9

mergennachin approved these changes Jun 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llm] Add generate_from_pos API to LLM runner #11570

[llm] Add generate_from_pos API to LLM runner #11570

larryliu0820 commented Jun 11, 2025

Uh oh!

pytorch-bot bot commented Jun 11, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Jun 11, 2025

Uh oh!

facebook-github-bot commented Jun 11, 2025

Uh oh!

Uh oh!

[llm] Add generate_from_pos API to LLM runner #11570

Are you sure you want to change the base?

[llm] Add generate_from_pos API to LLM runner #11570

Conversation

larryliu0820 commented Jun 11, 2025

New Feature: Text Generation from a Specific Starting Position

Error Handling Improvements

Uh oh!

pytorch-bot bot commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11570

❌ 1 New Failure

Uh oh!

facebook-github-bot commented Jun 11, 2025

Uh oh!

facebook-github-bot commented Jun 11, 2025

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 11, 2025 •

edited

Loading