Support multiple prompts in the runner #9817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

facebook-github-bot merged 1 commit into pytorch:main from cccclai:export-D72276104

Apr 3, 2025

Contributor

cccclai commented Apr 1, 2025

Summary:
As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:

./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1

It will be hard to use any char as delimiter, so we use --prompt to explicitly mark a prompt and collect them together.

Differential Revision: D72276104

pytorch-bot bot commented Apr 1, 2025 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9817

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 55678ad with merge base 753a88e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot added the CLA Signed label

Contributor

facebook-github-bot commented Apr 1, 2025

This pull request was exported from Phabricator. Differential Revision: D72276104

facebook-github-bot added the fb-exported label

cccclai requested review from chunit-quic, haowhsu-quic, shewu-quic and winskuo-quic

April 1, 2025 23:05

haowhsu-quic approved these changes

View reviewed changes

kirklandsign approved these changes

View reviewed changes

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

7d2dc2d

Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from 1e9a596 to 7d2dc2d Compare

April 2, 2025 18:49

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

24b6ca4

Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from 7d2dc2d to 24b6ca4 Compare

April 2, 2025 18:51

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

e3ec4b5

Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from 24b6ca4 to e3ec4b5 Compare

April 2, 2025 18:52

Contributor

facebook-github-bot commented Apr 2, 2025

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

d86ecc8

Summary:
Pull Request resolved: pytorch#9817

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from e3ec4b5 to d86ecc8 Compare

April 2, 2025 18:56

cccclai added the topic: not user facing label

Contributor

facebook-github-bot commented Apr 2, 2025

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

b59f256

Summary:
Pull Request resolved: pytorch#9817

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from d86ecc8 to b59f256 Compare

April 2, 2025 19:05

Contributor

facebook-github-bot commented Apr 2, 2025

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

8a19c85

Summary:
Pull Request resolved: pytorch#9817

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from b59f256 to 8a19c85 Compare

April 2, 2025 19:10

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

28292d0

Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from 8a19c85 to 28292d0 Compare

April 2, 2025 19:35

Contributor

facebook-github-bot commented Apr 2, 2025

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

92ed635

Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from 28292d0 to 92ed635 Compare

April 2, 2025 20:51

Contributor

facebook-github-bot commented Apr 2, 2025

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

40a42c6

Summary:
Pull Request resolved: pytorch#9817

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from 92ed635 to 40a42c6 Compare

April 2, 2025 20:56

cccclai added a commit to cccclai/executorch-1 that referenced this pull request


          Support multiple prompts in the runner (pytorch#9817)

43bd572

Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from 40a42c6 to 43bd572 Compare

April 2, 2025 20:58

Contributor

facebook-github-bot commented Apr 2, 2025

This pull request was exported from Phabricator. Differential Revision: D72276104

Contributor

larryliu0820 commented Apr 2, 2025

Just curious - will QNN be able to give response to multiple prompts?


          Support multiple prompts in the runner (pytorch#9817)

55678ad

Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104

cccclai force-pushed the export-D72276104 branch from 43bd572 to 55678ad Compare

April 2, 2025 23:46

Contributor

facebook-github-bot commented Apr 2, 2025

This pull request was exported from Phabricator. Differential Revision: D72276104

Contributor Author

cccclai commented Apr 2, 2025

Just curious - will QNN be able to give response to multiple prompts?

Yeah, currently it's the same as a fresh prompt for each inference, but we will enable multi turn conversation and previous conversation will be part of the context

facebook-github-bot merged commit fb1faaf into pytorch:main

88 of 90 checks passed

kirklandsign pushed a commit that referenced this pull request


          Support multiple prompts in the runner

111b7aa

Differential Revision: D72276104

Pull Request resolved: #9817

This was referenced Apr 14, 2025

Weekly pr metrics report - 2025-04-01..2025-04-07 wdvr/pytorch#28

Open

Weekly pr metrics report - 2025-04-01..2025-04-07 wdvr/pytorch#30

Open

github-actions bot mentioned this pull request

Weekly pr metrics report - 2025-04-01..2025-04-07 wdvr/pytorch#35

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

kirklandsign kirklandsign approved these changes

chunit-quic Awaiting requested review from chunit-quic

shewu-quic Awaiting requested review from shewu-quic

winskuo-quic Awaiting requested review from winskuo-quic

+1 more reviewer

haowhsu-quic haowhsu-quic approved these changes

Labels

CLA Signed fb-exported topic: not user facing