Skip to content

Support multiple prompts in the runner #9817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 3, 2025

Conversation

cccclai
Copy link
Contributor

@cccclai cccclai commented Apr 1, 2025

Summary:
As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:

./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1

It will be hard to use any char as delimiter, so we use --prompt to explicitly mark a prompt and collect them together.

Differential Revision: D72276104

Copy link

pytorch-bot bot commented Apr 1, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9817

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 55678ad with merge base 753a88e (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 1, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from 1e9a596 to 7d2dc2d Compare April 2, 2025 18:49
cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from 7d2dc2d to 24b6ca4 Compare April 2, 2025 18:51
cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from 24b6ca4 to e3ec4b5 Compare April 2, 2025 18:52
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:
Pull Request resolved: pytorch#9817

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from e3ec4b5 to d86ecc8 Compare April 2, 2025 18:56
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:
Pull Request resolved: pytorch#9817

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from d86ecc8 to b59f256 Compare April 2, 2025 19:05
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:
Pull Request resolved: pytorch#9817

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from b59f256 to 8a19c85 Compare April 2, 2025 19:10
cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from 8a19c85 to 28292d0 Compare April 2, 2025 19:35
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from 28292d0 to 92ed635 Compare April 2, 2025 20:51
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D72276104

cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:
Pull Request resolved: pytorch#9817

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from 92ed635 to 40a42c6 Compare April 2, 2025 20:56
cccclai added a commit to cccclai/executorch-1 that referenced this pull request Apr 2, 2025
Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from 40a42c6 to 43bd572 Compare April 2, 2025 20:58
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D72276104

@larryliu0820
Copy link
Contributor

Just curious - will QNN be able to give response to multiple prompts?

Summary:

As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte    --tokenizer_path tiktokenizer.bin  --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big,"  --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.

Reviewed By: kirklandsign

Differential Revision: D72276104
@cccclai cccclai force-pushed the export-D72276104 branch from 43bd572 to 55678ad Compare April 2, 2025 23:46
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D72276104

@cccclai
Copy link
Contributor Author

cccclai commented Apr 2, 2025

Just curious - will QNN be able to give response to multiple prompts?

Yeah, currently it's the same as a fresh prompt for each inference, but we will enable multi turn conversation and previous conversation will be part of the context

@facebook-github-bot facebook-github-bot merged commit fb1faaf into pytorch:main Apr 3, 2025
88 of 90 checks passed
kirklandsign pushed a commit that referenced this pull request Apr 11, 2025
Differential Revision: D72276104

Pull Request resolved: #9817
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported topic: not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants