You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: #9817
As a preparation for the multiturn conversation, we can provide multiple prompts and execute them in sequence. Example command:
```
./qnn_llama3_2_runner --model_path hybrid_llama_qnn.pte --tokenizer_path tiktokenizer.bin --eval_mode 1 --prompt "Once upon a time" --prompt "girl named Lily." --prompt "her toys and her favorite toy was a big," --kv_updater "ShiftPointer" --logits_scale 0.1 --output_path output.txt --num_iters 1
```
It will be hard to use any char as delimiter, so we use `--prompt` to explicitly mark a prompt and collect them together.
Reviewed By: kirklandsign
Differential Revision: D72276104
DEFINE_string(prompt, "The answer to the ultimate question is", "Prompt.");
38
-
DEFINE_string(
39
-
system_prompt,
40
-
"",
41
-
"Tells the model what kind of assistant it should be. For example, You are a helpful AI assistant for travel tips and recommendations. Default is None");
42
-
DEFINE_double(
43
-
temperature,
44
-
0.0f,
45
-
"Temperature; Default is 0.0f. 0 = greedy argmax sampling (deterministic). Lower temperature = more deterministic");
46
-
DEFINE_int32(
47
-
seq_len,
48
-
128,
49
-
"Total number of tokens to generate (prompt + output).");
DEFINE_string(prompt, "The answer to the ultimate question is", "Prompt.");
38
+
DEFINE_string(
39
+
system_prompt,
40
+
"",
41
+
"Tells the model what kind of assistant it should be. For example, You are a helpful AI assistant for travel tips and recommendations. Default is None");
42
+
DEFINE_double(
43
+
temperature,
44
+
0.0f,
45
+
"Temperature; Default is 0.0f. 0 = greedy argmax sampling (deterministic). Lower temperature = more deterministic");
46
+
DEFINE_int32(
47
+
seq_len,
48
+
128,
49
+
"Total number of tokens to generate (prompt + output).");
0 commit comments