Skip to content

main : add stop keywords #1387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

ejones
Copy link
Collaborator

@ejones ejones commented May 10, 2023

Resurrects #769, which was ready to go but abandoned in favor of #863, which was reverted. #769 was itself a rewrite of #365 by @joshmackwilliams. Fixes #57. I've also simplified the code a bit.

From the original author:

Stop keywords can be specified using the "--stop" parameter. Upon seeing one of these keywords in the generated output, the model will terminate generation immediately. Like reverse prompts, multiple stop keywords can be specified by specifying the --stop argument multiple times.

The implementation is heavily based on the reverse prompt implementation...

Testing

Tested with 30B in interactive and non-interactive modes. Note that in interactive mode, --stop still terminates the process. This appears to be the original intent.

Non-interactive, without --stop:

 % ./main -m $LLAMA_30B_Q4_0 -c 1024 -n 32 -p "$(cat prompts/chat-with-bob.txt) Name a color"$'\n'"Bob: "
main: build = 529 (365869d)
main: seed  = 1683689302
...
 Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.

User: Hello, Bob.
Bob: Hello. How may I help you today?
User: Please tell me the largest city in Europe.
Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
User: Name a color
Bob: 1,053,464 bytes of memory are being used by the current session of your Windows.
User: What year were you born?
llama_print_timings:        load time = 11568.84 ms
llama_print_timings:      sample time =    22.28 ms /    32 runs   (    0.70 ms per run)
llama_print_timings: prompt eval time = 11554.26 ms /   106 tokens (  109.00 ms per token)
llama_print_timings:        eval time =  6173.09 ms /    31 runs   (  199.13 ms per run)
llama_print_timings:       total time = 17767.44 ms

Non-interactive, with --stop

% ./main -m $LLAMA_30B_Q4_0 -c 1024 -n 32 -p "$(cat prompts/chat-with-bob.txt) Name a color"$'\n'"Bob: " --stop User:
main: build = 529 (365869d)
main: seed  = 1683689361
...
 Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.

User: Hello, Bob.
Bob: Hello. How may I help you today?
User: Please tell me the largest city in Europe.
Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
User: Name a color
Bob:  Red, blue, green, black and white.
User:
llama_print_timings:        load time = 11683.41 ms
llama_print_timings:      sample time =     8.97 ms /    13 runs   (    0.69 ms per run)
llama_print_timings: prompt eval time = 11665.79 ms /   106 tokens (  110.05 ms per token)
llama_print_timings:        eval time =  2367.22 ms /    12 runs   (  197.27 ms per run)
llama_print_timings:       total time = 14061.44 ms

Interactive with --stop

% ./main -m $LLAMA_30B_Q4_0 -c 1024 -n 32 -f prompts/chat-with-bob.txt -r User: --stop blue                          
main: build = 529 (365869d)
main: seed  = 1683689426
...
== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

 Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.

User: Hello, Bob.
Bob: Hello. How may I help you today?
User: Please tell me the largest city in Europe.
Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
User: name a color
Bob: Name one of these colors: Blue, Green, Red, Black, White, Orange, Pink, Yellow, Brown,
 Purple.
User: name a color in lower case
Bob: Name one of these colors: blue
llama_print_timings:        load time = 11954.31 ms
llama_print_timings:      sample time =    30.58 ms /    43 runs   (    0.71 ms per run)
llama_print_timings: prompt eval time = 13830.20 ms /   110 tokens (  125.73 ms per token)
llama_print_timings:        eval time =  7563.43 ms /    42 runs   (  180.08 ms per run)
llama_print_timings:       total time = 360323.90 ms

Multiple --stop:

% ./main -m $LLAMA_30B_Q4_0 -c 1024 -n 32 -p "$(cat prompts/chat-with-bob.txt) Name a" --stop User: --stop Bob:
main: build = 529 (365869d)
main: seed  = 1683689992
...
 Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.

User: Hello, Bob.
Bob: Hello. How may I help you today?
User: Please tell me the largest city in Europe.
Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
User: Name a country in Asia.
Bob:
llama_print_timings:        load time = 11892.59 ms
llama_print_timings:      sample time =     4.84 ms /     7 runs   (    0.69 ms per run)
llama_print_timings: prompt eval time = 11874.44 ms /   101 tokens (  117.57 ms per token)
llama_print_timings:        eval time =  1299.45 ms /     6 runs   (  216.57 ms per run)
llama_print_timings:       total time = 13197.79 ms

@DannyDaemonic
Copy link
Contributor

DannyDaemonic commented May 10, 2023

What's the argument for making a new option vs just having a --stop-on-reverse-prompt type option?

Edit: Or I remember reading a PR somewhere that there would change -r such that it just doesn't automatically trigger interactive mode.

@ejones
Copy link
Collaborator Author

ejones commented May 11, 2023

change -r such that it just doesn't automatically trigger interactive mode.

#1032 it looks like? Yeah, that was my first instinct. Tbh I went with #769 because it seemed to have consensus (@ggerganov approved) and just got tied up in #863. I'm not so opinionated as to reject the duplicative option.

That said, from what I could discern, the reasons for a distinct option seemed to include:

  • backwards compat for -r triggering interactive
  • -r has some interaction with --instruct
  • using distinct -r and --stop, with the latter terminating the process (from what I can tell). This use case is less clear to me

@ejones
Copy link
Collaborator Author

ejones commented May 11, 2023

Side note: the workaround for non-interactive stopping that @SlyEcho notes in #1032 (piping in /dev/null) doesn't appear to work any longer. As of #1040 it looks like EOF no longer terminates the process.

@ejones ejones requested a review from ggerganov May 11, 2023 03:25
@DannyDaemonic
Copy link
Contributor

DannyDaemonic commented May 11, 2023

Ah yes, thank you. I was referring to #1032. I prefer that solution over adding a second set of antiprompts just for exiting. Let's see if we can't push that one through.

@ejones ejones removed the request for review from ggerganov May 11, 2023 11:14
@ejones
Copy link
Collaborator Author

ejones commented May 11, 2023

Close in favor of #1032

@ejones ejones closed this May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stop keywords
2 participants