-
Notifications
You must be signed in to change notification settings - Fork 11.6k
Load parallel.cpp -f file.txt external prompt file #3416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load parallel.cpp -f file.txt external prompt file #3416
Conversation
…edj/llama.cpp into load-parallel-prompt-file
Thank you for the review. I have made all the changes.
Unfortunately I made a mess of updating this branch out of inexperience
with github processes so please ignore what I have pushed and I will do it
again today.
Is your preference for a new PR after this kind of updating?
…On Mon, Oct 2, 2023 at 12:17 PM Georgi Gerganov ***@***.***> wrote:
***@***.**** requested changes on this pull request.
------------------------------
On cmake_all.sh
<#3416 (comment)>:
Not needed
------------------------------
On ParallelQuestions.txt
<#3416 (comment)>:
Move to prompts change name to parallel-questions.txt
------------------------------
In common/common.h
<#3416 (comment)>:
> @@ -79,6 +79,7 @@ struct gpt_params {
std::string model_draft = ""; // draft model for speculative decoding
std::string model_alias = "unknown"; // model alias
std::string prompt = "";
+ std::string prompt_file = ""; // store for external prompt file name
⬇️ Suggested change
- std::string prompt_file = ""; // store for external prompt file name
+ std::string prompt_file = ""; // store the external prompt file name
------------------------------
In examples/parallel/parallel.cpp
<#3416 (comment)>:
> @@ -70,6 +72,22 @@ struct client {
std::vector<llama_token> tokens_prev;
};
+static void printDateTime() {
+ std::time_t currentTime = std::time(nullptr);
+ std::cout << "\n\033[35mRUN PARAMETERS as at \033[0m" << std::ctime(¤tTime);
We don't use std::cout in this project
------------------------------
In examples/parallel/parallel.cpp
<#3416 (comment)>:
> @@ -70,6 +72,22 @@ struct client {
std::vector<llama_token> tokens_prev;
};
+static void printDateTime() {
+ std::time_t currentTime = std::time(nullptr);
+ std::cout << "\n\033[35mRUN PARAMETERS as at \033[0m" << std::ctime(¤tTime);
+}
+
+// Define a split string function to ...
+static std::vector<std::string> splitString(const std::string& input, char delimiter) {
snake_case
—
Reply to this email directly, view it on GitHub
<#3416 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGG22YKKD3B7MW56F3APRSTX5KPFRANCNFSM6AAAAAA5NU6KQE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
No, please keep the existing PR. You may |
Thank you. All changes to <origin/load-parallel-prompt-file> now pushed to <origin/Update-load-parallel-prompt-file> which I hope I have done correctly this time. |
It's interesting to use the 100 questions in
Since memory is critical it's worth noting the resources used and how parsimonious the system allocation is with `device.recommendedMaxWorkingSetSize':
|
Could you please push your changes to the load-parallel-prompt-file branch so they appear here? |
…Update-load-parallel-prompt-file with requested changes
…/pudepiedj/llama.cpp into Update-load-parallel-prompt-file
OK this is what I did. I hope it's right. I've looked at the files in load-parallel-prompt-file and they appear to have been changed correctly. Please let me know if I have done something wrong (again)!
|
…edj/llama.cpp into load-parallel-prompt-file
…edj/llama.cpp into load-parallel-prompt-file
…edj/llama.cpp into load-parallel-prompt-file
…example * 'master' of github.com:ggerganov/llama.cpp: kv cache slot search improvements (ggml-org#3493) prompts : fix editorconfig checks after ggml-org#3416 parallel : add option to load external prompt file (ggml-org#3416) server : reuse llama_sample_token common util (ggml-org#3494) llama : correct hparams comparison (ggml-org#3446) ci : fix xcodebuild destinations (ggml-org#3491) convert : update Falcon script for new HF config (ggml-org#3448) build : use std::make_tuple() for compatibility with older GCC versions (ggml-org#3488) common : process escape sequences in reverse prompts (ggml-org#3461) CLBlast: Fix handling of on-device tensor data server : fix incorrect num_tokens_predicted (ggml-org#3480) swift : disable ACCELERATE_NEW_LAPACK (ggml-org#3481) ci : add swift build via xcodebuild (ggml-org#3482)
Thank you. I appreciate your patience!
…On Fri, Oct 6, 2023 at 2:16 PM Georgi Gerganov ***@***.***> wrote:
Merged #3416 <#3416> into
master.
—
Reply to this email directly, view it on GitHub
<#3416 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGG22YMNL45H37DQSGKLLV3X6AAEFAVCNFSM6AAAAAA5NU6KQGVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJQGU3TKMJTGAZDGMQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
* Enable external file and add datestamp * Add name of external file at end * Upload ToK2024 * Delete ToK2024.txt * Experiments with jeopardy * Move ParallelQuestions to /proimpts and rename * Interim commit * Interim commit * Final revision * Remove trailing whitespace * remove cmake_all.sh * Remove cmake_all.sh * Changed .gitignore * Improved reporting and new question files. * Corrected typo * More LLM questions * Update LLM-questions.txt * Yet more LLM-questions * Remove jeopardy results file * Reinstate original jeopardy.sh * Update examples/parallel/parallel.cpp --------- Co-authored-by: Georgi Gerganov <[email protected]>
This branch includes amendments to three files in
./llama.cpp/examples
necessary to implement the external prompt file option-f file.txt
that arises in./bin/parallel --help
. The three affected files are:Command-line code (second run example with
-ns 128
):Example output from two different runs on M2 MAX 32GB and MacOS Sonoma 14.0 (omitting the initialisation):
Many lines omitted. This from the end of a run with
-ns 128
.