Added README.md for main with examples and explanations #1139

DannyDaemonic · 2023-04-23T10:06:03Z

I've been slowly working on this. It contains a couple of examples and a longer explanation of the options a user might use.

DannyDaemonic · 2023-04-23T10:19:16Z

Just updated it to fix a link to the primary README.

If you want to see it formatted, you can follow this link.

Edit: And again, adding a link back to the front llama.cpp page in case someone finds their way to the main example from outside of github.
Edit 2: Expanded the Generation Flags section.

sw · 2023-04-23T10:59:37Z

@mgroeber9110 is doing the same in #1131...

DannyDaemonic · 2023-04-23T11:00:46Z

Thanks. That's the problem with these large write ups. We probably both started days ago...

Edit: I'm biased for sure, but mine feels more complete. If you want to commit @mgroeber9110's README.md first I can try to edit my information into his but they are structured different. I feel like I would probably just end up replacing everything. :( Still, it would get his --interactive_start fixes in.

DannyDaemonic · 2023-04-23T12:00:26Z

I was trying to compare all the options to see if we could merge the two. I did find his has instruction for --seed and --ignore-eos, which mine doesn't have and should certainly be in there. But our READMEs are structured so differently, I don't know if we could really merge them.

Mine is missing:
-h, --help
-s SEED, --seed SEED
--ignore-eos
-b N, --batch_size N

I did have --batch_size in there before but I accidentally cut it out of the final draft when I was restructuring things. I'm going to edit it back in again. I don't have any issue adding in the other missing options but I'll wait until we have a better idea of what we're doing.

mgroeber9110 · 2023-04-23T12:09:31Z

@DannyDaemonic Sorry for the duplicate work - I should have probably made a note on the ticket that I intend to work on it, but your text definitely feels much more comprehensive, while I only took a quick stab at giving people a starting point.

From my point of view, it probably makes more sense to merge yours first, and then I can perhaps transfer a few tidbits of information from mine (such as what exactly --instruct injects, and perhaps a sentence or two on stopping conditions).

Just wondering: was it intentional to not mention --ignore-eos? I found this quite powerful, but apparently it does not work everywhere (e.g. #990).

DannyDaemonic · 2023-04-23T12:34:03Z

Yeah, I feel bad that time was wasted either way.

The --ignore-eos was just an oversight. It really should be in there. I can make a stub for it in ## Generation Flags and you can fill it in?

I intended to put all the options in there. Even --help is worth mentioning as it shows the latest options and default values, which can and do change frequently.

DannyDaemonic · 2023-04-23T13:03:51Z

I added the missing options with the exception of --ignore-eos since @mgroeber9110 has more experience with that than me. I was going to put a stub in there but it might not even be best under ## Generation Flags. I was thinking maybe it belongs in ## Interaction if that's how it's most commonly used? Either way, I could leave that up to @mgroeber9110. Even if it does turn out to fit better in the generative section, it could certainly be mentioned in the instruct section.

On a bit of a tangent here, but I have had problems with Alpaca ending a response too soon, sometimes immediately, and never tried --ignore-eos to see if that helps. It does have a tendency to go a bit random at the end so I assumed it might make that worse. I suspect the real issue might be due to some early Alpaca models being trained with a context of 512 and I like to run everything with 2048.

examples/main/README.md

jon-chuang · 2023-04-26T14:19:20Z

examples/main/README.md

+
+These options help improve the performance and memory usage of the LLaMA models:
+
+-   `-t N, --threads N`: Set the number of threads to use during computation. Using the correct number of threads can greatly improve performance. It is recommended to set this value to the number of CPU cores.


I'll add physical CPU cores, and the number of performance cores in a chipset with efficiency/performance (E/P) cores in PR #934.

I'm watching that pull request. My plan was to put another pull request through removing it from the ## common section as soon as this is resolved.

I still think we should leave it in; even with a warning in the cmdline the user might miss it; it's good to have multiple paths to getting the information.

DannyDaemonic force-pushed the main-readme branch 2 times, most recently from 1a3d66d to e7a09bd Compare April 23, 2023 10:17

DannyDaemonic force-pushed the main-readme branch 5 times, most recently from 2d94c77 to 9d72015 Compare April 23, 2023 10:52

DannyDaemonic mentioned this pull request Apr 23, 2023

Example readme and some light refactoring #1131

Merged

Added README.md for main with examples and explanations

b8cf6b6

DannyDaemonic force-pushed the main-readme branch from 9d72015 to b8cf6b6 Compare April 23, 2023 12:01

sw added the documentation Improvements or additions to documentation label Apr 23, 2023

Added --help and --seed

befd875

DannyDaemonic force-pushed the main-readme branch from 3784be4 to befd875 Compare April 23, 2023 13:12

sw reviewed Apr 23, 2023

View reviewed changes

examples/main/README.md Outdated Show resolved Hide resolved

Fixed typo and added longer section on n_predict

507207e

sw approved these changes Apr 23, 2023

View reviewed changes

sw merged commit edce63b into ggml-org:master Apr 23, 2023

DannyDaemonic deleted the main-readme branch April 23, 2023 15:40

jon-chuang reviewed Apr 26, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added README.md for main with examples and explanations #1139

Added README.md for main with examples and explanations #1139

DannyDaemonic commented Apr 23, 2023

DannyDaemonic commented Apr 23, 2023 •

edited

Loading

sw commented Apr 23, 2023

DannyDaemonic commented Apr 23, 2023 •

edited

Loading

DannyDaemonic commented Apr 23, 2023

mgroeber9110 commented Apr 23, 2023

DannyDaemonic commented Apr 23, 2023

DannyDaemonic commented Apr 23, 2023 •

edited

Loading

jon-chuang Apr 26, 2023 •

edited

Loading

DannyDaemonic Apr 26, 2023

jon-chuang Apr 26, 2023 •

edited

Loading


		These options help improve the performance and memory usage of the LLaMA models:

		- `-t N, --threads N`: Set the number of threads to use during computation. Using the correct number of threads can greatly improve performance. It is recommended to set this value to the number of CPU cores.

Added README.md for main with examples and explanations #1139

Added README.md for main with examples and explanations #1139

Conversation

DannyDaemonic commented Apr 23, 2023

DannyDaemonic commented Apr 23, 2023 • edited Loading

sw commented Apr 23, 2023

DannyDaemonic commented Apr 23, 2023 • edited Loading

DannyDaemonic commented Apr 23, 2023

mgroeber9110 commented Apr 23, 2023

DannyDaemonic commented Apr 23, 2023

DannyDaemonic commented Apr 23, 2023 • edited Loading

jon-chuang Apr 26, 2023 • edited Loading

Choose a reason for hiding this comment

DannyDaemonic Apr 26, 2023

Choose a reason for hiding this comment

jon-chuang Apr 26, 2023 • edited Loading

Choose a reason for hiding this comment

DannyDaemonic commented Apr 23, 2023 •

edited

Loading

DannyDaemonic commented Apr 23, 2023 •

edited

Loading

DannyDaemonic commented Apr 23, 2023 •

edited

Loading

jon-chuang Apr 26, 2023 •

edited

Loading

jon-chuang Apr 26, 2023 •

edited

Loading