Skip to content

Commit 09549b7

Browse files
bertwagnerarthw
authored andcommitted
arg : add env variable for parallel (ggml-org#9513)
* add env variable for parallel * Update README.md with env: LLAMA_ARG_N_PARALLEL
1 parent e863854 commit 09549b7

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

common/arg.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1312,7 +1312,7 @@ gpt_params_context gpt_params_parser_init(gpt_params & params, llama_example ex,
13121312
[](gpt_params & params, int value) {
13131313
params.n_parallel = value;
13141314
}
1315-
));
1315+
).set_env("LLAMA_ARG_N_PARALLEL"));
13161316
add_opt(llama_arg(
13171317
{"-ns", "--sequences"}, "N",
13181318
format("number of sequences to decode (default: %d)", params.n_sequences),

examples/server/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ The project is under active development, and we are [looking for feedback and co
8787
| `-ctk, --cache-type-k TYPE` | KV cache data type for K (default: f16) |
8888
| `-ctv, --cache-type-v TYPE` | KV cache data type for V (default: f16) |
8989
| `-dt, --defrag-thold N` | KV cache defragmentation threshold (default: -1.0, < 0 - disabled)<br/>(env: LLAMA_ARG_DEFRAG_THOLD) |
90-
| `-np, --parallel N` | number of parallel sequences to decode (default: 1) |
90+
| `-np, --parallel N` | number of parallel sequences to decode (default: 1)<br/>(env: LLAMA_ARG_N_PARALLEL) |
9191
| `-cb, --cont-batching` | enable continuous batching (a.k.a dynamic batching) (default: enabled)<br/>(env: LLAMA_ARG_CONT_BATCHING) |
9292
| `-nocb, --no-cont-batching` | disable continuous batching<br/>(env: LLAMA_ARG_NO_CONT_BATCHING) |
9393
| `--mlock` | force system to keep model in RAM rather than swapping or compressing |

0 commit comments

Comments
 (0)