Skip to content

Commit dbfe9ae

Browse files
danbevarthw
authored andcommitted
llama : use reserve/emplace_back in sampler_sample (ggml-org#9534)
This commit updates the llama_sampler_sample function to use reserve and emplace_back for the vector of llama_token_data structs. The motivation for this change is to avoid the creation of n_vocab default-constructed llama_token_data structs which are then immediately overwritten.
1 parent 6c0b1b8 commit dbfe9ae

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

src/llama-sampling.cpp

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -236,9 +236,10 @@ llama_token llama_sampler_sample(struct llama_sampler * smpl, struct llama_conte
236236
const int n_vocab = llama_n_vocab(llama_get_model(ctx));
237237

238238
// TODO: do not allocate each time
239-
std::vector<llama_token_data> cur(n_vocab);
239+
std::vector<llama_token_data> cur;
240+
cur.reserve(n_vocab);
240241
for (llama_token token_id = 0; token_id < n_vocab; token_id++) {
241-
cur[token_id] = llama_token_data{token_id, logits[token_id], 0.0f};
242+
cur.emplace_back(llama_token_data{token_id, logits[token_id], 0.0f});
242243
}
243244

244245
llama_token_data_array cur_p = {

0 commit comments

Comments
 (0)