You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was using playing around with the server example and wanted to expose the probabilities of the generated tokens to the server client to implement custom stopping sequences and criteria(similar to openai's api here).
All it would take should just be creating a different version of "llama_sample_token" and "llama_sample_token_greedy" that returns an object containing the top X tokens and their probabilities.
The only related issue/pr/discussion I was able to find is this pr about logging probabilities. Please give me pointers if similar requests have been discussed somewhere.
Since I'm relatively new to the repo, what is the protocol here? Should I just make a PR?
The text was updated successfully, but these errors were encountered:
Yes. I realized the parameters passed into the sample functions are references. So no need to change the core logic to get the probs. just read directly from the candidate list object after passing it through the sampling function is enough.
I was using playing around with the server example and wanted to expose the probabilities of the generated tokens to the server client to implement custom stopping sequences and criteria(similar to openai's api here).
All it would take should just be creating a different version of "llama_sample_token" and "llama_sample_token_greedy" that returns an object containing the top X tokens and their probabilities.
The only related issue/pr/discussion I was able to find is this pr about logging probabilities. Please give me pointers if similar requests have been discussed somewhere.
Since I'm relatively new to the repo, what is the protocol here? Should I just make a PR?
The text was updated successfully, but these errors were encountered: