Feature Request: llama-server support continue_final_message #11755

DIYer22 · 2025-02-08T09:53:25Z

Prerequisites

I am running the latest code. Mention the version if possible as well.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Both transformers and vLLM support continue_final_message parameter, which let model continue writing the last round message.

Description from vLLM:

     "If this is set, the chat will be formatted so that the final "
     "message in the chat is open-ended, without any EOS tokens. The "
     "model will continue this message rather than starting a new one. "
     "This allows you to \"prefill\" part of the model's response for it. "
     "Cannot be used at the same time as `add_generation_prompt`."

Hope llama-server support this feature too.

Motivation

This is very helpful for user-controllable generation.
When response is truncated by max_token, user can continue to generate longer response.

Possible Implementation

No response

The text was updated successfully, but these errors were encountered:

remixer-dec · 2025-02-09T08:52:25Z

I agree that it would be nice to have such functionality, but instead of adding a separate parameter, I think it should be supported by default if the last message provided is assistant message, just like claude does.

github-actions · 2025-03-26T01:07:32Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

remixer-dec · 2025-05-18T22:44:54Z

implemented in #13174

DIYer22 added the enhancement New feature or request label Feb 8, 2025

henryclw mentioned this issue Mar 10, 2025

Feature Request: grammar / json schema with reasoning format. Allow model free to think but strict to answer. #12276

Closed

4 tasks

github-actions bot added the stale label Mar 12, 2025

github-actions bot closed this as completed Mar 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: llama-server support continue_final_message #11755

Feature Request: llama-server support continue_final_message #11755

DIYer22 commented Feb 8, 2025

remixer-dec commented Feb 9, 2025

Uh oh!

github-actions bot commented Mar 26, 2025

Uh oh!

remixer-dec commented May 18, 2025

Uh oh!

Feature Request: llama-server support continue_final_message #11755

Feature Request: llama-server support continue_final_message #11755

Comments

DIYer22 commented Feb 8, 2025

Prerequisites

Feature Description

Motivation

Possible Implementation

remixer-dec commented Feb 9, 2025

Uh oh!

github-actions bot commented Mar 26, 2025

Uh oh!

remixer-dec commented May 18, 2025

Uh oh!