Skip to content

Add the endpoints /api/tags and /api/chat #13659

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 21, 2025
Merged

Conversation

R-Dson
Copy link
Contributor

@R-Dson R-Dson commented May 20, 2025

Add the endpoints /api/tags and /api/chat, and improved the model metadata response.

These changes made llama-server work with Copilot Chat again for me. both /api/tags and /api/chat are added because of a discussion on open-webui's page where they point out that both need to be handled. The old json values are kept as is to keep backwards compatibility, in case anyone uses them.

This issue was also pointed out in the comments of #12896.

Add the endpoints /api/tags and /api/chat, and improved the model metadata response
@@ -4086,6 +4101,19 @@ int main(int argc, char ** argv) {
{ "llama.context_length", ctx_server.slots.back().n_ctx, },
}
},
{"modelfile", ""}, // Specific to ollama and does not seem to be needed
{"parameters", ""}, // TODO: add parameters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the API client expect this to be a string or an object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modelfile is a string. Parameters i am not sure, but it seems to be enough as is.

@R-Dson
Copy link
Contributor Author

R-Dson commented May 21, 2025

I think this is the minimal code that we need for copilot to work with it.

@@ -3740,7 +3741,7 @@ int main(int argc, char ** argv) {
if (req.path == "/" || tmp.back() == "html") {
res.set_content(reinterpret_cast<const char*>(loading_html), loading_html_len, "text/html; charset=utf-8");
res.status = 503;
} else if (req.path == "/models" || req.path == "/v1/models") {
} else if (req.path == "/models" || req.path == "/v1/models" || req.path == "/api/tags") {
// allow the models endpoint to be accessed during loading
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

during loading, common_chat_templates_source call will fail

Copy link
Contributor Author

@R-Dson R-Dson May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to remove that endpoint from the if-case?

Edit:
The current code does not use common_chat_templates_source in that endpoint after the latest changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I think it's ok to keep this then

@R-Dson R-Dson requested a review from ngxson May 21, 2025 11:34
@ngxson ngxson merged commit 0d5c742 into ggml-org:master May 21, 2025
46 checks passed
infil00p pushed a commit to baseweight/llama.cpp that referenced this pull request May 22, 2025
* Add the endpoints /api/tags and /api/chat

Add the endpoints /api/tags and /api/chat, and improved the model metadata response

* Remove trailing whitespaces

* Removed code that is not needed for copilot to work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants