Feature Addition: updated server/public_simplechat with 0 setup builtin tool calls, show reasoning, cleanup

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

The alternate tools/server/public_simplechat web client ui has been updated to 

* support a bunch of builtin tool calls (without needing any additional setup for a subset of these, by using the flexibility and power of browsers) and
* show reasoning as it is being generated for ai models that support the same.
* also cleanup the code and flow, to help with above as well as to make it easy to add features like multi modal support in future.

The PR which provides the above is #17038 

The tool calls implicitly supported and in turn used by ai models (with tool calling support) when running this alternate web client ui without needing any additional setup include

direct builtin tool call supported (using browser's web worker context)

* javascript code runner
* calculator
* data store
* system date time

additional builtin tool calls supported if running included simpleproxy.py helper

* web fetch raw and text
* web search text
* fetch pdf as text


Look into included readme.md for additional info.

NOTE: Refer to the previous PRs in this series to see the evolution. Or the git commits of this PR for the evolution in more finer detail.

### Motivation

With this PR support for showing reasoning as well as builtin client side tool calling support with a useful bunch of ready to use tool calls, has been added to tools/server/public_simplechat.

With this one can get the local ai to 

* collate news from multiple sites and summarise the same or
* explore and get the latest research papers / details from web including arxiv for some topic of interest and summarise the same
* search for a topic and prepare a summary of the search results and or automatically fetch additional pages and provide a more detailed info
* generate javascript code snippets and test them out or use it to validate mathematical statements it might make
* and or answer queries around these or ... its up to you and the ai model ...


While the thinking / work on backend / server side mcp support etal for tool calling is a good thing for users targeting back end server deployment or custom device setups etal. For normal end users, using the browser flexibility and capability to expose a bunch of builtin tool calls with 0 additional setup, as provided by this, is a more practically and immidiately useful and usable way of enhancing GenAi/LLM value in productive ways.

This PR series cleans up existing tools/server/public_simplechat flow as well as adds additional functionality and flexibility to the same.

### Possible Implementation

The latest PR which provides the above features is #17038 

One could get going with

build/bin/llama-server -m ../llama.cpp.models/gpt-oss-20b-mxfp4.gguf --jinja --path tools/server/public_simplechat/ -fa on

NOTE: Checked with GptOss, Granite4 and Gemma3N. Larger models are more flexible and understand the natural language request and map to needed tool calls and work around issues more readily, while smaller models may require more hand holding wrt tool calling.

NOTE: The default context size should be good enough for simple stuffs. However explicitly set the --ctx-size to a larger value if working with many web site / pdf contents or so as needed. Also dont forget --n-gpu-layers if wanting to use the GPU.

If one additionally needs web search, web fetch and pdf related tool calls, then also run

cd tools/server/public_simplechat/local.tools; python3 ./simpleproxy.py --config simpleproxy.json

NOTE: Remember to edit simpleproxy.json with the list of sites you want to allow access to, as well as to disable local file access, if needed.

By default the tool calls and their responses arent auto triggered. One can cross check the tool calls before allowing their execution and similarly cross check the responses before submitting them to the ai model, just to be on safe side.

One can look into other PRs in this series, from there to see how this feature set evolved.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Addition: updated server/public_simplechat with 0 setup builtin tool calls, show reasoning, cleanup #17040

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Addition: updated server/public_simplechat with 0 setup builtin tool calls, show reasoning, cleanup #17040

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions