Server allow /completion and /embedding

# Prerequisites

Please answer the following questions for yourself before submitting an issue.

- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [x] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md).
- [x] I [searched using keywords relevant to my issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/filtering-and-searching-issues-and-pull-requests) to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggerganov/llama.cpp/discussions), and have a new bug or useful enhancement to share.

# Feature Description

When I start the server as follows:

```
./server -m wizardlm-70b-v1.0.q4_K_S.gguf --threads 8 -ngl 100  -c 4096 --embedding
```

and make a request to `/embedding`
```
curl --request POST \
    --url http://localhost:8080/embedding \
    --header "Content-Type: application/json" \
    --data '{"content": "Building a website can be done in 10 simple steps:"}'
```
I get back - as expected the vector of embeddings. Now If I make a request to `/completion` as follows:

```
curl --request POST \
    --url http://localhost:8080/completion \
    --header "Content-Type: application/json" \
    --data '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128}'
```

I'd expect the normal completion to still work. But all I get is the embedding of the prompt (I tested it with above examples, and it is the same vector returned in both examples)

# Motivation

I guess having both normal completion and the possibiilty to just get embeddings makes sense in a lot of applications with the server.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Server allow /completion and /embedding #3815

Prerequisites

Feature Description

Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Server allow /completion and /embedding #3815

Description

Prerequisites

Feature Description

Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions