Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions chart/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,11 @@ dependencies:
version: "6.4.0"
repository: "https://helm.openwebui.com/"
condition: open-webui.enabled
- name: envoy-gateway
version: v1.3.2
repository: oci://docker.io/envoyproxy/gateway-helm
condition: envoy-gateway.enabled
- name: envoy-ai-gateway
version: v0.1.5
repository: oci://docker.io/envoyproxy/ai-gateway-helm
condition: envoy-ai-gateway.enabled
5 changes: 5 additions & 0 deletions chart/values.global.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,8 @@ open-webui:
enabled: false
redis-cluster:
enabled: false

envoy-gateway:
enabled: true
envoy-ai-gateway:
enabled: true
101 changes: 101 additions & 0 deletions docs/examples/envoy-ai-gateway/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Envoy AI Gateway

[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is an open source project for using Envoy Gateway
to handle request traffic from application clients to Generative AI services.

## How to use

### 1. Enable Envoy Gateway and Envoy AI Gateway in llmaz Helm

Enable Envoy Gateway and Envoy AI Gateway in the `values.global.yaml` file, envoy gateway and envoy ai gateway are disabled by default.

```yaml
envoy-gateway:
enabled: true
envoy-ai-gateway:
enabled: true
```

Note: [Envoy Gateway installation](https://gateway.envoyproxy.io/latest/install/install-helm/) and [Envoy AI Gateway installation](https://aigateway.envoyproxy.io/docs/getting-started/) can be done standalone.

### 2. Check Envoy Gateway and Envoy AI Gateway

Run `kubectl wait --timeout=5m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available` to wait for the envoy gateway to be ready.

Run `kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available` to wait for the envoy ai gateway to be ready.

### 3. Basic AI Gateway example

To expose your model(Playground) to Envoy Gateway, you need to create a GatewayClass, Gateway, and AIGatewayRoute. The following example shows how to do this.

Example [qwen playground](docs/examples/llamacpp/playground.yaml) configuration for a basic AI Gateway.
The model name is `qwen2-0.5b`, so the backend ref name is `qwen2-0--5b`, and the model lb service: `qwen2-0--5b-lb`
- Playground in [docs/examples/llamacpp/playground.yaml](docs/examples/llamacpp/playground.yaml)
- GatewayClass in [docs/examples/envoy-ai-gateway/basic.yaml](docs/examples/envoy-ai-gateway/basic.yaml)

Check if the gateway pod to be ready:

```bash
kubectl wait pods --timeout=2m \
-l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
-n envoy-gateway-system \
--for=condition=Ready
```

### 4. Check Envoy AI Gateway APIs

- For local test with port forwarding, use `export GATEWAY_URL="http://localhost:8080"`.
- Using external IP, use `export GATEWAY_URL=$(kubectl get gateway/envoy-ai-gateway-basic -o jsonpath='{.status.addresses[0].value}')`

See https://aigateway.envoyproxy.io/docs/getting-started/basic-usage for more details.

`$GATEWAY_URL/v1/models` will show the models that are available in the Envoy AI Gateway. The response will look like this:

```json
{
"data": [
{
"id": "some-cool-self-hosted-model",
"created": 1744880950,
"object": "model",
"owned_by": "Envoy AI Gateway"
},
{
"id": "qwen2-0.5b",
"created": 1744880950,
"object": "model",
"owned_by": "Envoy AI Gateway"
}
],
"object": "list"
}
```

`$GATEWAY_URL/v1/chat/completions` will show the chat completions for the model. The request will look like this:

```bash
curl -H "Content-Type: application/json" -d '{
"model": "qwen2-0.5b",
"messages": [
{
"role": "system",
"content": "Hi."
}
]
}' $GATEWAY_URL/v1/chat/completions
```

Expected response will look like this:

```json
{
"choices": [
{
"message": {
"content": "I'll be back."
}
}
]
}
```

54 changes: 54 additions & 0 deletions docs/examples/envoy-ai-gateway/basic.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: envoy-ai-gateway-basic
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: envoy-ai-gateway-basic
namespace: default
spec:
gatewayClassName: envoy-ai-gateway-basic
listeners:
- name: http
protocol: HTTP
port: 80
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
name: envoy-ai-gateway-basic
namespace: default
spec:
schema:
name: OpenAI
targetRefs:
- name: envoy-ai-gateway-basic
kind: Gateway
group: gateway.networking.k8s.io
rules:

# Above are basic config for envoy ai gateway
# Below is example for qwen2-0.5b: a matched backend ref and the AIServiceBackend
- matches:
- headers:
- type: Exact
name: x-ai-eg-model
value: qwen2-0.5b
backendRefs:
- name: envoy-ai-gateway-llmaz-model-1
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIServiceBackend
metadata:
name: envoy-ai-gateway-llmaz-model-1
namespace: default
spec:
schema:
name: OpenAI
backendRef:
name: qwen2-0--5b-lb
kind: Service
102 changes: 102 additions & 0 deletions docs/examples/envoy-ai-gateway/envoy-ai-gateway.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Envoy AI Gateway

[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is an open source project for using Envoy Gateway
to handle request traffic from application clients to Generative AI services.

## How to use

### 1. Enable Envoy Gateway and Envoy AI Gateway in llmaz Helm

Enable Envoy Gateway and Envoy AI Gateway in the `values.global.yaml` file, envoy gateway and envoy ai gateway are enabled by default.

```yaml
envoy-gateway:
enabled: true
envoy-ai-gateway:
enabled: true
```

Note: [Envoy Gateway installation](https://gateway.envoyproxy.io/latest/install/install-helm/) and [Envoy AI Gateway installation](https://aigateway.envoyproxy.io/docs/getting-started/) can be done standalone.

### 2. Check Envoy Gateway and Envoy AI Gateway

Run `kubectl wait --timeout=5m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available` to wait for the envoy gateway to be ready.

Run `kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available` to wait for the envoy ai gateway to be ready.

### 3. Basic AI Gateway example

To expose your model(Playground) to Envoy Gateway, you need to create a GatewayClass, Gateway, and AIGatewayRoute. The following example shows how to do this.

Example [qwen playground](docs/examples/llamacpp/playground.yaml) configuration for a basic AI Gateway.
The model name is `qwen2-0.5b`, so the backend ref name is `qwen2-0--5b`, and the model lb service: `qwen2-0--5b-lb`

- Playground in [docs/examples/llamacpp/playground.yaml](docs/examples/llamacpp/playground.yaml)
- GatewayClass in [docs/examples/envoy-ai-gateway/basic.yaml](docs/examples/envoy-ai-gateway/basic.yaml)

Check if the gateway pod to be ready:

```bash
kubectl wait pods --timeout=2m \
-l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
-n envoy-gateway-system \
--for=condition=Ready
```

### 4. Check Envoy AI Gateway APIs

- For local test with port forwarding, use `export GATEWAY_URL="http://localhost:8080"`.
- Using external IP, use `export GATEWAY_URL=$(kubectl get gateway/envoy-ai-gateway-basic -o jsonpath='{.status.addresses[0].value}')`

See https://aigateway.envoyproxy.io/docs/getting-started/basic-usage for more details.

`$GATEWAY_URL/v1/models` will show the models that are available in the Envoy AI Gateway. The response will look like this:

```json
{
"data": [
{
"id": "some-cool-self-hosted-model",
"created": 1744880950,
"object": "model",
"owned_by": "Envoy AI Gateway"
},
{
"id": "qwen2-0.5b",
"created": 1744880950,
"object": "model",
"owned_by": "Envoy AI Gateway"
}
],
"object": "list"
}
```

`$GATEWAY_URL/v1/chat/completions` will show the chat completions for the model. The request will look like this:

```bash
curl -H "Content-Type: application/json" -d '{
"model": "qwen2-0.5b",
"messages": [
{
"role": "system",
"content": "Hi."
}
]
}' $GATEWAY_URL/v1/chat/completions
```

Expected response will look like this:

```json
{
"choices": [
{
"message": {
"content": "I'll be back."
}
}
]
}
```

2 changes: 1 addition & 1 deletion docs/open-webui.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
## Prerequisites

- Make sure you're located in **llmaz-system** namespace, haven't tested with other namespaces.
- Make sure [EnvoyGateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway) are installed, both of them are installed by default in llmaz.
- Make sure [EnvoyGateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway) are installed, both of them are installed by default in llmaz. See [Envoy AI Gateway](docs/envoy-ai-gateway.md) for more details.

## How to use

Expand Down
Loading