diff --git a/chart/Chart.yaml b/chart/Chart.yaml index 02128132..f452fc8e 100644 --- a/chart/Chart.yaml +++ b/chart/Chart.yaml @@ -25,3 +25,11 @@ dependencies: version: "6.4.0" repository: "https://helm.openwebui.com/" condition: open-webui.enabled + - name: envoy-gateway + version: v1.3.2 + repository: oci://docker.io/envoyproxy/gateway-helm + condition: envoy-gateway.enabled + - name: envoy-ai-gateway + version: v0.1.5 + repository: oci://docker.io/envoyproxy/ai-gateway-helm + condition: envoy-ai-gateway.enabled diff --git a/chart/values.global.yaml b/chart/values.global.yaml index 8d6ed9a3..b46415a6 100644 --- a/chart/values.global.yaml +++ b/chart/values.global.yaml @@ -47,3 +47,8 @@ open-webui: enabled: false redis-cluster: enabled: false + +envoy-gateway: + enabled: true +envoy-ai-gateway: + enabled: true diff --git a/docs/examples/envoy-ai-gateway/README.md b/docs/examples/envoy-ai-gateway/README.md new file mode 100644 index 00000000..1222dacd --- /dev/null +++ b/docs/examples/envoy-ai-gateway/README.md @@ -0,0 +1,101 @@ +# Envoy AI Gateway + +[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is an open source project for using Envoy Gateway +to handle request traffic from application clients to Generative AI services. + +## How to use + +### 1. Enable Envoy Gateway and Envoy AI Gateway in llmaz Helm + +Enable Envoy Gateway and Envoy AI Gateway in the `values.global.yaml` file, envoy gateway and envoy ai gateway are disabled by default. + +```yaml +envoy-gateway: + enabled: true +envoy-ai-gateway: + enabled: true +``` + +Note: [Envoy Gateway installation](https://gateway.envoyproxy.io/latest/install/install-helm/) and [Envoy AI Gateway installation](https://aigateway.envoyproxy.io/docs/getting-started/) can be done standalone. + +### 2. Check Envoy Gateway and Envoy AI Gateway + +Run `kubectl wait --timeout=5m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available` to wait for the envoy gateway to be ready. + +Run `kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available` to wait for the envoy ai gateway to be ready. + +### 3. Basic AI Gateway example + +To expose your model(Playground) to Envoy Gateway, you need to create a GatewayClass, Gateway, and AIGatewayRoute. The following example shows how to do this. + +Example [qwen playground](docs/examples/llamacpp/playground.yaml) configuration for a basic AI Gateway. +The model name is `qwen2-0.5b`, so the backend ref name is `qwen2-0--5b`, and the model lb service: `qwen2-0--5b-lb` +- Playground in [docs/examples/llamacpp/playground.yaml](docs/examples/llamacpp/playground.yaml) +- GatewayClass in [docs/examples/envoy-ai-gateway/basic.yaml](docs/examples/envoy-ai-gateway/basic.yaml) + +Check if the gateway pod to be ready: + +```bash +kubectl wait pods --timeout=2m \ + -l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \ + -n envoy-gateway-system \ + --for=condition=Ready +``` + +### 4. Check Envoy AI Gateway APIs + +- For local test with port forwarding, use `export GATEWAY_URL="http://localhost:8080"`. +- Using external IP, use `export GATEWAY_URL=$(kubectl get gateway/envoy-ai-gateway-basic -o jsonpath='{.status.addresses[0].value}')` + +See https://aigateway.envoyproxy.io/docs/getting-started/basic-usage for more details. + +`$GATEWAY_URL/v1/models` will show the models that are available in the Envoy AI Gateway. The response will look like this: + +```json +{ + "data": [ + { + "id": "some-cool-self-hosted-model", + "created": 1744880950, + "object": "model", + "owned_by": "Envoy AI Gateway" + }, + { + "id": "qwen2-0.5b", + "created": 1744880950, + "object": "model", + "owned_by": "Envoy AI Gateway" + } + ], + "object": "list" +} +``` + +`$GATEWAY_URL/v1/chat/completions` will show the chat completions for the model. The request will look like this: + +```bash +curl -H "Content-Type: application/json" -d '{ + "model": "qwen2-0.5b", + "messages": [ + { + "role": "system", + "content": "Hi." + } + ] + }' $GATEWAY_URL/v1/chat/completions +``` + +Expected response will look like this: + +```json +{ + "choices": [ + { + "message": { + "content": "I'll be back." + } + } + ] +} +``` + diff --git a/docs/examples/envoy-ai-gateway/basic.yaml b/docs/examples/envoy-ai-gateway/basic.yaml new file mode 100644 index 00000000..2e2f79e1 --- /dev/null +++ b/docs/examples/envoy-ai-gateway/basic.yaml @@ -0,0 +1,54 @@ +apiVersion: gateway.networking.k8s.io/v1 +kind: GatewayClass +metadata: + name: envoy-ai-gateway-basic +spec: + controllerName: gateway.envoyproxy.io/gatewayclass-controller +--- +apiVersion: gateway.networking.k8s.io/v1 +kind: Gateway +metadata: + name: envoy-ai-gateway-basic + namespace: default +spec: + gatewayClassName: envoy-ai-gateway-basic + listeners: + - name: http + protocol: HTTP + port: 80 +--- +apiVersion: aigateway.envoyproxy.io/v1alpha1 +kind: AIGatewayRoute +metadata: + name: envoy-ai-gateway-basic + namespace: default +spec: + schema: + name: OpenAI + targetRefs: + - name: envoy-ai-gateway-basic + kind: Gateway + group: gateway.networking.k8s.io + rules: + +# Above are basic config for envoy ai gateway +# Below is example for qwen2-0.5b: a matched backend ref and the AIServiceBackend + - matches: + - headers: + - type: Exact + name: x-ai-eg-model + value: qwen2-0.5b + backendRefs: + - name: envoy-ai-gateway-llmaz-model-1 +--- +apiVersion: aigateway.envoyproxy.io/v1alpha1 +kind: AIServiceBackend +metadata: + name: envoy-ai-gateway-llmaz-model-1 + namespace: default +spec: + schema: + name: OpenAI + backendRef: + name: qwen2-0--5b-lb + kind: Service \ No newline at end of file diff --git a/docs/examples/envoy-ai-gateway/envoy-ai-gateway.md b/docs/examples/envoy-ai-gateway/envoy-ai-gateway.md new file mode 100644 index 00000000..5681d61a --- /dev/null +++ b/docs/examples/envoy-ai-gateway/envoy-ai-gateway.md @@ -0,0 +1,102 @@ +# Envoy AI Gateway + +[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is an open source project for using Envoy Gateway +to handle request traffic from application clients to Generative AI services. + +## How to use + +### 1. Enable Envoy Gateway and Envoy AI Gateway in llmaz Helm + +Enable Envoy Gateway and Envoy AI Gateway in the `values.global.yaml` file, envoy gateway and envoy ai gateway are enabled by default. + +```yaml +envoy-gateway: + enabled: true +envoy-ai-gateway: + enabled: true +``` + +Note: [Envoy Gateway installation](https://gateway.envoyproxy.io/latest/install/install-helm/) and [Envoy AI Gateway installation](https://aigateway.envoyproxy.io/docs/getting-started/) can be done standalone. + +### 2. Check Envoy Gateway and Envoy AI Gateway + +Run `kubectl wait --timeout=5m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available` to wait for the envoy gateway to be ready. + +Run `kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available` to wait for the envoy ai gateway to be ready. + +### 3. Basic AI Gateway example + +To expose your model(Playground) to Envoy Gateway, you need to create a GatewayClass, Gateway, and AIGatewayRoute. The following example shows how to do this. + +Example [qwen playground](docs/examples/llamacpp/playground.yaml) configuration for a basic AI Gateway. +The model name is `qwen2-0.5b`, so the backend ref name is `qwen2-0--5b`, and the model lb service: `qwen2-0--5b-lb` + +- Playground in [docs/examples/llamacpp/playground.yaml](docs/examples/llamacpp/playground.yaml) +- GatewayClass in [docs/examples/envoy-ai-gateway/basic.yaml](docs/examples/envoy-ai-gateway/basic.yaml) + +Check if the gateway pod to be ready: + +```bash +kubectl wait pods --timeout=2m \ + -l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \ + -n envoy-gateway-system \ + --for=condition=Ready +``` + +### 4. Check Envoy AI Gateway APIs + +- For local test with port forwarding, use `export GATEWAY_URL="http://localhost:8080"`. +- Using external IP, use `export GATEWAY_URL=$(kubectl get gateway/envoy-ai-gateway-basic -o jsonpath='{.status.addresses[0].value}')` + +See https://aigateway.envoyproxy.io/docs/getting-started/basic-usage for more details. + +`$GATEWAY_URL/v1/models` will show the models that are available in the Envoy AI Gateway. The response will look like this: + +```json +{ + "data": [ + { + "id": "some-cool-self-hosted-model", + "created": 1744880950, + "object": "model", + "owned_by": "Envoy AI Gateway" + }, + { + "id": "qwen2-0.5b", + "created": 1744880950, + "object": "model", + "owned_by": "Envoy AI Gateway" + } + ], + "object": "list" +} +``` + +`$GATEWAY_URL/v1/chat/completions` will show the chat completions for the model. The request will look like this: + +```bash +curl -H "Content-Type: application/json" -d '{ + "model": "qwen2-0.5b", + "messages": [ + { + "role": "system", + "content": "Hi." + } + ] + }' $GATEWAY_URL/v1/chat/completions +``` + +Expected response will look like this: + +```json +{ + "choices": [ + { + "message": { + "content": "I'll be back." + } + } + ] +} +``` + diff --git a/docs/open-webui.md b/docs/open-webui.md index 638a2310..c673be08 100644 --- a/docs/open-webui.md +++ b/docs/open-webui.md @@ -5,7 +5,7 @@ ## Prerequisites - Make sure you're located in **llmaz-system** namespace, haven't tested with other namespaces. -- Make sure [EnvoyGateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway) are installed, both of them are installed by default in llmaz. +- Make sure [EnvoyGateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway) are installed, both of them are installed by default in llmaz. See [Envoy AI Gateway](docs/envoy-ai-gateway.md) for more details. ## How to use