InftyAI · InftyAI-Agent · Apr 22, 2025 · Apr 22, 2025 · Apr 22, 2025
diff --git a/chart/Chart.yaml b/chart/Chart.yaml
@@ -25,3 +25,11 @@ dependencies:
     version: "6.4.0"
     repository: "https://helm.openwebui.com/"
     condition: open-webui.enabled
+  - name: envoy-gateway
+    version: v1.3.2
+    repository: oci://docker.io/envoyproxy/gateway-helm
+    condition: envoy-gateway.enabled
+  - name: envoy-ai-gateway
+    version: v0.1.5
+    repository: oci://docker.io/envoyproxy/ai-gateway-helm
+    condition: envoy-ai-gateway.enabled
diff --git a/chart/values.global.yaml b/chart/values.global.yaml
@@ -47,3 +47,8 @@ open-webui:
     enabled: false
   redis-cluster:
     enabled: false
+
+envoy-gateway:
+  enabled: true
+envoy-ai-gateway:
+  enabled: true
diff --git a/docs/examples/envoy-ai-gateway/README.md b/docs/examples/envoy-ai-gateway/README.md
@@ -0,0 +1,101 @@
+# Envoy AI Gateway
+
+[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is an open source project for using Envoy Gateway
+to handle request traffic from application clients to Generative AI services.
+
+## How to use
+
+### 1. Enable Envoy Gateway and Envoy AI Gateway in llmaz Helm
+
+Enable Envoy Gateway and Envoy AI Gateway in the `values.global.yaml` file, envoy gateway and envoy ai gateway are disabled by default.
+
+```yaml
+envoy-gateway:
+    enabled: true
+envoy-ai-gateway:
+    enabled: true
+```
+
+Note: [Envoy Gateway installation](https://gateway.envoyproxy.io/latest/install/install-helm/) and [Envoy AI Gateway installation](https://aigateway.envoyproxy.io/docs/getting-started/) can be done standalone.
+
+### 2. Check Envoy Gateway and Envoy AI Gateway
+
+Run `kubectl wait --timeout=5m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available` to wait for the envoy gateway to be ready.
+
+Run `kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available` to wait for the envoy ai gateway to be ready.
+
+### 3. Basic AI Gateway example
+
+To expose your model(Playground) to Envoy Gateway, you need to create a GatewayClass, Gateway, and AIGatewayRoute. The following example shows how to do this.
+
+Example [qwen playground](docs/examples/llamacpp/playground.yaml) configuration for a basic AI Gateway.
+The model name is `qwen2-0.5b`, so the backend ref name is `qwen2-0--5b`, and the model lb service: `qwen2-0--5b-lb`
+- Playground in [docs/examples/llamacpp/playground.yaml](docs/examples/llamacpp/playground.yaml)
+- GatewayClass in [docs/examples/envoy-ai-gateway/basic.yaml](docs/examples/envoy-ai-gateway/basic.yaml)
+
+Check if the gateway pod to be ready:
+
+```bash
+kubectl wait pods --timeout=2m \
+    -l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
+    -n envoy-gateway-system \
+    --for=condition=Ready
+```
+
+### 4. Check Envoy AI Gateway APIs
+
+- For local test with port forwarding, use `export GATEWAY_URL="http://localhost:8080"`. 
+- Using external IP, use `export GATEWAY_URL=$(kubectl get gateway/envoy-ai-gateway-basic -o jsonpath='{.status.addresses[0].value}')`
+
+See https://aigateway.envoyproxy.io/docs/getting-started/basic-usage for more details.
+
+`$GATEWAY_URL/v1/models` will show the models that are available in the Envoy AI Gateway. The response will look like this:
+
+```json
+{
+  "data": [
+    {
+      "id": "some-cool-self-hosted-model",
+      "created": 1744880950,
+      "object": "model",
+      "owned_by": "Envoy AI Gateway"
+    },
+    {
+      "id": "qwen2-0.5b",
+      "created": 1744880950,
+      "object": "model",
+      "owned_by": "Envoy AI Gateway"
+    }
+  ],
+  "object": "list"
+}
+```
+
+`$GATEWAY_URL/v1/chat/completions` will show the chat completions for the model. The request will look like this:
+
+```bash
+curl -H "Content-Type: application/json"     -d '{
+        "model": "qwen2-0.5b",
+        "messages": [
+            {
+                "role": "system",
+                "content": "Hi."
+            }
+        ]
+    }'     $GATEWAY_URL/v1/chat/completions
+```
+
+Expected response will look like this:
+
+```json
+{
+    "choices": [
+        {
+            "message": {
+                "content": "I'll be back."
+            }
+        }
+    ]
+}
+```
+
diff --git a/docs/examples/envoy-ai-gateway/basic.yaml b/docs/examples/envoy-ai-gateway/basic.yaml
@@ -0,0 +1,54 @@
+apiVersion: gateway.networking.k8s.io/v1
+kind: GatewayClass
+metadata:
+  name: envoy-ai-gateway-basic
+spec:
+  controllerName: gateway.envoyproxy.io/gatewayclass-controller
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: Gateway
+metadata:
+  name: envoy-ai-gateway-basic
+  namespace: default
+spec:
+  gatewayClassName: envoy-ai-gateway-basic
+  listeners:
+    - name: http
+      protocol: HTTP
+      port: 80
+---
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: AIGatewayRoute
+metadata:
+  name: envoy-ai-gateway-basic
+  namespace: default
+spec:
+  schema:
+    name: OpenAI
+  targetRefs:
+    - name: envoy-ai-gateway-basic
+      kind: Gateway
+      group: gateway.networking.k8s.io
+  rules:
+
+# Above are basic config for envoy ai gateway
+# Below is example for qwen2-0.5b: a matched backend ref and the AIServiceBackend
+    - matches:
+        - headers:
+            - type: Exact
+              name: x-ai-eg-model
+              value: qwen2-0.5b
+      backendRefs:
+        - name: envoy-ai-gateway-llmaz-model-1
+---
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: AIServiceBackend
+metadata:
+  name: envoy-ai-gateway-llmaz-model-1
+  namespace: default
+spec:
+  schema:
+    name: OpenAI
+  backendRef:
+    name: qwen2-0--5b-lb
+    kind: Service
diff --git a/docs/examples/envoy-ai-gateway/envoy-ai-gateway.md b/docs/examples/envoy-ai-gateway/envoy-ai-gateway.md
@@ -0,0 +1,102 @@
+# Envoy AI Gateway
+
+[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is an open source project for using Envoy Gateway
+to handle request traffic from application clients to Generative AI services.
+
+## How to use
+
+### 1. Enable Envoy Gateway and Envoy AI Gateway in llmaz Helm
+
+Enable Envoy Gateway and Envoy AI Gateway in the `values.global.yaml` file, envoy gateway and envoy ai gateway are enabled by default.
+
+```yaml
+envoy-gateway:
+    enabled: true
+envoy-ai-gateway:
+    enabled: true
+```
+
+Note: [Envoy Gateway installation](https://gateway.envoyproxy.io/latest/install/install-helm/) and [Envoy AI Gateway installation](https://aigateway.envoyproxy.io/docs/getting-started/) can be done standalone.
+
+### 2. Check Envoy Gateway and Envoy AI Gateway
+
+Run `kubectl wait --timeout=5m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available` to wait for the envoy gateway to be ready.
+
+Run `kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available` to wait for the envoy ai gateway to be ready.
+
+### 3. Basic AI Gateway example
+
+To expose your model(Playground) to Envoy Gateway, you need to create a GatewayClass, Gateway, and AIGatewayRoute. The following example shows how to do this.
+
+Example [qwen playground](docs/examples/llamacpp/playground.yaml) configuration for a basic AI Gateway.
+The model name is `qwen2-0.5b`, so the backend ref name is `qwen2-0--5b`, and the model lb service: `qwen2-0--5b-lb`
+
+- Playground in [docs/examples/llamacpp/playground.yaml](docs/examples/llamacpp/playground.yaml)
+- GatewayClass in [docs/examples/envoy-ai-gateway/basic.yaml](docs/examples/envoy-ai-gateway/basic.yaml)
+
+Check if the gateway pod to be ready:
+
+```bash
+kubectl wait pods --timeout=2m \
+    -l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
+    -n envoy-gateway-system \
+    --for=condition=Ready
+```
+
+### 4. Check Envoy AI Gateway APIs
+
+- For local test with port forwarding, use `export GATEWAY_URL="http://localhost:8080"`. 
+- Using external IP, use `export GATEWAY_URL=$(kubectl get gateway/envoy-ai-gateway-basic -o jsonpath='{.status.addresses[0].value}')`
+
+See https://aigateway.envoyproxy.io/docs/getting-started/basic-usage for more details.
+
+`$GATEWAY_URL/v1/models` will show the models that are available in the Envoy AI Gateway. The response will look like this:
+
+```json
+{
+  "data": [
+    {
+      "id": "some-cool-self-hosted-model",
+      "created": 1744880950,
+      "object": "model",
+      "owned_by": "Envoy AI Gateway"
+    },
+    {
+      "id": "qwen2-0.5b",
+      "created": 1744880950,
+      "object": "model",
+      "owned_by": "Envoy AI Gateway"
+    }
+  ],
+  "object": "list"
+}
+```
+
+`$GATEWAY_URL/v1/chat/completions` will show the chat completions for the model. The request will look like this:
+
+```bash
+curl -H "Content-Type: application/json"     -d '{
+        "model": "qwen2-0.5b",
+        "messages": [
+            {
+                "role": "system",
+                "content": "Hi."
+            }
+        ]
+    }'     $GATEWAY_URL/v1/chat/completions
+```
+
+Expected response will look like this:
+
+```json
+{
+    "choices": [
+        {
+            "message": {
+                "content": "I'll be back."
+            }
+        }
+    ]
+}
+```
+
diff --git a/docs/open-webui.md b/docs/open-webui.md
@@ -5,7 +5,7 @@
 ## Prerequisites
 
 - Make sure you're located in **llmaz-system** namespace, haven't tested with other namespaces.
-- Make sure [EnvoyGateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway) are installed, both of them are installed by default in llmaz.
+- Make sure [EnvoyGateway](https://github.com/envoyproxy/gateway) and [Envoy AI Gateway](https://github.com/envoyproxy/ai-gateway) are installed, both of them are installed by default in llmaz. See [Envoy AI Gateway](docs/envoy-ai-gateway.md) for more details.
 
 ## How to use