Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 0 additions & 51 deletions config/manifests/bbr-example/httproute_bbr.yaml

This file was deleted.

71 changes: 0 additions & 71 deletions config/manifests/bbr-example/httproute_bbr_lora.yaml

This file was deleted.

88 changes: 0 additions & 88 deletions config/manifests/bbr-example/vllm-phi4-mini.yaml

This file was deleted.

10 changes: 10 additions & 0 deletions config/manifests/bbr/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: vllm-llama3-8b-instruct-adapters-allowlist
labels:
inference-gateway.k8s.io/managed: "true"
data:
baseModel: meta-llama/Llama-3.1-8B-Instruct
adapters: |
- food-review-1
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,15 @@ spec:
resources:
requests:
cpu: 10m
---
apiVersion: v1
kind: ConfigMap
metadata:
name: deepseek-adapters-allowlist
labels:
inference-gateway.k8s.io/managed: "true"
data:
baseModel: deepseek/vllm-deepseek-r1
adapters: |
- ski-resorts
- movie-critique
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ nav:
- Getting started (Released): guides/index.md
- Getting started (Latest/Main): guides/getting-started-latest.md
- Use Cases:
- Serve Multiple GenAI models: guides/serve-multiple-genai-models.md
- Serving Multiple Inference Pools (Latest/Main): guides/serving-multiple-inference-pools-latest.md
- Rollout:
- Adapter Rollout: guides/adapter-rollout.md
- InferencePool Rollout: guides/inferencepool-rollout.md
Expand Down
2 changes: 1 addition & 1 deletion site-src/_includes/bbr.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
### Deploy the Body Based Router Extension (Optional)

This guide has shown how to get started with serving a single base model type per L7 URL path. If after this exercise, you wish to continue on to exercise model-aware routing such that more than 1 base model is served at the same L7 url path, that requires use of the (optional) Body Based Routing (BBR) extension which is described in a separate section of the documentation, namely the [`Serving Multiple GenAI Models`](serve-multiple-genai-models.md) section. If you wish to exercise that function, then retain the setup you have deployed so far from this guide and move on to the additional steps described in [that guide](serve-multiple-genai-models.md) or else move on to the following section to cleanup your setup.
This guide has shown how to get started with serving a single `InferencePool`. If after this exercise, you wish to continue on to exercise model-aware routing such that more than one `InferencePool` is served at the same L7 url path, that requires use of the (optional) Body Based Routing (BBR) extension which is described in a separate section of the documentation.
5 changes: 0 additions & 5 deletions site-src/_includes/infobj.md

This file was deleted.

19 changes: 0 additions & 19 deletions site-src/_includes/model-server.md

This file was deleted.

4 changes: 2 additions & 2 deletions site-src/guides/getting-started-latest.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@

### Deploy Sample Model Server

--8<-- "site-src/_includes/model-server-intro.md"

--8<-- "site-src/_includes/model-server-gpu.md"

```bash
Expand Down Expand Up @@ -222,6 +220,8 @@ Deploy the sample InferenceObjective which allows you to specify priority of req

--8<-- "site-src/_includes/bbr.md"

If you wish to exercise that function, then retain the setup you have deployed so far from this guide and move on to the additional steps described in [Serving Multiple Inference Pools](serving-multiple-inference-pools-latest.md) or else move on to the following section to cleanup your setup.

### Cleanup

The following instructions assume you would like to cleanup ALL resources that were created in this quickstart guide.
Expand Down
2 changes: 0 additions & 2 deletions site-src/guides/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,6 @@ IGW_LATEST_RELEASE=$(curl -s https://api.github.com/repos/kubernetes-sigs/gatewa

### Deploy Sample Model Server

--8<-- "site-src/_includes/model-server-intro.md"

--8<-- "site-src/_includes/model-server-gpu.md"

```bash
Expand Down
Loading