-
-
Notifications
You must be signed in to change notification settings - Fork 44
Add envoy ai gateway #353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add envoy ai gateway #353
Conversation
pkg/controller/inference/gateway.go
Outdated
| func IsAIGatewayRouteExist(ctx context.Context, client client.Client) (bool, error) { | ||
| var route aigv1a1.AIGatewayRoute | ||
| err := client.Get(ctx, types.NamespacedName{ | ||
| Name: "envoy-ai-gateway-basic", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we necessarily have to use hardcoding? in name namespace🤔
|
In the future, we should combine the gateway configuration with Playground for easy start. However, right now, envoy ai gateway is still alpha, let's have the users to configure the envoy configurations themselves. What I mean here is we want't have any envoy configurations in our code base, what we need is just a Documentation about how to use the AI gateway. WDYT? The consideration here is because envoy has a lot of configurations, like the weight, token limits, if we just hard code them, it doesn't make any sense, all the stuff should be exported one day, as I mentioned, with Playground, but not today. |
2a364a2 to
080aa64
Compare
|
List my example here, however, not work. |
|
It works ... I don't know why. |
|
Once this is ready, I'll write a post to open-webui as kubernetes integration. |
|
also another adopter for envoy ai gateway. |
080aa64 to
75877f1
Compare
…y ai gateway basic quick start
e934511 to
872d02b
Compare
|
Almost ready for review:
This may be a todo item. |
| sigs.k8s.io/json v0.0.0-20241014173422-cfa47c3a1cc8 // indirect | ||
| ) | ||
|
|
||
| replace github.com/google/cel-go => github.com/google/cel-go v0.22.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this, there will be a compiler error, IIRC.
516efca to
72a42eb
Compare
| IMAGE_REPO := $(IMAGE_REGISTRY)/$(IMAGE_NAME) | ||
| GIT_TAG ?= $(shell git describe --tags --dirty --always) | ||
| GOPROXY=${GOPROXY:-""} | ||
| ifeq ($(origin GOPROXY), undefined) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please submit with another PR. It's irrelevant.
| .PHONY: install | ||
| install: manifests kustomize ## Install CRDs into the K8s cluster specified in ~/.kube/config. | ||
| $(KUSTOMIZE) build config/crd | $(KUBECTL) apply -f - | ||
| $(KUSTOMIZE) build config/crd | $(KUBECTL) apply --server-side --force-conflicts -f - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will --force-conflicts cause other problems? Should be careful here, prefer to keep the server-side only.
| @@ -0,0 +1,107 @@ | |||
| {{- if .Values.envoyAIGateway.enabled -}} | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not include these files in the helm chart I think, let's make them an example instead. Because most of them should be user-defined.
| // name: qwen2-0--5b-lb # model name | ||
| // kind: Service | ||
| // port: 8080 | ||
| func CreateAIServiceBackend(ctx context.Context, client client.Client, backendRefName, namespace string, port int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can orchestrate the envoy ai gateway now, because we didn't expose any configurations from playground yet, for example, because the llm serving is usually very slow, we need to define the timeout in the AIServiceBackend, or most of the time, the request will be timeout.
So my suggestion is let's deploy these configurations manually, we'll provide an example for users to follow, only envoy-ai-gateway is mature, we'll add some fields to playground for quick integration, similar like we do here. So what we need is:
- an example
- helm dependence and enabled by default, if users want to disable the components, they should append the disable args after the install cmds.
72a42eb to
1fcbcff
Compare
|
Use #360 instead. This PR tried to create/update the ai gateway resource according to current playground. However, we may add some extra attributes for playground or new resource later to do that. /close |
What this PR does / why we need it
Action Item:
Which issue(s) this PR fixes
Fixes #339
Special notes for your reviewer
Does this PR introduce a user-facing change?