-
-
Notifications
You must be signed in to change notification settings - Fork 44
Closed
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.needs-kindIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.
Milestone
Description
What would you like to be added:
- Integrate with envoy ai gateway https://github.com/envoyproxy/ai-gateway with advanced features with GenAI services.
- Support to dispatch requests to corresponding inference services, for example, a request looks like
curl -H "Content-Type: application/json" \
-d '{
"model": "deepseek-r3",
"messages": [
{
"role": "system",
"content": "Hi."
}
]
}' \
$GATEWAY_URL/v1/chat/completions
The model name deepseek-r3 refers to the playground/inference service name, the request will route to the corresponding service(by default, we'll create a service named <sersvice>-lb routing to the real workloads).
See ai-gateway document for resources we need to create: https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/basic/basic.yaml.
Note that, the api is following openAI schema: https://platform.openai.com/docs/api-reference/chat/create
Why is this needed:
Completion requirements:
This enhancement requires the following artifacts:
- Design doc
- API change
- Docs update
The artifacts should be linked in subsequent comments.
Metadata
Metadata
Assignees
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.needs-kindIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.