-
-
Notifications
You must be signed in to change notification settings - Fork 35
Open
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.needs-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.
Description
What would you like to be added:
See the whole list: https://github.com/InftyAI/llmaz/milestone/3
We'll focus on three main things:
-
xPyD serving with heterogeneous devices, we need a new orchestration layer build on top of lws
- disaggregate PD serving
- aggregate PD serving
-
More advanced routing policies, e.g. based on request profile & GPU type
-
GPU spot instances scaling ready for production env
Glad to have like:
- Advanced Pod scaling with dedicated scaler
Why is this needed:
Completion requirements:
This enhancement requires the following artifacts:
- Design doc
- API change
- Docs update
The artifacts should be linked in subsequent comments.
Metadata
Metadata
Assignees
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.needs-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.