Background and motivation
Multimodal use cases can involve cases where same image content is sent across different requests, for example:
Recently, we added support for encoder cache aware routing in EPP as llm-d/llm-d-router@489b77c which optimizes traffic in scenarios like the above.
Requirement
Given the above use cases, it would be good to have a workload catalog for shared multimodal content (images) dataset. The underlying work may be addressed by #498
This issue primarily tracks translating it into a workload catalog.
References
Related issue for well-lit path in llm-d: llm-d/llm-d#1611
Background and motivation
Multimodal use cases can involve cases where same image content is sent across different requests, for example:
Recently, we added support for encoder cache aware routing in EPP as llm-d/llm-d-router@489b77c which optimizes traffic in scenarios like the above.
Requirement
Given the above use cases, it would be good to have a workload catalog for shared multimodal content (images) dataset. The underlying work may be addressed by #498
This issue primarily tracks translating it into a workload catalog.
References
Related issue for well-lit path in llm-d: llm-d/llm-d#1611