Skip to content

Workload catalog for multimodal workloads #508

@rahulgurnani

Description

@rahulgurnani

Background and motivation

Multimodal use cases can involve cases where same image content is sent across different requests, for example:

Recently, we added support for encoder cache aware routing in EPP as llm-d/llm-d-router@489b77c which optimizes traffic in scenarios like the above.

Requirement

Given the above use cases, it would be good to have a workload catalog for shared multimodal content (images) dataset. The underlying work may be addressed by #498
This issue primarily tracks translating it into a workload catalog.

References
Related issue for well-lit path in llm-d: llm-d/llm-d#1611

Metadata

Metadata

Assignees

Labels

needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions