-
Notifications
You must be signed in to change notification settings - Fork 73
Description
Is your feature request related to a problem? Please describe.
Currently, there is no way to distinguish individual user usage when OpenShift Lightspeed (OLS) makes requests to the LLM provider.
This creates the following issues:
- Cost Allocation: We cannot track which users or teams are consuming the most tokens.
- Rate Limiting: We cannot apply rate limits per user (e.g., via LiteLLM) to prevent a single user from draining the organization's quota.
Describe the solution you'd like
We would like to add an optional tracking_header (or similar configuration) to the OLS Custom Resource (CR).
This configuration should allow us to specify a custom HTTP header key (e.g., x-user-id or llm-user) that OLS will inject into every request sent to the LLM provider. Crucially, the value of this header should automatically populate with the authenticated OpenShift user's ID or username.
Describe alternatives you've considered
- Deploying per-team OLS instances: This is operationally heavy and inefficient.
- Intermediate Proxy: Routing OLS through a custom proxy to inject headers, which adds unnecessary latency and complexity.
Additional context
We use LiteLLM as a gateway to manage costs and quotas. LiteLLM supports a user parameter/header to track and limit usage by end-user ID. Enabling OLS to forward the OpenShift username in a header would allow us to utilize these governance features effectively.