Skip to content

Commit 062aff8

Browse files
refactor: inherit provider keys from global client in semanticcache plugin (#3079)
## Summary The semantic cache plugin previously managed its own internal Bifrost client and required API keys to be specified directly in the plugin config. This PR removes that self-contained client and replaces it with an injected `EmbeddingRequestExecutor` — a function reference pointing to the global Bifrost client's `EmbeddingRequest` method. As a result, provider API keys are inherited automatically from the global provider configuration and no longer need to be (or can be) specified inside the plugin config. ## Changes - Removed the `Keys []schemas.Key` field from `semanticcache.Config` and all related JSON unmarshalling, test fixtures, and UI type definitions. - Removed the internal `PluginAccount` struct and the self-contained `bifrost.Bifrost` client that the plugin previously initialized for embedding requests. - Introduced `EmbeddingRequestExecutor` as a function type (`func(*schemas.BifrostContext, *schemas.BifrostEmbeddingRequest) (*schemas.BifrostEmbeddingResponse, *schemas.BifrostError)`) and added `SetEmbeddingRequestExecutor` on the plugin to wire it to the global client at startup and on plugin reload. - The server's `Bootstrap` and `ReloadPlugin` paths now call `SetEmbeddingRequestExecutor(s.Client.EmbeddingRequest)` after the plugin is loaded. - Embedding requests generated by the plugin now run in a child context with `BifrostContextKeySkipPluginPipeline` set to `true`, preventing recursive plugin pipeline execution. - Renamed `AddProviderKeysToSemanticCacheConfig` → `ValidateSemanticCacheConfig` to reflect that the function no longer injects keys — it only validates that the referenced provider exists in the global config. - Removed `RemoveProviderKeysFromSemanticCacheConfig` entirely, since keys are no longer stored in the plugin config. - Removed the `keys` field from `config.schema.json` and the UI type system (`CacheConfig`, `EditorCacheConfig`, `DirectCacheConfig`, `ProviderBackedCacheConfig`, and Zod schemas). - Updated documentation to reflect that keys are inherited automatically, clarify UI configuration steps, and correct the direct-only mode setup instructions (omit `provider` and `embedding_model`, not `keys`). - Updated tests to remove `Keys` from all config fixtures and adjust `TestInvalidProviderRejection` to expect `Init` to succeed (provider validation now happens at request time via the global client). ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [x] UI (React) - [x] Docs ## How to test ```sh # Core/Transports go test ./plugins/semanticcache/... go test ./transports/bifrost-http/... # UI cd ui pnpm i pnpm build ``` 1. Configure a provider (e.g. OpenAI) in `config.json` without specifying `keys` inside the semantic cache plugin config. 2. Enable the semantic cache plugin via the UI or `config.json`. 3. Send requests through Bifrost and confirm that semantic cache hits are returned using the global provider's keys. 4. Confirm that specifying `keys` inside the plugin config is silently ignored (field no longer exists). ## Breaking changes - [x] Yes - [ ] No The `Keys` field has been removed from `semanticcache.Config`. Any existing `config.json` files or Go code that set `Keys` inside the semantic cache plugin config must remove that field. Keys are now inherited automatically from the global provider configuration and no longer need to be specified. ## Security considerations API keys are no longer duplicated into the plugin config or persisted to the config store as part of the plugin's configuration blob. This reduces the surface area for accidental key exposure in stored configs or API responses. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
1 parent 02d0079 commit 062aff8

15 files changed

Lines changed: 1173 additions & 1206 deletions

File tree

docs/features/semantic-caching.mdx

Lines changed: 24 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,6 @@ import (
118118
cacheConfig := &semanticcache.Config{
119119
// Embedding model configuration (Required)
120120
Provider: schemas.OpenAI,
121-
Keys: []schemas.Key{{Value: "sk-..."}},
122121
EmbeddingModel: "text-embedding-3-small",
123122
Dimension: 1536,
124123

@@ -155,22 +154,32 @@ bifrostConfig := schemas.BifrostConfig{
155154

156155
![Semantic Cache Plugin Configuration](../media/ui-semantic-cache-config.png)
157156

158-
**Note**: Make sure you have a vector store setup (using `config.json`) before configuring the semantic cache plugin.
157+
**Prerequisites**: A vector store must be configured and enabled in `config.json`, and at least one provider must be configured, before the toggle becomes available.
159158

160-
1. **Navigate to Settings**
161-
- Open Bifrost UI at `http://localhost:8080`
162-
- Go to Settings.
159+
1. **Navigate to the Config page** in the Bifrost UI and find the **Plugins** section.
163160

164-
2. **Configure Semantic Cache Plugin**
161+
2. **Toggle** the **Enable Semantic Caching** switch to enable it. The configuration form expands below.
165162

166-
- Toggle the plugin switch to enable it, and fill in the required fields.
163+
3. **Fill in the fields** across the four sections:
167164

168-
**Required Fields:**
169-
- **Provider**: The provider to use for caching.
170-
- **Embedding Model**: The embedding model to use for caching.
171-
- **Dimension**: The embedding dimension for the configured embedding model.
165+
**Provider and Model Settings** (required for semantic mode):
166+
- **Configured Providers**: Dropdown of providers already set up in Bifrost. The selected provider's API keys are inherited automatically.
167+
- **Embedding Model**: The embedding model to use (e.g. `text-embedding-3-small`).
172168

173-
**Note**: Changes will need a restart of the Bifrost server to take effect, because the plugin is loaded on startup only.
169+
**Cache Settings**:
170+
- **TTL (seconds)**: How long cached responses are kept (default: 300 s).
171+
- **Similarity Threshold**: Cosine similarity cutoff for a cache hit (0–1, default: 0.8).
172+
- **Dimension**: Vector dimension matching your embedding model (e.g. 1536 for `text-embedding-3-small`).
173+
174+
**Conversation Settings**:
175+
- **Conversation History Threshold**: Skip caching when the conversation has more than this many messages (default: 3).
176+
- **Exclude System Prompt** (toggle): Exclude system messages from cache-key generation.
177+
178+
**Cache Behavior**:
179+
- **Cache by Model** (toggle): Include the model name in the cache key (default: on).
180+
- **Cache by Provider** (toggle): Include the provider name in the cache key (default: on).
181+
182+
4. Click **Save**. Changes are persisted and applied immediately for enabled plugins via the API reload path; other plugin changes (e.g. via `config.json`) may still require a restart.
174183

175184
</Tab>
176185

@@ -202,7 +211,7 @@ bifrostConfig := schemas.BifrostConfig{
202211
}
203212
```
204213

205-
> **Note**: In `config.json` setups, provider keys are taken from the provider config on initialization, so you do not need to duplicate `keys` inside the plugin config. Any updates to the provider keys will not be reflected until next restart.
214+
> **Note**: Provider API keys are inherited automatically from the global provider configuration. You do not need to (and cannot) specify keys inside the plugin config.
206215
207216
**TTL Format Options:**
208217
- Duration strings: `"30s"`, `"5m"`, `"1h"`, `"24h"`
@@ -228,7 +237,7 @@ Exact-match direct entries are stored and retrieved using a deterministic cache
228237

229238
### Setup
230239

231-
To enable direct-only mode globally, set `dimension: 1` and omit the `provider` and `keys` fields from the plugin config. The plugin will automatically fall back to direct search only.
240+
To enable direct-only mode globally, set `dimension: 1` and omit the `provider` and `embedding_model` fields from the plugin config. The plugin will automatically fall back to direct search only.
232241

233242
> **Important**: If you specify `dimension: 1` and also provide a `provider`, Bifrost treats the config as provider-backed semantic mode, not direct-only mode. To use direct-only mode, omit the `provider` field entirely.
234243
@@ -246,7 +255,7 @@ import (
246255
)
247256

248257
cacheConfig := &semanticcache.Config{
249-
// No Provider, Keys, or EmbeddingModel -- direct hash mode only
258+
// No Provider or EmbeddingModel -- direct hash mode only
250259
Dimension: 1, // Placeholder; entries are stored as metadata-only (no embedding vectors). Change dimension before switching to dual-layer mode to avoid mixed-dimension issues.
251260

252261
TTL: 5 * time.Minute,

plugins/semanticcache/main.go

Lines changed: 48 additions & 101 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@ import (
1515

1616
bifrost "github.com/maximhq/bifrost/core"
1717
"github.com/maximhq/bifrost/core/schemas"
18-
"github.com/maximhq/bifrost/framework"
1918
"github.com/maximhq/bifrost/framework/vectorstore"
2019
)
2120

@@ -25,7 +24,6 @@ import (
2524
type Config struct {
2625
// Embedding Model settings - REQUIRED for semantic caching
2726
Provider schemas.ModelProvider `json:"provider"`
28-
Keys []schemas.Key `json:"keys"`
2927
EmbeddingModel string `json:"embedding_model,omitempty"` // Model to use for generating embeddings (optional)
3028

3129
// Plugin behavior settings
@@ -48,19 +46,18 @@ type Config struct {
4846
func (c *Config) UnmarshalJSON(data []byte) error {
4947
// Define a temporary struct to avoid infinite recursion
5048
type TempConfig struct {
51-
Provider string `json:"provider"`
52-
Keys []schemas.Key `json:"keys"`
53-
EmbeddingModel string `json:"embedding_model,omitempty"`
54-
CleanUpOnShutdown bool `json:"cleanup_on_shutdown,omitempty"`
55-
Dimension int `json:"dimension"`
56-
TTL interface{} `json:"ttl,omitempty"`
57-
Threshold float64 `json:"threshold,omitempty"`
58-
VectorStoreNamespace string `json:"vector_store_namespace,omitempty"`
59-
DefaultCacheKey string `json:"default_cache_key,omitempty"`
60-
ConversationHistoryThreshold int `json:"conversation_history_threshold,omitempty"`
61-
CacheByModel *bool `json:"cache_by_model,omitempty"`
62-
CacheByProvider *bool `json:"cache_by_provider,omitempty"`
63-
ExcludeSystemPrompt *bool `json:"exclude_system_prompt,omitempty"`
49+
Provider string `json:"provider"`
50+
EmbeddingModel string `json:"embedding_model,omitempty"`
51+
CleanUpOnShutdown bool `json:"cleanup_on_shutdown,omitempty"`
52+
Dimension int `json:"dimension"`
53+
TTL interface{} `json:"ttl,omitempty"`
54+
Threshold float64 `json:"threshold,omitempty"`
55+
VectorStoreNamespace string `json:"vector_store_namespace,omitempty"`
56+
DefaultCacheKey string `json:"default_cache_key,omitempty"`
57+
ConversationHistoryThreshold int `json:"conversation_history_threshold,omitempty"`
58+
CacheByModel *bool `json:"cache_by_model,omitempty"`
59+
CacheByProvider *bool `json:"cache_by_provider,omitempty"`
60+
ExcludeSystemPrompt *bool `json:"exclude_system_prompt,omitempty"`
6461
}
6562

6663
var temp TempConfig
@@ -70,7 +67,6 @@ func (c *Config) UnmarshalJSON(data []byte) error {
7067

7168
// Set simple fields
7269
c.Provider = schemas.ModelProvider(temp.Provider)
73-
c.Keys = temp.Keys
7470
c.EmbeddingModel = temp.EmbeddingModel
7571
c.CleanUpOnShutdown = temp.CleanUpOnShutdown
7672
c.Dimension = temp.Dimension
@@ -129,6 +125,10 @@ type StreamAccumulator struct {
129125
mu sync.Mutex // Protects chunk operations
130126
}
131127

128+
// EmbeddingRequestExecutor is a function that executes a request and returns a response and an error.
129+
// It maps to .EmbeddingRequest() of the bifrost client.
130+
type EmbeddingRequestExecutor func(ctx *schemas.BifrostContext, req *schemas.BifrostEmbeddingRequest) (*schemas.BifrostEmbeddingResponse, *schemas.BifrostError)
131+
132132
// Plugin implements the schemas.LLMPlugin interface for semantic caching.
133133
// It caches responses using a two-tier approach: direct hash matching for exact requests
134134
// and semantic similarity search for related content. The plugin supports configurable caching behavior
@@ -139,12 +139,12 @@ type StreamAccumulator struct {
139139
// - config: Plugin configuration including semantic cache and caching settings
140140
// - logger: Logger instance for plugin operations
141141
type Plugin struct {
142-
store vectorstore.VectorStore
143-
config *Config
144-
logger schemas.Logger
145-
client *bifrost.Bifrost
146-
streamAccumulators sync.Map // Track stream accumulators by request ID
147-
waitGroup sync.WaitGroup
142+
store vectorstore.VectorStore
143+
config *Config
144+
logger schemas.Logger
145+
embeddingRequestExecutor EmbeddingRequestExecutor
146+
streamAccumulators sync.Map // Track stream accumulators by request ID
147+
waitGroup sync.WaitGroup
148148
}
149149

150150
// Plugin constants
@@ -201,45 +201,6 @@ var VectorStoreProperties = map[string]vectorstore.VectorStoreProperties{
201201
},
202202
}
203203

204-
type PluginAccount struct {
205-
provider schemas.ModelProvider
206-
keys []schemas.Key
207-
}
208-
209-
func (pa *PluginAccount) GetConfiguredProviders() ([]schemas.ModelProvider, error) {
210-
return []schemas.ModelProvider{pa.provider}, nil
211-
}
212-
213-
func (pa *PluginAccount) GetKeysForProvider(ctx context.Context, providerKey schemas.ModelProvider) ([]schemas.Key, error) {
214-
return pa.keys, nil
215-
}
216-
217-
func (pa *PluginAccount) GetConfigForProvider(providerKey schemas.ModelProvider) (*schemas.ProviderConfig, error) {
218-
return &schemas.ProviderConfig{
219-
NetworkConfig: schemas.DefaultNetworkConfig,
220-
ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
221-
}, nil
222-
}
223-
224-
// Dependencies is a list of dependencies that the plugin requires.
225-
var Dependencies []framework.FrameworkDependency = []framework.FrameworkDependency{framework.FrameworkDependencyVectorStore}
226-
227-
// ProvidersWithEmbeddingSupport lists all providers that support embedding operations.
228-
// Providers not in this list will return UnsupportedOperationError for embedding requests.
229-
var ProvidersWithEmbeddingSupport = map[schemas.ModelProvider]bool{
230-
schemas.OpenAI: true,
231-
schemas.Azure: true,
232-
schemas.Bedrock: true,
233-
schemas.Cohere: true,
234-
schemas.Gemini: true,
235-
schemas.Vertex: true,
236-
schemas.Mistral: true,
237-
schemas.Ollama: true,
238-
schemas.Nebius: true,
239-
schemas.HuggingFace: true,
240-
schemas.SGL: true,
241-
}
242-
243204
const (
244205
CacheKey schemas.BifrostContextKey = "semantic_cache_key" // To set the cache key for a request - REQUIRED for all requests
245206
CacheTTLKey schemas.BifrostContextKey = "semantic_cache_ttl" // To explicitly set the TTL for a request
@@ -323,26 +284,8 @@ func Init(ctx context.Context, config *Config, logger schemas.Logger, store vect
323284

324285
if config.Provider == "" && config.Dimension == 1 {
325286
logger.Info(PluginLoggerPrefix + " Starting in direct-only mode (dimension=1, no embedding provider)")
326-
} else if config.Provider == "" || len(config.Keys) == 0 {
327-
logger.Warn(PluginLoggerPrefix + " Incomplete semantic mode config: missing provider or keys, falling back to direct search only")
328-
} else {
329-
// Validate that the provider supports embeddings
330-
if bifrost.IsStandardProvider(config.Provider) && !ProvidersWithEmbeddingSupport[config.Provider] {
331-
return nil, fmt.Errorf("provider '%s' does not support embedding operations required for semantic cache. Supported providers: openai, azure, bedrock, cohere, gemini, vertex, mistral, ollama, nebius, huggingface, sgl. Note: custom providers based on embedding-capable providers are also supported", config.Provider)
332-
}
333-
334-
bifrost, err := bifrost.Init(ctx, schemas.BifrostConfig{
335-
Logger: logger,
336-
Account: &PluginAccount{
337-
provider: config.Provider,
338-
keys: config.Keys,
339-
},
340-
})
341-
if err != nil {
342-
return nil, fmt.Errorf("failed to initialize bifrost for semantic cache: %w", err)
343-
}
344-
345-
plugin.client = bifrost
287+
} else if config.Provider == "" {
288+
logger.Warn(PluginLoggerPrefix + " Incomplete semantic mode config: missing provider, falling back to direct search only")
346289
}
347290

348291
createCtx, cancel := context.WithTimeout(ctx, CreateNamespaceTimeout)
@@ -378,19 +321,6 @@ func (plugin *Plugin) HTTPTransportStreamChunkHook(ctx *schemas.BifrostContext,
378321
return chunk, nil
379322
}
380323

381-
func (plugin *Plugin) clearRequestScopedContext(ctx *schemas.BifrostContext) {
382-
ctx.ClearValue(requestIDKey)
383-
ctx.ClearValue(requestStorageIDKey)
384-
ctx.ClearValue(requestHashKey)
385-
ctx.ClearValue(requestParamsHashKey)
386-
ctx.ClearValue(requestModelKey)
387-
ctx.ClearValue(requestProviderKey)
388-
ctx.ClearValue(requestEmbeddingKey)
389-
ctx.ClearValue(requestEmbeddingTokensKey)
390-
ctx.ClearValue(isCacheHitKey)
391-
ctx.ClearValue(cacheHitTypeKey)
392-
}
393-
394324
// PreLLMHook is called before a request is processed by Bifrost.
395325
// It performs a two-stage cache lookup: first direct hash matching, then semantic similarity search.
396326
// Uses UUID-based keys for entries stored in the VectorStore.
@@ -465,7 +395,7 @@ func (plugin *Plugin) PreLLMHook(ctx *schemas.BifrostContext, req *schemas.Bifro
465395
}
466396
}
467397

468-
if performSemanticSearch && plugin.client != nil {
398+
if performSemanticSearch && plugin.embeddingRequestExecutor != nil {
469399
if req.EmbeddingRequest != nil || req.TranscriptionRequest != nil {
470400
plugin.logger.Debug(PluginLoggerPrefix + " Skipping semantic search for embedding/transcription input")
471401
// For vector stores that require vectors, set a zero vector placeholder
@@ -488,7 +418,7 @@ func (plugin *Plugin) PreLLMHook(ctx *schemas.BifrostContext, req *schemas.Bifro
488418
if shortCircuit != nil {
489419
return req, shortCircuit, nil
490420
}
491-
} else if !performSemanticSearch && plugin.store.RequiresVectors() && plugin.client != nil {
421+
} else if !performSemanticSearch && plugin.store.RequiresVectors() && plugin.embeddingRequestExecutor != nil {
492422
// Vector store requires vectors but we're in direct-only mode
493423
// Generate embeddings for storage purposes (not for searching)
494424
if req.EmbeddingRequest != nil || req.TranscriptionRequest != nil {
@@ -759,11 +689,6 @@ func (plugin *Plugin) Cleanup() error {
759689
// Clean up old stream accumulators first
760690
plugin.cleanupOldStreamAccumulators()
761691

762-
// Shutdown the internal Bifrost client used for embeddings
763-
if plugin.client != nil {
764-
plugin.client.Shutdown()
765-
}
766-
767692
// Only clean up cache entries if configured to do so
768693
if !plugin.config.CleanUpOnShutdown {
769694
plugin.logger.Debug(PluginLoggerPrefix + " Cleanup on shutdown is disabled, skipping cache cleanup")
@@ -804,6 +729,15 @@ func (plugin *Plugin) Cleanup() error {
804729
return nil
805730
}
806731

732+
// SetEmbeddingRequestExecutor sets the embedding request executor for the plugin.
733+
// Needs to be set before the plugin is used.
734+
//
735+
// Parameters:
736+
// - executor: The embedding request executor to set
737+
func (plugin *Plugin) SetEmbeddingRequestExecutor(executor EmbeddingRequestExecutor) {
738+
plugin.embeddingRequestExecutor = executor
739+
}
740+
807741
// Public Methods for External Use
808742

809743
// ClearCacheForKey deletes cache entries for a specific cache key.
@@ -869,3 +803,16 @@ func (plugin *Plugin) ClearCacheForRequestID(requestID string) error {
869803

870804
return nil
871805
}
806+
807+
func (plugin *Plugin) clearRequestScopedContext(ctx *schemas.BifrostContext) {
808+
ctx.ClearValue(requestIDKey)
809+
ctx.ClearValue(requestStorageIDKey)
810+
ctx.ClearValue(requestHashKey)
811+
ctx.ClearValue(requestParamsHashKey)
812+
ctx.ClearValue(requestModelKey)
813+
ctx.ClearValue(requestProviderKey)
814+
ctx.ClearValue(requestEmbeddingKey)
815+
ctx.ClearValue(requestEmbeddingTokensKey)
816+
ctx.ClearValue(isCacheHitKey)
817+
ctx.ClearValue(cacheHitTypeKey)
818+
}

0 commit comments

Comments
 (0)