You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Summary
Adds a full named cached content lifecycle (create, list, retrieve, update, delete) for Gemini (Google AI Studio) and Vertex AI, exposes it through the HTTP transport layer, and extends the `Provider` interface so all existing providers satisfy it with explicit "unsupported" stubs. Also fixes Bedrock Converse and Vertex-Anthropic to auto-fetch and inline remote URL images and documents (both APIs only accept inline base64 bytes), and adds a JSON unmarshal normalizer that transparently converts Anthropic-style `{type:"document", source:{...}}` content blocks into bifrost's canonical `{type:"file", file:{...}}` shape.
## Changes
- **`core/schemas/cachedcontents.go`** — New file defining `CachedContentObject` and the five request/response pairs (`BifrostCachedContentCreate/List/Retrieve/Update/Delete Request/Response`). TTL and `expireTime` are documented as mutually exclusive.
- **`core/schemas/bifrost.go`** — Added five new `RequestType` constants, wired the new request/response structs into `BifrostRequest` / `BifrostResponse`, and extended `GetRequestFields` to cover all five operations.
- **`core/schemas/provider.go`** — Added `CachedContentCreate`, `CachedContentList`, `CachedContentRetrieve`, `CachedContentUpdate`, and `CachedContentDelete` to the `Provider` interface.
- **`core/bifrost.go`** — Added five public `CachedContent*Request` methods on `Bifrost` with nil/empty-field guards, and wired the new request types into `handleProviderRequest`.
- **`core/providers/gemini/cachedcontents.go`** — Full implementation against Google AI Studio's `/v1beta/cachedContents` endpoints. List/retrieve/update/delete iterate over keys and return on first success. `normalizeCachedContentName` ensures the `cachedContents/` prefix is always present.
- **`core/providers/vertex/cachedcontents.go`** — Full implementation against Vertex AI's `/v1/projects/{p}/locations/{l}/cachedContents` endpoints. `expandVertexCachedContentName` and `expandVertexModelPath` rewrite short IDs to full resource paths. OAuth bearer tokens are applied per-request via the existing `getAuthTokenSource` helper.
- **`core/providers/{anthropic,azure,bedrock,cerebras,cohere,elevenlabs,fireworks,groq,huggingface,mistral,nebius,ollama,openai,openrouter,parasail,perplexity,replicate,runway,sgl,vllm,xai}/cachedcontents.go`** — Unsupported-operation stubs for every other provider so the interface is satisfied.
- **`core/providers/utils/fetch.go`** — New `FetchAndEncodeURL` utility: downloads a remote resource with a 20 s timeout and 25 MiB cap, returns the response `Content-Type` and base64-encoded body. Used by Bedrock and Vertex-Anthropic converters.
- **`core/providers/bedrock/utils.go`** — `convertImageToBedrockSource` now fetches remote `http(s)://` image URLs and inlines them instead of rejecting them. `convertContentBlock` gains the same fetch-and-inline path for URL-sourced document blocks, with `Content-Type`-driven format detection.
- **`core/providers/vertex/vertex.go`** — `inlineDocumentURLs` pre-processes chat requests for Anthropic-on-Vertex, replacing URL document sources with fetched base64 bytes before the Anthropic converter runs. Called in both `ChatCompletion` and `ChatCompletionStream`.
- **`core/schemas/chatcompletions.go`** — `ChatContentBlock.UnmarshalJSON` now transparently rewrites Anthropic-style `{type:"document", source:{...}}` blocks to `{type:"file", file:{...}}` using `gjson`/`sjson`, covering `base64`, `text`, `url`, and `file` source variants. Sibling fields (`citations`, `cache_control`, `cachePoint`, `title`) are preserved.
- **`transports/bifrost-http/integrations/router.go`** — Added `CachedContentRequest` wrapper struct, five converter/response-converter function types, corresponding `RouteConfig` fields, route-type detection, and `handleCachedContentRequest` dispatch method.
- **`transports/bifrost-http/integrations/genai.go`** — `CreateGenAICachedContentRouteConfigs` registers `POST/GET/PATCH/DELETE /v1beta/cachedContents[/{cached_id}]` routes under the `/genai` prefix with pre-callbacks for path and query parameter extraction.
- **`transports/bifrost-http/integrations/openai.go`** — Added `/responses` path detection and `OpenAIResponsesRequest` type dispatch so the Responses API route is handled correctly alongside chat completions; added `ResponsesResponseConverter` and `ResponsesStreamResponseConverter` to the route config.
- **Docs / harness** — `test-harness-coverage.mdx` rewritten as per-provider tables with `✅*` for preview-gated rows and a `[PREVIEW]` tag explanation. `HARNESS_COVERAGE_BACKLOG.md` marks cached content CRUD as complete. `provider-harness.json` promotes "Gemini: list cached contents" from `[PREVIEW]` to a standard test and fixes the Anthropic skills/container body shape.
## Type of change
- [ ] Bug fix
- [x] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI
## Affected areas
- [x] Core (Go)
- [x] Transports (HTTP)
- [x] Providers/Integrations
- [ ] Plugins
- [ ] UI (React)
- [x] Docs
## How to test
```sh
# Build and unit tests
go build ./...
go test ./...
# End-to-end harness (requires GENAI_API_KEY and VERTEX credentials)
make run-provider-harness-test
# Include preview-gated tests (cached content reference, MCP, preview deployments)
make run-provider-harness-test INCLUDE_PREVIEW=1
```
To exercise the cached content lifecycle directly:
```sh
# Create a cached content (Gemini)
curl -X POST http://localhost:8080/genai/v1beta/cachedContents \
-H "x-goog-api-key: $GENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"models/gemini-2.5-flash","contents":[...],"ttl":"3600s"}'
# List
curl http://localhost:8080/genai/v1beta/cachedContents \
-H "x-goog-api-key: $GENAI_API_KEY"
# Retrieve / Update / Delete follow the same /v1beta/cachedContents/{id} path
```
## Breaking changes
- [x] Yes
- [ ] No
The `Provider` interface gains five new methods. Any external implementation of `Provider` must add stubs for `CachedContentCreate`, `CachedContentList`, `CachedContentRetrieve`, `CachedContentUpdate`, and `CachedContentDelete`. The provided unsupported-operation pattern can be copied directly from any of the stub files.
## Security considerations
`FetchAndEncodeURL` issues outbound HTTP requests to URLs supplied by callers. It is bounded by a 20 s timeout and a 25 MiB body cap to limit SSRF blast radius, but operators running bifrost in environments with strict egress controls should be aware that Bedrock and Vertex-Anthropic document/image URL inputs will now trigger outbound fetches from the bifrost process.
## Checklist
- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable
0 commit comments