Skip to content

gemini named content cache support#3194

Merged
akshaydeo merged 1 commit intomainfrom
05-03-gemini_named_content_cache_support
May 5, 2026
Merged

gemini named content cache support#3194
akshaydeo merged 1 commit intomainfrom
05-03-gemini_named_content_cache_support

Conversation

@akshaydeo
Copy link
Copy Markdown
Contributor

@akshaydeo akshaydeo commented May 3, 2026

Summary

Adds a full named cached content lifecycle (create, list, retrieve, update, delete) for Gemini (Google AI Studio) and Vertex AI, exposes it through the HTTP transport layer, and extends the Provider interface so all existing providers satisfy it with explicit "unsupported" stubs. Also fixes Bedrock Converse and Vertex-Anthropic to auto-fetch and inline remote URL images and documents (both APIs only accept inline base64 bytes), and adds a JSON unmarshal normalizer that transparently converts Anthropic-style {type:"document", source:{...}} content blocks into bifrost's canonical {type:"file", file:{...}} shape.

Changes

  • core/schemas/cachedcontents.go — New file defining CachedContentObject and the five request/response pairs (BifrostCachedContentCreate/List/Retrieve/Update/Delete Request/Response). TTL and expireTime are documented as mutually exclusive.
  • core/schemas/bifrost.go — Added five new RequestType constants, wired the new request/response structs into BifrostRequest / BifrostResponse, and extended GetRequestFields to cover all five operations.
  • core/schemas/provider.go — Added CachedContentCreate, CachedContentList, CachedContentRetrieve, CachedContentUpdate, and CachedContentDelete to the Provider interface.
  • core/bifrost.go — Added five public CachedContent*Request methods on Bifrost with nil/empty-field guards, and wired the new request types into handleProviderRequest.
  • core/providers/gemini/cachedcontents.go — Full implementation against Google AI Studio's /v1beta/cachedContents endpoints. List/retrieve/update/delete iterate over keys and return on first success. normalizeCachedContentName ensures the cachedContents/ prefix is always present.
  • core/providers/vertex/cachedcontents.go — Full implementation against Vertex AI's /v1/projects/{p}/locations/{l}/cachedContents endpoints. expandVertexCachedContentName and expandVertexModelPath rewrite short IDs to full resource paths. OAuth bearer tokens are applied per-request via the existing getAuthTokenSource helper.
  • core/providers/{anthropic,azure,bedrock,cerebras,cohere,elevenlabs,fireworks,groq,huggingface,mistral,nebius,ollama,openai,openrouter,parasail,perplexity,replicate,runway,sgl,vllm,xai}/cachedcontents.go — Unsupported-operation stubs for every other provider so the interface is satisfied.
  • core/providers/utils/fetch.go — New FetchAndEncodeURL utility: downloads a remote resource with a 20 s timeout and 25 MiB cap, returns the response Content-Type and base64-encoded body. Used by Bedrock and Vertex-Anthropic converters.
  • core/providers/bedrock/utils.goconvertImageToBedrockSource now fetches remote http(s):// image URLs and inlines them instead of rejecting them. convertContentBlock gains the same fetch-and-inline path for URL-sourced document blocks, with Content-Type-driven format detection.
  • core/providers/vertex/vertex.goinlineDocumentURLs pre-processes chat requests for Anthropic-on-Vertex, replacing URL document sources with fetched base64 bytes before the Anthropic converter runs. Called in both ChatCompletion and ChatCompletionStream.
  • core/schemas/chatcompletions.goChatContentBlock.UnmarshalJSON now transparently rewrites Anthropic-style {type:"document", source:{...}} blocks to {type:"file", file:{...}} using gjson/sjson, covering base64, text, url, and file source variants. Sibling fields (citations, cache_control, cachePoint, title) are preserved.
  • transports/bifrost-http/integrations/router.go — Added CachedContentRequest wrapper struct, five converter/response-converter function types, corresponding RouteConfig fields, route-type detection, and handleCachedContentRequest dispatch method.
  • transports/bifrost-http/integrations/genai.goCreateGenAICachedContentRouteConfigs registers POST/GET/PATCH/DELETE /v1beta/cachedContents[/{cached_id}] routes under the /genai prefix with pre-callbacks for path and query parameter extraction.
  • transports/bifrost-http/integrations/openai.go — Added /responses path detection and OpenAIResponsesRequest type dispatch so the Responses API route is handled correctly alongside chat completions; added ResponsesResponseConverter and ResponsesStreamResponseConverter to the route config.
  • Docs / harnesstest-harness-coverage.mdx rewritten as per-provider tables with ✅* for preview-gated rows and a [PREVIEW] tag explanation. HARNESS_COVERAGE_BACKLOG.md marks cached content CRUD as complete. provider-harness.json promotes "Gemini: list cached contents" from [PREVIEW] to a standard test and fixes the Anthropic skills/container body shape.

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (React)
  • Docs

How to test

# Build and unit tests
go build ./...
go test ./...

# End-to-end harness (requires GENAI_API_KEY and VERTEX credentials)
make run-provider-harness-test

# Include preview-gated tests (cached content reference, MCP, preview deployments)
make run-provider-harness-test INCLUDE_PREVIEW=1

To exercise the cached content lifecycle directly:

# Create a cached content (Gemini)
curl -X POST http://localhost:8080/genai/v1beta/cachedContents \
  -H "x-goog-api-key: $GENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"models/gemini-2.5-flash","contents":[...],"ttl":"3600s"}'

# List
curl http://localhost:8080/genai/v1beta/cachedContents \
  -H "x-goog-api-key: $GENAI_API_KEY"

# Retrieve / Update / Delete follow the same /v1beta/cachedContents/{id} path

Breaking changes

  • Yes
  • No

The Provider interface gains five new methods. Any external implementation of Provider must add stubs for CachedContentCreate, CachedContentList, CachedContentRetrieve, CachedContentUpdate, and CachedContentDelete. The provided unsupported-operation pattern can be copied directly from any of the stub files.

Security considerations

FetchAndEncodeURL issues outbound HTTP requests to URLs supplied by callers. It is bounded by a 20 s timeout and a 25 MiB body cap to limit SSRF blast radius, but operators running bifrost in environments with strict egress controls should be aware that Bedrock and Vertex-Anthropic document/image URL inputs will now trigger outbound fetches from the bifrost process.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 3, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1398c3f1-b1be-476f-8450-eb922f13e540

📥 Commits

Reviewing files that changed from the base of the PR and between 2719786 and b4a6335.

📒 Files selected for processing (42)
  • core/bifrost.go
  • core/providers/anthropic/cachedcontents.go
  • core/providers/azure/cachedcontents.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/cachedcontents.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/utils.go
  • core/providers/cerebras/cachedcontents.go
  • core/providers/cohere/cachedcontents.go
  • core/providers/elevenlabs/cachedcontents.go
  • core/providers/fireworks/cachedcontents.go
  • core/providers/gemini/cachedcontents.go
  • core/providers/groq/cachedcontents.go
  • core/providers/huggingface/cachedcontents.go
  • core/providers/mistral/cachedcontents.go
  • core/providers/nebius/cachedcontents.go
  • core/providers/ollama/cachedcontents.go
  • core/providers/openai/cachedcontents.go
  • core/providers/openrouter/cachedcontents.go
  • core/providers/parasail/cachedcontents.go
  • core/providers/perplexity/cachedcontents.go
  • core/providers/replicate/cachedcontents.go
  • core/providers/runway/cachedcontents.go
  • core/providers/sgl/cachedcontents.go
  • core/providers/utils/fetch.go
  • core/providers/vertex/cachedcontents.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/cachedcontents.go
  • core/providers/xai/cachedcontents.go
  • core/schemas/bifrost.go
  • core/schemas/cachedcontents.go
  • core/schemas/chatcompletions.go
  • core/schemas/provider.go
  • core/utils.go
  • docs/providers/test-harness-coverage.mdx
  • tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md
  • tests/e2e/api/collections/provider-harness.json
  • transports/bifrost-http/integrations/genai.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/integrations/router.go
  • ui/app/workspace/governance/virtual-keys/page.tsx

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added cached content management (create, list, retrieve, update, delete) for Gemini and Vertex AI providers with full CRUD support.
    • Enhanced remote document and image handling across providers with secure URL fetching, context-aware timeouts, and IP validation.
  • Improvements

    • Improved document content block support in Bedrock for both base64-encoded and remote URL sources.

Walkthrough

Add typed cached-content lifecycle (create/list/retrieve/update/delete) across schemas, core API, transports, and providers; Gemini and Vertex provide full implementations, many providers add unsupported-operation stubs; router, GenAI routes, Bedrock/Vertex inlining, and URL fetch utilities updated to support cached-content flows.

Changes

Cached Content Lifecycle

Layer / File(s) Summary
Domain Schemas / Types
core/schemas/cachedcontents.go, core/schemas/bifrost.go
Add CachedContentObject, new BifrostCachedContent*Request/Response types, five RequestType constants, and extend BifrostRequest/BifrostResponse getters/setters and extra-fields plumbing.
Provider Interface & Permissions
core/schemas/provider.go
Add five cached-content methods to Provider interface and boolean flags in AllowedRequests; extend IsOperationAllowed for cached-content types.
Core API Wiring
core/bifrost.go, core/utils.go
Add Bifrost.CachedContent{Create,List,Retrieve,Update,Delete}Request public methods that validate inputs, set request type/payload and call handleRequest; extend handleProviderRequest to dispatch to provider cached-content methods; treat non-create cached-content ops as multi-key; reset clears cached-content fields; add isCachedContentRequestType.
Transport Router
transports/bifrost-http/integrations/router.go
Introduce CachedContentRequest wrapper and converter types, extend RouteConfig, short-circuit createHandler to handleCachedContentRequest, and wire response converters/post-callbacks for cached-content flows.
GenAI (Gemini) HTTP Routes
transports/bifrost-http/integrations/genai.go
Add Gemini wire-body structs for create/update, PreCallbacks to extract path/query/provider, implement CreateGenAICachedContentRouteConfigs registering POST/GET/PATCH/DELETE routes, and append these routes to the GenAI router.
Provider Implementations — Gemini (Full)
core/providers/gemini/cachedcontents.go
Full Gemini cached-content CRUD: wire request/response shapes, normalize names/models, validate ttl/expireTime, support pagination, updateMask, x-goog-api-key, perform HTTP calls, decode JSON, map into Bifrost responses, attach latency, and multi-key retry semantics.
Provider Implementations — Vertex (Full)
core/providers/vertex/cachedcontents.go, core/providers/vertex/vertex.go
Full Vertex CachedContents CRUD with OAuth auth headers, model/name expansion, error parsing, updateMask, pagination, latency; add inlineDocumentURLs to inline remote FileURL bytes into chat requests (used before Anthropic conversions).
Provider Implementations — Stubs
core/providers/*/cachedcontents.go (anthropic, azure, bedrock, cerebras, cohere, elevenlabs, fireworks, groq, huggingface, mistral, nebius, ollama, openai, openrouter, parasail, perplexity, replicate, runway, sgl, vllm, xai)
Add cached-content lifecycle methods for many providers; the majority return standardized unsupported-operation errors via providerUtils.NewUnsupportedOperationError(...).
Fetch Utility & Bedrock Inlining
core/providers/utils/fetch.go, core/providers/bedrock/utils.go, core/providers/bedrock/responses.go
FetchAndEncodeURL(ctx, ...) now accepts context, performs hardened HTTP GET (20s timeout, redirect checks, SSRF/IP filtering, 25 MiB cap) and returns media-type + base64 body. Bedrock conversion helpers now accept context.Context and inline remote image/document URLs.
Chat content JSON handling
core/schemas/chatcompletions.go
Add ChatContentBlock.UnmarshalJSON to rewrite Anthropic-style document blocks into canonical file blocks during JSON unmarshal.
OpenAI Responses Routing
transports/bifrost-http/integrations/openai.go
Map deployment suffixes to ResponsesRequest/CountTokensRequest and add response/stream converters that return raw OpenAI responses when present.
Docs & Tests
docs/providers/test-harness-coverage.mdx, tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md, tests/e2e/api/collections/provider-harness.json
Update harness docs/backlog to reflect Gemini/Vertex cached-content support; adjust Postman collection test ordering/payloads and response-shape assertions to recognize Gemini cached-contents.
Minor UI change
ui/app/workspace/governance/virtual-keys/page.tsx
Formatting changes and removal of max-w-7xl wrapper; no API changes.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client as Client
    participant Router as HTTP_Router
    participant Core as Bifrost_Core
    participant Provider as Provider
    participant External as External_API
    Client->>Router: HTTP cached-content request (e.g., POST /genai/v1beta/cachedContents)
    Router->>Core: convert → BifrostCachedContentCreateRequest
    Core->>Provider: CachedContentCreate(ctx, key, request)
    Provider->>External: HTTP call to upstream provider cachedContents endpoint
    External-->>Provider: 200 OK + JSON
    Provider-->>Core: BifrostCachedContentCreateResponse (includes latency)
    Core-->>Router: apply response converter / post-callback
    Router-->>Client: HTTP 200 + provider-formatted response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

"I nibble bytes and hum a tune,
Five new doors beneath the moon.
Gemini builds, Vertex brings light,
Stubs keep watch through quiet night.
Hop through code — the carrots wait!" 🐰✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 05-03-gemini_named_content_cache_support

@akshaydeo akshaydeo marked this pull request as ready for review May 3, 2026 15:07
Copy link
Copy Markdown
Contributor Author

akshaydeo commented May 3, 2026

@akshaydeo akshaydeo mentioned this pull request May 3, 2026
18 tasks
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 3, 2026

Confidence Score: 3/5

Not safe to merge until the PATCH routing regression and Vertex CheckOperationAllowed gaps (flagged in earlier review rounds) are resolved.

Two previously-flagged P1 issues remain unresolved in the current HEAD: (1) fasthttp.MethodPatch is absent from the RegisterRoutes switch, so the update route is silently registered as POST and all PATCH requests return 405; (2) Vertex cached-content methods skip CheckOperationAllowed. New findings in this round are P2 only (CGNAT SSRF gap, per-request transport allocation), which do not further lower the score.

transports/bifrost-http/integrations/router.go (PATCH case missing), core/providers/vertex/cachedcontents.go (CheckOperationAllowed absent), core/providers/utils/fetch.go (CGNAT range gap)

Security Review

  • FetchAndEncodeURL (core/providers/utils/fetch.go): SSRF mitigations (private-range block, dial-time IP validation, redirect scheme check) are solid, but the isPublicIP guard omits 100.64.0.0/10 (RFC 6598 CGNAT/shared-address space). Loopback, RFC-1918, link-local, and cloud-metadata addresses are all correctly blocked.
  • FetchAndEncodeURL is triggered by caller-supplied URLs in Bedrock Converse and Vertex-Anthropic request bodies. Operators with strict egress controls should audit this as documented in the PR security section.

Important Files Changed

Filename Overview
core/providers/utils/fetch.go New FetchAndEncodeURL utility with SSRF mitigations; CGNAT range 100.64.0.0/10 is not blocked and a new http.Transport is created per call (no connection reuse)
transports/bifrost-http/integrations/router.go PATCH method still missing from RegisterRoutes switch — the cached-content update route falls through to default and is registered as POST, making PATCH requests return 405
transports/bifrost-http/integrations/genai.go Five cached-content routes with response converters, PreCallbacks, and correct provider resolution; create route now has setGeminiCachedContentCreateProvider PreCallback for Vertex routing
core/providers/gemini/cachedcontents.go Full Gemini cached content lifecycle implementation with camelCase wire-shape response converters; uses url.Values.Encode for pageToken query params
core/providers/vertex/cachedcontents.go Full Vertex cached-content lifecycle; still missing CheckOperationAllowed calls (previously flagged)
core/providers/bedrock/utils.go Remote URL images and documents are now fetched and inlined for Bedrock Converse; context propagation through the full call chain is correct
core/schemas/chatcompletions.go UnmarshalJSON normalizer correctly rewrites Anthropic-style document blocks to canonical file shape; unknown source types now return an explicit error instead of silently dropping content
core/providers/vertex/vertex.go inlineDocumentURLs pre-processes file/document URL blocks for Anthropic-on-Vertex in both ChatCompletion and ChatCompletionStream paths
core/bifrost.go Five new CachedContent* public methods with nil/empty-field guards and proper wiring into handleRequest

Reviews (6): Last reviewed commit: "gemini named content cache support" | Re-trigger Greptile

Comment thread core/providers/gemini/cachedcontents.go
Comment thread core/providers/vertex/cachedcontents.go
Comment thread core/schemas/chatcompletions.go Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/providers/test-harness-coverage.mdx (1)

7-360: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Add the required Mintlify tab structure to this MDX page.

This page is still a single un-tabbed doc. For docs/**/*.mdx, please restructure with Web UI / API / config.json tabs (and validate any config.json examples against transports/config.schema.json).

As per coding guidelines docs/**/*.mdx: "Mintlify MDX documentation must have Web UI / API / config.json tabs; validate config.json examples against transports/config.schema.json".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/providers/test-harness-coverage.mdx` around lines 7 - 360, The document
lacks the required Mintlify tab layout; wrap the entire MDX content (the
top-level headings like "Overview" and "Per-provider coverage") into a Mintlify
tabs component with three tabs titled "Web UI", "API", and "config.json",
placing user-facing narrative and examples under the appropriate "Web UI" or
"API" tabs and moving any JSON config examples into the "config.json" tab;
ensure every config.json code block in that tab matches
transports/config.schema.json (validate and fix example keys/types) and update
any references to preview toggles or run instructions to remain visible in the
appropriate tab.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/bifrost.go`:
- Around line 2316-2431: resetBifrostRequest currently doesn't clear the new
cached-content request fields leading to pooled-object leakage; update
resetBifrostRequest to explicitly nil out CachedContentCreateRequest,
CachedContentListRequest, CachedContentRetrieveRequest,
CachedContentUpdateRequest, and CachedContentDeleteRequest (and any related
cached-content subfields) on the BifrostRequest before calling pool.Put(),
ensuring all fields set by
CachedContentCreateRequest/CachedContentListRequest/CachedContentRetrieveRequest/CachedContentUpdateRequest/CachedContentDeleteRequest
are reset.
- Around line 6159-6182: The cached-content branches call
provider.CachedContent* with `keys` which may be nil because requestWorker only
populates `keys` for batch/file/container paths; update the key-selection logic
in requestWorker to also populate `keys` for cached-content requests
(schemas.CachedContentListRequest, CachedContentRetrieveRequest,
CachedContentUpdateRequest, CachedContentDeleteRequest) by extracting the key
set from the corresponding BifrostRequest payload (e.g., from
req.BifrostRequest.CachedContentListRequest.Keys /
CachedContentRetrieveRequest.Keys / CachedContentUpdateRequest.Keys /
CachedContentDeleteRequest.Keys or the equivalent field used for cached-content
requests), reuse the same multi-key resolution code path used for
batch/file/container, and ensure you set an empty slice (not nil) when no keys
are present so provider calls receive a stable, non-nil slice.

In `@core/providers/bedrock/utils.go`:
- Around line 922-926: convertContentBlock and convertImageToBedrockSource
currently perform network I/O by calling providerUtils.FetchAndEncodeURL;
refactor so these converters are pure: remove any calls to FetchAndEncodeURL
from convertContentBlock and convertImageToBedrockSource and instead perform URL
fetching in a new pre-conversion pass (e.g., a prepare/normalize step) that
resolves FileURL into an inline payload (fetched mime/type + base64 bytes) and
stores it on the block (or a separate prepared struct) before calling the
converters; update callers to run the pre-conversion fetch step and pass the
enriched blocks into convertContentBlock/convertImageToBedrockSource so those
functions only transform data without logging or HTTP calls.
- Around line 931-954: The switch over fetchedMediaType is too strict and misses
content-type parameters and case variations; normalize fetchedMediaType before
the switch (e.g., lowercase it and strip any charset/parameters by splitting at
';' and trimming) and then switch on the normalized value so assignments to
documentSource.Format and isText are correct; apply the same normalization to
the other identical switch instance that handles formats (the one mirroring
lines ~1046-1055) so both places use the cleaned content-type string.

In `@core/providers/gemini/cachedcontents.go`:
- Around line 171-180: The code builds requestURL by appending query args
including the opaque pagination token raw (request.PageToken); URL-encode the
token before appending to avoid reserved-character issues — replace the raw
concatenation "pageToken="+*request.PageToken with an encoded form using net/url
(e.g., url.QueryEscape(*request.PageToken) or use url.Values to set "pageToken"
and encode) so that queryArgs contains a safely escaped pageToken and the final
requestURL is constructed from properly encoded query parameters.

In `@core/providers/utils/fetch.go`:
- Around line 20-30: The fetch code currently performs unrestricted requests
using resourceURL with http.NewRequest and client.Do; add SSRF guards by first
parsing resourceURL and rejecting non-http/https schemes, then resolve the
target hostname (e.g., via net.LookupIP) and validate that none of its IPs fall
into disallowed CIDRs (private 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16,
loopback 127.0.0.0/8, link-local 169.254.0.0/16, IPv6 equivalents, etc.); also
set client.CheckRedirect to re-validate each redirect target with the same
hostname/IP checks and reject redirects that go to disallowed addresses or
change scheme, and return an explicit error when a check fails; apply the same
validation before calling http.NewRequest and client.Do so resourceURL and all
redirect targets are blocked from internal addresses.

In `@core/providers/vertex/cachedcontents.go`:
- Around line 108-109: Add an operation-allowed guard at the start of each
public Vertex cached-content entrypoint (CachedContentCreate, CachedContentList,
CachedContentRetrieve, CachedContentUpdate, CachedContentDelete) by calling
providerUtils.CheckOperationAllowed(...) with the function context and the
matching request object (BifrostCachedContentCreateRequest,
BifrostCachedContentListRequest, BifrostCachedContentRetrieveRequest,
BifrostCachedContentUpdateRequest, BifrostCachedContentDeleteRequest) and
immediately return the resulting *schemas.BifrostError if non-nil; place this
check before any business logic (e.g., before validateVertexTTLExpireMutex in
CachedContentCreate) to mirror the Gemini pattern.
- Around line 214-224: The code appends raw request.PageToken into requestURL
which can break when the token contains reserved characters; update the
construction to URL-encode the page token (or build the query with url.Values
and call Encode()) before appending. Specifically, change the logic around
request.PageToken handling in the block that builds requestURL (references:
requestURL, vertexCachedContentBaseURL, request.PageSize, request.PageToken) to
use url.QueryEscape(*request.PageToken) or use a url.Values map and call
Encode(), and add the required net/url import if not present.

In `@core/providers/vertex/vertex.go`:
- Around line 366-391: inlineDocumentURLs performs remote fetches via
providerUtils.FetchAndEncodeURL without using the request's context, so update
the flow to propagate cancellation/deadlines: modify
providerUtils.FetchAndEncodeURL to accept a context.Context (or add a new
providerUtils.FetchAndEncodeURLWithContext) and change inlineDocumentURLs to
extract the request context (e.g., request.Context / request.Ctx) and pass it
into the fetch call for each block; ensure error handling and setting of
block.File.FileData, FileType, and niling FileURL remain unchanged.

In `@core/schemas/bifrost.go`:
- Around line 419-423: The generic mutators (SetProvider, SetModel,
SetRawRequestBody) are not handling the cached-content variants, so add handling
for BifrostCachedContentCreateRequest, BifrostCachedContentListRequest,
BifrostCachedContentRetrieveRequest, BifrostCachedContentUpdateRequest, and
BifrostCachedContentDeleteRequest in the same switch/field-dispatch that
GetRequestFields() uses; update the logic in the methods named SetProvider,
SetModel, and SetRawRequestBody (and any helper that switches on
BifrostRequest's concrete request field) to treat those five cached-content
request fields the same as other request types (rewrite provider/model or store
raw bytes) so cached-content operations no longer no-op.
- Around line 796-800: GetExtraFields and PopulateExtraFields need to be
extended to handle the new cached-content response types so generic metadata is
preserved; add switch cases for BifrostCachedContentCreateResponse,
BifrostCachedContentListResponse, BifrostCachedContentRetrieveResponse,
BifrostCachedContentUpdateResponse, and BifrostCachedContentDeleteResponse in
both functions. For GetExtraFields return the embedded ExtraFields from each
specific response (and for list response iterate items to collect/return their
ExtraFields as appropriate to your existing list handling), and for
PopulateExtraFields set the ExtraFields field on each concrete response type
using the provided map/object. Use the exact type names (e.g.,
BifrostCachedContentCreateResponse) to locate where to add the cases and mirror
the existing pattern used for other response types.

In `@core/schemas/chatcompletions.go`:
- Around line 1172-1200: The switch on srcType (the variable computed from
gjson.GetBytes(data, "source.type")) currently ignores unknown or missing values
and proceeds to rewrite the object to a file block, which can drop or corrupt
the original payload; update the logic in the document rewrite handling (the
switch over srcType that writes to out using sjson.SetBytes and reads from data)
to validate srcType first (allow only "base64","text","url","file"), and if
srcType is empty or not one of those values return a descriptive error (e.g.,
"unsupported or missing document.source.type") instead of falling through, so
the function fails fast and preserves/report original payload issues.

In `@core/schemas/provider.go`:
- Around line 639-648: The new cached-content methods (CachedContentCreate,
CachedContentList, CachedContentRetrieve, CachedContentUpdate,
CachedContentDelete) were added to the provider interface but were not added to
the allowlist checks, so a non-nil AllowedRequests causes them to be denied by
IsOperationAllowed; update the allowlist mapping used by IsOperationAllowed (and
any AllowedRequests construction in ProviderConfig/CustomProviderConfig) to
include explicit entries for these five operations and ensure the logic treats
nil AllowedRequests as "allow all" while non-nil only permits operations set to
true.

In `@tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md`:
- Line 268: The checklist entry currently marks "Cached content CRUD" as fully
covered but the test harness only exercises the list path; change the checkbox
from "[x]" to "[~]" to indicate partial coverage and update the description to
note that only the "list" path (harness `Gemini: list cached contents`) runs
against real upstream while create/retrieve/update/delete are not yet exercised
so the backlog accurately reflects coverage.

In `@transports/bifrost-http/integrations/genai.go`:
- Around line 870-894: The pre-callback extractGeminiCachedContentNameFromPath
isn't populating Name for the PATCH request body type used in the route; update
the type switch in extractGeminiCachedContentNameFromPath to also handle
*schemas.GeminiCachedContentUpdateBody (and any other route body types that
deserialize differently) and set r.Name = nameStr and r.Provider = provider for
that case so the converter receives the Name; locate the function and add the
new case alongside the existing
*schemas.BifrostCachedContentUpdateRequest/*RetrieveRequest/*DeleteRequest
branches.

In `@transports/bifrost-http/integrations/openai.go`:
- Around line 355-357: The wildcard deployment route is missing handling for the
"/responses/input_tokens" suffix which causes requests to fall through to
UnknownRequest (and later be misrouted into chat parsing); add a check alongside
the existing strings.HasSuffix(path, "/responses") case to detect
strings.HasSuffix(path, "/responses/input_tokens") and return the correct schema
(e.g., schemas.InputTokensRequest) so token-counting calls are routed correctly
instead of hitting the fallback; apply the same addition at the other similar
spots mentioned (around the blocks that currently return
schemas.ResponsesRequest / fall back to UnknownRequest).

---

Outside diff comments:
In `@docs/providers/test-harness-coverage.mdx`:
- Around line 7-360: The document lacks the required Mintlify tab layout; wrap
the entire MDX content (the top-level headings like "Overview" and "Per-provider
coverage") into a Mintlify tabs component with three tabs titled "Web UI",
"API", and "config.json", placing user-facing narrative and examples under the
appropriate "Web UI" or "API" tabs and moving any JSON config examples into the
"config.json" tab; ensure every config.json code block in that tab matches
transports/config.schema.json (validate and fix example keys/types) and update
any references to preview toggles or run instructions to remain visible in the
appropriate tab.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 97356ccb-09a9-4165-abf0-9744e7568783

📥 Commits

Reviewing files that changed from the base of the PR and between 6d4e381 and f46d289.

📒 Files selected for processing (38)
  • core/bifrost.go
  • core/providers/anthropic/cachedcontents.go
  • core/providers/azure/cachedcontents.go
  • core/providers/bedrock/cachedcontents.go
  • core/providers/bedrock/utils.go
  • core/providers/cerebras/cachedcontents.go
  • core/providers/cohere/cachedcontents.go
  • core/providers/elevenlabs/cachedcontents.go
  • core/providers/fireworks/cachedcontents.go
  • core/providers/gemini/cachedcontents.go
  • core/providers/groq/cachedcontents.go
  • core/providers/huggingface/cachedcontents.go
  • core/providers/mistral/cachedcontents.go
  • core/providers/nebius/cachedcontents.go
  • core/providers/ollama/cachedcontents.go
  • core/providers/openai/cachedcontents.go
  • core/providers/openrouter/cachedcontents.go
  • core/providers/parasail/cachedcontents.go
  • core/providers/perplexity/cachedcontents.go
  • core/providers/replicate/cachedcontents.go
  • core/providers/runway/cachedcontents.go
  • core/providers/sgl/cachedcontents.go
  • core/providers/utils/fetch.go
  • core/providers/vertex/cachedcontents.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/cachedcontents.go
  • core/providers/xai/cachedcontents.go
  • core/schemas/bifrost.go
  • core/schemas/cachedcontents.go
  • core/schemas/chatcompletions.go
  • core/schemas/provider.go
  • docs/providers/test-harness-coverage.mdx
  • tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md
  • tests/e2e/api/collections/provider-harness.json
  • transports/bifrost-http/integrations/genai.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/integrations/router.go
  • ui/app/workspace/governance/virtual-keys/page.tsx

Comment thread core/bifrost.go
Comment thread core/bifrost.go
Comment thread core/providers/bedrock/utils.go
Comment thread core/providers/bedrock/utils.go
Comment thread core/providers/gemini/cachedcontents.go Outdated
Comment thread core/schemas/chatcompletions.go
Comment thread core/schemas/provider.go
Comment thread tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md Outdated
Comment thread transports/bifrost-http/integrations/genai.go
Comment thread transports/bifrost-http/integrations/openai.go
@akshaydeo akshaydeo force-pushed the 05-03-gemini_named_content_cache_support branch from f46d289 to 3cc7c32 Compare May 3, 2026 19:50
Comment thread transports/bifrost-http/integrations/genai.go
Comment thread transports/bifrost-http/integrations/genai.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/providers/test-harness-coverage.mdx (1)

7-34: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add required Mintlify tabs (Web UI / API / config.json) to this MDX page.

This page currently uses plain markdown sections/tables only; it should include the required tabbed structure per repo docs standard.

As per coding guidelines docs/**/*.mdx: "Mintlify MDX documentation must have Web UI / API / config.json tabs; validate config.json examples against transports/config.schema.json".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/providers/test-harness-coverage.mdx` around lines 7 - 34, The MDX page
currently uses plain markdown sections/tables; update
docs/providers/test-harness-coverage.mdx to wrap the content in the required
Mintlify tabbed structure by adding the three tabs "Web UI", "API", and
"config.json" (move the existing overview and tables into the appropriate "Web
UI" or "API" tabs), add a config.json tab containing the example config
snippets, and validate those config.json examples against
transports/config.schema.json so they conform to the repo schema; ensure
headings like "Overview" and "Per-provider coverage" remain inside the correct
tabs and that the file follows the docs/**/*.mdx Mintlify tab convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/providers/test-harness-coverage.mdx`:
- Line 198: The table rows currently claim full CRUD for "Cached contents
lifecycle (`POST/GET/PATCH/DELETE /v1beta/cachedContents`)" but the harness only
shows list/reference coverage; update those entries (the row text "Cached
contents lifecycle (`POST/GET/PATCH/DELETE /v1beta/cachedContents`)") to reflect
the actual coverage (e.g., "List/Reference only" or similar) in both occurrences
so the docs no longer overstate POST/PATCH/DELETE support.

In `@tests/e2e/api/collections/provider-harness.json`:
- Line 1325: The generic response-shape assertion fails for the newly unskipped
"Gemini: list cached contents" request (path /genai/v1beta/cachedContents);
update the shared test harness function that enforces response shapes to
recognize cachedContents list responses—modify the list/whitelist (or the
detection logic) to include "cachedContents" or the route
"/genai/v1beta/cachedContents" so valid 2xx list payloads bypass the generic
object-shape assertion; locate and change the shared validation helper (e.g.,
the generic response-shape checker used by the collection runner) to accept this
endpoint as a list response.

---

Outside diff comments:
In `@docs/providers/test-harness-coverage.mdx`:
- Around line 7-34: The MDX page currently uses plain markdown sections/tables;
update docs/providers/test-harness-coverage.mdx to wrap the content in the
required Mintlify tabbed structure by adding the three tabs "Web UI", "API", and
"config.json" (move the existing overview and tables into the appropriate "Web
UI" or "API" tabs), add a config.json tab containing the example config
snippets, and validate those config.json examples against
transports/config.schema.json so they conform to the repo schema; ensure
headings like "Overview" and "Per-provider coverage" remain inside the correct
tabs and that the file follows the docs/**/*.mdx Mintlify tab convention.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4f0b6a7f-5ee1-404e-9658-c53e12468220

📥 Commits

Reviewing files that changed from the base of the PR and between f46d289 and 3cc7c32.

📒 Files selected for processing (38)
  • core/bifrost.go
  • core/providers/anthropic/cachedcontents.go
  • core/providers/azure/cachedcontents.go
  • core/providers/bedrock/cachedcontents.go
  • core/providers/bedrock/utils.go
  • core/providers/cerebras/cachedcontents.go
  • core/providers/cohere/cachedcontents.go
  • core/providers/elevenlabs/cachedcontents.go
  • core/providers/fireworks/cachedcontents.go
  • core/providers/gemini/cachedcontents.go
  • core/providers/groq/cachedcontents.go
  • core/providers/huggingface/cachedcontents.go
  • core/providers/mistral/cachedcontents.go
  • core/providers/nebius/cachedcontents.go
  • core/providers/ollama/cachedcontents.go
  • core/providers/openai/cachedcontents.go
  • core/providers/openrouter/cachedcontents.go
  • core/providers/parasail/cachedcontents.go
  • core/providers/perplexity/cachedcontents.go
  • core/providers/replicate/cachedcontents.go
  • core/providers/runway/cachedcontents.go
  • core/providers/sgl/cachedcontents.go
  • core/providers/utils/fetch.go
  • core/providers/vertex/cachedcontents.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/cachedcontents.go
  • core/providers/xai/cachedcontents.go
  • core/schemas/bifrost.go
  • core/schemas/cachedcontents.go
  • core/schemas/chatcompletions.go
  • core/schemas/provider.go
  • docs/providers/test-harness-coverage.mdx
  • tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md
  • tests/e2e/api/collections/provider-harness.json
  • transports/bifrost-http/integrations/genai.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/integrations/router.go
  • ui/app/workspace/governance/virtual-keys/page.tsx
✅ Files skipped from review due to trivial changes (14)
  • core/providers/fireworks/cachedcontents.go
  • core/providers/perplexity/cachedcontents.go
  • core/providers/anthropic/cachedcontents.go
  • core/providers/azure/cachedcontents.go
  • core/providers/elevenlabs/cachedcontents.go
  • core/providers/nebius/cachedcontents.go
  • core/providers/huggingface/cachedcontents.go
  • core/providers/runway/cachedcontents.go
  • core/providers/vllm/cachedcontents.go
  • core/providers/groq/cachedcontents.go
  • core/providers/mistral/cachedcontents.go
  • tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md
  • core/providers/bedrock/utils.go
  • core/schemas/bifrost.go
🚧 Files skipped from review as they are similar to previous changes (17)
  • core/providers/openai/cachedcontents.go
  • core/schemas/provider.go
  • core/providers/replicate/cachedcontents.go
  • core/schemas/chatcompletions.go
  • core/providers/cerebras/cachedcontents.go
  • core/providers/xai/cachedcontents.go
  • core/providers/cohere/cachedcontents.go
  • core/schemas/cachedcontents.go
  • core/providers/utils/fetch.go
  • core/providers/sgl/cachedcontents.go
  • transports/bifrost-http/integrations/genai.go
  • core/providers/ollama/cachedcontents.go
  • ui/app/workspace/governance/virtual-keys/page.tsx
  • core/providers/vertex/vertex.go
  • core/providers/gemini/cachedcontents.go
  • core/bifrost.go
  • core/providers/vertex/cachedcontents.go

Comment thread docs/providers/test-harness-coverage.mdx Outdated
Comment thread tests/e2e/api/collections/provider-harness.json
@akshaydeo akshaydeo force-pushed the 05-03-gemini_named_content_cache_support branch from 3cc7c32 to d369c95 Compare May 4, 2026 05:24
@akshaydeo akshaydeo force-pushed the 05-02-test_harness_for_quick_checks branch from 6d4e381 to 6d08e1c Compare May 4, 2026 05:24
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
core/providers/vertex/vertex.go (1)

366-381: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Harden URL inlining fetches: add request-context propagation and SSRF guardrails.

Line 380 fetches request-provided URLs directly. Right now this path (a) does not use request ctx for cancellation/deadlines and (b) does not enforce network safety restrictions (private/link-local/loopback/internal targets), which opens a real SSRF risk in addition to reliability issues under cancellation.

Suggested direction
-func inlineDocumentURLs(request *schemas.BifrostChatRequest) error {
+func inlineDocumentURLs(ctx context.Context, request *schemas.BifrostChatRequest) error {
@@
-            mediaType, encoded, err := providerUtils.FetchAndEncodeURL(*block.File.FileURL)
+            mediaType, encoded, err := providerUtils.FetchAndEncodeURLWithContextAndPolicy(
+                ctx,
+                *block.File.FileURL,
+                providerUtils.URLFetchPolicy{
+                    // e.g. allow only http/https and block loopback/link-local/private CIDRs
+                    RestrictToPublicNetwork: true,
+                },
+            )
- if err := inlineDocumentURLs(request); err != nil {
+ if err := inlineDocumentURLs(ctx, request); err != nil {

Also applies to: 408-412, 720-724

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/vertex/vertex.go` around lines 366 - 381, The URL inlining in
inlineDocumentURLs calls providerUtils.FetchAndEncodeURL without propagating the
request context or validating the target, which risks missing
cancellation/deadlines and SSRF to private/loopback/link-local hosts; update
inlineDocumentURLs to pass the request's context (e.g., request.Ctx or
request.Context()) into the fetch path and add an SSRF guard before fetching:
ensure the scheme is http/https, resolve the host to IP(s) and reject
private/loopback/link-local/rfc1918 ranges, and enforce a safe HTTP client with
context-aware timeout; implement the same context propagation and SSRF
validation for the other FetchAndEncodeURL call sites (the analogous blocks
around the other occurrences of providerUtils.FetchAndEncodeURL).
transports/bifrost-http/integrations/genai.go (1)

870-894: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

PATCH cached-content update still drops cached_id/Name due to type mismatch.

On Line 1007, PATCH deserializes into *GeminiCachedContentUpdateBody, but on Line 886 the pre-callback only hydrates *schemas.BifrostCachedContentUpdateRequest. As a result, Name from path never reaches the converter, and the update request built on Lines 1017-1021 is missing the target resource identity.

💡 Suggested fix
@@
 		GetRequestTypeInstance: func(ctx context.Context) interface{} {
-			return &GeminiCachedContentUpdateBody{}
+			return &schemas.BifrostCachedContentUpdateRequest{}
+		},
+		RequestParser: func(ctx *fasthttp.RequestCtx, req interface{}) error {
+			updateReq, ok := req.(*schemas.BifrostCachedContentUpdateRequest)
+			if !ok {
+				return errors.New("invalid cached content update request type")
+			}
+			if body := ctx.Request.Body(); len(body) > 0 {
+				var wire GeminiCachedContentUpdateBody
+				if err := sonic.Unmarshal(body, &wire); err != nil {
+					return err
+				}
+				updateReq.TTL = wire.TTL
+				updateReq.ExpireTime = wire.ExpireTime
+			}
+			return nil
 		},
 		CachedContentRequestConverter: func(ctx *schemas.BifrostContext, req interface{}) (*CachedContentRequest, error) {
-			body, ok := req.(*GeminiCachedContentUpdateBody)
+			updateReq, ok := req.(*schemas.BifrostCachedContentUpdateRequest)
 			if !ok {
 				return nil, errors.New("invalid cached content update request type")
 			}
-			// Name is set via PreCallback (extractGeminiCachedContentNameFromPath); we
-			// stage it via a dedicated update-request struct here.
-			updateReq := &schemas.BifrostCachedContentUpdateRequest{
-				Provider:   schemas.Gemini,
-				TTL:        body.TTL,
-				ExpireTime: body.ExpireTime,
-			}
+			if updateReq.Provider == "" {
+				updateReq.Provider = schemas.Gemini
+			}
 			return &CachedContentRequest{Type: schemas.CachedContentUpdateRequest, UpdateRequest: updateReq}, nil
 		},

Also applies to: 1007-1023

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/integrations/genai.go` around lines 870 - 894, The
pre-callback extractGeminiCachedContentNameFromPath currently only sets
Name/Provider for
schemas.BifrostCachedContentRetrieveRequest/UpdateRequest/DeleteRequest, but the
PATCH handler deserializes into *GeminiCachedContentUpdateBody so the
path-derived cached_id never propagates; update
extractGeminiCachedContentNameFromPath to also handle
*GeminiCachedContentUpdateBody (set its Name and Provider) or add a conversion
step that copies Name/Provider from the request body into
schemas.BifrostCachedContentUpdateRequest so the converter receives the target
identity (refer to extractGeminiCachedContentNameFromPath,
GeminiCachedContentUpdateBody, and schemas.BifrostCachedContentUpdateRequest).
tests/e2e/api/collections/provider-harness.json (1)

1325-1325: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unskipping cached-contents list will still fail the shared response-shape check.

After enabling this request on Line 1325, the generic test script still doesn’t treat cachedContents arrays as a valid list shape, so valid 2xx responses can fail the hasContent assertion.

💡 Suggested fix
@@
     } else if (Array.isArray(j.files)) {
         shape = 'gemini-list-files';
         hasContent = j.files.length >= 0;
+    } else if (Array.isArray(j.cachedContents)) {
+        shape = 'gemini-list-cached-contents';
+        hasContent = j.cachedContents.length >= 0;
     } else if (Array.isArray(j.invocationJobSummaries)) {
         shape = 'bedrock-list-invocation-jobs';
         hasContent = j.invocationJobSummaries.length >= 0;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/api/collections/provider-harness.json` at line 1325, The new
request "Gemini: list cached contents" (GET /genai/v1beta/cachedContents) is
failing the shared response-shape check because the generic test script's
hasContent assertion doesn't recognize the body property "cachedContents" as a
valid list shape; update the shared response-shape checker (the function/method
that implements hasContent) to treat "cachedContents" as a list/array response
(or generically accept arrays at top-level list fields), so valid 2xx responses
with body.cachedContents as an array pass the hasContent assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@core/providers/vertex/vertex.go`:
- Around line 366-381: The URL inlining in inlineDocumentURLs calls
providerUtils.FetchAndEncodeURL without propagating the request context or
validating the target, which risks missing cancellation/deadlines and SSRF to
private/loopback/link-local hosts; update inlineDocumentURLs to pass the
request's context (e.g., request.Ctx or request.Context()) into the fetch path
and add an SSRF guard before fetching: ensure the scheme is http/https, resolve
the host to IP(s) and reject private/loopback/link-local/rfc1918 ranges, and
enforce a safe HTTP client with context-aware timeout; implement the same
context propagation and SSRF validation for the other FetchAndEncodeURL call
sites (the analogous blocks around the other occurrences of
providerUtils.FetchAndEncodeURL).

In `@tests/e2e/api/collections/provider-harness.json`:
- Line 1325: The new request "Gemini: list cached contents" (GET
/genai/v1beta/cachedContents) is failing the shared response-shape check because
the generic test script's hasContent assertion doesn't recognize the body
property "cachedContents" as a valid list shape; update the shared
response-shape checker (the function/method that implements hasContent) to treat
"cachedContents" as a list/array response (or generically accept arrays at
top-level list fields), so valid 2xx responses with body.cachedContents as an
array pass the hasContent assertion.

In `@transports/bifrost-http/integrations/genai.go`:
- Around line 870-894: The pre-callback extractGeminiCachedContentNameFromPath
currently only sets Name/Provider for
schemas.BifrostCachedContentRetrieveRequest/UpdateRequest/DeleteRequest, but the
PATCH handler deserializes into *GeminiCachedContentUpdateBody so the
path-derived cached_id never propagates; update
extractGeminiCachedContentNameFromPath to also handle
*GeminiCachedContentUpdateBody (set its Name and Provider) or add a conversion
step that copies Name/Provider from the request body into
schemas.BifrostCachedContentUpdateRequest so the converter receives the target
identity (refer to extractGeminiCachedContentNameFromPath,
GeminiCachedContentUpdateBody, and schemas.BifrostCachedContentUpdateRequest).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0ac63ef5-d18d-4e19-8f3d-3c2cef603f14

📥 Commits

Reviewing files that changed from the base of the PR and between 3cc7c32 and d369c95.

📒 Files selected for processing (38)
  • core/bifrost.go
  • core/providers/anthropic/cachedcontents.go
  • core/providers/azure/cachedcontents.go
  • core/providers/bedrock/cachedcontents.go
  • core/providers/bedrock/utils.go
  • core/providers/cerebras/cachedcontents.go
  • core/providers/cohere/cachedcontents.go
  • core/providers/elevenlabs/cachedcontents.go
  • core/providers/fireworks/cachedcontents.go
  • core/providers/gemini/cachedcontents.go
  • core/providers/groq/cachedcontents.go
  • core/providers/huggingface/cachedcontents.go
  • core/providers/mistral/cachedcontents.go
  • core/providers/nebius/cachedcontents.go
  • core/providers/ollama/cachedcontents.go
  • core/providers/openai/cachedcontents.go
  • core/providers/openrouter/cachedcontents.go
  • core/providers/parasail/cachedcontents.go
  • core/providers/perplexity/cachedcontents.go
  • core/providers/replicate/cachedcontents.go
  • core/providers/runway/cachedcontents.go
  • core/providers/sgl/cachedcontents.go
  • core/providers/utils/fetch.go
  • core/providers/vertex/cachedcontents.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/cachedcontents.go
  • core/providers/xai/cachedcontents.go
  • core/schemas/bifrost.go
  • core/schemas/cachedcontents.go
  • core/schemas/chatcompletions.go
  • core/schemas/provider.go
  • docs/providers/test-harness-coverage.mdx
  • tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md
  • tests/e2e/api/collections/provider-harness.json
  • transports/bifrost-http/integrations/genai.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/integrations/router.go
  • ui/app/workspace/governance/virtual-keys/page.tsx
✅ Files skipped from review due to trivial changes (16)
  • core/providers/utils/fetch.go
  • core/providers/fireworks/cachedcontents.go
  • core/providers/cohere/cachedcontents.go
  • core/providers/elevenlabs/cachedcontents.go
  • core/providers/azure/cachedcontents.go
  • core/providers/perplexity/cachedcontents.go
  • core/providers/bedrock/cachedcontents.go
  • core/providers/groq/cachedcontents.go
  • core/providers/anthropic/cachedcontents.go
  • core/providers/runway/cachedcontents.go
  • core/providers/mistral/cachedcontents.go
  • tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md
  • core/providers/huggingface/cachedcontents.go
  • core/providers/ollama/cachedcontents.go
  • core/providers/vertex/cachedcontents.go
  • core/providers/xai/cachedcontents.go
🚧 Files skipped from review as they are similar to previous changes (15)
  • core/providers/cerebras/cachedcontents.go
  • core/providers/sgl/cachedcontents.go
  • core/providers/vllm/cachedcontents.go
  • core/schemas/cachedcontents.go
  • core/providers/parasail/cachedcontents.go
  • core/schemas/provider.go
  • core/schemas/bifrost.go
  • core/providers/openrouter/cachedcontents.go
  • transports/bifrost-http/integrations/router.go
  • core/providers/openai/cachedcontents.go
  • ui/app/workspace/governance/virtual-keys/page.tsx
  • core/providers/gemini/cachedcontents.go
  • core/providers/replicate/cachedcontents.go
  • docs/providers/test-harness-coverage.mdx
  • core/bifrost.go

@akshaydeo akshaydeo force-pushed the 05-03-gemini_named_content_cache_support branch from d369c95 to 2719786 Compare May 4, 2026 14:19
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/providers/bedrock/responses.go (2)

2922-2936: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Propagate conversion errors instead of dropping whole messages.

At Line 2935 this helper returns nil on conversion error, and at Line 2813 the caller treats nil as “skip message”. That silently removes user/assistant content when image/file conversion fails.

Suggested fix
-func convertBifrostMessageToBedrockMessage(ctx context.Context, msg *schemas.ResponsesMessage) *BedrockMessage {
+func convertBifrostMessageToBedrockMessage(ctx context.Context, msg *schemas.ResponsesMessage) (*BedrockMessage, error) {
 	// Ensure Content is present
 	if msg.Content == nil {
-		return nil
+		return nil, nil
 	}
@@
 	contentBlocks, err := convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks(ctx, *msg.Content)
 	if err != nil {
-		return nil
+		return nil, err
 	}
 	bedrockMsg.Content = contentBlocks
 
-	return &bedrockMsg
+	return &bedrockMsg, nil
}
-				bedrockMsg := convertBifrostMessageToBedrockMessage(ctx, &msg)
-				if bedrockMsg != nil {
-					bedrockMessages = append(bedrockMessages, *bedrockMsg)
-				}
+				bedrockMsg, err := convertBifrostMessageToBedrockMessage(ctx, &msg)
+				if err != nil {
+					return nil, nil, fmt.Errorf("failed to convert Responses message at index %d: %w", i, err)
+				}
+				if bedrockMsg != nil {
+					bedrockMessages = append(bedrockMessages, *bedrockMsg)
+				}

Also applies to: 2813-2816

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/responses.go` around lines 2922 - 2936, The helper
convertBifrostMessageToBedrockMessage currently swallows conversion errors by
returning nil when
convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks fails; change
convertBifrostMessageToBedrockMessage to return (*BedrockMessage, error),
propagate and return the underlying error from
convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks instead of
nil, and update the caller (the function that currently treats a nil return as
"skip message") to handle the error case explicitly (log/collect the error and
decide whether to include a fallback/partial message) so messages aren't
silently dropped.

2626-2632: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not silently drop tool-result images on conversion failure.

At Line 2631, the error is created and discarded, so failed image conversion is invisible and the tool output is truncated. This causes silent correctness loss.

Suggested fix
-								imageSource, err := convertImageToBedrockSource(ctx, *block.ResponsesInputMessageContentBlockImage.ImageURL)
-								if err != nil {
-									// Bedrock only supports base64 data URIs for images. If conversion
-									// fails (e.g. remote URL), the image is dropped from the tool result
-									// which silently degrades the model's ability to see tool output.
-									_ = fmt.Errorf("bedrock: converting tool result image: %w", err)
-								} else {
+								imageSource, err := convertImageToBedrockSource(ctx, *block.ResponsesInputMessageContentBlockImage.ImageURL)
+								if err != nil {
+									return nil, nil, fmt.Errorf("bedrock: converting tool result image: %w", err)
+								} else {
 									resultContent = append(resultContent, BedrockContentBlock{Image: imageSource})
 								}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/providers/bedrock/responses.go` around lines 2626 - 2632, The code
currently discards the error from convertImageToBedrockSource (the "_ =
fmt.Errorf(...)" line) which silently drops tool-result images; replace that
discard with proper error propagation or logging: either return the wrapped
error from the current function (so callers can surface/fail the tool result) or
call the appropriate logger to record the conversion failure (include the error
and the original block.ResponsesInputMessageContentBlockImage.ImageURL value),
ensuring convertImageToBedrockSource failures are visible and not silently
ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/bifrost.go`:
- Around line 5728-5730: The branch sets isMultiKeyCachedContentOp but can still
pass a nil keys slice downstream when providerRequiresKey(...) is false; to fix,
ensure that when isMultiKeyCachedContentOp is true you initialize a stable,
non-nil slice for keys (e.g., keys = []string{} or make([]string, 0)) before
routing even if providerRequiresKey(req.Provider...) returns false—update the
logic around isMultiKeyCachedContentOp and the variables used by request routing
(req.RequestType, isMultiKeyCachedContentOp, and keys) so cached-content
list/retrieve/update/delete handlers always receive a non-nil keys slice.

In `@core/providers/bedrock/responses.go`:
- Line 2473: ConvertBifrostMessagesToBedrockMessages is no longer pure because
it accepts ctx and calls image conversion that can perform HTTP fetches; remove
ctx from the ConvertBifrostMessagesToBedrockMessages signature and eliminate any
direct calls to network-capable image helpers inside it. Instead call or create
a pure image-conversion helper (e.g., a ConvertImagesToBedrockImagesPure that
only transforms provided image metadata/URLs into Bedrock image structs without
fetching) or require callers to pre-resolve/download images before invoking this
converter; ensure no logging, HTTP calls, or context usage remain in
ConvertBifrostMessagesToBedrockMessages (and the other message-converter
functions in this file) so the function is a pure transformation that only maps
schemas.ResponsesMessage -> []BedrockMessage and []BedrockSystemMessage.

In `@transports/bifrost-http/integrations/openai.go`:
- Around line 529-536: The wildcard route only defines
ResponsesResponseConverter so CountTokensRequest flows don't get the same OpenAI
raw-response passthrough; add a CountTokensResponseConverter for the same route
that mirrors ResponsesResponseConverter behavior: when the response's
ExtraFields.Provider == schemas.OpenAI and resp.ExtraFields.RawResponse != nil
return that raw response, otherwise return the response with defaults. Update
the route to include CountTokensResponseConverter (matching the signature used
for ResponsesResponseConverter) so token-count responses use the same OpenAI
passthrough logic.

---

Outside diff comments:
In `@core/providers/bedrock/responses.go`:
- Around line 2922-2936: The helper convertBifrostMessageToBedrockMessage
currently swallows conversion errors by returning nil when
convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks fails; change
convertBifrostMessageToBedrockMessage to return (*BedrockMessage, error),
propagate and return the underlying error from
convertBifrostResponsesMessageContentBlocksToBedrockContentBlocks instead of
nil, and update the caller (the function that currently treats a nil return as
"skip message") to handle the error case explicitly (log/collect the error and
decide whether to include a fallback/partial message) so messages aren't
silently dropped.
- Around line 2626-2632: The code currently discards the error from
convertImageToBedrockSource (the "_ = fmt.Errorf(...)" line) which silently
drops tool-result images; replace that discard with proper error propagation or
logging: either return the wrapped error from the current function (so callers
can surface/fail the tool result) or call the appropriate logger to record the
conversion failure (include the error and the original
block.ResponsesInputMessageContentBlockImage.ImageURL value), ensuring
convertImageToBedrockSource failures are visible and not silently ignored.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c1b08d64-7835-4b5a-aa2e-21dedcf041b5

📥 Commits

Reviewing files that changed from the base of the PR and between d369c95 and 2719786.

📒 Files selected for processing (42)
  • core/bifrost.go
  • core/providers/anthropic/cachedcontents.go
  • core/providers/azure/cachedcontents.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/bedrock/cachedcontents.go
  • core/providers/bedrock/chat.go
  • core/providers/bedrock/responses.go
  • core/providers/bedrock/utils.go
  • core/providers/cerebras/cachedcontents.go
  • core/providers/cohere/cachedcontents.go
  • core/providers/elevenlabs/cachedcontents.go
  • core/providers/fireworks/cachedcontents.go
  • core/providers/gemini/cachedcontents.go
  • core/providers/groq/cachedcontents.go
  • core/providers/huggingface/cachedcontents.go
  • core/providers/mistral/cachedcontents.go
  • core/providers/nebius/cachedcontents.go
  • core/providers/ollama/cachedcontents.go
  • core/providers/openai/cachedcontents.go
  • core/providers/openrouter/cachedcontents.go
  • core/providers/parasail/cachedcontents.go
  • core/providers/perplexity/cachedcontents.go
  • core/providers/replicate/cachedcontents.go
  • core/providers/runway/cachedcontents.go
  • core/providers/sgl/cachedcontents.go
  • core/providers/utils/fetch.go
  • core/providers/vertex/cachedcontents.go
  • core/providers/vertex/vertex.go
  • core/providers/vllm/cachedcontents.go
  • core/providers/xai/cachedcontents.go
  • core/schemas/bifrost.go
  • core/schemas/cachedcontents.go
  • core/schemas/chatcompletions.go
  • core/schemas/provider.go
  • core/utils.go
  • docs/providers/test-harness-coverage.mdx
  • tests/e2e/api/HARNESS_COVERAGE_BACKLOG.md
  • tests/e2e/api/collections/provider-harness.json
  • transports/bifrost-http/integrations/genai.go
  • transports/bifrost-http/integrations/openai.go
  • transports/bifrost-http/integrations/router.go
  • ui/app/workspace/governance/virtual-keys/page.tsx
✅ Files skipped from review due to trivial changes (17)
  • core/providers/mistral/cachedcontents.go
  • core/providers/groq/cachedcontents.go
  • core/providers/elevenlabs/cachedcontents.go
  • core/providers/replicate/cachedcontents.go
  • core/providers/azure/cachedcontents.go
  • ui/app/workspace/governance/virtual-keys/page.tsx
  • core/providers/ollama/cachedcontents.go
  • core/providers/anthropic/cachedcontents.go
  • core/providers/cerebras/cachedcontents.go
  • core/providers/bedrock/bedrock_test.go
  • core/providers/vllm/cachedcontents.go
  • core/providers/fireworks/cachedcontents.go
  • core/providers/nebius/cachedcontents.go
  • core/providers/runway/cachedcontents.go
  • core/providers/bedrock/cachedcontents.go
  • docs/providers/test-harness-coverage.mdx
  • core/providers/openrouter/cachedcontents.go
🚧 Files skipped from review as they are similar to previous changes (11)
  • core/providers/openai/cachedcontents.go
  • core/providers/sgl/cachedcontents.go
  • core/schemas/provider.go
  • core/providers/xai/cachedcontents.go
  • core/providers/bedrock/utils.go
  • transports/bifrost-http/integrations/router.go
  • core/providers/vertex/vertex.go
  • core/schemas/cachedcontents.go
  • core/providers/vertex/cachedcontents.go
  • core/providers/gemini/cachedcontents.go
  • core/providers/cohere/cachedcontents.go

Comment thread core/bifrost.go
Comment thread core/providers/bedrock/responses.go
Comment thread transports/bifrost-http/integrations/openai.go
@akshaydeo akshaydeo force-pushed the 05-02-test_harness_for_quick_checks branch from 6d08e1c to 5f06bd8 Compare May 4, 2026 15:25
@akshaydeo akshaydeo force-pushed the 05-03-gemini_named_content_cache_support branch from 2719786 to b135039 Compare May 4, 2026 15:25
@akshaydeo akshaydeo mentioned this pull request May 5, 2026
18 tasks
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 5, 2026
Copy link
Copy Markdown
Contributor Author

akshaydeo commented May 5, 2026

Merge activity

  • May 5, 6:47 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • May 5, 6:49 AM UTC: Graphite rebased this pull request as part of a merge.
  • May 5, 6:50 AM UTC: @akshaydeo merged this pull request with Graphite.

@akshaydeo akshaydeo changed the base branch from 05-02-test_harness_for_quick_checks to graphite-base/3194 May 5, 2026 06:48
@akshaydeo akshaydeo changed the base branch from graphite-base/3194 to main May 5, 2026 06:48
@akshaydeo akshaydeo dismissed coderabbitai[bot]’s stale review May 5, 2026 06:48

The base branch was changed.

@akshaydeo akshaydeo requested a review from a team as a code owner May 5, 2026 06:48
@akshaydeo akshaydeo force-pushed the 05-03-gemini_named_content_cache_support branch from b135039 to b4a6335 Compare May 5, 2026 06:48
@akshaydeo akshaydeo merged commit d61cb55 into main May 5, 2026
13 of 17 checks passed
@akshaydeo akshaydeo deleted the 05-03-gemini_named_content_cache_support branch May 5, 2026 06:50
soby added a commit to soby/bifrost that referenced this pull request May 7, 2026
Resolves textual conflicts in core/bifrost.go and core/schemas/bifrost.go
arising from upstream's v1.5.0 changes. Notable upstream changes since
the prior merge base 734f02d that interact with this PR:

* PR maximhq#3241 removed the `BifrostContextKeyDirectKey` constant and its
  callers — the "direct key bypass" is gone from both the Go SDK and HTTP
  gateway. Resolution: drop the parallel DirectKey block in
  getAllSupportedKeys, getKeysForBatchAndFileOps, and
  selectKeyFromProviderForModelWithPool, leaving only the
  ProviderOverride path (the plugin-driven replacement). Renamed the
  `OverrideWinsOverDirectKey` test to `OverrideWinsOverPinnedAPIKey` and
  rewired it to use `BifrostContextKeyAPIKeyID` (upstream's recommended
  replacement), preserving the precedence assertion.
* PR maximhq#3194 (gemini named content cache support) added five
  CachedContent* request types and extended SetProvider/SetModel.
  Resolution: combined our File*/Batch*/Container*/Passthrough cases
  with upstream's CachedContent* cases, ordered to mirror BifrostRequest
  field declaration order. SetModel keeps each side's chosen idiom for
  *string fields — our PR's unconditional `Model = &model` for
  File*/Batch*/Passthrough, upstream's guarded
  `if Model != nil { Model = new(model) }` for CachedContent*.
* Upstream added `BifrostContextKeyPassthroughOverridesPresent`. Kept
  alongside our existing context-key additions.

`resolveModelAndProvider` in transports/bifrost-http/handlers/inference.go
auto-merged correctly: upstream's relaxation (no longer error on empty
provider) is aligned with this PR's deferred-provider-check work — the
empty-provider case is now handled post-plugin in tryRequest.

go build ./... and go vet ./... pass on the merged tree. The full core
test suite passes except for three pre-existing
TestAnthropicStructuredOutput* failures in core/providers/bedrock that
also fail on a clean upstream main checkout, so are unrelated to this
merge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants