Skip to content

feat(providers): add Google Vertex AI provider#364

Closed
qhkm wants to merge 9 commits intomainfrom
feat/vertex-provider
Closed

feat(providers): add Google Vertex AI provider#364
qhkm wants to merge 9 commits intomainfrom
feat/vertex-provider

Conversation

@qhkm
Copy link
Copy Markdown
Owner

@qhkm qhkm commented Mar 15, 2026

Summary

  • Adds first-class Google Vertex AI provider for Gemini models with enterprise-grade bearer token auth
  • Zero new dependencies — reuses reqwest + GeminiProvider's response parsing (extract_text, extract_usage)
  • Auth via VERTEX_ACCESS_TOKEN env var (from gcloud auth print-access-token)
  • Project/location resolved from config (api_key/api_base) or env vars (GOOGLE_CLOUD_PROJECT, VERTEX_LOCATION)

Ported from zeroclaw-labs/zeroclaw#3506, adapted to ZeptoClaw's architecture without the heavy Google Cloud SDK crates.

Config example

{
  "providers": {
    "vertex": {
      "api_key": "my-gcp-project-id",
      "api_base": "us-central1"
    }
  }
}

Files changed

  • src/providers/vertex.rs — New VertexProvider (407 lines, 12 tests)
  • src/providers/mod.rs — Register module + runtime support
  • src/providers/registry.rs — Add to provider registry + config lookup
  • src/providers/gemini.rs — Make extract_usage public for reuse
  • src/config/types.rs — Add vertex field to ProvidersConfig
  • src/kernel/provider.rs — Wire vertex backend routing

Closes #363

Test plan

  • cargo fmt -- --check passes
  • cargo clippy -- -D warnings passes
  • cargo nextest run --lib — 3184 passed (21 new vertex tests), 6 skipped
  • Manual: provider status shows vertex when configured
  • Manual: agent -m "Hello" works with vertex provider

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Google Vertex AI (Gemini) added as a new model provider.
    • r8r workflow bridge with WebSocket event streaming, approvals flow, at-least-once delivery, deduplication, reconnection, and health/status reporting.
  • Improvements

    • Global system prompt support via environment override and higher precedence in agent configuration.
    • Provider resolution updated for async runtime backends.
  • Tests

    • New unit and integration tests covering the r8r bridge, approval flows, deduplication, and Vertex provider.

qhkm and others added 6 commits March 15, 2026 12:55
…ridge

Mirror the 8 bridge event types from r8r with identical JSON wire format
(BridgeEventEnvelope, BridgeEvent, Ack) as independent Rust code. Add an
LRU-based Deduplicator for at-least-once delivery, R8rBridgeConfig with
channel routing, and placeholder modules for approval (Task 8) and health
(Task 9).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tting

Implements parse_approval_response (6 approve / 6 reject keywords,
case-insensitive, reason extraction), format_approval_message,
format_timeout_message (Xs / Xm / Xh Ym elapsed), and format_help_message.
Adds 10 integration tests in tests/r8r_bridge_approval_test.rs; all pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pport

Implement R8rBridge struct that connects to r8r's /api/ws/events endpoint
with bearer token auth, automatic deduplication, event acknowledgments,
health status tracking, and exponential backoff reconnection. Add url
crate for endpoint host extraction and integration test with a mock
WebSocket server.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements start_health_ping_loop (tokio background task with configurable
interval), format_health_status (multiline CLI output with version, uptime,
and counters), and the format_uptime helper (d/h/m bucketing). Includes unit
tests covering all branches.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Allows setting the agent system prompt via config JSON or env var
(ZEPTOCLAW_AGENTS_DEFAULTS_SYSTEM_PROMPT). Takes priority over
template and hand system prompts. Enables gateway/headless mode
to use config-driven system prompts without CLI flags.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add first-class Vertex AI support, enabling Gemini models with enterprise
bearer token authentication through the Vertex AI regional endpoint.

Zero new dependencies — reuses reqwest for HTTP and GeminiProvider's
response parsing (extract_text, extract_usage). Auth via VERTEX_ACCESS_TOKEN
env var (from `gcloud auth print-access-token`). Project/location resolved
from config or GOOGLE_CLOUD_PROJECT/VERTEX_LOCATION env vars.

Closes #363

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 15, 2026

📝 Walkthrough

Walkthrough

Adds Google Vertex AI provider, introduces agent-level system_prompt with env override and precedence, and implements an r8r WebSocket bridge (events, dedup, approvals, health) with tests and a small Cargo dependency for URL parsing.

Changes

Cohort / File(s) Summary
Vertex AI provider
src/providers/vertex.rs, src/providers/mod.rs, src/providers/registry.rs, src/kernel/provider.rs
New VertexProvider implementation, public re-export, registry entry, runtime selection branch, async provider resolution changes, and config/env wiring for project/location/token/model.
Gemini provider export
src/providers/gemini.rs
Made extract_usage public so Vertex provider can reuse Gemini response parsing.
Agent system_prompt & config
src/config/types.rs, src/config/mod.rs, src/cli/common.rs
Added system_prompt: Option<String> to AgentDefaults, env-override support (ZEPTOCLAW_AGENTS_DEFAULTS_SYSTEM_PROMPT), and changed precedence to prefer agent defaults over template/manifest prompts.
R8r bridge core
src/r8r_bridge/mod.rs, src/r8r_bridge/events.rs, src/r8r_bridge/approval.rs, src/r8r_bridge/dedup.rs, src/r8r_bridge/health.rs
New reconnecting WebSocket R8rBridge: envelope (de)serialization, acking, deduplication, approval parsing/formatting, health pings, reconnection/backoff, and public API (connect/send/run/etc.).
Bridge tests
tests/r8r_bridge_approval_test.rs, tests/r8r_bridge_events_test.rs, tests/r8r_bridge_client_test.rs
Unit and integration tests for approval parsing/formatting, event envelope round-trips, deduplicator, and WebSocket client health-ping exchange.
Lib export & small deps
src/lib.rs, Cargo.toml
Exported r8r_bridge module and added url = "2" dependency (used for WebSocket endpoint parsing).
Config: r8r bridge & providers field
src/config/types.rs
Added R8rBridgeConfig and ChannelTarget, added r8r_bridge field on Config, and added vertex: Option<ProviderConfig> under ProvidersConfig.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant R8rBridge
    participant WebSocketServer
    participant Dedup
    participant Handler

    Client->>R8rBridge: connect()
    R8rBridge->>WebSocketServer: WebSocket upgrade + Auth
    WebSocketServer-->>R8rBridge: Connected
    R8rBridge->>R8rBridge: spawn reader/writer

    loop Event loop
        WebSocketServer->>R8rBridge: BridgeEventEnvelope(id, type, data)
        R8rBridge->>Dedup: is_new(id)?
        alt seen
            Dedup-->>R8rBridge: false
            R8rBridge->>R8rBridge: discard
        else new
            Dedup-->>R8rBridge: true
            R8rBridge->>WebSocketServer: Ack(id)
            R8rBridge->>Handler: dispatch(event_type, data)
            Handler-->>R8rBridge: handled
        end
    end

    Client->>R8rBridge: send_health_ping()
    R8rBridge->>WebSocketServer: HealthPing envelope
    WebSocketServer-->>R8rBridge: HealthStatus envelope
    R8rBridge->>R8rBridge: store last_health_status
Loading
sequenceDiagram
    participant User
    participant ZeptoClaw
    participant VertexProvider
    participant GoogleVertex

    User->>ZeptoClaw: chat request
    ZeptoClaw->>VertexProvider: chat(messages, options)
    VertexProvider->>GoogleVertex: POST /v1/projects/{p}/locations/{l}:generateContent
    GoogleVertex-->>VertexProvider: response (content, usageMetadata)
    VertexProvider->>VertexProvider: extract_usage(response)
    VertexProvider-->>ZeptoClaw: LLMResponse (text + usage)
    ZeptoClaw-->>User: chat response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped a bridge from Zepto to r8r,
Pings and acks on websockets near and far.
Vertex hums Gemini answers in the night,
Prompts now follow defaults, set just right.
Dedup keeps echoes low — rabbit hops delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR includes some out-of-scope changes beyond issue #363: r8r bridge implementation (approval, dedup, events, health modules), system prompt configuration changes, and environment variable overrides for Vertex/Gemini config. These extend beyond the stated Vertex AI provider objective. Separate r8r bridge features and system prompt management into independent PRs. Focus this PR solely on Vertex AI provider (config types, provider impl, registry, kernel routing) as specified in issue #363.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main change: adding Google Vertex AI provider support. It is concise, specific, and directly reflects the primary objective of the changeset.
Linked Issues check ✅ Passed The code changes comprehensively fulfill all requirements from issue #363: Vertex AI provider implementation with bearer token auth, project/location config via env vars or settings, zero new dependencies (reuses reqwest and Gemini parsing), provider registry integration, and runtime routing.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Important

Merge conflicts detected (Beta)

  • Resolve merge conflict in branch feat/vertex-provider
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/vertex-provider
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can enforce grammar and style rules using `languagetool`.

Configure the reviews.tools.languagetool setting to enable/disable rules and categories. Refer to the LanguageTool Community to learn more.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (2)
src/kernel/provider.rs (1)

97-123: Consider logging a warning when VERTEX_ACCESS_TOKEN is missing.

When VERTEX_ACCESS_TOKEN is not set, the function silently returns None at line 111. Unlike other providers where the API key comes from config (and absence is expected), Vertex specifically requires this environment variable from gcloud auth print-access-token. Users selecting the vertex provider may not understand why it's being skipped.

💡 Proposed improvement to add a warning
-            let bearer_token = std::env::var("VERTEX_ACCESS_TOKEN").ok()?;
+            let bearer_token = match std::env::var("VERTEX_ACCESS_TOKEN") {
+                Ok(token) if !token.is_empty() => token,
+                _ => {
+                    tracing::warn!(
+                        provider = "vertex",
+                        "VERTEX_ACCESS_TOKEN not set; run `gcloud auth print-access-token` and export the result"
+                    );
+                    return None;
+                }
+            };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/kernel/provider.rs` around lines 97 - 123, The code currently silently
returns None when VERTEX_ACCESS_TOKEN is missing in the "vertex" match arm;
update that branch to log a warning before returning None so users know Vertex
was skipped due to a missing token. Specifically, in the "vertex" arm where
bearer_token is obtained via std::env::var("VERTEX_ACCESS_TOKEN").ok()?, replace
the early None-return path with a call to the existing logger (e.g.,
process_logger or similar) to emit a clear warning mentioning
VERTEX_ACCESS_TOKEN and gcloud auth print-access-token, then return None; keep
creating VertexProvider::new(&project_id, location, &bearer_token, model)
unchanged when the token is present.
tests/r8r_bridge_client_test.rs (1)

92-93: Consider using a polling approach instead of a fixed sleep.

The 500ms sleep at line 93 assumes the health status response will be processed within that time. In CI environments under load, this could occasionally fail. A polling approach with timeout would be more robust.

🔧 Optional: Use polling with timeout
-    // Wait briefly for the bridge to receive the health status response.
-    tokio::time::sleep(Duration::from_millis(500)).await;
-
-    // The mock server echoes a health status; we may also receive the ack first.
-    // Drain the ack message from rx (the bridge sends an ack for the health status).
-    // The health status should now be stored.
-    let health = bridge.last_health_status().await;
+    // Poll for health status with timeout instead of fixed sleep.
+    let health = tokio::time::timeout(Duration::from_secs(5), async {
+        loop {
+            if let Some(status) = bridge.last_health_status().await {
+                return Some(status);
+            }
+            tokio::time::sleep(Duration::from_millis(50)).await;
+        }
+    })
+    .await
+    .expect("should receive health status within timeout");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/r8r_bridge_client_test.rs` around lines 92 - 93, Replace the fixed
500ms sleep (tokio::time::sleep(Duration::from_millis(500)).await) with a
polling loop that repeatedly checks for the bridge having received the health
status response until a deadline; implement this by recording a start Instant
and looping with short sleeps (e.g., 10–50ms) or by using tokio::time::timeout
to await a condition future, and fail the test if the timeout elapses. Ensure
you poll the same condition the test expects (the bridge's received-health flag
or response buffer) and keep the overall timeout conservative (e.g., a few
seconds) to be CI-friendly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/config/types.rs`:
- Around line 1356-1358: The Vertex provider (providers.vertex) was added to
types.rs but apply_provider_env_overrides in src/config/mod.rs does not wire
ZEPTOCLAW_PROVIDERS_VERTEX_API_KEY, ZEPTOCLAW_PROVIDERS_VERTEX_API_BASE, or
ZEPTOCLAW_PROVIDERS_VERTEX_MODEL; update apply_provider_env_overrides to read
those three env vars and populate or merge them into providers.vertex (creating
a ProviderConfig when None or overriding fields when Some) using the same
env-name parsing and precedence as other providers; ensure you reference the
ProviderConfig field names used elsewhere so the API key -> api_key, API base ->
api_base, and model -> model are set consistently.
- Around line 135-137: Add the new top-level config key "r8r_bridge" to the
KNOWN_TOP_LEVEL allowlist used during validation: update the KNOWN_TOP_LEVEL
array/constant to include "r8r_bridge" so that the validator recognizes the
R8rBridgeConfig field (refer to the r8r_bridge field in types.rs and the
KNOWN_TOP_LEVEL symbol in validate.rs); ensure spelling matches exactly and run
tests to confirm validation accepts configs with r8r_bridge.

In `@src/providers/vertex.rs`:
- Around line 195-258: VertexProvider currently ignores the tools parameter in
chat() and inherits a streaming wrapper via chat_stream(), so it must explicitly
refuse unsupported features: update VertexProvider::chat to validate the tools
argument and immediately return a ZeptoError::Provider (or appropriate error
type) when tools is non-empty (mentioning that tool calling is unsupported by
Vertex), and implement/override VertexProvider::chat_stream to return an
immediate error indicating streaming is not supported instead of falling back to
the default wrapper; reference the methods LLMProvider::chat,
VertexProvider::chat, and VertexProvider::chat_stream and the ToolDefinition
type when making these checks and error returns.
- Around line 88-89: In from_config() ensure the VERTEX_ACCESS_TOKEN is not only
present but non-empty by replacing let bearer_token =
std::env::var("VERTEX_ACCESS_TOKEN").ok()? with code that checks the string and
returns None if empty (e.g., get var, if bearer_token.is_empty() { return None;
}); also remove the .expect("failed to build HTTP client") used when
constructing the HTTP client (the Client::builder().build().expect(...) call)
and propagate the error instead using the ? operator (adjust the function
signature to return a Result if needed) so failures building the HTTP client
bubble up rather than panicking.

In `@src/r8r_bridge/approval.rs`:
- Around line 42-50: The decision token extraction currently keeps trailing
punctuation so tokens like "approve," or "reject." fail to match; update the
logic after computing `first` to strip surrounding punctuation before
lowercasing (e.g., trim punctuation characters from `first`), then assign that
result to `keyword` and use it against `APPROVE_KEYWORDS`/`REJECT_KEYWORDS` when
constructing `ApprovalDecision` (ensure you still handle the None/no-whitespace
case and empty strings from trimming). This change should be applied where
`first`, `rest`, and `keyword` are defined so punctuation-normalized tokens
match correctly.

In `@src/r8r_bridge/dedup.rs`:
- Around line 23-28: The constructor new currently allows storing entries even
when max_entries == 0; explicitly treat zero-capacity deduplicators as
non-storing/non-suppressing by adding a fast-path: set a zero_capacity flag (or
check max_entries == 0) in new and then short-circuit the dedup logic in the
methods that add/check entries (the code around lines 56-62 — e.g.,
insert/contains or push_entry functions) so they never push to seen or mark
entries as seen when max_entries == 0, and always report “not seen” (no
suppression) for incoming entries; ensure ttl_secs logic is also bypassed for
this case.

In `@src/r8r_bridge/mod.rs`:
- Around line 141-149: disconnect() currently only clears the outbound sender
and flips a flag but does not stop the spawned reader/writer tasks or close the
underlying websocket, so reconnections can leave multiple live tasks; modify the
code that spawns the tasks (the tokio::spawn block which reads from rx and uses
ws_write_clone to send, and the corresponding reader task spawned in run()) to
return and store their JoinHandle(s) (e.g., store in a struct field like
reader_handle/writer_handle), and update disconnect() to actively abort those
handles (JoinHandle::abort) and close the websocket (call close on the
sink/stream or send a Close frame via writer.send(WsMessage::Close(_))). Also
ensure run() checks/cleans these handles before creating new tasks so no
orphaned tasks remain; locate usages of rx, ws_write_clone, writer.send, and the
reader spawn to implement these changes.
- Around line 187-205: The ack is only sent after the dedup check so duplicates
are dropped without acknowledging; move the ack send logic to occur before the
dedup block (but keep the dedup check and the early continue to skip further
processing), i.e., create Ack { event_id: envelope.id.clone() }, serialize it
with serde_json::to_string, lock ws_write and send the WsMessage::Text as
currently done (the writer.send call), handling send errors as before, then
perform the dedup.lock().await and if !dd.is_new(&envelope.id) continue; so
retried events are always acknowledged even when deduplicated.
- Around line 110-118: In connect() avoid panics from header parsing: replace
the two parse().unwrap() calls on Host (host_with_port.parse()) and
Authorization (format!("Bearer {token}").parse()) with fallible handling that
returns Err(String) on failure; use the same error-return style as the function
(e.g., map_err or match) and include context like "invalid Host header" or
"invalid Authorization header" so request.headers_mut().insert(...) only runs
with valid HeaderValue instances; update references in connect(), self.token,
and request.headers_mut() accordingly.
- Around line 278-283: The code is awaiting tx.send(...) while holding the mutex
guard on self.sender, which can deadlock under backpressure; fix by scoping the
lock to clone the sender (e.g., take the Option<Sender> from sender_guard and
clone or .to_owned() the contained tx) inside a short block, drop the guard,
then call send(json).await on the cloned tx (and keep the existing map_err
handling), similar to the pattern used in connect() and ensuring disconnect()
and connect() can acquire the mutex.

---

Nitpick comments:
In `@src/kernel/provider.rs`:
- Around line 97-123: The code currently silently returns None when
VERTEX_ACCESS_TOKEN is missing in the "vertex" match arm; update that branch to
log a warning before returning None so users know Vertex was skipped due to a
missing token. Specifically, in the "vertex" arm where bearer_token is obtained
via std::env::var("VERTEX_ACCESS_TOKEN").ok()?, replace the early None-return
path with a call to the existing logger (e.g., process_logger or similar) to
emit a clear warning mentioning VERTEX_ACCESS_TOKEN and gcloud auth
print-access-token, then return None; keep creating
VertexProvider::new(&project_id, location, &bearer_token, model) unchanged when
the token is present.

In `@tests/r8r_bridge_client_test.rs`:
- Around line 92-93: Replace the fixed 500ms sleep
(tokio::time::sleep(Duration::from_millis(500)).await) with a polling loop that
repeatedly checks for the bridge having received the health status response
until a deadline; implement this by recording a start Instant and looping with
short sleeps (e.g., 10–50ms) or by using tokio::time::timeout to await a
condition future, and fail the test if the timeout elapses. Ensure you poll the
same condition the test expects (the bridge's received-health flag or response
buffer) and keep the overall timeout conservative (e.g., a few seconds) to be
CI-friendly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d06e5471-3b35-494d-a6e4-adfbca3b5aa1

📥 Commits

Reviewing files that changed from the base of the PR and between 624ad11 and 11c179d.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • Cargo.toml
  • src/cli/common.rs
  • src/config/mod.rs
  • src/config/types.rs
  • src/kernel/provider.rs
  • src/lib.rs
  • src/providers/gemini.rs
  • src/providers/mod.rs
  • src/providers/registry.rs
  • src/providers/vertex.rs
  • src/r8r_bridge/approval.rs
  • src/r8r_bridge/dedup.rs
  • src/r8r_bridge/events.rs
  • src/r8r_bridge/health.rs
  • src/r8r_bridge/mod.rs
  • tests/r8r_bridge_approval_test.rs
  • tests/r8r_bridge_client_test.rs
  • tests/r8r_bridge_events_test.rs

Comment thread src/config/types.rs
Comment on lines +135 to +137
/// r8r workflow-engine bridge configuration.
#[serde(default)]
pub r8r_bridge: R8rBridgeConfig,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Checking src/config/validate.rs for KNOWN_TOP_LEVEL + r8r_bridge"
fd validate.rs src/config --exec sh -c '
  echo "== $1 ==";
  rg -n "KNOWN_TOP_LEVEL|r8r_bridge" "$1";
' sh {}

Repository: qhkm/zeptoclaw

Length of output: 495


🏁 Script executed:

sed -n '7,50p' src/config/validate.rs

Repository: qhkm/zeptoclaw

Length of output: 805


Add r8r_bridge to the KNOWN_TOP_LEVEL allowlist in src/config/validate.rs.

The new r8r_bridge config field added to src/config/types.rs is missing from the KNOWN_TOP_LEVEL array. Without it, configuration validation will reject valid r8r_bridge configurations. Per the coding guidelines, new top-level config fields must be added to KNOWN_TOP_LEVEL in src/config/validate.rs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/types.rs` around lines 135 - 137, Add the new top-level config key
"r8r_bridge" to the KNOWN_TOP_LEVEL allowlist used during validation: update the
KNOWN_TOP_LEVEL array/constant to include "r8r_bridge" so that the validator
recognizes the R8rBridgeConfig field (refer to the r8r_bridge field in types.rs
and the KNOWN_TOP_LEVEL symbol in validate.rs); ensure spelling matches exactly
and run tests to confirm validation accepts configs with r8r_bridge.

Comment thread src/config/types.rs
Comment thread src/providers/vertex.rs Outdated
Comment on lines +88 to +89
// Bearer token is required
let bearer_token = std::env::var("VERTEX_ACCESS_TOKEN").ok()?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove .expect() and validate empty config in from_config().

Line 88-89: std::env::var("VERTEX_ACCESS_TOKEN").ok()? accepts empty strings; add an explicit check like if bearer_token.is_empty() { return None; } to match the pattern in other providers.

Line 107: .expect("failed to build HTTP client") should be replaced with ? operator to propagate the error cleanly. As per coding guidelines, do not use expect() in production paths unless failure is truly unrecoverable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/vertex.rs` around lines 88 - 89, In from_config() ensure the
VERTEX_ACCESS_TOKEN is not only present but non-empty by replacing let
bearer_token = std::env::var("VERTEX_ACCESS_TOKEN").ok()? with code that checks
the string and returns None if empty (e.g., get var, if bearer_token.is_empty()
{ return None; }); also remove the .expect("failed to build HTTP client") used
when constructing the HTTP client (the Client::builder().build().expect(...)
call) and propagate the error instead using the ? operator (adjust the function
signature to return a Result if needed) so failures building the HTTP client
bubble up rather than panicking.

Comment thread src/providers/vertex.rs
Comment on lines +195 to +258
#[async_trait]
impl LLMProvider for VertexProvider {
async fn chat(
&self,
messages: Vec<Message>,
_tools: Vec<ToolDefinition>,
model: Option<&str>,
options: ChatOptions,
) -> Result<LLMResponse> {
let model = model.unwrap_or(&self.model);
let body = self.build_messages_body(&messages, &options);

debug!("Vertex AI request to model {} in {}", model, self.location);

let response = self
.client
.post(self.api_url(model))
.header("Content-Type", "application/json")
.header("Authorization", format!("Bearer {}", self.bearer_token))
.json(&body)
.send()
.await
.map_err(|e| ZeptoError::Provider(format!("Vertex AI request failed: {}", e)))?;

if response.status().is_success() {
let json: Value = response.json().await.map_err(|e| {
ZeptoError::Provider(format!("Failed to parse Vertex AI response: {}", e))
})?;

// Reuse Gemini's response parsing — the format is identical.
let content = GeminiProvider::extract_text(&json).unwrap_or_default();
let usage = GeminiProvider::extract_usage(&json);

let mut llm_response = LLMResponse::text(&content);
if let Some(u) = usage {
llm_response = llm_response.with_usage(u);
}
return Ok(llm_response);
}

let status = response.status().as_u16();
let error_text = response.text().await.unwrap_or_default();

// Try to extract a useful message from the Vertex AI error body.
let body_msg = serde_json::from_str::<Value>(&error_text)
.ok()
.and_then(|v| {
v["error"]["message"]
.as_str()
.map(|s| format!("Vertex AI API error: {}", s))
})
.unwrap_or_else(|| format!("Vertex AI API error: {}", error_text));

Err(ZeptoError::from(parse_provider_error(status, &body_msg)))
}

fn default_model(&self) -> &str {
&self.model
}

fn name(&self) -> &str {
"vertex"
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find the LLMProvider trait definition
fd LLMProvider --type f -e rs | head -20

Repository: qhkm/zeptoclaw

Length of output: 40


🏁 Script executed:

# Search for LLMProvider trait definition
rg -A 30 "trait LLMProvider" -t rs

Repository: qhkm/zeptoclaw

Length of output: 84


🏁 Script executed:

# Check if VertexProvider implements chat_stream
rg -A 10 "impl.*VertexProvider" src/providers/vertex.rs

Repository: qhkm/zeptoclaw

Length of output: 1154


🏁 Script executed:

# Look for any default implementations in the trait
rg -B 5 -A 15 "fn chat_stream" -t rs

Repository: qhkm/zeptoclaw

Length of output: 84


🏁 Script executed:

# Search for LLMProvider trait definition without file type filter
rg "trait LLMProvider" -A 30

Repository: qhkm/zeptoclaw

Length of output: 1750


🏁 Script executed:

# Check if VertexProvider implements chat_stream in vertex.rs
rg "chat_stream" src/providers/vertex.rs -A 10

Repository: qhkm/zeptoclaw

Length of output: 40


🏁 Script executed:

# Check all implementations in vertex.rs
cat src/providers/vertex.rs | grep -A 5 "impl LLMProvider"

Repository: qhkm/zeptoclaw

Length of output: 227


🏁 Script executed:

# Find where LLMProvider trait is defined
find . -name "*.rs" -type f -exec grep -l "trait LLMProvider" {} \;

Repository: qhkm/zeptoclaw

Length of output: 82


🏁 Script executed:

# Get the complete LLMProvider trait definition from types.rs
cat src/providers/types.rs | grep -A 100 "trait LLMProvider"

Repository: qhkm/zeptoclaw

Length of output: 3373


🏁 Script executed:

# Check if there's a default implementation for chat_stream
rg "chat_stream" src/providers/ -B 2 -A 10

Repository: qhkm/zeptoclaw

Length of output: 19167


🏁 Script executed:

# Check what methods are actually required vs optional in the trait
rg "fn chat_stream\|fn chat\|fn default_model\|fn name" src/providers/types.rs -B 1 -A 3

Repository: qhkm/zeptoclaw

Length of output: 40


VertexProvider silently advertises unsupported tool calling and streaming.

chat() ignores the tools parameter, and VertexProvider doesn't override chat_stream(), so it inherits the default implementation that wraps chat(). This means agents using Vertex appear to support tool calling and streaming but actually lose both capabilities. Either implement proper support through Vertex endpoints or return an explicit error until available.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/vertex.rs` around lines 195 - 258, VertexProvider currently
ignores the tools parameter in chat() and inherits a streaming wrapper via
chat_stream(), so it must explicitly refuse unsupported features: update
VertexProvider::chat to validate the tools argument and immediately return a
ZeptoError::Provider (or appropriate error type) when tools is non-empty
(mentioning that tool calling is unsupported by Vertex), and implement/override
VertexProvider::chat_stream to return an immediate error indicating streaming is
not supported instead of falling back to the default wrapper; reference the
methods LLMProvider::chat, VertexProvider::chat, and VertexProvider::chat_stream
and the ToolDefinition type when making these checks and error returns.

Comment on lines +42 to +50
let (first, rest) = match text.find(char::is_whitespace) {
Some(idx) => (&text[..idx], text[idx..].trim()),
None => (text, ""),
};

let keyword = first.to_lowercase();

if APPROVE_KEYWORDS.contains(&keyword.as_str()) {
return Some(ApprovalDecision {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Normalize punctuation on the decision token.

Replies like approve, looks good and reject. won’t parse because punctuation remains part of the first token.

🔧 Proposed fix
-    let keyword = first.to_lowercase();
+    let keyword = first
+        .trim_matches(|c: char| !c.is_alphanumeric())
+        .to_ascii_lowercase();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let (first, rest) = match text.find(char::is_whitespace) {
Some(idx) => (&text[..idx], text[idx..].trim()),
None => (text, ""),
};
let keyword = first.to_lowercase();
if APPROVE_KEYWORDS.contains(&keyword.as_str()) {
return Some(ApprovalDecision {
let (first, rest) = match text.find(char::is_whitespace) {
Some(idx) => (&text[..idx], text[idx..].trim()),
None => (text, ""),
};
let keyword = first
.trim_matches(|c: char| !c.is_alphanumeric())
.to_ascii_lowercase();
if APPROVE_KEYWORDS.contains(&keyword.as_str()) {
return Some(ApprovalDecision {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/r8r_bridge/approval.rs` around lines 42 - 50, The decision token
extraction currently keeps trailing punctuation so tokens like "approve," or
"reject." fail to match; update the logic after computing `first` to strip
surrounding punctuation before lowercasing (e.g., trim punctuation characters
from `first`), then assign that result to `keyword` and use it against
`APPROVE_KEYWORDS`/`REJECT_KEYWORDS` when constructing `ApprovalDecision`
(ensure you still handle the None/no-whitespace case and empty strings from
trimming). This change should be applied where `first`, `rest`, and `keyword`
are defined so punctuation-normalized tokens match correctly.

Comment thread src/r8r_bridge/dedup.rs
Comment on lines +23 to +28
pub fn new(max_entries: usize, ttl_secs: u64) -> Self {
Self {
seen: VecDeque::new(),
max_entries,
ttl_secs,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle zero-capacity deduplicators explicitly.

With max_entries == 0, entries are still stored and duplicates can still be suppressed, which is unexpected for a zero-capacity setting.

🔧 Proposed fix
 pub fn is_new(&mut self, event_id: &str) -> bool {
+    if self.max_entries == 0 {
+        return true;
+    }
     let now = Instant::now();

Also applies to: 56-62

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/r8r_bridge/dedup.rs` around lines 23 - 28, The constructor new currently
allows storing entries even when max_entries == 0; explicitly treat
zero-capacity deduplicators as non-storing/non-suppressing by adding a
fast-path: set a zero_capacity flag (or check max_entries == 0) in new and then
short-circuit the dedup logic in the methods that add/check entries (the code
around lines 56-62 — e.g., insert/contains or push_entry functions) so they
never push to seen or mark entries as seen when max_entries == 0, and always
report “not seen” (no suppression) for incoming entries; ensure ttl_secs logic
is also bypassed for this case.

Comment thread src/r8r_bridge/mod.rs Outdated
Comment on lines +110 to +118
request
.headers_mut()
.insert("Host", host_with_port.parse().unwrap());

if let Some(ref token) = self.token {
request.headers_mut().insert(
"Authorization",
format!("Bearer {token}").parse().unwrap(),
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and read the file around lines 110-118
head -n 125 src/r8r_bridge/mod.rs | tail -n 30

Repository: qhkm/zeptoclaw

Length of output: 1011


🏁 Script executed:

# Get full context of the function containing these lines
head -n 150 src/r8r_bridge/mod.rs | tail -n 80

Repository: qhkm/zeptoclaw

Length of output: 2966


🏁 Script executed:

# Check the full file structure to understand function signatures
wc -l src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 83


Return a connection error instead of panicking on bad headers.

The connect() method is a public async API that already returns Result<(), String>. Two .parse().unwrap() calls on header values (Host and Authorization) can panic if the token or derived host contains invalid HTTP header characters, violating the guideline against unwrap()/expect() in production paths. Propagate these errors consistently with the existing error handling in this method.

Suggested change
-        request
-            .headers_mut()
-            .insert("Host", host_with_port.parse().unwrap());
+        let host_header = host_with_port
+            .parse()
+            .map_err(|e| format!("Invalid Host header: {e}"))?;
+        request.headers_mut().insert("Host", host_header);

         if let Some(ref token) = self.token {
+            let auth_header = format!("Bearer {token}")
+                .parse()
+                .map_err(|e| format!("Invalid Authorization header: {e}"))?;
             request.headers_mut().insert(
                 "Authorization",
-                format!("Bearer {token}").parse().unwrap(),
+                auth_header,
             );
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/r8r_bridge/mod.rs` around lines 110 - 118, In connect() avoid panics from
header parsing: replace the two parse().unwrap() calls on Host
(host_with_port.parse()) and Authorization (format!("Bearer {token}").parse())
with fallible handling that returns Err(String) on failure; use the same
error-return style as the function (e.g., map_err or match) and include context
like "invalid Host header" or "invalid Authorization header" so
request.headers_mut().insert(...) only runs with valid HeaderValue instances;
update references in connect(), self.token, and request.headers_mut()
accordingly.

Comment thread src/r8r_bridge/mod.rs
Comment on lines +141 to +149
tokio::spawn(async move {
while let Some(msg) = rx.recv().await {
let mut writer = ws_write_clone.lock().await;
if let Err(e) = writer.send(WsMessage::Text(msg.into())).await {
warn!("r8r bridge send error: {e}");
break;
}
}
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

disconnect() leaves the live socket and reader task behind.

This only clears the outbound sender and flips the flag. The spawned reader keeps consuming until the peer closes, so run() can reconnect and end up with multiple live connections. Store shutdown/task handles and actively close or abort them here.

Also applies to: 157-258, 264-269

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/r8r_bridge/mod.rs` around lines 141 - 149, disconnect() currently only
clears the outbound sender and flips a flag but does not stop the spawned
reader/writer tasks or close the underlying websocket, so reconnections can
leave multiple live tasks; modify the code that spawns the tasks (the
tokio::spawn block which reads from rx and uses ws_write_clone to send, and the
corresponding reader task spawned in run()) to return and store their
JoinHandle(s) (e.g., store in a struct field like reader_handle/writer_handle),
and update disconnect() to actively abort those handles (JoinHandle::abort) and
close the websocket (call close on the sink/stream or send a Close frame via
writer.send(WsMessage::Close(_))). Also ensure run() checks/cleans these handles
before creating new tasks so no orphaned tasks remain; locate usages of rx,
ws_write_clone, writer.send, and the reader spawn to implement these changes.

Comment thread src/r8r_bridge/mod.rs
Comment on lines +187 to +205
// Dedup check.
{
let mut dd = dedup.lock().await;
if !dd.is_new(&envelope.id) {
info!("r8r bridge: skipping duplicate event {}", envelope.id);
continue;
}
}

// Send ack.
let ack = Ack {
event_id: envelope.id.clone(),
};
if let Ok(ack_json) = serde_json::to_string(&ack) {
let mut writer = ws_write.lock().await;
if let Err(e) = writer.send(WsMessage::Text(ack_json.into())).await {
warn!("r8r bridge: failed to send ack: {e}");
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ack duplicates before dropping them.

A retried event currently hits the dedup check and returns before the ack is written, so the sender never learns it can stop retrying.

Suggested change
-                // Dedup check.
-                {
-                    let mut dd = dedup.lock().await;
-                    if !dd.is_new(&envelope.id) {
-                        info!("r8r bridge: skipping duplicate event {}", envelope.id);
-                        continue;
-                    }
-                }
-
-                // Send ack.
+                let is_new = {
+                    let mut dd = dedup.lock().await;
+                    dd.is_new(&envelope.id)
+                };
+
+                // Ack even duplicates so the sender can stop retrying.
                 let ack = Ack {
                     event_id: envelope.id.clone(),
                 };
                 if let Ok(ack_json) = serde_json::to_string(&ack) {
                     let mut writer = ws_write.lock().await;
                     if let Err(e) = writer.send(WsMessage::Text(ack_json.into())).await {
                         warn!("r8r bridge: failed to send ack: {e}");
                     }
                 }
+
+                if !is_new {
+                    info!("r8r bridge: skipping duplicate event {}", envelope.id);
+                    continue;
+                }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/r8r_bridge/mod.rs` around lines 187 - 205, The ack is only sent after the
dedup check so duplicates are dropped without acknowledging; move the ack send
logic to occur before the dedup block (but keep the dedup check and the early
continue to skip further processing), i.e., create Ack { event_id:
envelope.id.clone() }, serialize it with serde_json::to_string, lock ws_write
and send the WsMessage::Text as currently done (the writer.send call), handling
send errors as before, then perform the dedup.lock().await and if
!dd.is_new(&envelope.id) continue; so retried events are always acknowledged
even when deduplicated.

Comment thread src/r8r_bridge/mod.rs
Comment on lines +278 to +283
let sender_guard = self.sender.lock().await;
match sender_guard.as_ref() {
Some(tx) => tx
.send(json)
.await
.map_err(|e| format!("Failed to send message: {e}")),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the file
find . -name "mod.rs" -path "*/r8r_bridge/*" -type f

Repository: qhkm/zeptoclaw

Length of output: 81


🏁 Script executed:

# Check if the file exists and get basic info
ls -la src/r8r_bridge/mod.rs 2>/dev/null || echo "File not found at src/r8r_bridge/mod.rs"
wc -l src/r8r_bridge/mod.rs 2>/dev/null || echo "Cannot read file"

Repository: qhkm/zeptoclaw

Length of output: 155


🏁 Script executed:

# Read the specific lines mentioned in the review (278-283)
sed -n '270,290p' src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 901


🏁 Script executed:

# Also need to find the broader context - look for the function containing this code
sed -n '250,300p' src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 1830


🏁 Script executed:

# Find the struct definition and sender field
sed -n '1,100p' src/r8r_bridge/mod.rs | head -80

Repository: qhkm/zeptoclaw

Length of output: 2939


🏁 Script executed:

# Search for the struct definition
grep -n "struct\|sender:" src/r8r_bridge/mod.rs | head -20

Repository: qhkm/zeptoclaw

Length of output: 191


🏁 Script executed:

# Look for the full struct definition
sed -n '1,150p' src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 5616


🏁 Script executed:

# Verify mpsc::Sender is Clone - check Tokio documentation pattern
# Also check for other potential lock-across-await patterns
grep -n "\.lock()\.await" src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 618


🏁 Script executed:

# Check the actual channel creation to confirm it's bounded
grep -n "mpsc::channel" src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 118


🏁 Script executed:

# Look for other methods that might have similar issues
grep -n "pub async fn" src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 444


🏁 Script executed:

# Let's see the disconnect() method in full context
sed -n '264,273p' src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 427


🏁 Script executed:

# And also check the connect() method where sender_guard is used properly
sed -n '128,135p' src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 352


🏁 Script executed:

# Let's verify the complete send() method with context
sed -n '274,288p' src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 623


🏁 Script executed:

# Verify that mpsc::Sender implements Clone by checking Tokio documentation patterns
# and also check if there are any usages of send() in tests or elsewhere that might be affected
grep -n "\.send(" src/r8r_bridge/mod.rs

Repository: qhkm/zeptoclaw

Length of output: 306


Hold the sender lock scope tightly—avoid awaiting on send() while holding the mutex.

The send().await call can block indefinitely if the bounded channel (capacity 256) is under backpressure. Holding the mutex during this wait blocks disconnect() and connect() from acquiring the lock, creating a potential deadlock. The connect() method at line 131 already demonstrates the correct pattern: acquire the lock in a scoped block, release it, then proceed with async operations.

Clone the sender inside a scoped block to release the lock before awaiting:

Suggested change
-        let sender_guard = self.sender.lock().await;
-        match sender_guard.as_ref() {
-            Some(tx) => tx
+        let tx = {
+            let sender_guard = self.sender.lock().await;
+            sender_guard.clone()
+        };
+        match tx {
+            Some(tx) => tx
                 .send(json)
                 .await
                 .map_err(|e| format!("Failed to send message: {e}")),
             None => Err("Not connected".to_string()),
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let sender_guard = self.sender.lock().await;
match sender_guard.as_ref() {
Some(tx) => tx
.send(json)
.await
.map_err(|e| format!("Failed to send message: {e}")),
let tx = {
let sender_guard = self.sender.lock().await;
sender_guard.clone()
};
match tx {
Some(tx) => tx
.send(json)
.await
.map_err(|e| format!("Failed to send message: {e}")),
None => Err("Not connected".to_string()),
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/r8r_bridge/mod.rs` around lines 278 - 283, The code is awaiting
tx.send(...) while holding the mutex guard on self.sender, which can deadlock
under backpressure; fix by scoping the lock to clone the sender (e.g., take the
Option<Sender> from sender_guard and clone or .to_owned() the contained tx)
inside a short block, drop the guard, then call send(json).await on the cloned
tx (and keep the existing map_err handling), similar to the pattern used in
connect() and ensuring disconnect() and connect() can acquire the mutex.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/r8r_bridge/events.rs (1)

118-266: Consider extracting inline deserializer structs to reduce duplication.

Each type-string branch defines an identical helper struct D that mirrors the BridgeEvent variant fields. While functionally correct, this creates maintenance burden when fields change. You could use a macro or extract shared helper structs to reduce boilerplate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/r8r_bridge/events.rs` around lines 118 - 266, The from_type_and_data
function contains repeated inline deserializer structs D for each branch (e.g.,
r8r.approval.requested, r8r.execution.failed, zeptoclaw.workflow.trigger) which
duplicates the BridgeEvent variant fields; extract each inline struct into named
Deserialize types (e.g., ApprovalRequestedData, ExecutionFailedData,
WorkflowTriggerData) or create a small macro to generate them, then replace
serde_json::from_value(data.clone()) with deserialization into those named types
and map their fields into the corresponding BridgeEvent::... variants to remove
boilerplate and centralize field definitions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/r8r_bridge/events.rs`:
- Around line 118-266: The from_type_and_data function contains repeated inline
deserializer structs D for each branch (e.g., r8r.approval.requested,
r8r.execution.failed, zeptoclaw.workflow.trigger) which duplicates the
BridgeEvent variant fields; extract each inline struct into named Deserialize
types (e.g., ApprovalRequestedData, ExecutionFailedData, WorkflowTriggerData) or
create a small macro to generate them, then replace
serde_json::from_value(data.clone()) with deserialization into those named types
and map their fields into the corresponding BridgeEvent::... variants to remove
boilerplate and centralize field definitions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 08c8d093-4db0-4554-a81f-ad0098a12042

📥 Commits

Reviewing files that changed from the base of the PR and between 11c179d and 6c296c7.

📒 Files selected for processing (5)
  • src/r8r_bridge/events.rs
  • src/r8r_bridge/mod.rs
  • tests/r8r_bridge_approval_test.rs
  • tests/r8r_bridge_client_test.rs
  • tests/r8r_bridge_events_test.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/r8r_bridge_approval_test.rs
  • tests/r8r_bridge_client_test.rs

@booya
Copy link
Copy Markdown

booya commented Mar 16, 2026

@qhkm How does it work with access tokens being short lived? While the google crates are maybe a bit heavy, they do all the heavy lifting of managing access token lifecycle. If gcloud cli is needed to get the access tokens that is a much much bigger overhead than the google crates.

@qhkm
Copy link
Copy Markdown
Owner Author

qhkm commented Mar 16, 2026

@qhkm How does it work with access tokens being short lived? While the google crates are maybe a bit heavy, they do all the heavy lifting of managing access token lifecycle. If gcloud cli is needed to get the access tokens that is a much much bigger overhead than the google crates.

yeah i was thinking of migrating that, will run a bit of internal test first and see how. Will update again soon

Address CodeRabbit review feedback:
- Add ZEPTOCLAW_PROVIDERS_VERTEX_{API_KEY,API_BASE,MODEL} env overrides
- Filter empty VERTEX_ACCESS_TOKEN strings in from_config()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace static VERTEX_ACCESS_TOKEN with proper Application Default
Credentials (ADC) via google-cloud-auth. Tokens are now automatically
refreshed before expiry, fixing the short-lived token problem for
long-running gateways.

Auth priority: VERTEX_ACCESS_TOKEN (static, backward compat) → ADC
(service account JSON or gcloud default credentials, auto-refresh).

Binary size impact: +33 KB (shares most deps with existing stack).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/kernel/provider.rs`:
- Around line 39-42: The callers still treat the now-async functions
provider_from_runtime_selection and build_runtime_provider_chain as synchronous;
update all call sites (e.g., in src/cli/common.rs and src/kernel/mod.rs and any
locations around the mentioned lines) to await the futures and propagate async:
mark the caller functions as async (or .await where inside an async context) and
add .await after calls to build_runtime_provider_chain() and
provider_from_runtime_selection(); ensure the call chain is updated consistently
(or restore a sync wrapper) so compilation succeeds.

In `@src/providers/vertex.rs`:
- Around line 183-189: The api_url method currently prefixes the hostname with
"{location}-" unconditionally which produces an invalid host for the global
endpoint; update api_url (function api_url, using self.location and
self.project_id) to special-case when self.location == "global" and use
"aiplatform.googleapis.com" as the host, otherwise continue to use
"{location}-aiplatform.googleapis.com", and ensure the rest of the path still
uses the project and model values unchanged.
- Around line 41-42: The VertexAuth enum currently uses the private type
google_cloud_auth::token::Token (Adc variant) which won't compile; replace that
variant to use the public credentials API instead (for example
Arc<google_cloud_auth::credentials::Credentials> or
Arc<google_cloud_auth::credentials::AccessTokenCredentials> depending on whether
you need full ADC behavior or just an access token). Update the Adc variant
declaration in VertexAuth and any construction/usage sites (the places around
the previous token/DefaultTokenProvider usage) to accept and call the public
Credentials/AccessTokenCredentials API so the code uses
google_cloud_auth::credentials::{Credentials, AccessTokenCredentials} rather
than the private token module.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 000cf824-b77a-4e9b-abd8-4be266f6d135

📥 Commits

Reviewing files that changed from the base of the PR and between 53a727f and 50447ab.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • Cargo.toml
  • src/kernel/provider.rs
  • src/providers/vertex.rs

Comment thread src/kernel/provider.rs
Comment on lines +39 to 42
pub async fn provider_from_runtime_selection(
selection: &RuntimeProviderSelection,
configured_model: &str,
) -> Option<Box<dyn LLMProvider>> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Propagate the async conversion to the remaining callers.

provider_from_runtime_selection() and build_runtime_provider_chain() now return futures, but the same file still calls build_runtime_provider_chain() synchronously on Lines 418, 430, 453, 480, and 498. The provided snippets show the same missing .await in src/cli/common.rs Lines 373-379 and src/kernel/mod.rs Lines 113-117. As written, this change leaves the tree in a non-compiling state until those callers are updated or the sync surface is restored.

Also applies to: 134-146

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/kernel/provider.rs` around lines 39 - 42, The callers still treat the
now-async functions provider_from_runtime_selection and
build_runtime_provider_chain as synchronous; update all call sites (e.g., in
src/cli/common.rs and src/kernel/mod.rs and any locations around the mentioned
lines) to await the futures and propagate async: mark the caller functions as
async (or .await where inside an async context) and add .await after calls to
build_runtime_provider_chain() and provider_from_runtime_selection(); ensure the
call chain is updated consistently (or restore a sync wrapper) so compilation
succeeds.

Comment thread src/providers/vertex.rs
Comment on lines +41 to +42
Adc(Arc<google_cloud_auth::token::Token>),
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In google-cloud-auth1.7.x, aregoogle_cloud_auth::token::Tokenandgoogle_cloud_auth::token::DefaultTokenProvider public APIs? If not, what public ADC/access-token API should be used instead?

💡 Result:

In google-cloud-auth 1.7.x, google_cloud_auth::token::Token and google_cloud_auth::token::DefaultTokenProvider are not public APIs. No documentation or source snippets explicitly show a token module or these types as publicly exported in the current version. The crate's public modules are credentials, errors, signer, and build_errors, with the primary public ADC API being google_cloud_auth::credentials::Credentials (and its provider traits). SDK clients consume implementations of credentials::Credentials to authenticate RPCs. For access tokens, use AccessToken from the credentials module or AccessTokenCredentials, which provides direct access to tokens via the AccessTokenCredentialsProvider trait.

Citations:


Rewrite VertexAuth to use the public google-cloud-auth credentials API.

google_cloud_auth::token::Token and google_cloud_auth::token::DefaultTokenProvider are not public in google-cloud-auth 1.7.x; the token module is pub(crate) only. This code will not compile. Use the public credentials module instead: google_cloud_auth::credentials::Credentials for the main ADC abstraction, or AccessToken / AccessTokenCredentials for token-specific handling.

Also applies to: 94–99

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/vertex.rs` around lines 41 - 42, The VertexAuth enum currently
uses the private type google_cloud_auth::token::Token (Adc variant) which won't
compile; replace that variant to use the public credentials API instead (for
example Arc<google_cloud_auth::credentials::Credentials> or
Arc<google_cloud_auth::credentials::AccessTokenCredentials> depending on whether
you need full ADC behavior or just an access token). Update the Adc variant
declaration in VertexAuth and any construction/usage sites (the places around
the previous token/DefaultTokenProvider usage) to accept and call the public
Credentials/AccessTokenCredentials API so the code uses
google_cloud_auth::credentials::{Credentials, AccessTokenCredentials} rather
than the private token module.

Comment thread src/providers/vertex.rs
Comment on lines +183 to +189
fn api_url(&self, model: &str) -> String {
format!(
"https://{location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent",
location = self.location,
project = self.project_id,
model = model,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For Vertex AI generateContent, what hostname should be used when the resource location is global?

💡 Result:

For Vertex AI generateContent when the resource location is global, the hostname to use is aiplatform.googleapis.com.

Citations:


🏁 Script executed:

# Check the actual current state of vertex.rs
cat -n src/providers/vertex.rs | head -200

Repository: qhkm/zeptoclaw

Length of output: 8896


🏁 Script executed:

# Search for tests related to api_url or vertex provider
find . -type f -name "*.rs" | xargs grep -l "api_url\|vertex" | head -20

Repository: qhkm/zeptoclaw

Length of output: 272


🏁 Script executed:

# Check for any existing handling of "global" location
rg "global" src/providers/vertex.rs -B 2 -A 2

Repository: qhkm/zeptoclaw

Length of output: 40


🏁 Script executed:

# Check how location is used elsewhere in vertex.rs
rg "location" src/providers/vertex.rs -B 1 -A 1

Repository: qhkm/zeptoclaw

Length of output: 2587


Handle the global Vertex endpoint correctly.

api_url() constructs the hostname by always prefixing {location}-, but Vertex AI's global endpoint is https://aiplatform.googleapis.com/... without a location prefix. Setting VERTEX_LOCATION=global will currently generate https://global-aiplatform.googleapis.com/..., which is incorrect and will fail API calls.

Suggested fix
 fn api_url(&self, model: &str) -> String {
+    let host = if self.location == "global" {
+        "aiplatform.googleapis.com".to_string()
+    } else {
+        format!("{}-aiplatform.googleapis.com", self.location)
+    };
+
     format!(
-        "https://{location}-aiplatform.googleapis.com/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent",
-        location = self.location,
+        "https://{host}/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent",
+        host = host,
+        location = self.location,
         project = self.project_id,
         model = model,
     )
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/vertex.rs` around lines 183 - 189, The api_url method currently
prefixes the hostname with "{location}-" unconditionally which produces an
invalid host for the global endpoint; update api_url (function api_url, using
self.location and self.project_id) to special-case when self.location ==
"global" and use "aiplatform.googleapis.com" as the host, otherwise continue to
use "{location}-aiplatform.googleapis.com", and ensure the rest of the path
still uses the project and model values unchanged.

@qhkm
Copy link
Copy Markdown
Owner Author

qhkm commented Mar 26, 2026

Hey — this is solid work but it needs to be split before merge. The PR bundles 3 unrelated features into one:

1. Vertex AI provider (matches title + issue #363)

  • src/providers/vertex.rs, gemini.rs, mod.rs, registry.rs
  • src/kernel/provider.rs (async changes)
  • Config: vertex field + env overrides
  • Deps: google-cloud-auth

2. R8r bridge (completely separate feature)

  • src/r8r_bridge/approval.rs, dedup.rs, events.rs
  • src/lib.rs (module export)
  • Config: R8rBridgeConfig, ChannelTarget

3. System prompt config (independent concern)

  • src/config/types.rs (system_prompt field on AgentDefaults)
  • src/config/mod.rs (env override)
  • src/cli/common.rs (priority change)

Also:

  • PR description says "zero new dependencies" but adds google-cloud-auth and url to Cargo.toml
  • Description lists 6 files but PR touches 14
  • CI is stale — needs rebase onto main (24+ commits behind)

I'll split this into separate PRs so each can be reviewed and merged independently. Vertex provider (issue #363) goes first.

qhkm added a commit that referenced this pull request Mar 26, 2026
Adds first-class Vertex AI provider for Gemini models via regional
endpoint with ADC auto-refresh and static bearer token auth.

- Zero-config auth: VERTEX_ACCESS_TOKEN env var or google-cloud-auth ADC
- Reuses GeminiProvider response parsing (extract_text, extract_usage)
- 14 unit tests, all passing
- Config: providers.vertex.api_key (project ID), providers.vertex.api_base (location)
- Env overrides: ZEPTOCLAW_PROVIDERS_VERTEX_{API_KEY,API_BASE,MODEL}

Split from PR #364 — Vertex provider only, no r8r_bridge or system_prompt changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@qhkm
Copy link
Copy Markdown
Owner Author

qhkm commented Mar 26, 2026

Superseded by #447 (Vertex provider only). R8r bridge and system prompt config will be separate PRs.

@qhkm qhkm closed this Mar 26, 2026
qhkm added a commit that referenced this pull request Mar 26, 2026
## Summary

Closes #363

Adds first-class Google Vertex AI provider for Gemini models via
regional endpoint with enterprise-grade auth.

- **Auth priority:** `VERTEX_ACCESS_TOKEN` env var (static bearer) →
`google-cloud-auth` ADC (auto-refresh via service account or `gcloud
auth application-default login`)
- **Reuses** GeminiProvider response parsing (`extract_text`,
`extract_usage`) — no duplicated logic
- **Config:** `providers.vertex.api_key` (GCP project ID),
`providers.vertex.api_base` (location)
- **Env overrides:**
`ZEPTOCLAW_PROVIDERS_VERTEX_{API_KEY,API_BASE,MODEL}`

Split from #364 — Vertex provider only. R8r bridge and system prompt
changes will be separate PRs.

## Test plan

- [x] `cargo fmt -- --check` clean
- [x] `cargo clippy -- -D warnings` clean
- [x] 14 new Vertex unit tests pass
- [x] Full lib suite: 3353 passed, 0 failed
- [ ] Manual: `provider status` shows vertex when configured
- [ ] Manual: `agent -m "Hello"` works with vertex provider

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added support for Google Vertex AI as a new provider for chat
operations
* Vertex AI configuration available via environment variables for API
key, endpoint, and model selection

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(providers): add Google Vertex AI provider

2 participants