fix: harden openai-compatible provider, approval replay, and embeddings defaults by ilblackdragon · Pull Request #237 · nearai/ironclaw

ilblackdragon · 2026-02-19T20:38:41Z

Summary

Hardens the OpenAI-compatible chat provider, adds multi-tool approval replay with deferred tool calls, wires Ollama embeddings, and fixes several robustness issues across LLM providers.

Changes included (from original PR)

Deferred tool call replay: When a multi-tool LLM response requires approval for one tool, remaining tools are queued and replayed after approval (dispatcher.rs, thread_ops.rs, session.rs)
Tool message sanitization (sanitize_tool_messages): Rewrites orphaned tool_result messages as user messages to prevent HTTP 400 from Anthropic/others. Applied consistently across all providers
Ollama embeddings: New OllamaEmbeddings provider with configurable model/dimension
Flexible embedding dimensions: PostgreSQL migration V9 removes fixed 1536-dim constraint
Approval aliases: /approve, /always, /deny, a, n etc. in submission parser
NearAI chat provider: Configurable tool-message flattening toggle, smarter URL construction
Config additions: OPENAI_BASE_URL, ANTHROPIC_BASE_URL for proxy support; OLLAMA_BASE_URL for embeddings
Sandbox default: Changed default image from ghcr.io/nearai/sandbox:latest to ironclaw-worker:latest
Web gateway: thread_id on ApprovalNeeded SSE events for thread-scoped filtering
REPL: Forward /quit as message so agent loop exits when other channels are active
Gateway: Wire with_llm_provider for OpenAI-compatible API proxy

Composable RetryProvider & LLM module hardening

RetryProvider decorator (retry.rs): Composable wrapper for any LlmProvider with exponential backoff + jitter. Respects RateLimited { retry_after } hints. Wired into main.rs composition chain so each provider retries independently before failover.
Removed openai_compatible_chat.rs: Replaced by rig adapter + RetryProvider. The rig-based adapter now handles all OpenAI-compatible endpoints (including custom base URLs) with tool name normalization moved to rig_adapter.rs.
Removed internal retry loops: Both nearai.rs and nearai_chat.rs no longer have internal retry loops — retries are handled by the external RetryProvider wrapper, eliminating double-retry (was up to 16 attempts, now correctly 4).
Error classification reconciled: is_retryable() (retry/failover) and is_transient() (circuit breaker) now have clear, documented semantics. ModelNotAvailable is no longer retryable; Json no longer trips the circuit breaker.
Fixed Duration subtraction panic in circuit_breaker.rs — uses checked_sub to avoid panic on TOCTOU race.
Failover uses shared is_retryable() from retry.rs instead of duplicating classification logic.

Changes from original

Merged with latest main (resolved 4 conflicts: CHANGELOG.md, config.rs→config/, agent_loop.rs→dispatcher+thread_ops, llm/mod.rs)
Ported deferred tool call logic from monolithic agent_loop.rs to new split module architecture (dispatcher.rs + thread_ops.rs)
Ported config changes from deleted src/config.rs to new modular config/ directory (llm.rs, embeddings.rs, sandbox.rs)
Fixed ChatCompletionResponse.id to Option<String> (reviewer feedback: some providers omit this field)
Fixed unwrap_or_else(|_| Client::new()) to properly propagate errors (reviewer feedback: silently dropped timeout config)
Added EMBEDDING_DIMENSION env var with smart per-model defaults instead of hardcoding 768/1536
Full /review-crate @src/llm/ audit — fixed all Medium findings

Original PR

#112 - fix: harden openai-compatible provider, approval replay, and embeddings defaults

Review comments addressed

ChatCompletionResponse.id changed from String to Option<String> with #[serde(default)]
Ollama embedding dimension no longer hardcoded; uses EMBEDDING_DIMENSION env var or infers from model name
unwrap_or_else(|_| Client::new()) replaced with proper error propagation
libSQL F32_BLOB(1536) schema noted as known limitation

Test plan

All 1265 tests pass (1211 lib + 54 integration/doc)
cargo clippy --all --all-features clean (zero warnings)
cargo fmt clean
New tests: RetryProvider (6 tests), approval aliases, tool name normalization, URL construction, usage parsing, tool message sanitization, error classification

Co-Authored-By: panosAthDBX panosAthDBX@users.noreply.github.com

Generated with Claude Code

…ddings # Conflicts: # CHANGELOG.md # src/agent/agent_loop.rs # src/config.rs # src/llm/mod.rs

- Make ChatCompletionResponse.id Optional<String> to handle providers that omit or null the field - Propagate HTTP client builder errors instead of silently dropping timeout configuration (openai_compatible_chat, nearai_chat) - Add EMBEDDING_DIMENSION env var with smart per-model defaults instead of hardcoding 768/1536 everywhere - Remove duplicated dimension inference logic from main.rs Co-Authored-By: panosAthDBX <panosAthDBX@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist · 2026-02-19T20:39:16Z

Summary of Changes

Hello @ilblackdragon, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness and flexibility of the LLM integration by hardening the OpenAI-compatible chat provider, introducing a sophisticated multi-tool approval replay mechanism, and expanding embedding capabilities to include Ollama and flexible dimensions. These changes improve compatibility with various LLM services, streamline complex agentic workflows, and provide greater control over embedding configurations, ultimately leading to a more reliable and adaptable system.

Highlights

Enhanced OpenAI-Compatible Provider: Introduced a new OpenAiCompatibleChatProvider with robust retry logic, tool-call name normalization, and improved usage parsing, allowing for flexible base URL configurations for OpenAI and Anthropic.
Multi-Tool Approval Replay: Implemented deferred tool call logic, enabling the system to queue and replay subsequent tool calls after a preceding tool call requires and receives approval, ensuring all tool_use IDs get matching tool_result messages.
Flexible Embeddings and Ollama Support: Added support for Ollama embeddings with configurable models and dimensions, and introduced a database migration to allow flexible embedding vector dimensions in PostgreSQL, removing the fixed 1536-dim constraint.
Improved Tool Message Sanitization: Implemented a sanitize_tool_messages function across all LLM providers to rewrite orphaned tool_result messages as user messages, preventing HTTP 400 errors from providers like Anthropic.
Expanded Approval Aliases: Added more intuitive aliases for approval responses (e.g., /approve, /always, /deny) in the submission parser for a smoother user experience.

Changelog

CHANGELOG.md
- Added OpenAiCompatibleChatProvider and wired OpenAI-compatible chat completion routing for custom base URL usage.
- Added Ollama embeddings provider support (EMBEDDING_PROVIDER=ollama, OLLAMA_BASE_URL) in workspace embeddings.
- Added migration V9__flexible_embedding_dimension.sql for flexible embedding vector dimensions.
- Changed default sandbox image to ironclaw-worker:latest in config/settings/sandbox defaults.
- Improved tool-message sanitization and provider compatibility handling across NEAR AI, rig adapter, and shared LLM provider code.
- Fixed approval-input aliases (a, /approve, /always, /deny, etc.) in submission parsing.
- Fixed multi-tool approval resume flow by preserving and replaying deferred tool calls so all prior tool_use IDs receive matching tool_result messages.
- Fixed REPL quit/exit handling to route shutdown through the agent loop for graceful termination.
migrations/V9__flexible_embedding_dimension.sql
- Added a new migration to allow embedding vectors of any dimension by altering the memory_chunks.embedding column type to vector and recreating dependent views.
src/agent/dispatcher.rs
- Modified the tool call execution loop to store and process deferred tool calls when approval is needed for a multi-tool response.
src/agent/session.rs
- Extended the PendingApproval struct to include deferred_tool_calls for replaying queued tool actions.
src/agent/submission.rs
- Expanded the set of recognized aliases for approval responses, including slash commands like /approve and /deny.
src/agent/thread_ops.rs
- Implemented logic to handle and replay deferred tool calls after an approval response, including re-checking approval for each deferred call.
src/channels/repl.rs
- Updated the /quit and /exit commands to forward a shutdown message to the agent loop, ensuring graceful termination.
src/channels/web/mod.rs
- Added thread_id to ApprovalNeeded Server-Sent Events (SSE) to enable thread-scoped filtering in the web gateway.
src/channels/web/static/app.js
- Added a check to filter approval_needed events based on the current thread ID, preventing irrelevant approval prompts.
src/channels/web/types.rs
- Added an optional thread_id field to the SseEvent::ApprovalNeeded enum for better event context.
src/config/embeddings.rs
- Added ollama_base_url and dimension fields to EmbeddingsConfig and implemented a default_dimension_for_model function to infer embedding dimensions.
- Updated the configuration resolution logic to support OLLAMA_BASE_URL and EMBEDDING_DIMENSION environment variables.
src/config/llm.rs
- Added an optional base_url field to OpenAiDirectConfig and AnthropicDirectConfig to support custom proxy endpoints.
- Updated the LLM configuration resolution to read OPENAI_BASE_URL and ANTHROPIC_BASE_URL.
src/config/sandbox.rs
- Updated the default sandbox image to ironclaw-worker:latest.
src/llm/mod.rs
- Added and exported the openai_compatible_chat module and OpenAiCompatibleChatProvider.
- Modified create_openai_provider to use the new OpenAiCompatibleChatProvider when a custom OPENAI_BASE_URL is specified.
- Updated create_anthropic_provider to utilize the configured base_url for Anthropic clients.
- Refactored create_openai_compatible_provider to instantiate the new OpenAiCompatibleChatProvider.
src/llm/nearai.rs
- Applied sanitize_tool_messages to incoming messages before processing them in the NearAiProvider.
src/llm/nearai_chat.rs
- Added flatten_tool_messages configuration and a new_with_flatten constructor to control tool message handling.
- Improved the api_url construction logic for better compatibility with various NEAR AI endpoints.
- Applied sanitize_tool_messages to incoming messages and conditionally flattened tool messages based on configuration.
- Added unit tests for api_url construction scenarios.
src/llm/openai_compatible_chat.rs
- Added a new module implementing OpenAiCompatibleChatProvider for robust interaction with OpenAI-compatible chat completion endpoints, including retry logic, usage parsing, and tool-call name normalization.
src/llm/provider.rs
- Added a sanitize_tool_messages utility function to convert orphaned tool_result messages into user messages, preventing API errors.
- Added unit tests for the sanitize_tool_messages function.
src/llm/rig_adapter.rs
- Applied sanitize_tool_messages to messages before converting them for the RigAdapter.
src/main.rs
- Imported OllamaEmbeddings and updated the main application logic to initialize Ollama as an embedding provider.
- Modified embedding provider initialization to use the newly introduced flexible embedding dimensions.
- Wired the GatewayChannel with the LLM provider for enhanced functionality.
src/sandbox/config.rs
- Updated the default sandbox image to ironclaw-worker:latest.
src/settings.rs
- Updated the default sandbox image setting to ironclaw-worker:latest.
src/workspace/embeddings.rs
- Added the OllamaEmbeddings struct and its implementation as an EmbeddingProvider, supporting Ollama's local embedding API.
- Included methods for creating OllamaEmbeddings instances with specific models and dimensions.
src/workspace/mod.rs
- Exported OllamaEmbeddings from the workspace module.

Activity

The pull request author, ilblackdragon, continued work from a previous PR (fix: harden openai-compatible provider, approval replay, and embeddings defaults #112) by panosAthDBX.
The changes were merged with the latest main branch, resolving 4 conflicts in CHANGELOG.md, config.rs, agent_loop.rs, and llm/mod.rs.
Deferred tool call logic was ported from a monolithic agent_loop.rs to the new dispatcher.rs and thread_ops.rs modules.
Configuration changes were moved from src/config.rs to the new modular config/ directory (llm.rs, embeddings.rs, sandbox.rs).
Reviewer feedback was addressed, including changing ChatCompletionResponse.id to Option<String> and fixing error propagation in unwrap_or_else calls.
The EMBEDDING_DIMENSION environment variable was introduced with smart per-model defaults, addressing feedback about hardcoded Ollama dimensions.
Test compilation issues were fixed by creating a test_nearai_config() helper for new NearAiConfig fields.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a substantial set of features and robustness improvements across the LLM provider landscape, including the addition of the OpenAiCompatibleChatProvider, support for Ollama embeddings, flexible embedding dimensions, and deferred tool call replay for multi-tool approvals. However, a high-severity security vulnerability was identified in the web UI's HTML sanitization logic. The custom regular expression-based sanitization is bypassable, which could lead to Cross-Site Scripting (XSS) and the theft of sensitive authentication tokens. It is strongly recommended to use a dedicated sanitization library like DOMPurify to mitigate this risk. Additionally, there are suggestions to improve maintainability by refactoring duplicated code and to enhance error handling for more robust provider interactions.

gemini-code-assist · 2026-02-19T20:43:13Z

+            let mut deferred_queue = std::collections::VecDeque::from(deferred_tool_calls);
+            while let Some(tc) = deferred_queue.pop_front() {
+                // Re-check approval for each deferred tool call
+                if let Some(tool) = self.tools().get(&tc.name).await
+                    && tool.requires_approval()
+                {
+                    let is_auto_approved = {
+                        let sess = session.lock().await;
+                        let mut approved = sess.is_tool_auto_approved(&tc.name);
+                        if approved && tool.requires_approval_for(&tc.arguments) {
+                            approved = false;
+                        }
+                        approved
+                    };
+
+                    if !is_auto_approved {
+                        let new_pending = PendingApproval {
+                            request_id: Uuid::new_v4(),
+                            tool_name: tc.name.clone(),
+                            parameters: tc.arguments.clone(),
+                            description: tool.description().to_string(),
+                            tool_call_id: tc.id.clone(),
+                            context_messages: context_messages.clone(),
+                            deferred_tool_calls: deferred_queue.iter().cloned().collect(),
+                        };
+
+                        let request_id = new_pending.request_id;
+                        let tool_name = new_pending.tool_name.clone();
+                        let description = new_pending.description.clone();
+                        let parameters = new_pending.parameters.clone();
+
+                        {
+                            let mut sess = session.lock().await;
+                            if let Some(thread) = sess.threads.get_mut(&thread_id) {
+                                thread.await_approval(new_pending);
+                            }
+                        }
+
+                        let _ = self
+                            .channels
+                            .send_status(
+                                &message.channel,
+                                StatusUpdate::Status("Awaiting approval".into()),
+                                &message.metadata,
+                            )
+                            .await;
+
+                        return Ok(SubmissionResult::NeedApproval {
+                            request_id,
+                            tool_name,
+                            description,
+                            parameters,
+                        });
+                    }
+                }
+
+                let _ = self
+                    .channels
+                    .send_status(
+                        &message.channel,
+                        StatusUpdate::ToolStarted {
+                            name: tc.name.clone(),
+                        },
+                        &message.metadata,
+                    )
+                    .await;
+
+                let deferred_result = self
+                    .execute_chat_tool(&tc.name, &tc.arguments, &job_ctx)
+                    .await;
+
+                let _ = self
+                    .channels
+                    .send_status(
+                        &message.channel,
+                        StatusUpdate::ToolCompleted {
+                            name: tc.name.clone(),
+                            success: deferred_result.is_ok(),
+                        },
+                        &message.metadata,
+                    )
+                    .await;
+
+                if let Ok(ref output) = deferred_result
+                    && !output.is_empty()
+                {
+                    let _ = self
+                        .channels
+                        .send_status(
+                            &message.channel,
+                            StatusUpdate::ToolResult {
+                                name: tc.name.clone(),
+                                preview: output.clone(),
+                            },
+                            &message.metadata,
+                        )
+                        .await;
+                }
+
+                // Record in thread
+                {
+                    let mut sess = session.lock().await;
+                    if let Some(thread) = sess.threads.get_mut(&thread_id)
+                        && let Some(turn) = thread.last_turn_mut()
+                    {
+                        match &deferred_result {
+                            Ok(output) => turn.record_tool_result(serde_json::json!(output)),
+                            Err(e) => turn.record_tool_error(e.to_string()),
+                        }
+                    }
+                }
+
+                // Auth detection for deferred tools
+                if let Some((ext_name, instructions)) =
+                    detect_auth_awaiting(&tc.name, &deferred_result)
+                {
+                    let auth_data = parse_auth_result(&deferred_result);
+                    {
+                        let mut sess = session.lock().await;
+                        if let Some(thread) = sess.threads.get_mut(&thread_id) {
+                            thread.enter_auth_mode(ext_name.clone());
+                            thread.complete_turn(&instructions);
+                        }
+                    }
+                    let _ = self
+                        .channels
+                        .send_status(
+                            &message.channel,
+                            StatusUpdate::AuthRequired {
+                                extension_name: ext_name,
+                                instructions: Some(instructions.clone()),
+                                auth_url: auth_data.auth_url,
+                                setup_url: auth_data.setup_url,
+                            },
+                            &message.metadata,
+                        )
+                        .await;
+                    return Ok(SubmissionResult::response(instructions));
+                }
+
+                let deferred_content = match deferred_result {
+                    Ok(output) => {
+                        let sanitized =
+                            self.safety().sanitize_tool_output(&tc.name, &output);
+                        self.safety().wrap_for_llm(
+                            &tc.name,
+                            &sanitized.content,
+                            sanitized.was_modified,
+                        )
+                    }
+                    Err(e) => format!("Error: {}", e),
+                };
+
+                context_messages
+                    .push(ChatMessage::tool_result(&tc.id, &tc.name, deferred_content));
+            }
+


There is significant code duplication between this block for handling deferred tool calls and the main tool execution loop in dispatcher.rs. Both sections handle iterating through tool calls, checking for approvals, executing tools, and processing results.

This duplication can make maintenance difficult, as changes in one place might need to be manually synchronized in the other. Consider refactoring this shared logic into a helper function that can be called from both dispatcher.rs and here. This would centralize the tool execution flow, improving maintainability and reducing the risk of inconsistencies.

The deferred tool call duplication is a valid observation. However, refactoring this into a shared helper is a larger architectural change that touches the core agent loop — it warrants its own PR. Filed as a follow-up.

gemini-code-assist · 2026-02-19T20:43:13Z

+            };
+
+            let status = response.status();
+            let response_text = response.text().await.unwrap_or_default();


The use of unwrap_or_default() here can hide potential errors when reading the response body. If response.text().await fails, this will proceed with an empty string, leading to a potentially misleading "JSON parse error" later on. It would be more robust to handle the Result from .text() and return a more specific error, which would aid in debugging issues with the response stream from the provider.

let response_text = match response.text().await { Ok(text) => text, Err(e) => { return Err(LlmError::RequestFailed { provider: "openai_compatible_chat".to_string(), reason: format!("Failed to read response body: {}", e), }); } };

References

Ensuring robust error handling when reading API response bodies helps prevent client-side parsing failures, aligning with the principle that API endpoints should provide valid and consistent responses.

This file (openai_compatible_chat.rs) was deleted in e285759 — replaced by the rig adapter + RetryProvider. The same unwrap_or_default() pattern existed in nearai.rs and nearai_chat.rs; fixed in all 4 call sites in 0c42321 with proper error propagation.

Copilot

Pull request overview

This PR hardens LLM provider robustness and embeddings flexibility while adding support for OpenAI-compatible endpoints and Ollama embeddings. It's a continuation of #112 that addresses reviewer feedback and merges with the latest main branch architecture changes.

Changes:

Adds OpenAI-compatible chat provider with retry logic, tool-call normalization, and robust usage parsing
Implements multi-tool approval replay with deferred tool calls to prevent orphaned tool_result protocol errors
Adds Ollama embeddings provider with configurable dimensions and removes fixed 1536-dim PostgreSQL constraint
Improves provider configuration with base URL overrides for proxies (OpenAI, Anthropic)
Adds tool message sanitization across all LLM providers to prevent HTTP 400 errors
Adds approval command aliases (/approve, /always, /deny, a, n) for better UX
Changes default sandbox image from ghcr.io/nearai/sandbox:latest to ironclaw-worker:latest
Adds thread_id to approval SSE events for better web gateway filtering
Fixes REPL quit forwarding to gracefully exit the agent loop

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/llm/openai_compatible_chat.rs	New OpenAI-compatible chat provider with retry, tool normalization, and robust usage parsing
src/llm/provider.rs	Added sanitize_tool_messages function with comprehensive tests to prevent orphaned tool_result errors
src/llm/nearai_chat.rs	Added configurable tool-message flattening and improved URL construction
src/llm/nearai.rs	Integrated sanitize_tool_messages for robustness
src/llm/rig_adapter.rs	Integrated sanitize_tool_messages before sending to Rig adapters
src/llm/mod.rs	Wired OpenAI-compatible provider and added base URL support for OpenAI/Anthropic
src/workspace/embeddings.rs	Added OllamaEmbeddings provider with configurable model and dimension
src/workspace/mod.rs	Exported OllamaEmbeddings
src/config/embeddings.rs	Added dimension inference from model names with EMBEDDING_DIMENSION override
src/config/llm.rs	Added optional base_url to OpenAI and Anthropic configs for proxy support
src/config/sandbox.rs	Changed default sandbox image to ironclaw-worker:latest
src/sandbox/config.rs	Changed default sandbox image to ironclaw-worker:latest
src/settings.rs	Changed default sandbox image to ironclaw-worker:latest
src/agent/dispatcher.rs	Collects deferred tool calls when approval is needed for multi-tool responses
src/agent/thread_ops.rs	Replays deferred tool calls after approval with proper approval re-checking
src/agent/session.rs	Added deferred_tool_calls field to PendingApproval
src/agent/submission.rs	Added approval aliases for better UX
src/channels/web/types.rs	Added thread_id to ApprovalNeeded SSE event
src/channels/web/mod.rs	Wired thread_id in approval events and added with_llm_provider
src/channels/web/static/app.js	Added thread filtering for approval events
src/channels/repl.rs	Forwards /quit command through agent loop for graceful shutdown
src/main.rs	Updated embeddings setup with dimension config and Ollama support
migrations/V9__flexible_embedding_dimension.sql	Removes fixed dimension constraint and HNSW index for flexible embeddings
CHANGELOG.md	Documents all changes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-19T20:44:32Z

+        let result: OllamaEmbedResponse = response.json().await.map_err(|e| {
+            EmbeddingError::InvalidResponse(format!("Failed to parse Ollama response: {}", e))
+        })?;
+


The Ollama embeddings implementation doesn't validate that the returned embedding dimensions match the configured dimension. If the model returns embeddings with a different dimension than expected, this could cause silent failures when storing to the database or performing vector operations. Consider adding validation after receiving the response to ensure each embedding has the expected dimension.

Suggested change

// Validate that all returned embeddings match the configured dimension.

let expected_dim = self.dimension;

if let Some((index, actual_dim)) = result

.embeddings

.iter()

.enumerate()

.find_map(|(i, emb)| {

let len = emb.len();

if len != expected_dim {

Some((i, len))

} else {

None

}

})

{

return Err(EmbeddingError::InvalidResponse(format!(

"Ollama returned embedding with incorrect dimension at index {}: expected {}, got {}",

index, expected_dim, actual_dim

)));

}

Good catch. Added dimension validation in OllamaEmbeddings::embed_batch() — returns EmbeddingError::InvalidResponse if any returned embedding doesn't match the configured dimension. Fixed in 0c42321.

Copilot · 2026-02-19T20:44:32Z

+                            .with_model(&config.embeddings.model, config.embeddings.dimension),
+                        )),
+                        "ollama" => Some(Arc::new(
+                            ironclaw::workspace::OllamaEmbeddings::new(
+                                &config.embeddings.ollama_base_url,
+                            )
+                            .with_model(&config.embeddings.model, config.embeddings.dimension),
                        )),
                        _ => {
                            if let Some(api_key) = config.embeddings.openai_api_key() {
-                                let dim = match config.embeddings.model.as_str() {
-                                    "text-embedding-3-large" => 3072,
-                                    _ => 1536,
-                                };
                                Some(Arc::new(ironclaw::workspace::OpenAiEmbeddings::with_model(
                                    api_key,
                                    &config.embeddings.model,
-                                    dim,
+                                    config.embeddings.dimension,


The libSQL migrations still use F32_BLOB(1536) with a fixed dimension, meaning libSQL deployments cannot use embedding models with different dimensions (e.g., Ollama's 768-dim models or OpenAI's 3072-dim models). While the PR description notes this as a "known limitation," users who configure non-1536-dim models with libSQL will experience silent failures or dimension mismatches when storing embeddings. Consider adding runtime validation that checks if the configured embedding dimension matches the database schema dimension, and emitting a clear error or warning when there's a mismatch for libSQL deployments.

Added a runtime tracing::warn! at startup when libSQL backend is configured with a non-1536 embedding dimension, explaining the F32_BLOB(1536) schema limitation and suggesting PostgreSQL or EMBEDDING_DIMENSION=1536. Fixed in 0c42321.

- Replace 9x .expect() on RwLock with graceful poison recovery (nearai.rs: 7, nearai_chat.rs: 2) — eliminates production panics - Propagate HTTP client builder errors in nearai.rs instead of silently dropping timeout config (NearAiProvider::new now returns Result) - Make nearai_chat ChatCompletionResponse.id Optional<String> (mirrors openai_compatible_chat.rs fix for providers that omit id) - Make nearai_chat usage fields optional with defensive parse_usage() helper (was required u32 fields that crash on null/missing) - Truncate error responses to 512 chars in nearai_chat.rs error messages to prevent log bloat and potential data leakage - Delegate 4 missing LlmProvider methods in FailoverProvider (model_metadata, seed_response_chain, get_response_chain_id, calculate_cost) to last-used provider instead of trait defaults Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…en decorators - Add composable RetryProvider decorator wrapping any LlmProvider with exponential backoff + jitter, respecting RateLimited retry_after hints - Remove openai_compatible_chat.rs — replaced by rig adapter + RetryProvider - Remove internal retry loop from nearai.rs (was causing double-retry with external RetryProvider, up to 16 attempts instead of 4) - Remove internal retry loop from nearai_chat.rs (same issue) - Wire RetryProvider into main.rs composition chain: each provider gets its own retry wrapper before failover - Move normalize_tool_name to rig_adapter.rs for all rig-based providers - Reconcile is_retryable() vs is_transient() error classification: ModelNotAvailable no longer retryable, Json no longer transient - Fix unchecked Duration subtraction panic in circuit_breaker.rs - Make failover.rs use shared is_retryable() from retry.rs - Remove stale #[allow(dead_code)] on NearAiResponse::id (field is used) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-19T22:35:39Z

+            EmbeddingError::InvalidResponse(format!("Failed to parse Ollama response: {}", e))
+        })?;
+
+        Ok(result.embeddings)


The OllamaEmbeddings provider doesn't validate that the returned embedding dimension matches the configured dimension. If the Ollama model returns embeddings of a different dimension than what was configured (e.g., due to model misconfiguration), this mismatch will only be caught later when storing to the database, leading to a confusing error. Consider adding validation after line 456 to check that each embedding in result.embeddings has length equal to self.dimension, and return an EmbeddingError::InvalidResponse if the dimension doesn't match.

Suggested change

Ok(result.embeddings)

let embeddings = result.embeddings;

for (i, emb) in embeddings.iter().enumerate() {

if emb.len() != self.dimension {

return Err(EmbeddingError::InvalidResponse(format!(

"Ollama returned embedding of dimension {}, expected {} at index {}",

emb.len(),

self.dimension,

i

)));

}

}

Ok(embeddings)

Duplicate of the Copilot comment above — fixed in 0c42321 with dimension validation in embed_batch().

…n, libSQL warning - Replace response.text().await.unwrap_or_default() with proper error propagation in nearai.rs and nearai_chat.rs (4 call sites). Failures now return LlmError::RequestFailed with context instead of silently proceeding with an empty string. - Add embedding dimension validation in OllamaEmbeddings::embed_batch(): returns EmbeddingError if Ollama returns embeddings with a dimension that doesn't match the configured value. - Add runtime warning when libSQL backend is used with non-1536 embedding dimension, since the libSQL schema uses F32_BLOB(1536) and cannot store different-dimension vectors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…-replay-embeddings # Conflicts: # src/llm/failover.rs # src/llm/nearai_chat.rs # src/llm/rig_adapter.rs # src/main.rs

Copilot

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-19T23:11:46Z

+                    let mut idx = 0usize;
+                    while idx < tool_calls.len() {
+                        let mut tc = tool_calls[idx].clone();
+
                        // Check if tool requires approval


Switching to a manual while idx < tool_calls.len() loop requires ensuring idx is advanced on every control-flow path. There are continue branches later in this loop body (e.g. hook rejection / policy error) that will now skip idx += 1 and can cause an infinite loop on the same tool call.

Copilot · 2026-02-19T23:11:46Z

+            // Wait for at least 1 retry attempt (backoff is ~1s, so 1.5s should be enough)
+            tokio::time::sleep(Duration::from_millis(1500)).await;


This test introduces a 1.5s wall-clock sleep to wait for retry backoff, which will noticeably slow the suite and can be flaky under load. Consider making backoff delays configurable for tests, using an error with retry_after=Duration::ZERO, or using Tokio time control to advance time without real sleeps.

Suggested change

// Wait for at least 1 retry attempt (backoff is ~1s, so 1.5s should be enough)

tokio::time::sleep(Duration::from_millis(1500)).await;

// Wait briefly before flipping; this should allow at least one failure before retry

tokio::time::sleep(Duration::from_millis(10)).await;

Copilot · 2026-02-19T23:11:47Z

+        let dimension = optional_env("EMBEDDING_DIMENSION")?
+            .map(|s| s.parse::<usize>())
+            .transpose()
+            .map_err(|e| ConfigError::InvalidValue {
+                key: "EMBEDDING_DIMENSION".to_string(),
+                message: format!("must be a positive integer: {e}"),
+            })?
+            .unwrap_or_else(|| default_dimension_for_model(&model));
+


EMBEDDING_DIMENSION is parsed as usize, but the code (and error message) implies it must be a positive integer. As written, 0 is accepted and will propagate into embedding providers / DB writes where a zero-dimension vector is invalid. Add an explicit > 0 validation and return ConfigError::InvalidValue when the parsed dimension is 0.

Suggested change

let dimension = optional_env("EMBEDDING_DIMENSION")?

.map(|s| s.parse::<usize>())

.transpose()

.map_err(|e| ConfigError::InvalidValue {

key: "EMBEDDING_DIMENSION".to_string(),

message: format!("must be a positive integer: {e}"),

})?

.unwrap_or_else(|| default_dimension_for_model(&model));

let dimension_opt = optional_env("EMBEDDING_DIMENSION")?

.map(|s| s.parse::<usize>())

.transpose()

.map_err(|e| ConfigError::InvalidValue {

key: "EMBEDDING_DIMENSION".to_string(),

message: format!("must be a positive integer: {e}"),

})?;

if let Some(0) = dimension_opt {

return Err(ConfigError::InvalidValue {

key: "EMBEDDING_DIMENSION".to_string(),

message: "must be a positive integer greater than zero".to_string(),

});

}

let dimension = dimension_opt.unwrap_or_else(|| default_dimension_for_model(&model));

Copilot · 2026-02-19T23:11:47Z

+        let ollama_base_url = optional_env("OLLAMA_BASE_URL")?
+            .or_else(|| settings.ollama_base_url.clone())
+            .unwrap_or_else(|| "http://localhost:11434".to_string());


New env vars (OLLAMA_BASE_URL, EMBEDDING_DIMENSION) were added for embeddings resolution. The env-mutating tests in this module should also clear these vars in the clear_embedding_env() helper to avoid flakiness when they are set in the surrounding test environment.

Copilot · 2026-02-19T23:11:47Z

+            input: texts,
+        };
+
+        let url = format!("{}/api/embed", self.base_url);


OllamaEmbeddings builds the endpoint with format!("{}/api/embed", self.base_url). If base_url is configured with a trailing slash, this produces a //api/embed path which can 404 (depending on server path normalization / redirect behavior). Consider normalizing base_url (e.g., trimming trailing /) before formatting the endpoint URL.

Suggested change

let url = format!("{}/api/embed", self.base_url);

let base_url = self.base_url.trim_end_matches('/');

let url = format!("{}/api/embed", base_url);

Port relevant changes from PR #112 that were not carried over to #237: - Add persist_turn calls in process_approval for the response, error, and auth-required paths. Previously, turns completed after tool approval were never persisted to DB — if the process crashed after approval the entire turn (user message + assistant response) was lost. - Add agent-level unit tests: StaticLlmProvider mock, make_test_agent helper, tests for auto-approval logic, destructive shell command detection, and PendingApproval backward-compatible deserialization (without deferred_tool_calls field). - Remove unused _thread_state binding in process_approval. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: persist turns after approval and add agent-level tests Port relevant changes from PR #112 that were not carried over to #237: - Add persist_turn calls in process_approval for the response, error, and auth-required paths. Previously, turns completed after tool approval were never persisted to DB — if the process crashed after approval the entire turn (user message + assistant response) was lost. - Add agent-level unit tests: StaticLlmProvider mock, make_test_agent helper, tests for auto-approval logic, destructive shell command detection, and PendingApproval backward-compatible deserialization (without deferred_tool_calls field). - Remove unused _thread_state binding in process_approval. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address 14 audit findings in src/agent/ Audit of the agent module found 2 High, 7 Medium, 3 Low, and 2 Nit severity issues. This commit fixes all of them: High: - Remove 4 `.expect()` calls in session.rs (entry API, match, direct indexing, if-let) to eliminate panic paths in production - Add typed RoutineError enum replacing Result<_, String> across routine.rs, routine_engine.rs, and callers in history/store.rs and db/libsql/mod.rs Medium: - Sanitize routine names in path construction to prevent directory traversal (routine_engine.rs) - Log warnings for 5 silently-swallowed errors in scheduler.rs, compaction.rs, and worker.rs - Extract shared handle_auth_intercept helper to deduplicate auth interception in thread_ops.rs - Add session count warning threshold in session_manager.rs - Make FullJob stub degradation visible via warn-level log and prepended warning in output Low: - Restrict dead code visibility with #[cfg(test)] on 19 unused items in submission.rs, task.rs, and undo.rs - Narrow pub to pub(crate) on self_repair.rs builder methods - Remove TaskStatus from mod.rs re-exports (test-only type) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review comments - Reorder persist_turn before persist_response_chain so the conversation row exists before the metadata UPDATE runs - Add persist_response_chain call to handle_auth_intercept so auth-required paths preserve the response chain - Harden sanitize_routine_name to use allowlist (alphanumeric, dash, underscore) instead of denylist replacements - Fix stale active_thread ID in get_or_create_thread: fall back to create_thread() when the stored ID is missing from the map - Persist turn on approval rejection so user messages survive crashes after a tool is rejected Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…gs defaults (nearai#237) * fix: harden openai-compatible tool flow and local defaults * fix: close approval replay gaps and harden openai-compatible flow * fix: address review feedback and code improvements (takeover nearai#112) - Make ChatCompletionResponse.id Optional<String> to handle providers that omit or null the field - Propagate HTTP client builder errors instead of silently dropping timeout configuration (openai_compatible_chat, nearai_chat) - Add EMBEDDING_DIMENSION env var with smart per-model defaults instead of hardcoding 768/1536 everywhere - Remove duplicated dimension inference logic from main.rs Co-Authored-By: panosAthDBX <panosAthDBX@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: harden src/llm/ module from crate audit findings - Replace 9x .expect() on RwLock with graceful poison recovery (nearai.rs: 7, nearai_chat.rs: 2) — eliminates production panics - Propagate HTTP client builder errors in nearai.rs instead of silently dropping timeout config (NearAiProvider::new now returns Result) - Make nearai_chat ChatCompletionResponse.id Optional<String> (mirrors openai_compatible_chat.rs fix for providers that omit id) - Make nearai_chat usage fields optional with defensive parse_usage() helper (was required u32 fields that crash on null/missing) - Truncate error responses to 512 chars in nearai_chat.rs error messages to prevent log bloat and potential data leakage - Delegate 4 missing LlmProvider methods in FailoverProvider (model_metadata, seed_response_chain, get_response_chain_id, calculate_cost) to last-used provider instead of trait defaults Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(llm): add RetryProvider, remove openai_compatible_chat, harden decorators - Add composable RetryProvider decorator wrapping any LlmProvider with exponential backoff + jitter, respecting RateLimited retry_after hints - Remove openai_compatible_chat.rs — replaced by rig adapter + RetryProvider - Remove internal retry loop from nearai.rs (was causing double-retry with external RetryProvider, up to 16 attempts instead of 4) - Remove internal retry loop from nearai_chat.rs (same issue) - Wire RetryProvider into main.rs composition chain: each provider gets its own retry wrapper before failover - Move normalize_tool_name to rig_adapter.rs for all rig-based providers - Reconcile is_retryable() vs is_transient() error classification: ModelNotAvailable no longer retryable, Json no longer transient - Fix unchecked Duration subtraction panic in circuit_breaker.rs - Make failover.rs use shared is_retryable() from retry.rs - Remove stale #[allow(dead_code)] on NearAiResponse::id (field is used) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review feedback — error handling, dimension validation, libSQL warning - Replace response.text().await.unwrap_or_default() with proper error propagation in nearai.rs and nearai_chat.rs (4 call sites). Failures now return LlmError::RequestFailed with context instead of silently proceeding with an empty string. - Add embedding dimension validation in OllamaEmbeddings::embed_batch(): returns EmbeddingError if Ollama returns embeddings with a dimension that doesn't match the configured value. - Add runtime warning when libSQL backend is used with non-1536 embedding dimension, since the libSQL schema uses F32_BLOB(1536) and cannot store different-dimension vectors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: panosAthDbx <packux@gmail.com> Co-authored-by: panosAthDBX <127238517+panosAthDBX@users.noreply.github.com> Co-authored-by: panosAthDBX <panosAthDBX@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: persist turns after approval and add agent-level tests Port relevant changes from PR nearai#112 that were not carried over to nearai#237: - Add persist_turn calls in process_approval for the response, error, and auth-required paths. Previously, turns completed after tool approval were never persisted to DB — if the process crashed after approval the entire turn (user message + assistant response) was lost. - Add agent-level unit tests: StaticLlmProvider mock, make_test_agent helper, tests for auto-approval logic, destructive shell command detection, and PendingApproval backward-compatible deserialization (without deferred_tool_calls field). - Remove unused _thread_state binding in process_approval. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address 14 audit findings in src/agent/ Audit of the agent module found 2 High, 7 Medium, 3 Low, and 2 Nit severity issues. This commit fixes all of them: High: - Remove 4 `.expect()` calls in session.rs (entry API, match, direct indexing, if-let) to eliminate panic paths in production - Add typed RoutineError enum replacing Result<_, String> across routine.rs, routine_engine.rs, and callers in history/store.rs and db/libsql/mod.rs Medium: - Sanitize routine names in path construction to prevent directory traversal (routine_engine.rs) - Log warnings for 5 silently-swallowed errors in scheduler.rs, compaction.rs, and worker.rs - Extract shared handle_auth_intercept helper to deduplicate auth interception in thread_ops.rs - Add session count warning threshold in session_manager.rs - Make FullJob stub degradation visible via warn-level log and prepended warning in output Low: - Restrict dead code visibility with #[cfg(test)] on 19 unused items in submission.rs, task.rs, and undo.rs - Narrow pub to pub(crate) on self_repair.rs builder methods - Remove TaskStatus from mod.rs re-exports (test-only type) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review comments - Reorder persist_turn before persist_response_chain so the conversation row exists before the metadata UPDATE runs - Add persist_response_chain call to handle_auth_intercept so auth-required paths preserve the response chain - Harden sanitize_routine_name to use allowlist (alphanumeric, dash, underscore) instead of denylist replacements - Fix stale active_thread ID in get_or_create_thread: fall back to create_thread() when the stored ID is missing from the map - Persist turn on approval rejection so user messages survive crashes after a tool is rejected Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…gs defaults (nearai#237) * fix: harden openai-compatible tool flow and local defaults * fix: close approval replay gaps and harden openai-compatible flow * fix: address review feedback and code improvements (takeover nearai#112) - Make ChatCompletionResponse.id Optional<String> to handle providers that omit or null the field - Propagate HTTP client builder errors instead of silently dropping timeout configuration (openai_compatible_chat, nearai_chat) - Add EMBEDDING_DIMENSION env var with smart per-model defaults instead of hardcoding 768/1536 everywhere - Remove duplicated dimension inference logic from main.rs Co-Authored-By: panosAthDBX <panosAthDBX@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: harden src/llm/ module from crate audit findings - Replace 9x .expect() on RwLock with graceful poison recovery (nearai.rs: 7, nearai_chat.rs: 2) — eliminates production panics - Propagate HTTP client builder errors in nearai.rs instead of silently dropping timeout config (NearAiProvider::new now returns Result) - Make nearai_chat ChatCompletionResponse.id Optional<String> (mirrors openai_compatible_chat.rs fix for providers that omit id) - Make nearai_chat usage fields optional with defensive parse_usage() helper (was required u32 fields that crash on null/missing) - Truncate error responses to 512 chars in nearai_chat.rs error messages to prevent log bloat and potential data leakage - Delegate 4 missing LlmProvider methods in FailoverProvider (model_metadata, seed_response_chain, get_response_chain_id, calculate_cost) to last-used provider instead of trait defaults Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(llm): add RetryProvider, remove openai_compatible_chat, harden decorators - Add composable RetryProvider decorator wrapping any LlmProvider with exponential backoff + jitter, respecting RateLimited retry_after hints - Remove openai_compatible_chat.rs — replaced by rig adapter + RetryProvider - Remove internal retry loop from nearai.rs (was causing double-retry with external RetryProvider, up to 16 attempts instead of 4) - Remove internal retry loop from nearai_chat.rs (same issue) - Wire RetryProvider into main.rs composition chain: each provider gets its own retry wrapper before failover - Move normalize_tool_name to rig_adapter.rs for all rig-based providers - Reconcile is_retryable() vs is_transient() error classification: ModelNotAvailable no longer retryable, Json no longer transient - Fix unchecked Duration subtraction panic in circuit_breaker.rs - Make failover.rs use shared is_retryable() from retry.rs - Remove stale #[allow(dead_code)] on NearAiResponse::id (field is used) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review feedback — error handling, dimension validation, libSQL warning - Replace response.text().await.unwrap_or_default() with proper error propagation in nearai.rs and nearai_chat.rs (4 call sites). Failures now return LlmError::RequestFailed with context instead of silently proceeding with an empty string. - Add embedding dimension validation in OllamaEmbeddings::embed_batch(): returns EmbeddingError if Ollama returns embeddings with a dimension that doesn't match the configured value. - Add runtime warning when libSQL backend is used with non-1536 embedding dimension, since the libSQL schema uses F32_BLOB(1536) and cannot store different-dimension vectors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: panosAthDbx <packux@gmail.com> Co-authored-by: panosAthDBX <127238517+panosAthDBX@users.noreply.github.com> Co-authored-by: panosAthDBX <panosAthDBX@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: persist turns after approval and add agent-level tests Port relevant changes from PR nearai#112 that were not carried over to nearai#237: - Add persist_turn calls in process_approval for the response, error, and auth-required paths. Previously, turns completed after tool approval were never persisted to DB — if the process crashed after approval the entire turn (user message + assistant response) was lost. - Add agent-level unit tests: StaticLlmProvider mock, make_test_agent helper, tests for auto-approval logic, destructive shell command detection, and PendingApproval backward-compatible deserialization (without deferred_tool_calls field). - Remove unused _thread_state binding in process_approval. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address 14 audit findings in src/agent/ Audit of the agent module found 2 High, 7 Medium, 3 Low, and 2 Nit severity issues. This commit fixes all of them: High: - Remove 4 `.expect()` calls in session.rs (entry API, match, direct indexing, if-let) to eliminate panic paths in production - Add typed RoutineError enum replacing Result<_, String> across routine.rs, routine_engine.rs, and callers in history/store.rs and db/libsql/mod.rs Medium: - Sanitize routine names in path construction to prevent directory traversal (routine_engine.rs) - Log warnings for 5 silently-swallowed errors in scheduler.rs, compaction.rs, and worker.rs - Extract shared handle_auth_intercept helper to deduplicate auth interception in thread_ops.rs - Add session count warning threshold in session_manager.rs - Make FullJob stub degradation visible via warn-level log and prepended warning in output Low: - Restrict dead code visibility with #[cfg(test)] on 19 unused items in submission.rs, task.rs, and undo.rs - Narrow pub to pub(crate) on self_repair.rs builder methods - Remove TaskStatus from mod.rs re-exports (test-only type) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review comments - Reorder persist_turn before persist_response_chain so the conversation row exists before the metadata UPDATE runs - Add persist_response_chain call to handle_auth_intercept so auth-required paths preserve the response chain - Harden sanitize_routine_name to use allowlist (alphanumeric, dash, underscore) instead of denylist replacements - Fix stale active_thread ID in get_or_create_thread: fall back to create_thread() when the stored ID is missing from the map - Persist turn on approval rejection so user messages survive crashes after a tool is rejected Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

panosAthDBX and others added 5 commits February 16, 2026 17:22

fix: harden openai-compatible tool flow and local defaults

1dc4223

fix: close approval replay gaps and harden openai-compatible flow

347b485

Merge branch 'main' into fix/openai-compatible-approval-flow

5c063e9

Merge branch 'pr-112-incoming' into takeover/112-approval-replay-embe…

54a1bc3

…ddings # Conflicts: # CHANGELOG.md # src/agent/agent_loop.rs # src/config.rs # src/llm/mod.rs

Copilot AI review requested due to automatic review settings February 19, 2026 20:38

ilblackdragon mentioned this pull request Feb 19, 2026

fix: harden openai-compatible provider, approval replay, and embeddings defaults #112

Closed

Copilot started reviewing on behalf of ilblackdragon February 19, 2026 20:39 View session

gemini-code-assist Bot reviewed Feb 19, 2026

View reviewed changes

Copilot AI reviewed Feb 19, 2026

View reviewed changes

ilblackdragon and others added 2 commits February 19, 2026 13:16

Copilot AI review requested due to automatic review settings February 19, 2026 22:30

Copilot started reviewing on behalf of ilblackdragon February 19, 2026 22:30 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

ilblackdragon and others added 2 commits February 19, 2026 14:41

Merge remote-tracking branch 'origin/main' into takeover/112-approval…

f6575cc

…-replay-embeddings # Conflicts: # src/llm/failover.rs # src/llm/nearai_chat.rs # src/llm/rig_adapter.rs # src/main.rs

Copilot AI review requested due to automatic review settings February 19, 2026 22:55

Copilot started reviewing on behalf of ilblackdragon February 19, 2026 22:55 View session

Copilot AI reviewed Feb 19, 2026

View reviewed changes

Comment thread CHANGELOG.md Outdated

Apply suggestions from code review

29747ea

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings February 19, 2026 23:04

ilblackdragon merged commit 097a26a into main Feb 19, 2026
3 checks passed

ilblackdragon deleted the takeover/112-approval-replay-embeddings branch February 19, 2026 23:05

Copilot started reviewing on behalf of ilblackdragon February 19, 2026 23:05 View session

github-actions Bot mentioned this pull request Feb 19, 2026

chore: release v0.7.0 #239

Merged

Copilot AI reviewed Feb 19, 2026

View reviewed changes

github-actions Bot mentioned this pull request Feb 20, 2026

chore: release v0.8.0 #249

Merged

ilblackdragon mentioned this pull request Feb 20, 2026

fix: persist turns after approval and add agent-level tests #250

Merged

4 tasks

+        // Validate that all returned embeddings match the configured dimension.
+        let expected_dim = self.dimension;
+        if let Some((index, actual_dim)) = result
+            .embeddings
+            .iter()
+            .enumerate()
+            .find_map(|(i, emb)| {
+                let len = emb.len();
+                if len != expected_dim {
+                    Some((i, len))
+                } else {
+                    None
+                }
+            })
+        {
+            return Err(EmbeddingError::InvalidResponse(format!(
+                "Ollama returned embedding with incorrect dimension at index {}: expected {}, got {}",
+                index, expected_dim, actual_dim
+            )));
+        }

-        Ok(result.embeddings)
+        let embeddings = result.embeddings;
+        for (i, emb) in embeddings.iter().enumerate() {
+            if emb.len() != self.dimension {
+                return Err(EmbeddingError::InvalidResponse(format!(
+                    "Ollama returned embedding of dimension {}, expected {} at index {}",
+                    emb.len(),
+                    self.dimension,
+                    i
+                )));
+            }
+        }
+        Ok(embeddings)

		// Wait for at least 1 retry attempt (backoff is ~1s, so 1.5s should be enough)
		tokio::time::sleep(Duration::from_millis(1500)).await;

	let url = format!("{}/api/embed", self.base_url);
	let base_url = self.base_url.trim_end_matches('/');
	let url = format!("{}/api/embed", base_url);

Conversation

ilblackdragon commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes included (from original PR)

Composable RetryProvider & LLM module hardening

Changes from original

Original PR

Review comments addressed

Test plan

Uh oh!

gemini-code-assist Bot commented Feb 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ilblackdragon Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ilblackdragon Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ilblackdragon Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ilblackdragon Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

ilblackdragon Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

ilblackdragon commented Feb 19, 2026 •

edited

Loading