refactor: simplify config resolution and consolidate main.rs init#287
refactor: simplify config resolution and consolidate main.rs init#287ilblackdragon merged 7 commits intomainfrom
Conversation
…o AppBuilder - Add parse_bool_env() and parse_string_env() helpers to eliminate repetitive 5-line optional_env/parse/map_err/unwrap_or boilerplate across 12 config files - Add EmbeddingsConfig::create_provider() to centralize embeddings construction (fixes hardcoded 1536 dimensions and missing Ollama provider in app.rs) - Extract init_cli_tracing(), setup_wasm_channels(), start_tunnel(), run_memory_command(), run_worker(), run_claude_bridge() from main.rs - Replace ~600 lines of inline init in main.rs with AppBuilder::build_all() - Expose catalog_entries from AppComponents for gateway registry entries - Net reduction: ~738 lines across 15 files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello @ilblackdragon, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the application's configuration resolution and main initialization logic. By introducing dedicated helper functions for environment variable parsing and centralizing complex component creation, the codebase becomes more maintainable and less prone to duplication. The Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request successfully refactors the configuration resolution and consolidates the application initialization logic into a reusable AppBuilder. This significantly reduces boilerplate and improves maintainability. However, I identified a regression where the names of development WASM tools loaded from build artifacts are no longer tracked or returned by the builder, which breaks hook registration for those tools. Additionally, some configuration fields in AgentConfig were missed during the transition to the new environment parsing helpers.
| Ok(( | ||
| mcp_session_manager, | ||
| wasm_tool_runtime, | ||
| extension_manager, | ||
| catalog_entries, | ||
| )) | ||
| } |
There was a problem hiding this comment.
The return tuple should include the list of development tool names collected during the WASM loading phase to ensure they can be passed to the hook bootstrapping process.
References
- When implementing mutually exclusive logic or refactoring shared state, ensure that all relevant state flags and identifiers are updated consistently to prevent incorrect behavior.
There was a problem hiding this comment.
Fixed in bd5523e. init_extensions() now captures and returns dev_loaded_tool_names from the WASM loading future. The names are exposed on AppComponents and passed through to bootstrap_hooks() in main.rs.
| max_cost_per_day_cents: crate::config::helpers::optional_env("MAX_COST_PER_DAY_CENTS")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| key: "AGENT_MAX_PARALLEL_JOBS".to_string(), | ||
| message: format!("must be a positive integer: {e}"), | ||
| })? | ||
| .unwrap_or(settings.agent.max_parallel_jobs as usize), | ||
| job_timeout: Duration::from_secs( | ||
| optional_env("AGENT_JOB_TIMEOUT_SECS")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| key: "AGENT_JOB_TIMEOUT_SECS".to_string(), | ||
| message: format!("must be a positive integer: {e}"), | ||
| })? | ||
| .unwrap_or(settings.agent.job_timeout_secs), | ||
| ), | ||
| stuck_threshold: Duration::from_secs( | ||
| optional_env("AGENT_STUCK_THRESHOLD_SECS")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| key: "AGENT_STUCK_THRESHOLD_SECS".to_string(), | ||
| message: format!("must be a positive integer: {e}"), | ||
| })? | ||
| .unwrap_or(settings.agent.stuck_threshold_secs), | ||
| ), | ||
| repair_check_interval: Duration::from_secs( | ||
| optional_env("SELF_REPAIR_CHECK_INTERVAL_SECS")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| key: "SELF_REPAIR_CHECK_INTERVAL_SECS".to_string(), | ||
| message: format!("must be a positive integer: {e}"), | ||
| })? | ||
| .unwrap_or(settings.agent.repair_check_interval_secs), | ||
| ), | ||
| max_repair_attempts: optional_env("SELF_REPAIR_MAX_ATTEMPTS")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| key: "SELF_REPAIR_MAX_ATTEMPTS".to_string(), | ||
| message: format!("must be a positive integer: {e}"), | ||
| })? | ||
| .unwrap_or(settings.agent.max_repair_attempts), | ||
| use_planning: optional_env("AGENT_USE_PLANNING")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| key: "AGENT_USE_PLANNING".to_string(), | ||
| message: format!("must be 'true' or 'false': {e}"), | ||
| })? | ||
| .unwrap_or(settings.agent.use_planning), | ||
| session_idle_timeout: Duration::from_secs( | ||
| optional_env("SESSION_IDLE_TIMEOUT_SECS")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| key: "SESSION_IDLE_TIMEOUT_SECS".to_string(), | ||
| message: format!("must be a positive integer: {e}"), | ||
| })? | ||
| .unwrap_or(settings.agent.session_idle_timeout_secs), | ||
| ), | ||
| allow_local_tools: optional_env("ALLOW_LOCAL_TOOLS")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| key: "ALLOW_LOCAL_TOOLS".to_string(), | ||
| message: format!("must be 'true' or 'false': {e}"), | ||
| })? | ||
| .unwrap_or(false), | ||
| max_cost_per_day_cents: optional_env("MAX_COST_PER_DAY_CENTS")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| .map_err(|e: std::num::ParseIntError| ConfigError::InvalidValue { | ||
| key: "MAX_COST_PER_DAY_CENTS".to_string(), | ||
| message: format!("must be a positive integer: {e}"), | ||
| })?, | ||
| max_actions_per_hour: optional_env("MAX_ACTIONS_PER_HOUR")? | ||
| max_actions_per_hour: crate::config::helpers::optional_env("MAX_ACTIONS_PER_HOUR")? | ||
| .map(|s| s.parse()) | ||
| .transpose() | ||
| .map_err(|e| ConfigError::InvalidValue { | ||
| .map_err(|e: std::num::ParseIntError| ConfigError::InvalidValue { | ||
| key: "MAX_ACTIONS_PER_HOUR".to_string(), | ||
| message: format!("must be a positive integer: {e}"), | ||
| })?, |
There was a problem hiding this comment.
These fields are still using the verbose optional_env/map/transpose/map_err boilerplate that this PR aims to eliminate. While parse_optional_env requires a default value, a new helper like parse_option_env could be added to helpers.rs to handle Option<T> types consistently across the configuration files.
There was a problem hiding this comment.
Fixed in bd5523e. Added parse_option_env<T>() helper to config/helpers.rs for Option<T> fields. The two remaining verbose blocks (max_cost_per_day_cents, max_actions_per_hour) now use it, reducing them from 5 lines each to a single call.
There was a problem hiding this comment.
Pull request overview
This PR refactors startup/config plumbing to reduce duplication and centralize component construction, primarily by introducing small config parsing helpers, moving embeddings provider construction behind EmbeddingsConfig, and switching main.rs over to AppBuilder::build_all() for core initialization.
Changes:
- Add config helper functions (
parse_bool_env,parse_string_env) and update multiple configresolve()implementations to use them. - Add
EmbeddingsConfig::create_provider()and use it from bothmain.rsandAppBuilderto standardize embeddings provider creation. - Consolidate
main.rsinitialization by delegating core component setup toAppBuilder::build_all()and extracting helper functions for CLI/worker/tunnel/WASM-channel setup.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/main.rs | Replaces large inline init with AppBuilder::build_all() and adds helper functions (CLI tracing, worker tracing, tunnel, WASM channels, memory/worker/bridge runners). |
| src/app.rs | Extends AppComponents (exports catalog_entries) and updates tool/embeddings init to use centralized embeddings provider creation + credential-aware tool registry. |
| src/config/helpers.rs | Adds parse_bool_env / parse_string_env helpers for consistent env parsing. |
| src/config/embeddings.rs | Adds EmbeddingsConfig::create_provider() and refactors env parsing to use new helpers. |
| src/config/agent.rs | Simplifies env parsing via new helper functions. |
| src/config/builder.rs | Simplifies env parsing via new helper functions. |
| src/config/channels.rs | Simplifies env parsing (ports/bools) via new helper functions. |
| src/config/heartbeat.rs | Simplifies env parsing via new helper functions. |
| src/config/hygiene.rs | Simplifies env parsing via new helper functions. |
| src/config/routines.rs | Simplifies env parsing via new helper functions. |
| src/config/safety.rs | Simplifies env parsing via new helper functions. |
| src/config/sandbox.rs | Simplifies env parsing via new helper functions (bools/strings). |
| src/config/skills.rs | Simplifies env parsing via new helper functions. |
| src/config/wasm.rs | Simplifies env parsing via new helper functions. |
| src/llm/reasoning.rs | Updates the agent identity string used in the reasoning prompt. |
Comments suppressed due to low confidence (1)
src/config/helpers.rs:63
- The
parse_bool_enverror message says the value must be 'true' or 'false', but the function also explicitly accepts '1' and '0'. Updating the message to reflect the accepted set (e.g., "'true'/'false' or '1'/'0'") will make configuration errors less confusing.
pub(crate) fn parse_bool_env(key: &str, default: bool) -> Result<bool, ConfigError> {
match optional_env(key)? {
Some(s) => match s.to_lowercase().as_str() {
"true" | "1" => Ok(true),
"false" | "0" => Ok(false),
_ => Err(ConfigError::InvalidValue {
key: key.to_string(),
message: format!("must be 'true' or 'false', got '{s}'"),
}),
},
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Create the appropriate embedding provider based on configuration. | ||
| /// | ||
| /// Returns `None` if embeddings are disabled or the required credentials | ||
| /// are missing. The `nearai_base_url` and `session` are needed only for | ||
| /// the NEAR AI provider but must be passed unconditionally. | ||
| pub fn create_provider( | ||
| &self, | ||
| nearai_base_url: &str, | ||
| session: Arc<SessionManager>, | ||
| ) -> Option<Arc<dyn EmbeddingProvider>> { | ||
| if !self.enabled { | ||
| tracing::info!("Embeddings disabled (set EMBEDDING_ENABLED=true to enable)"); | ||
| return None; | ||
| } | ||
|
|
||
| match self.provider.as_str() { | ||
| "nearai" => { | ||
| tracing::info!( | ||
| "Embeddings enabled via NEAR AI (model: {}, dim: {})", | ||
| self.model, | ||
| self.dimension, | ||
| ); | ||
| Some(Arc::new( | ||
| crate::workspace::NearAiEmbeddings::new(nearai_base_url, session) | ||
| .with_model(&self.model, self.dimension), | ||
| )) | ||
| } | ||
| "ollama" => { | ||
| tracing::info!( | ||
| "Embeddings enabled via Ollama (model: {}, url: {}, dim: {})", | ||
| self.model, | ||
| self.ollama_base_url, | ||
| self.dimension, | ||
| ); | ||
| Some(Arc::new( | ||
| crate::workspace::OllamaEmbeddings::new(&self.ollama_base_url) | ||
| .with_model(&self.model, self.dimension), | ||
| )) | ||
| } | ||
| _ => { | ||
| if let Some(api_key) = self.openai_api_key() { | ||
| tracing::info!( | ||
| "Embeddings enabled via OpenAI (model: {}, dim: {})", | ||
| self.model, | ||
| self.dimension, | ||
| ); | ||
| Some(Arc::new(crate::workspace::OpenAiEmbeddings::with_model( | ||
| api_key, | ||
| &self.model, | ||
| self.dimension, | ||
| ))) | ||
| } else { | ||
| tracing::warn!("Embeddings configured but OPENAI_API_KEY not set"); | ||
| None | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
EmbeddingsConfig::create_provider() centralizes provider selection and dimensions, but there are currently no tests covering its branching behavior (nearai/ollama/openai fallback, disabled case, missing OPENAI_API_KEY). Since this file already has config-resolution tests, adding focused unit tests for create_provider() would help prevent regressions like the previously hardcoded 1536-dimension bug.
There was a problem hiding this comment.
Acknowledged. The provider constructors (NearAiEmbeddings, OllamaEmbeddings, OpenAiEmbeddings) require real HTTP endpoints or API keys, making them integration-test territory rather than unit tests. The branching logic itself is straightforward pattern matching on a config string. Adding mock-based tests here would add complexity without proportional value — the real protection is that any misuse will fail at startup with a clear log message.
| &active_tool_names, | ||
| &loaded_wasm_channel_names, | ||
| &dev_loaded_tool_names, | ||
| &[], // dev_loaded_tool_names tracked inside AppBuilder now |
There was a problem hiding this comment.
bootstrap_hooks relies on dev_loaded_tool_names to decide whether to load capabilities from installed WASM tools vs dev tool artifacts (see hooks/bootstrap.rs logic). Passing an empty slice here means dev tool capability files will never be registered as plugin hooks, which can break hook loading when dev tools are active. Consider returning the dev_loaded_tool_names from AppBuilder::build_all() (e.g., on AppComponents) and passing it through here instead of &[].
| &[], // dev_loaded_tool_names tracked inside AppBuilder now | |
| &components.dev_loaded_tool_names, // dev_loaded_tool_names tracked inside AppBuilder now |
There was a problem hiding this comment.
Fixed in bd5523e. dev_loaded_tool_names is now captured from the WASM loading future in init_extensions(), returned via the tuple, stored on AppComponents, and passed to bootstrap_hooks() in main.rs. Dev tool plugin hooks will now register correctly.
…tion_env helper Address PR review feedback: - Capture dev_loaded_tool_names from WASM loading in init_extensions() and expose via AppComponents so bootstrap_hooks receives the actual dev tool names instead of an empty slice (fixes silent hook skip) - Add parse_option_env<T>() helper for Option<T> config fields, simplifying max_cost_per_day_cents and max_actions_per_hour in agent.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CostGuard was independently looking up pricing via costs::model_cost(), falling back to GPT-4o default rates when NEAR AI model names didn't match the static table — causing ~3x cost overestimates in logs. - Add pricing map to NearAiChatProvider that fetches real rates from /v1/model/list at startup (background, non-blocking) - Update cost_per_token() to check fetched pricing first, then static table, then default - Add cost_per_token parameter to CostGuard::record_llm_call() so the dispatcher passes provider-sourced rates directly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace fireworks llama4-maverick-instruct-basic with zai-org/GLM-latest as the default model in config and setup wizard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…fig-and-main # Conflicts: # src/main.rs
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 21 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .into(), | ||
| "Llama 4 Maverick (default, fast)".into(), | ||
| ), | ||
| ("zai/GLM-latest".into(), "GLM Latest (default, fast)".into()), |
There was a problem hiding this comment.
The default model name is inconsistent between files. In src/config/llm.rs it's set to "zai-org/GLM-latest" while in src/setup/wizard.rs it's set to "zai/GLM-latest" (missing the "-org" suffix). This inconsistency will cause the wizard's default model to not match the actual config default, potentially confusing users or causing mismatched behavior.
| ("zai/GLM-latest".into(), "GLM Latest (default, fast)".into()), | |
| ("zai-org/GLM-latest".into(), "GLM Latest (default, fast)".into()), |
There was a problem hiding this comment.
Fixed in ff3e043. Changed "zai/GLM-latest" to "zai-org/GLM-latest" in wizard.rs to match config/llm.rs.
Change "zai/GLM-latest" to "zai-org/GLM-latest" in wizard.rs to match the default in config/llm.rs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…arai#287) * refactor: simplify config resolution and consolidate main.rs init into AppBuilder - Add parse_bool_env() and parse_string_env() helpers to eliminate repetitive 5-line optional_env/parse/map_err/unwrap_or boilerplate across 12 config files - Add EmbeddingsConfig::create_provider() to centralize embeddings construction (fixes hardcoded 1536 dimensions and missing Ollama provider in app.rs) - Extract init_cli_tracing(), setup_wasm_channels(), start_tunnel(), run_memory_command(), run_worker(), run_claude_bridge() from main.rs - Replace ~600 lines of inline init in main.rs with AppBuilder::build_all() - Expose catalog_entries from AppComponents for gateway registry entries - Net reduction: ~738 lines across 15 files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: propagate dev_loaded_tool_names from AppBuilder and add parse_option_env helper Address PR review feedback: - Capture dev_loaded_tool_names from WASM loading in init_extensions() and expose via AppComponents so bootstrap_hooks receives the actual dev tool names instead of an empty slice (fixes silent hook skip) - Add parse_option_env<T>() helper for Option<T> config fields, simplifying max_cost_per_day_cents and max_actions_per_hour in agent.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: fetch real NEAR AI pricing and unify cost calculation path CostGuard was independently looking up pricing via costs::model_cost(), falling back to GPT-4o default rates when NEAR AI model names didn't match the static table — causing ~3x cost overestimates in logs. - Add pricing map to NearAiChatProvider that fetches real rates from /v1/model/list at startup (background, non-blocking) - Update cost_per_token() to check fetched pricing first, then static table, then default - Add cost_per_token parameter to CostGuard::record_llm_call() so the dispatcher passes provider-sourced rates directly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: update default NEAR AI model to GLM-latest Replace fireworks llama4-maverick-instruct-basic with zai-org/GLM-latest as the default model in config and setup wizard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: align wizard default model name with config Change "zai/GLM-latest" to "zai-org/GLM-latest" in wizard.rs to match the default in config/llm.rs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…arai#287) * refactor: simplify config resolution and consolidate main.rs init into AppBuilder - Add parse_bool_env() and parse_string_env() helpers to eliminate repetitive 5-line optional_env/parse/map_err/unwrap_or boilerplate across 12 config files - Add EmbeddingsConfig::create_provider() to centralize embeddings construction (fixes hardcoded 1536 dimensions and missing Ollama provider in app.rs) - Extract init_cli_tracing(), setup_wasm_channels(), start_tunnel(), run_memory_command(), run_worker(), run_claude_bridge() from main.rs - Replace ~600 lines of inline init in main.rs with AppBuilder::build_all() - Expose catalog_entries from AppComponents for gateway registry entries - Net reduction: ~738 lines across 15 files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: propagate dev_loaded_tool_names from AppBuilder and add parse_option_env helper Address PR review feedback: - Capture dev_loaded_tool_names from WASM loading in init_extensions() and expose via AppComponents so bootstrap_hooks receives the actual dev tool names instead of an empty slice (fixes silent hook skip) - Add parse_option_env<T>() helper for Option<T> config fields, simplifying max_cost_per_day_cents and max_actions_per_hour in agent.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: fetch real NEAR AI pricing and unify cost calculation path CostGuard was independently looking up pricing via costs::model_cost(), falling back to GPT-4o default rates when NEAR AI model names didn't match the static table — causing ~3x cost overestimates in logs. - Add pricing map to NearAiChatProvider that fetches real rates from /v1/model/list at startup (background, non-blocking) - Update cost_per_token() to check fetched pricing first, then static table, then default - Add cost_per_token parameter to CostGuard::record_llm_call() so the dispatcher passes provider-sourced rates directly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: update default NEAR AI model to GLM-latest Replace fireworks llama4-maverick-instruct-basic with zai-org/GLM-latest as the default model in config and setup wizard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: align wizard default model name with config Change "zai/GLM-latest" to "zai-org/GLM-latest" in wizard.rs to match the default in config/llm.rs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
parse_bool_env()andparse_string_env()helpers to eliminate ~40 repetitive 5-lineoptional_env/parse/map_err/unwrap_orpatterns across 12 config filesEmbeddingsConfig::create_provider(), fixing bugs whereapp.rshad hardcoded 1536 dimensions and was missing Ollama provider supportAppBuilder::build_all()(net reduction: ~730 lines)/v1/model/listat startup instead of hardcoding rates; unify cost calculation soCostGuarduses provider-sourced rates instead of independently falling back to GPT-4o defaults (~3x overestimate)zai-org/GLM-latestKey changes
src/config/helpers.rsparse_bool_env,parse_string_env,parse_option_envhelperssrc/config/*.rs(11 files)resolve()methods using new helperssrc/config/embeddings.rscreate_provider()methodsrc/main.rsAppBuilder::build_all()src/llm/nearai_chat.rscost_per_token()src/agent/cost_guard.rscost_per_tokenparamsrc/agent/dispatcher.rsllm.cost_per_token()to cost guardsrc/config/llm.rs,src/setup/wizard.rszai-org/GLM-latestTest plan
cargo fmt --checkpassescargo clippy -- -D warningspassescargo test— all 1384 tests passNonefallback path)🤖 Generated with Claude Code