Skip to content

tracking: provider modules refactor — Cycle 2 of transport/provider infrastructure #14418

@kshitijk4poor

Description

@kshitijk4poor

Overview

Cycle 2 of the provider infrastructure refactor. Cycle 1 (#13473, PRs 1-7 merged) extracted format conversion into agent/transports/. This cycle consolidates per-provider quirks from 5+ files into single-file provider modules.

Problem: Adding or modifying a provider touches auth.py + runtime_provider.py + models.py + auxiliary_client.py + run_agent.py + transport. Kimi and Copilot each touch 7 files despite having ~130-200 lines of quirk code. ChatCompletionsTransport takes 20+ boolean flag params because every provider's quirks are passed individually.

Solution: providers/<name>.py modules — each declares auth, endpoints, headers, temperature, max_tokens, message preprocessing, and extra_body in one place. Transport receives a provider object instead of flags.

Workstreams

WS1: Transport Cleanup (from Cycle 1 gaps)

Prerequisite work before provider modules can absorb the adapters.

WS2: Provider Module ABC + Registry

Design the provider interface and migrate providers.

WS3: Provider Migrations (ordered by scatter/complexity)

Migrate actual providers, most-scattered first.

Additional work completed (PR #14424)

Beyond the original WS2/WS3 scope, PR #14424 also delivered full auto-wiring
infrastructure — adding providers/<name>.py now zero-touches every integration
point:

Integration point File Mechanism
PROVIDER_REGISTRY hermes_cli/auth.py Loop over list_providers() at import; skips names already declared
CANONICAL_PROVIDERS hermes_cli/models.py Appends ProviderEntry for api_key providers at import
--provider CLI choices hermes_cli/main.py _build_provider_choices() derives from CANONICAL_PROVIDERS
api_key provider catch-all hermes_cli/main.py _is_profile_api_key_provider() routes new providers to model flow without new elif
provider_model_ids() hermes_cli/models.py Calls profile.fetch_models() then profile.fallback_models for api_key profiles
Doctor health checks hermes_cli/doctor.py Auto-appends to _apikey_providers_static at runtime
OPTIONAL_ENV_VARS hermes_cli/config.py _inject_profile_env_vars() populates from env_vars at import
_URL_TO_PROVIDER domain map agent/model_metadata.py profile.get_hostname() auto-derived from base_url
Transport kwargs agent/transports/chat_completions.py _build_kwargs_from_profile() — hooks handle all quirks
api_mode resolution hermes_cli/runtime_provider.py profile.api_mode read directly
Aux model selection agent/auxiliary_client.py profile.default_aux_model read first
Request routing run_agent.py get_provider_profile() — all 30 registered providers go through profile path

New profile fields added beyond the original spec:

  • display_name — human label for picker / setup wizard
  • description — picker subtitle
  • signup_url — shown during first-run setup
  • fallback_models — curated list for model picker when live fetch fails
  • hostname — base hostname for URL→provider reverse-mapping (auto-derived from base_url)
  • default_aux_model — cheap model for auxiliary tasks

Provider Quirk Scatter (Current State)

Provider Files touched Approx lines Key quirks
Anthropic 6 ~1510 OAuth identity spoof, model-gated features, thinking modes, prompt caching
Bedrock 5 ~1000 boto3 client, dual api_mode (Converse vs Anthropic), region/guardrail
OpenAI Codex 6 ~940 Responses API, Cloudflare headers, encrypted reasoning
Nous Portal 6 ~875 OAuth device flow, agent keys, rate guard, tags
OpenRouter 6 ~340 HTTP headers, provider preferences, reasoning extra_body
Copilot/GitHub 7 ~205 Editor headers, dynamic api_mode per model, reasoning
Kimi/Moonshot 7 ~130 Dual endpoint, User-Agent spoof, temp OMIT, thinking/effort
Custom/Ollama 4 ~110 Auto-detect model, num_ctx, think=false
Qwen Portal 4 ~95 OAuth, message preprocessing, vl_high_resolution
ZAI/GLM 3 ~85 4-endpoint probing, billing plan detection
MiniMax 4 ~75 Anthropic-compat, Bearer auth, beta header strip
xAI/Grok 4 ~45 Responses API, encrypted reasoning, conv headers
NVIDIA 3 ~30 max_tokens 16384
Alibaba 2 ~25 DashScope URL
HuggingFace 2 ~25 Case-sensitive model IDs
DeepSeek 3 ~20 reasoning_content field

What Stays on AIAgent (NOT moving to providers)

  • Client lifecycle (construction, interrupt, rebuild)
  • Streaming orchestration
  • Credential rotation / fallback
  • Prompt caching
  • Message history / _build_assistant_message
  • Tool dispatch

Relationship to Cycle 1

Open items

  • auth.py PROVIDER_REGISTRY full replacement — complex OAuth flows
    (Anthropic, Nous device-code, Copilot, OpenAI Codex, Bedrock) are still handled
    by bespoke code in hermes_cli/auth.py. Auto-extend covers all simple api_key
    providers, but OAuth flows need their own migration work.
  • Docswebsite/docs/developer-guide/adding-providers.md exists but
    needs a full pass; integration docs on the new auto-wiring contract.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existsarea/configConfig system, migrations, profilescomp/agentCore agent loop, run_agent.py, prompt buildertype/refactorCode restructuring, no behavior change

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions