feat: add vLLM/local LLM support#4
Merged
Re-bin merged 1 commit intoHKUDS:mainfrom Feb 2, 2026
Merged
Conversation
- Add vllm provider configuration in config schema - Auto-detect vLLM endpoints and use hosted_vllm/ prefix for LiteLLM - Pass api_base directly to acompletion for custom endpoints - Add vLLM status display in CLI status command - Add vLLM setup documentation in README
Collaborator
|
Hi ZhihaoZhang97, This is great! Thanks for the PR :) Best regards, |
|
这里不应该指定 vllm 应该以API通讯方式为主 设置 OpenAI-compatible 这样所有支持API的服务都能调用 比如ollama 和lm studio agents设置的 model-name 可以是* 这样不管后端换什么模型都不用改配置 用OpenAI-compatible通讯 前段也不用管模型是什么 而且OpenAI-compatible也支持命令加载某个模型或切换模型 |
anchapin
added a commit
to anchapin/nanobot
that referenced
this pull request
Feb 6, 2026
…support This commit fixes 6 categories of issues identified during code review: **Security Fixes (Task HKUDS#1, HKUDS#2):** - Fix LiteLLMProvider API key race condition by passing api_key directly to litellm instead of modifying os.environ (prevents credential leakage) - Fix RateLimiter defaultdict thread-safety issue by using explicit dict.get() - Add TTL-based cleanup to RateLimiter (max_age_seconds, max_entries) to prevent memory exhaustion DoS from unbounded user ID growth **Resource Management (Task HKUDS#3, HKUDS#4):** - Implement ProcessRegistry for tracking spawned ffmpeg/ffprobe processes - Add signal handlers (SIGTERM, SIGINT) for graceful process cleanup - Make video processing timeouts configurable (frame, audio, info) - Add timeout-safe process.wait() after process.kill() - Implement periodic background cleanup for media files - Add signal handlers to MediaCleanupRegistry for reliable cleanup - Add thread-safe file registration with get_stats() monitoring **Configuration Improvements (Task HKUDS#5):** - Make TTS model, max_text_length, and timeout configurable - Add validation for TTS config parameters (model, max_length, timeout) - Improve error messages with text preview for debugging - Return tuple[bool, str | None] from synthesize() to warn about truncation - Update TelegramChannel to handle TTS truncation warnings **Code Quality (Task HKUDS#6):** - Refactor rate limiters to use factory functions (cleaner API) - Keep backwards-compatible class aliases marked as deprecated - Update imports to use factory functions (tts_rate_limiter, etc.) **Documentation:** - Add CODE_REVIEW_ISSUES.md with detailed analysis and fix summary - Update CLAUDE.md with new multi-modal patterns and utilities **Testing:** - Add tests for rate limiter cleanup functionality - All 35 tests passing Core line count: 4,833 lines (security/reliability improvements added ~478 lines) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
orgoj
pushed a commit
to orgoj/nanobot
that referenced
this pull request
Feb 13, 2026
feat(router): add CODING tier, per-tier secondary models, greeting patterns
StreamAzure
pushed a commit
to StreamAzure/nanobot_Theo
that referenced
this pull request
Feb 18, 2026
feat: add vLLM/local LLM support
KinglittleQ
pushed a commit
to KinglittleQ/nanobot
that referenced
this pull request
Feb 19, 2026
## Fixes (from subagent code review) ### P0: Subagent tool context isolation (HKUDS#1) - Subagent now calls set_tool_context() before executing tools - Prevents message routing to wrong session in concurrent scenarios ### P0: Atomic session writes (HKUDS#2) - Session save() now writes to temp file then os.replace() (atomic on POSIX) - Crash mid-write no longer corrupts/empties session file ### P1: Consolidation saves last_consolidated (HKUDS#3) - After successful consolidation, session is saved to persist the pointer - On failure, still advances pointer to avoid retrying same messages ### P1: admin.json mtime cache (HKUDS#4) - _load_admin_config() now caches by file mtime - Avoids re-reading disk on every tool call in high-frequency scenarios ### P1: Subagent registry atomic write (HKUDS#5) - _save_registry() uses temp file + os.replace() like session save ### P2: LLM error responses not saved to session (HKUDS#17) - When finish_reason == "error", the error is returned to user but NOT added to session history, preventing context pollution ### P2: Subagent timeout protection (HKUDS#15) - Max iterations reduced from 500 to 200 - Added 30-minute wall-clock timeout to prevent runaway tasks ### P2: SSRF protection for web_fetch (HKUDS#10) - _validate_url() now blocks private/loopback/link-local/reserved IPs ### P3: ReadFileTool binary file handling (HKUDS#13) - Catches UnicodeDecodeError and returns friendly message instead of crash
KinglittleQ
pushed a commit
to KinglittleQ/nanobot
that referenced
this pull request
Feb 26, 2026
HKUDS#1 - Move MODEL_CONTEXT_WINDOWS/MODEL_PRICING to registry.py Add get_context_window() and get_pricing() module-level helpers Remove duplicate dicts from loop.py HKUDS#2 - Extract _build_status() into module-level _build_status_report() Pure function with no AgentLoop dependency; 80 lines → 12 lines in class HKUDS#3 - Extract on_cron_job closure into _make_cron_job_handler() gateway() is now ~50 lines shorter and easier to read HKUDS#4 - Extract _build_tools() factory function shared by AgentLoop + SubagentManager Eliminates ~15 lines of duplicated tool registration in subagent.py HKUDS#8 - Replace 6-tuple return from _run_agent_loop() with LoopResult dataclass Named fields instead of positional unpacking Also: - Move _strip_think/_tool_hint/_format_tool_detail to module level - Remove verbose=False branch from _format_tool_detail (unused) - Remove ToolRegistry import from subagent.py (no longer needed directly)
ollie-dev-ops
pushed a commit
to mics8128/nanobot
that referenced
this pull request
Feb 27, 2026
README fix
JiajunBernoulli
pushed a commit
to JiajunBernoulli/nanobot
that referenced
this pull request
Mar 15, 2026
fix: cannot import name '_apply_patches'; update v0.1.6
WTHDonghai
pushed a commit
to WTHDonghai/nanobot
that referenced
this pull request
Mar 22, 2026
…actions/setup-go-6 Bump actions/setup-go from 5 to 6
LeslieMiau
added a commit
to LeslieMiau/nanobot
that referenced
this pull request
Mar 29, 2026
Add a user-facing CLI entrypoint for creating persisted coding tasks and reuse the shared coding-task runtime loader across gateway and CLI flows. Update harness state to mark feature HKUDS#4 complete and record the verification checkpoint. Co-authored-by: Codex <noreply@openai.com>
weitongtong
added a commit
to weitongtong/nanobot
that referenced
this pull request
Apr 11, 2026
支持更新已有定时任务的名称、调度计划、消息内容、投递配置等可变字段。 系统任务(system_event)受保护不可编辑。包含完整的单元测试覆盖。 Made-with: Cursor Co-authored-by: weitongtong <tongtong.wei@nodeskai.com>
dragosroua
pushed a commit
to dragosroua/aigernon
that referenced
this pull request
Apr 13, 2026
feat: add vLLM/local LLM support
mohamed-elkholy95
added a commit
to mohamed-elkholy95/nanobot
that referenced
this pull request
Apr 20, 2026
Remove the two blank-line-after-import additions in transcribe_audio that slipped in with the bus-bounding change. Same class of unrelated formatting churn the reviewer flagged in point HKUDS#4; now the PR-vs-base diff for base.py contains only the functional drop-feedback branch.
This was referenced Apr 20, 2026
liflovs
added a commit
to liflovs/nanobot
that referenced
this pull request
Apr 29, 2026
Resolves conflicts in: - session/manager.py: kept upstream's media-breadcrumb + timestamp logic, added thinking_blocks to history allowlist (upstream had reasoning_content, we extended). Patch HKUDS#6 effectively absorbed. - agent/loop.py: dropped consolidation_model param (upstream removed), took upstream's expanded constructor signature, ExecTool sandbox/ allowed_env_keys, conditional WebSearch/WebFetch via web_config.enable. Patch HKUDS#1 (disable web_search) now achieved via config not code. - agent/tools/message.py: combined upstream's path resolution + metadata with our patch HKUDS#3 (zero-byte/missing media validation), running validation after path resolution. - cli/commands.py: dropped consolidation_model arg from agent ctor, took upstream's expanded args. Patch HKUDS#4 (cron prefix) merged with upstream's wording — kept the 'do not create new cron reminders' note. - config/schema.py: removed consolidation_model field (upstream dropped in favor of provider_retry_mode + max_tool_result_chars + others). All 5 fork patches still apply or have been absorbed by upstream: 1. web_search → now config-driven (web_config.enable=false) 2. HTML unescape → still in shell.py (auto-merged) 3. Media validation → integrated into message.py 4. Cron prefix → wording merged 5. ${VAR} interpolation → loader.py auto-merged (upstream added own resolve_config_env_vars; ours runs at load time, theirs is opt-in) 6. reasoning_content/thinking_blocks → upstream landed reasoning_content; we kept thinking_blocks extension
ctmmit
pushed a commit
to ctmmit/foreman
that referenced
this pull request
Apr 29, 2026
Phase 2 of the bootstrap plan — extends nanobot's free-form Dream memory with the seven structured slots from CLAUDE.md (shop_profile, equipment, customers, materials, routing_memory, pricing_corrections, audit_log). Files added under foreman/: - memory/models.py: Pydantic models for the seven slots, with reversal fields on PricingCorrection (corrections are marked reversed, never edited). - memory/store.py: ForemanMemoryStore subclass of nanobot.MemoryStore. Per-slot CRUD with audit-log-on-write at the data layer — the audit entry cannot be skipped because it's emitted from inside every mutating method, not from the calling tool. Atomic writes via tmp+os.replace. - memory/resolver.py: customer-id resolver with confidence-gated escalation per CLAUDE.md non-negotiable HKUDS#4. Resolution chain: exact email_address match → unique-domain match → fuzzy display_name. Below 0.9 → never returns a match; always returns escalate=True with candidate list for owner pick. Reply-To-different-from-From triggers escalation even when both individually match (forwarded-RFQ scenario). - hooks/personality.py: PersonalityWriteHook for telemetry on personality- mutating tool calls. Observability only — the actual audit log lives at the data layer. - tests/test_store.py + tests/test_resolver.py: 29 tests covering CRUD round- trips, audit-log integrity (the non-negotiable), atomic-write guarantees, reversal flow, and every documented resolver scenario including the property-test that confidence < 0.9 NEVER returns a match. pyproject.toml updated: - packages = ["nanobot", "foreman"] - testpaths includes foreman/tests - coverage source includes foreman All 29 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds support for vLLM and other OpenAI-compatible local LLM endpoints.
Changes
vllmprovider configurationhosted_vllm/prefix for LiteLLMnanobot statuscommandUsage
{ "providers": { "vllm": { "apiKey": "dummy", "apiBase": "http://your-vllm-server:8000/v1" } }, "agents": { "defaults": { "model": "your-model-name" } } }Testing
Tested with vLLM server running gpt-oss-120b model.