Skip to content

Latest commit

 

History

History
87 lines (59 loc) · 2.46 KB

File metadata and controls

87 lines (59 loc) · 2.46 KB

AI Stack

Local, offline, private AI running entirely on-device — zero cloud dependency. Total model footprint: ~2.8 GB, stored at /opt/volt/ai-models/.


Pipeline

Text → Qwen3 1.7B (llama-server) → Response text

Audio → Whisper (STT) → Text → Qwen3 → OuteTTS (TTS) → Audio

Models

Model Format Size Role
Qwen3-1.7B-Q8_0.gguf GGUF (Q8_0) 1.7 GB LLM — conversation, reasoning, code
ggml-small.bin GGML (f16) 465 MB Speech recognition (Whisper)
OuteTTS-0.2-500M-Q8_0.gguf GGUF (Q8_0) 512 MB Text-to-speech generation
WavTokenizer-Large-75-F16.gguf GGUF (F16) 124 MB Audio vocoder

Resource Management

Orbit receives dedicated resource reservations to ensure interactive performance:

Resource Value
DWP workers 4 dedicated inference workers
Tesseract reservation 512 MiB reserved memory
VRM memory quota 2048 MiB (critical priority, T1 tier)
Burst window 100ms with 75% worker allocation during inference
Max consecutive bursts 3

Qwen3 1.7B Parameters

Parameter Fast Smart Code Data-science
Temperature 0.7 0.6 0.5 0.5
top_p 0.8 0.95 0.95 0.95
top_k 20 20 20 20
min_p 0 0 0 0
presence_penalty 1.5 1.5 1.5 1.5
num_ctx 8192 8192 8192 8192
n_predict 1024 4096 4096 4096

Inference Details

  • All inference runs locally via llama.cpp (llama-server for LLM, llama-tts for TTS)
  • GPU acceleration via Metal (macOS) or CUDA / Vulkan (Linux)
  • Models idle-shutdown after 45 seconds of inactivity (configurable)
  • First call incurs ~8s Metal shader init (subsequent calls use cached pipeline)
  • Lazy loading: model loaded on first request (or at Orbit startup)

Integration

Orbit runs as a service within Kernel A:

  • LLM inference managed by orbitRuntime.js interfacing with llama.cpp
  • STT handled by sttService.js using the Whisper binary
  • TTS handled by ttsService.js using OuteTTS
  • The Orbit bridge (orbitBridge.js) in the Volt Unifier handles desktop communication

Related



← Back: Architecture Overview