AI Models

Included Models

All models are stored at /opt/volt/ai-models/ and are included in the ISO image.

Model	File	Size	Format	License
Qwen3 1.7B (Q8_0)	`ai-models/Qwen3-1.7B-Q8_0.gguf`	1.7 GB	GGUF	Apache 2.0
Whisper Small	`ai-models/whisper/ggml-small.bin`	465 MB	GGML	MIT
OuteTTS 0.2 (Q8_0)	`ai-models/outetts/OuteTTS-0.2-500M-Q8_0.gguf`	512 MB	GGUF	Apache 2.0
WavTokenizer Large	`ai-models/outetts/WavTokenizer-Large-75-F16.gguf`	124 MB	GGUF	MIT

Total AI model footprint: ~2.8 GB

Model Directory Structure

/opt/volt/ai-models/
├── Qwen3-1.7B-Q8_0.gguf
├── outetts/
│   ├── OuteTTS-0.2-500M-Q8_0.gguf
│   └── WavTokenizer-Large-75-F16.gguf
├── whisper/
│   └── ggml-small.bin
└── bin/
    └── whisper

Inference Runtimes

All models run via llama.cpp:

LLM: llama-server with Qwen3 GGUF — conversation, reasoning, code
STT: whisper.cpp with ggml-small — speech recognition
TTS: llama-tts with OuteTTS GGUF + WavTokenizer — text-to-speech

TTS Quality

OuteTTS 0.2 provides significantly better quality than the previous Piper TTS engine:

82M parameter StyleTTS2 architecture
24 kHz output sample rate
Near-human speech quality
WavTokenizer vocoder for audio decoding

Resource Reservations

Model	VRM Quota	DWP Workers	Priority
Qwen3 (Orbit)	2048 MB	4 dedicated	Critical
Whisper (STT)	512 MB	Shared	High
OuteTTS (TTS)	512 MB	Shared	High

← Back: Documentation Index

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Models

Included Models

Model Directory Structure

Inference Runtimes

TTS Quality

Resource Reservations

Related

FilesExpand file tree

ai-models.md

Latest commit

History

ai-models.md

File metadata and controls

AI Models

Included Models

Model Directory Structure

Inference Runtimes

TTS Quality

Resource Reservations

Related