Local whisper transcription by dilidin2 · Pull Request #3723 · HKUDS/nanobot

dilidin2 · 2026-05-10T00:41:36Z

Summary

Adds support for local voice transcription via faster-whisper, a fast C++/ONNX reimplementation of Whisper that runs entirely on the host machine — no API key or network access required.

This is useful for users who:

Do not want to depend on Groq or OpenAI for transcription
Need offline or air-gapped deployments
Want to reduce costs by avoiding transcription API calls

Changes

nanobot/providers/transcription.py — adds FasterWhisperTranscriptionProvider with:
- Lazy model loading (the model is loaded on first use, not at startup)
- Class-level model cache with a 10-minute idle TTL — the model is automatically unloaded after inactivity to free RAM/VRAM
- asyncio.Lock to prevent concurrent calls from loading the model twice
- Auto-detection of CUDA via torch (falls back to CPU if torch is not installed)
nanobot/channels/base.py — adds transcription_model_size and transcription_device fields to BaseChannel
nanobot/config/schema.py — adds the same two fields to ChannelsConfig; transcription_provider now accepts "local" in addition to "groq" and "openai"
pyproject.toml — adds optional dependency group local-transcription
tests/providers/test_transcription.py — adds tests for missing file, missing package, device auto-detection, model caching, and concurrent call safety

Installation

faster-whisper is an optional dependency and must be explicitly installed:

uv tool install "nanobot-ai[local-transcription]"

The Whisper model weights (~500 MB for small) are downloaded automatically on first use and cached in ~/.cache/huggingface/hub/.

Configuration

Add the following to your nanobot config:

{
  "transcriptionProvider": "local",
  "transcriptionModelSize": "small",
  "transcriptionDevice": "cpu"
}

Available model sizes (trade-off between quality, speed and RAM):

Model	RAM	Notes
`tiny`	~390 MB	Fastest, lower accuracy
`base`	~500 MB	Good for simple speech
`small`	~960 MB	Recommended default
`medium`	~3 GB	Higher accuracy
`large-v3`	~6 GB	Best quality, GPU recommended

Notes

No changes to the existing Groq and OpenAI provider paths
If faster-whisper is not installed and provider: local is configured, the transcription returns an empty string and logs a clear install instruction
torch is not a required dependency — CUDA auto-detection gracefully falls back to CPU if torch is not present

This implementation was developed with AI assistance.

- Replace deprecated asyncio.get_event_loop() with get_running_loop() in _schedule_unload() and transcribe() - Clear _unload_handle in _do_unload() to avoid stale reference - Remove autouse=True from _reset_faster_whisper_cache fixture - Rewrite auto_device tests using sys.modules patching instead of patch('torch.cuda.is_available') — no torch dependency needed - Add test_faster_whisper_auto_device_torch_not_installed case

This was referenced May 10, 2026

🦞 OpenClaw 生态日报 2026-05-10 gsscsd/big_model_radar#321

Open

🦞 OpenClaw 生态日报 2026-05-10 ivanweng2077/big_model_radar#21

Open

dilidin2 added 3 commits May 11, 2026 00:36

feat: add local faster-whisper transcription provider

49729fc

fix: remove unused exc variable in FasterWhisperTranscriptionProvider

8c25809

chengyongru force-pushed the feat/local-whisper-transcription branch from 998951b to 8c25809 Compare May 10, 2026 16:37

This was referenced May 11, 2026

🦞 OpenClaw 生态日报 2026-05-11 gsscsd/big_model_radar#326

Open

🦞 OpenClaw 生态日报 2026-05-11 ivanweng2077/big_model_radar#26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local whisper transcription#3723

Local whisper transcription#3723
dilidin2 wants to merge 3 commits into
HKUDS:nightlyfrom
dilidin2:feat/local-whisper-transcription

dilidin2 commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dilidin2 commented May 10, 2026

Summary

Changes

Installation

Configuration

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant