Skip to content

feat(skills): add use-local-whisper skill package#702

Merged
gavrielc merged 3 commits intoqwibitai:mainfrom
glifocat:upstream/skill-use-local-whisper
Mar 4, 2026
Merged

feat(skills): add use-local-whisper skill package#702
gavrielc merged 3 commits intoqwibitai:mainfrom
glifocat:upstream/skill-use-local-whisper

Conversation

@glifocat
Copy link
Copy Markdown
Collaborator

@glifocat glifocat commented Mar 4, 2026

Type of Change

  • Skill - adds a new skill in .claude/skills/
  • Fix - bug fix or security fix to source code
  • Simplification - reduces or simplifies source code

Description

Closes #699

Adds the use-local-whisper skill that replaces OpenAI Whisper API transcription with local whisper.cpp. Runs entirely on-device — no API key, no network, no cost. WhatsApp channel only for now.

Requires the voice-transcription skill to be applied first. Preserves the public API (transcribeAudioMessage, isVoiceMessage) while swapping the internals to use whisper-cli + ffmpeg.

Also fixes a stale Baileys test mock in the upstream add-voice-transcription skill — adds missing fetchLatestWaWebVersion and normalizeMessageContent exports that were causing all socket-dependent tests to fail after skill application.

@glifocat glifocat mentioned this pull request Mar 4, 2026
6 tasks
Comment thread .claude/skills/use-local-whisper/modify/src/transcription.ts Outdated
Ethan M and others added 3 commits March 4, 2026 15:55
Replaces OpenAI Whisper API transcription with local whisper.cpp,
eliminating the external API dependency and keeping voice data on-device.

Preserves the public API (transcribeAudioMessage, isVoiceMessage) while
swapping the internals to use whisper-cli + ffmpeg for local processing.
Requires the voice-transcription skill to be applied first.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The add-voice-transcription skill's modify template for whatsapp.test.ts
was missing fetchLatestWaWebVersion and normalizeMessageContent in the
Baileys mock, causing all socket-dependent tests to fail with ETIMEDOUT
after applying the skill.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gavrielc gavrielc force-pushed the upstream/skill-use-local-whisper branch from 82f3d05 to e98a1d2 Compare March 4, 2026 13:55
@gavrielc gavrielc merged commit 03f792b into qwibitai:main Mar 4, 2026
5 checks passed
@gavrielc
Copy link
Copy Markdown
Collaborator

gavrielc commented Mar 4, 2026

Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!

@gavrielc
Copy link
Copy Markdown
Collaborator

gavrielc commented Mar 4, 2026

Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!

em dash is claude's, words are mine lol

@glifocat
Copy link
Copy Markdown
Collaborator Author

glifocat commented Mar 4, 2026

Thank you so much for reviewing and merging @gavrielc ! I'm a new contributor to open source projects so every piece of feedback means the world to me! I got more stuff coming, just making sure I got my main instance upgraded to post PR #500 refactor before pushing :D

garettmd added a commit to garettmd/nanoclaw that referenced this pull request Mar 4, 2026
Merge upstream/main which includes:
- fix: scheduler race condition preventing duplicate task execution (qwibitai#657)
- refactor: multi-channel architecture with self-registering channels (qwibitai#500)
- feat: local whisper skill (qwibitai#702)
- fix: WhatsApp error handling (qwibitai#695)

Adapted our Telegram channel to use the new channel registry pattern
(registerChannel + factory). Removed direct channel instantiation from
index.ts in favor of the upstream barrel import approach.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@glifocat glifocat deleted the upstream/skill-use-local-whisper branch March 4, 2026 21:23
jenskock pushed a commit to jenskock/nanoclaw that referenced this pull request Mar 6, 2026
Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!
terrylica pushed a commit to terrylica/nanoclaw that referenced this pull request Mar 8, 2026
# [1.3.0](v1.2.0...v1.3.0) (2026-03-08)

### Bug Fixes

* add-voice-transcription skill drops WhatsApp registerChannel call ([qwibitai#766](https://github.com/terrylica/nanoclaw/issues/766)) ([47ad2e6](47ad2e6))
* aggressive false positive prevention — 5-layer MiniMax pipeline, devil's advocate round, FP learning ([8bfa372](8bfa372))
* atomic claim prevents scheduled tasks from executing twice ([qwibitai#657](https://github.com/terrylica/nanoclaw/issues/657)) ([f794185](f794185)), closes [qwibitai#138](https://github.com/terrylica/nanoclaw/issues/138) [qwibitai#138](https://github.com/terrylica/nanoclaw/issues/138) [qwibitai#211](https://github.com/terrylica/nanoclaw/issues/211) [qwibitai#300](https://github.com/terrylica/nanoclaw/issues/300) [qwibitai#578](https://github.com/terrylica/nanoclaw/issues/578) [qwibitai#601](https://github.com/terrylica/nanoclaw/issues/601) [qwibitai#138](https://github.com/terrylica/nanoclaw/issues/138) [qwibitai#300](https://github.com/terrylica/nanoclaw/issues/300) [qwibitai#138](https://github.com/terrylica/nanoclaw/issues/138)
* cc-skills now reads label strategy + content types; Claude JSON parsing hardened ([fd7fc7f](fd7fc7f))
* correct misleading send_message tool description for scheduled tasks ([qwibitai#729](https://github.com/terrylica/nanoclaw/issues/729)) ([ec0e42b](ec0e42b))
* **db:** add LIMIT to unbounded message history queries ([qwibitai#692](https://github.com/terrylica/nanoclaw/issues/692)) ([qwibitai#735](https://github.com/terrylica/nanoclaw/issues/735)) ([74b02c8](74b02c8))
* format src/index.ts to pass CI prettier check ([qwibitai#711](https://github.com/terrylica/nanoclaw/issues/711)) ([df2bac6](df2bac6)), closes [qwibitai#710](https://github.com/terrylica/nanoclaw/issues/710)
* grant write permissions to CLAUDE.md maintenance claude -p call ([9ddb433](9ddb433))
* rename _chatJid to chatJid in onMessage callback ([1436186](1436186))
* use 'state' instead of 'stateReason' for gh compatibility on bigblack ([a4f2e92](a4f2e92))
* **whatsapp:** add error handling to messages.upsert handler ([qwibitai#695](https://github.com/terrylica/nanoclaw/issues/695)) ([5e3d8b6](5e3d8b6))
* **whatsapp:** write pairing code to file for immediate access ([qwibitai#745](https://github.com/terrylica/nanoclaw/issues/745)) ([be19911](be19911))

### Features

* add /add-ollama skill for local model inference ([qwibitai#712](https://github.com/terrylica/nanoclaw/issues/712)) ([298c3ea](298c3ea))
* add ast-grep rules for Python static analysis ([a548761](a548761))
* add mise deploy task for bigblack deployment ([c39a1f4](c39a1f4))
* add NDJSON telemetry logging for all Telegram messages ([7f64ea6](7f64ea6))
* add update_task tool and return task ID from schedule_task ([68123fd](68123fd))
* cc-skills integration — enhanced issue creation with taxonomy-aware labels, type-specific templates, and discovery provenance ([602e65d](602e65d))
* CLAUDE.md maintenance creates GitHub issues with full link to Telegram ([ba34620](ba34620))
* CLAUDE.md maintenance, devil's advocate fix, OpenGrep + proactive scanning ([ce66e88](ce66e88))
* confidence scoring, verification scripts, log rotation — 3 more FP prevention layers ([0ff2c3c](0ff2c3c))
* iterative MiniMax self-validation (3 adversarial rounds) ([fc05aff](fc05aff))
* Phase 0 — enable Telegram channel and Docker Compose deployment ([ebbf59c](ebbf59c))
* Phase 2 — MiniMax orchestrator loop for continuous validation ([17e90a3](17e90a3))
* proactive algo correctness scanning with full Telegram + GitHub issue reporting ([4b68c3e](4b68c3e))
* **skills:** add image vision skill for WhatsApp ([qwibitai#770](https://github.com/terrylica/nanoclaw/issues/770)) ([af937d6](af937d6))
* **skills:** add pdf-reader skill ([qwibitai#772](https://github.com/terrylica/nanoclaw/issues/772)) ([0b260ec](0b260ec))
* **skills:** add use-local-whisper skill package ([qwibitai#702](https://github.com/terrylica/nanoclaw/issues/702)) ([03f792b](03f792b))
* timezone-aware context injection for agent prompts ([qwibitai#691](https://github.com/terrylica/nanoclaw/issues/691)) ([632713b](632713b)), closes [qwibitai#483](https://github.com/terrylica/nanoclaw/issues/483) [qwibitai#483](https://github.com/terrylica/nanoclaw/issues/483) [qwibitai#526](https://github.com/terrylica/nanoclaw/issues/526)
* whole-repo scanning instead of 3-file batches ([1ace951](1ace951))
* wire trace UUIDs into all Telegram notifications ([b48f0e9](b48f0e9))
ortalis97 pushed a commit to ortalis97/alfred that referenced this pull request Mar 8, 2026
Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!
idgmatrix pushed a commit to Gurufin-AI/nanoclaw that referenced this pull request Mar 9, 2026
Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!
squarewings pushed a commit to squarewings/nanoclaw that referenced this pull request Mar 15, 2026
Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!
bogdano2 pushed a commit to bogdano2/nanoclaw that referenced this pull request Mar 17, 2026
Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!
onlyforart pushed a commit to onlyforart/nanoclaw that referenced this pull request Mar 27, 2026
Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!
XiRoSe pushed a commit to XiRoSe/nova-agent that referenced this pull request Apr 9, 2026
Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!
dm-j pushed a commit to dm-j/nanoclaw that referenced this pull request Apr 13, 2026
Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support local voice transcription with whisper.cpp instead of OpenAI API

2 participants