Voice assistant example - the "command" tool

There seems to be significant interest for a voice assistant application of Whisper, similar to "Ok, Google", "Hey Siri", "Alexa", etc. The existing [stream](https://github.com/ggerganov/whisper.cpp/tree/master/examples/stream) tool is not very applicable for this use case, because the voice assistant commands are usually short (i.e. `play some music`, `turn on the TV`, `kill all humans`, `feed the baby`, etc), while `stream` expects a continuous stream of speech.

Therefore, implement a basic command-line tool called `command` that does the following:

- Upon start, asks the person to say a "key phrase". The phrase should be an average sentence that normally takes 2-3 seconds to pronounce. We want to have enough "training" data of the person's voice
- If the transcribed text matches the expected phrase, then we "remember" this audio and use it later. Else, we ask to say it again until we succeed
- We start listening continuously for voice activity using my [VAD detector](https://github.com/ggerganov/whisper.cpp/blob/2f596f5b3329ccc415253e386e96a5c57498c3e2/examples/talk.wasm/emscripten.cpp#L1133-L1153) that I implemented for [talk.wasm](https://github.com/ggerganov/whisper.cpp/tree/master/examples/talk.wasm) - I think it works very well given it's simplicity
- When we detect speech, we prepend the recorded key-phrase to the last 2-3 seconds of the live audio and transcribe
- The result should be: `[key phrase][command]`, so by knowing the key phrase we can extract only the `[command]`

This should work in Web and Raspberry Pi and thanks to the VAD, it will be energy efficient.
Should be a good starting example for creating a voice assistant.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Voice assistant example - the "command" tool #171

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Voice assistant example - the "command" tool #171

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions