-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Labels
Description
This is a working-out-of-the-box demo for realtime speech recognition on macOS with wav2letter++
This is based on my C API in #326
There's a src dir in the w2l_cli tarball with the frontend source (w2l_cli.cpp) and scripts/instructions for building this all from scratch.
to install:
wget https://talonvoice.com/research/w2l_cli.tar.gz
tar -xf w2l_cli.tar.gz && rm w2l_cli.tar.gz
cd w2l_cli
wget https://talonvoice.com/research/epoch186-ls3_14.tar.gz
tar -xf epoch186-ls3_14.tar.gz && rm epoch186-ls3_14.tar.gz
to run:
./bin/w2l emit epoch186-ls3_14/model.bin epoch186-ls3_14/tokens.txt
Then speak, and you should see emissions (letter predictions) in the terminal output after you speak, for example:
$ ./bin/w2l emit epoch186-ls3_14/model.bin epoch186-ls3_14/tokens.txt
helow|world
this|is|a|test|of|wave|to|leter
Language model decoding is also wired up via ./bin/w2l decode am tokens lm lexicon, but as per #326 it segfaults right now when setting up the Trie.
There are more pretrained english acoustic models at https://talonvoice.com/research/ you can try as well.
ryan-zheng-teki, cocowalla, Franceshe, jan-arch, sakares and 2 morevineelpratap, wolfmanstout, jacobkahn, BoneGoat, cocowalla and 1 morejan-arch, twitchyliquid64, cri5Castro, nosami and vabiple