You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/speech_command_recognition/README.rst
+14-49Lines changed: 14 additions & 49 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -47,14 +47,8 @@ Format of Speech Commands
47
47
48
48
Different MultiNets support different format:
49
49
50
-
- Chinese
51
-
52
-
MultiNet5 and MultiNet6 sse Pinyin for Chinese speech commands. Please use :project_file:`tool/multinet_pinyin.py` to get pinyin of Chinese.
53
-
54
-
- English
55
-
56
-
MultiNet5 use phonemes for English speech commands. Simplicity, we use chats to denote different phoneme.Please use :project_file:`tool/multinet_g2p.py` to do the convention.
57
-
MultiNet6 use grapheme for English speech commands. You do not need any convention.
50
+
- MultiNet5 use phonemes for English speech commands. For simplicity, we use characters to denote different phonemes. Please use :project_file:`tool/multinet_g2p.py` to do the convention.
51
+
- MultiNet6 use grapheme for English speech commands. You do not need any conversion.
58
52
59
53
Suggestions on Customizing Speech Commands
60
54
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -63,45 +57,28 @@ When customizing speech command words, please pay attention to the following sug
63
57
64
58
.. list::
65
59
66
-
- The recommended length of Chinese speech commands is generally 4-6 Chinese characters. Too short leads to high false recognition rate and too long is inconvenient for users to remember
67
60
:esp32s3: - The recommended length of English speech commands is generally 4-6 words
68
61
- Mixed Chinese and English is not supported in command words
69
62
- The command word cannot contain Arabic numerals and special characters
70
-
- Avoid common command words like "hello"
71
-
- The greater the pronunciation difference of each Chinese character / word in the command words, the better the performance
72
63
73
64
Speech Commands Customization Methods
74
65
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
75
66
76
-
MultiNet6 customize speech commands:
77
-
78
-
- For English, words are used as units. Please modify a text file :project_file:`model/multinet_model/fst/commands_en.txt` by the following format:
67
+
MultiNet6 customize speech commands
68
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
69
+
- Words are used as units. Please modify a text file :project_file:`model/multinet_model/fst/commands_en.txt` by the following format:
79
70
80
71
::
81
72
82
73
# command_id command_sentence
83
74
1 TELL ME A JOKE
84
75
2 MAKE A COFFEE
85
76
86
-
- For Chinese, pinyin are used as units. Please modify a text file :project_file:`model/multinet_model/fst/commands_cn.txt` by the following format. :project_file:`tool/multinet_pinyin.py` help tp get Pinyin of Chinese.
87
77
88
-
::
78
+
MultiNet5 customize speech commands
79
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
89
80
90
-
# command_id command_sentence
91
-
1 da kai kong tiao
92
-
2 guan bi kong tiao
93
-
94
-
Multinet5 supports flexible methods to customize speech commands. Users can do it either online or offline and can also add/delete/modify speech commands dynamically.
95
-
96
-
.. only:: latex
97
-
98
-
.. figure:: ../../_static/QR_multinet_g2p.png
99
-
:alt:menuconfig_add_speech_commands
100
-
101
-
Customize Speech Commands Offline
102
-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
103
-
104
-
There are two methods for users to customize speech commands offline:
81
+
There are two methods to customize speech commands offline:
105
82
106
83
- Via ``menuconfig``
107
84
@@ -114,7 +91,7 @@ There are two methods for users to customize speech commands offline:
114
91
115
92
Please note that a single ``Command ID`` can correspond to more than one commands. For example, "da kai kong tiao" and "kai kong tiao" have the same meaning. Therefore, users can assign the same command id to these two commands and separate them with "," (no space required before and after).
116
93
117
-
1. Call the following API:
94
+
2. Call the following API:
118
95
119
96
::
120
97
@@ -135,19 +112,12 @@ There are two methods for users to customize speech commands offline:
135
112
136
113
- Via modifying code
137
114
138
-
Users directly customize the speech commands in the code and pass these commands to the MultiNet. In the actual user scenarios, users can pass these commands via various interfaces including network / UART / SPI. For details, see the example described in ESP-Skainet.
139
-
140
-
Customize speech commands online
141
-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
142
-
143
-
MultiNet allows users to add/delete/modify speech commands dynamically during the operation, without the need to change models or modifying parameters. For details, see the example described in ESP-Skainet.
144
-
145
-
For detailed description of APIs, please refer to :project_file:`src/esp_mn_speech_commands.c` .
115
+
Users directly customize the speech commands in the code and pass these commands to the MultiNet. In the actual user scenarios, users can pass these commands via various interfaces including network / UART / SPI. For detailed description of APIs. Please refer to :project_file:`src/esp_mn_speech_commands.c` and examples described in ESP-Skainet.
146
116
147
117
Use MultiNet
148
118
------------
149
119
150
-
MultiNet speech commands recognition must be used together with audio front-end (AFE) in ESP-SR (What's more, AFE must be used together with WakeNet). For details, see Section :doc:`AFE Introduction and Use <../audio_front_end/README>` .
120
+
We suggest to use MultiNet together with audio front-end (AFE) in ESP-SR. For details, see Section :doc:`AFE Introduction and Use <../audio_front_end/README>` .
151
121
152
122
After configuring AFE, users can follow the steps below to configure and run MultiNet.
153
123
@@ -187,11 +157,6 @@ Users can start MultiNet after enabling AFE and WakeNet, but must pay attention
187
157
MultiNet Output
188
158
~~~~~~~~~~~~~~~
189
159
190
-
Speech commands recognition supports two basic modes:
191
-
192
-
* Single recognition
193
-
* Continuous recognition
194
-
195
160
Speech command recognition must be used with WakeNet. After wake-up, MultiNet detection can start.
196
161
197
162
Afer running, MultiNet returns the recognition output of the current frame in real time ``mn_state``, which is currently divided into the following identification states:
@@ -228,13 +193,13 @@ Afer running, MultiNet returns the recognition output of the current frame in re
228
193
229
194
Users can use ``phrase_id[0]`` and ``prob[0]`` get the recognition result with the highest probability.
230
195
231
-
- ESP_MN_STATE_TIMEOUT
196
+
- ESP_MN_STATE_TIMEOUT
232
197
233
198
Indicates the speech commands has not been detected for a long time and will exit automatically and wait to be waked up again.
234
199
235
-
Therefore:
200
+
Single recognition mode and Continuous recognition mode:
236
201
* Single recognition mode: exit the speech recognition when the return status is ``ESP_MN_STATE_DETECTED``
237
-
* Continuous recognition: exit the speech recognition when the return status is ``ESP_MN_STATE_TIMEOUT``
202
+
* Continuous recognition mode: exit the speech recognition when the return status is ``ESP_MN_STATE_TIMEOUT``
Copy file name to clipboardExpand all lines: tool/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ For English, words are used as units. Please prepare a list of commands written
10
10
2 MAKE A COFFEE
11
11
```
12
12
13
-
For Chinese, pinyin are used as units. [multinet_pinyin.py](./multinet_pinyin.py) help tp get Pinyin of Chinese. Please prepare a list of commands written in a text file `commands_cn.txt` of the following format:
13
+
For Chinese, pinyin are used as units. [multinet_pinyin.py](./multinet_pinyin.py) help to get Pinyin of Chinese. Please prepare a list of commands written in a text file `commands_cn.txt` of the following format:
0 commit comments