Skip to content

Commit 8982693

Browse files
committed
Merge branch 'feat/update_mn6_doc' into 'release/v1.2.0'
Fix some typos See merge request speech-recognition-framework/esp-sr!23
2 parents 3610431 + e786bcb commit 8982693

File tree

3 files changed

+20
-85
lines changed

3 files changed

+20
-85
lines changed

docs/en/speech_command_recognition/README.rst

Lines changed: 14 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -47,14 +47,8 @@ Format of Speech Commands
4747

4848
Different MultiNets support different format:
4949

50-
- Chinese
51-
52-
MultiNet5 and MultiNet6 sse Pinyin for Chinese speech commands. Please use :project_file:`tool/multinet_pinyin.py` to get pinyin of Chinese.
53-
54-
- English
55-
56-
MultiNet5 use phonemes for English speech commands. Simplicity, we use chats to denote different phoneme.Please use :project_file:`tool/multinet_g2p.py` to do the convention.
57-
MultiNet6 use grapheme for English speech commands. You do not need any convention.
50+
- MultiNet5 use phonemes for English speech commands. For simplicity, we use characters to denote different phonemes. Please use :project_file:`tool/multinet_g2p.py` to do the convention.
51+
- MultiNet6 use grapheme for English speech commands. You do not need any conversion.
5852

5953
Suggestions on Customizing Speech Commands
6054
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -63,45 +57,28 @@ When customizing speech command words, please pay attention to the following sug
6357

6458
.. list::
6559

66-
- The recommended length of Chinese speech commands is generally 4-6 Chinese characters. Too short leads to high false recognition rate and too long is inconvenient for users to remember
6760
:esp32s3: - The recommended length of English speech commands is generally 4-6 words
6861
- Mixed Chinese and English is not supported in command words
6962
- The command word cannot contain Arabic numerals and special characters
70-
- Avoid common command words like "hello"
71-
- The greater the pronunciation difference of each Chinese character / word in the command words, the better the performance
7263

7364
Speech Commands Customization Methods
7465
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
7566

76-
MultiNet6 customize speech commands:
77-
78-
- For English, words are used as units. Please modify a text file :project_file:`model/multinet_model/fst/commands_en.txt` by the following format:
67+
MultiNet6 customize speech commands
68+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
69+
- Words are used as units. Please modify a text file :project_file:`model/multinet_model/fst/commands_en.txt` by the following format:
7970

8071
::
8172

8273
# command_id command_sentence
8374
1 TELL ME A JOKE
8475
2 MAKE A COFFEE
8576

86-
- For Chinese, pinyin are used as units. Please modify a text file :project_file:`model/multinet_model/fst/commands_cn.txt` by the following format. :project_file:`tool/multinet_pinyin.py` help tp get Pinyin of Chinese.
8777

88-
::
78+
MultiNet5 customize speech commands
79+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8980

90-
# command_id command_sentence
91-
1 da kai kong tiao
92-
2 guan bi kong tiao
93-
94-
Multinet5 supports flexible methods to customize speech commands. Users can do it either online or offline and can also add/delete/modify speech commands dynamically.
95-
96-
.. only:: latex
97-
98-
.. figure:: ../../_static/QR_multinet_g2p.png
99-
:alt: menuconfig_add_speech_commands
100-
101-
Customize Speech Commands Offline
102-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
103-
104-
There are two methods for users to customize speech commands offline:
81+
There are two methods to customize speech commands offline:
10582

10683
- Via ``menuconfig``
10784

@@ -114,7 +91,7 @@ There are two methods for users to customize speech commands offline:
11491

11592
Please note that a single ``Command ID`` can correspond to more than one commands. For example, "da kai kong tiao" and "kai kong tiao" have the same meaning. Therefore, users can assign the same command id to these two commands and separate them with "," (no space required before and after).
11693

117-
1. Call the following API:
94+
2. Call the following API:
11895

11996
::
12097

@@ -135,19 +112,12 @@ There are two methods for users to customize speech commands offline:
135112

136113
- Via modifying code
137114

138-
Users directly customize the speech commands in the code and pass these commands to the MultiNet. In the actual user scenarios, users can pass these commands via various interfaces including network / UART / SPI. For details, see the example described in ESP-Skainet.
139-
140-
Customize speech commands online
141-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
142-
143-
MultiNet allows users to add/delete/modify speech commands dynamically during the operation, without the need to change models or modifying parameters. For details, see the example described in ESP-Skainet.
144-
145-
For detailed description of APIs, please refer to :project_file:`src/esp_mn_speech_commands.c` .
115+
Users directly customize the speech commands in the code and pass these commands to the MultiNet. In the actual user scenarios, users can pass these commands via various interfaces including network / UART / SPI. For detailed description of APIs. Please refer to :project_file:`src/esp_mn_speech_commands.c` and examples described in ESP-Skainet.
146116

147117
Use MultiNet
148118
------------
149119

150-
MultiNet speech commands recognition must be used together with audio front-end (AFE) in ESP-SR (What's more, AFE must be used together with WakeNet). For details, see Section :doc:`AFE Introduction and Use <../audio_front_end/README>` .
120+
We suggest to use MultiNet together with audio front-end (AFE) in ESP-SR. For details, see Section :doc:`AFE Introduction and Use <../audio_front_end/README>` .
151121

152122
After configuring AFE, users can follow the steps below to configure and run MultiNet.
153123

@@ -187,11 +157,6 @@ Users can start MultiNet after enabling AFE and WakeNet, but must pay attention
187157
MultiNet Output
188158
~~~~~~~~~~~~~~~
189159

190-
Speech commands recognition supports two basic modes:
191-
192-
* Single recognition
193-
* Continuous recognition
194-
195160
Speech command recognition must be used with WakeNet. After wake-up, MultiNet detection can start.
196161

197162
Afer running, MultiNet returns the recognition output of the current frame in real time ``mn_state``, which is currently divided into the following identification states:
@@ -228,13 +193,13 @@ Afer running, MultiNet returns the recognition output of the current frame in re
228193

229194
Users can use ``phrase_id[0]`` and ``prob[0]`` get the recognition result with the highest probability.
230195

231-
- ESP_MN_STATE_TIMEOUT
196+
- ESP_MN_STATE_TIMEOUT
232197

233198
Indicates the speech commands has not been detected for a long time and will exit automatically and wait to be waked up again.
234199

235-
Therefore:
200+
Single recognition mode and Continuous recognition mode:
236201
* Single recognition mode: exit the speech recognition when the return status is ``ESP_MN_STATE_DETECTED``
237-
* Continuous recognition: exit the speech recognition when the return status is ``ESP_MN_STATE_TIMEOUT``
202+
* Continuous recognition mode: exit the speech recognition when the return status is ``ESP_MN_STATE_TIMEOUT``
238203

239204
Resource Occupancy
240205
------------------

docs/zh_CN/speech_command_recognition/README.rst

Lines changed: 5 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -47,16 +47,8 @@ MultiNet 输入为经过前端语音算法(AFE)处理过的音频(格式
4747

4848
不同版本的MultiNet命令词格式不同。命令词需要满足特定的格式,具体如下:
4949

50-
- 中文
51-
5250
MultiNet5和MultiNet6使用汉语拼音作为基本识别单元,并且每个字的拼音拼写间隔一个空格。比如“打开空调”,应该写成 “da kai kong tiao”,请使用以下工具将汉字转为拼音: :project_file:`tool/multinet_pinyin.py` 。
5351

54-
- 英文
55-
56-
MultiNet5: 使用音标作为基本识别单元。为简单起见,将每个音标映射为单个字母表示,比如“turn on the light”,需要写成“TkN nN jc LiT”。请使用我们提供的工具进行转换,详细可见: :project_file:`tool/multinet_g2p.py` 。
57-
MultiNet6: 使用subwords作为识别单元,用户可以直接输入所需短语。比如“turn on the light”,直接写为“turn on the light”即可。
58-
59-
6052
自定义要求
6153
~~~~~~~~~~~
6254

@@ -96,17 +88,7 @@ MultiNet6 离线设置命令词的方法:
9688
1 da kai kong tiao
9789
2 guan bi kong tiao
9890

99-
- 英语通过修改 :project_file:`model/multinet_model/fst/commands_en.txt`
100-
101-
格式如下,第一个数字代表command id, 后面为指令的英语短语,两者由空格隔开,单词间也由空格隔开
102-
103-
::
104-
105-
# command_id command_sentence
106-
1 TELL ME A JOKE
107-
2 MAKE A COFFEE
108-
109-
MultiNet5 支持两种离线设置命令词的方法:
91+
MultiNet5 离线设置命令词的方法:
11092

11193
- 通过 ``menuconfig``
11294

@@ -119,7 +101,7 @@ MultiNet5 支持两种离线设置命令词的方法:
119101

120102
注意,单个 Command ID 可以支持多个短语,比如“打开空调”和“开空调”表示的意义相同,则可以将其写在同一个 Command ID 对应的词条中,用英文字符“,”隔开相邻词条(“,”前后无需空格)。
121103

122-
1. 在代码里调用以下 API:
104+
2. 在代码里调用以下 API:
123105

124106
::
125107

@@ -140,19 +122,12 @@ MultiNet5 支持两种离线设置命令词的方法:
140122

141123
- 通过修改代码
142124

143-
该方法中,用户直接在代码中编写命令词,并传给 MultiNet。在实际产品开发和使用中,用户可以通过网络/UART/SPI 等多种接口,传递所需的命令词并随时更换命令词。详情可参考 ESP-Skainet 中的 example。
144-
145-
在线设置命令词
146-
^^^^^^^^^^^^^^
147-
148-
MultiNet 还支持在运行过程中,在线动态设置命令词(添加/删除/修改),且整个过程无须更换模型或调整参数。详情可参考 ESP-Skainet 中 example。
149-
150-
具体 API 说明请参考 :project_file:`src/esp_mn_speech_commands.c` 。
125+
该方法中,用户直接在代码中编写命令词,并传给 MultiNet。在实际产品开发和使用中,用户可以通过网络/UART/SPI 等多种接口,传递所需的命令词并随时更换命令词。具体 API 说明请参考 :project_file:`src/esp_mn_speech_commands.c` 和 ESP-Skainet 中的 example。
151126

152127
MultiNet 的使用
153128
----------------
154129

155-
MultiNet 命令词识别需要和 ESP-SR 中的 AFE 声学算法模块一起运行(此外,AFE 运行还需要使能 WakeNet 功能,具体请参考 :doc:`AFE 介绍及使用 <../audio_front_end/README>` )。
130+
MultiNet 命令词识别建议和 ESP-SR 中的 AFE 声学算法模块一起运行,具体请参考 :doc:`AFE 介绍及使用 <../audio_front_end/README>` )。
156131

157132
当用户配置完成 AFE 后,请按照以下步骤配置和运行 MultiNet。
158133

@@ -192,11 +167,6 @@ MultiNet 运行
192167
MultiNet 识别结果
193168
~~~~~~~~~~~~~~~~~
194169

195-
MultiNet 命令词识别支持两种基本模式:
196-
197-
* 单次识别
198-
* 连续识别
199-
200170
命令词识别必须和唤醒搭配使用,当唤醒后可以运行命令词的检测。
201171

202172
命令词模型在运行时,会实时返回当前帧的识别状态 ``mn_state`` ,目前分为以下几种识别状态:
@@ -237,7 +207,7 @@ MultiNet 命令词识别支持两种基本模式:
237207

238208
该状态表示长时间未检测到命令词,自动退出。等待下次唤醒。
239209

240-
因此
210+
单次识别模式和连续识别模式
241211
当命令词识别返回状态为 ``ESP_MN_STATE_DETECTED`` 时退出命令词识别,则为单次识别模式;
242212
当命令词识别返回状态为 ``ESP_MN_STATE_TIMEOUT`` 时退出命令词识别,则为连续识别模式;
243213

tool/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ For English, words are used as units. Please prepare a list of commands written
1010
2 MAKE A COFFEE
1111
```
1212

13-
For Chinese, pinyin are used as units. [multinet_pinyin.py](./multinet_pinyin.py) help tp get Pinyin of Chinese. Please prepare a list of commands written in a text file `commands_cn.txt` of the following format:
13+
For Chinese, pinyin are used as units. [multinet_pinyin.py](./multinet_pinyin.py) help to get Pinyin of Chinese. Please prepare a list of commands written in a text file `commands_cn.txt` of the following format:
1414
```
1515
# command_id command_sentence
1616
1 da kai kong tiao

0 commit comments

Comments
 (0)