Skip to content

Commit 1c0503e

Browse files
committed
Merge branch 'rename_webrtc' into 'master'
Rename webrtc See merge request speech-recognition-framework/esp-sr!141
2 parents 79556de + b4f1200 commit 1c0503e

32 files changed

+24409
-536
lines changed

docs/_static/kconfig.png

27.8 KB
Loading

docs/en/flash_model/README.rst

Lines changed: 50 additions & 186 deletions
Original file line numberDiff line numberDiff line change
@@ -1,216 +1,80 @@
1-
Flashing Models
2-
===============
1+
Model Selection and Loading
2+
===========================
33

44
:link_to_translation:`zh_CN:[中文]`
55

6-
In the AI industry, a model refers to a mathematical representation of a system or process. It is used to make predictions or decisions based on input data. There are many types of models, such as decision trees, neural networks, and support vector machines, each with their own strengths and weaknesses. Esprssif also provides our trained models such as WakeNet and MultiNet (see the model data used in :project:`model`)
6+
This document explains how to select and load models for ESP-SR.
77

8-
To use our models in your project, you need to flash these models. Currently, ESP-SR supports the following methods to flash models:
8+
Model Selection
9+
---------------
910

10-
.. only:: esp32
11+
ESP-SR allows you to choose required models through the ``menuconfig`` interface. To configure models:
1112

12-
ESP32: Load directly from Flash
13+
1. Run ``idf.py menuconfig``
14+
2. Navigate to **ESP Speech Recognition**
15+
3. Configure the following options:
16+
- **Noise Suppression Model**
17+
- **VAD Model**
18+
- **WakeNet Model**
19+
- **MultiNet Model**
1320

14-
.. only:: esp32s3
21+
.. figure:: ../../_static/kconfig.png
22+
:alt: kconfig
1523

16-
ESP32-S3:
1724

18-
- Load directly from SIP Flash File System (flash)
19-
- Load from external SD card
25+
Updating Partition Table
26+
------------------------
27+
You must add a `partition.csv` file and ensure that there is enough space for the selected models.
28+
Add the following line to your project's ``partitions.csv`` file to allocate space for models:
2029

21-
So that on ESP32-S3 you can:
30+
.. code-block::
2231
23-
- Greatly reduce the size of the user application APP BIN
24-
- Supports the selection of up to two wake words
25-
- Support online switching of Chinese and English Speech Command Recognition
26-
- Convenient for users to perform OTA
27-
- Supports reading and changing models from SD card, which is more convenient and can reduce the size of module Flash used in the project
28-
- When the user is developing the code, when the modification does not involve the model, it can avoid flashing the model data every time, greatly reducing the flashing time and improving the development efficiency
32+
model, data, , , 6000K
2933
30-
Configuration
31-
-------------
32-
33-
Run ``idf.py menuconfig`` to navigate to ``ESP Speech Recognition``:
34-
35-
.. figure:: ../../_static/model-1.png
36-
:alt: overview
37-
38-
overview
39-
40-
.. only:: esp32s3
41-
42-
Model Data Path
43-
~~~~~~~~~~~~~~~
44-
45-
This option indicates the storage location of the model data: ``Read model data from flash`` or ``Read model data from SD card``.
46-
47-
- ``Read model data from flash`` means that the model data is stored in the flash, and the model data will be loaded from the flash partition
48-
- ``Read model data from SD card`` means that the model data is stored in the SD card, and the model data will be loaded from the SD card
49-
50-
Use AFE
51-
~~~~~~~
52-
53-
This option is enabled by default. Users do not need to modify it. Please keep the default configuration.
54-
55-
Use WakeNet
56-
~~~~~~~~~~~
57-
58-
This option is enabled by default. When the user only uses ``AEC`` or ``BSS``, etc., and does not need ``WakeNet`` or ``MultiNet``, please disable this option, which reduces the size of the project firmware.
59-
60-
Select wake words by via ``menuconfig`` by navigating to ``ESP Speech Recognition`` > ``Select wake words``. The model name of wake word in parentheses must be used to initialize WakeNet handle.
61-
62-
|select wake wake|
63-
64-
If you want to select multiple wake words, please select ``Load Multiple Wake Words``
65-
66-
|multi wake wake|
67-
68-
Then you can select multiple wake words at the same time:
69-
70-
|image1|
71-
72-
.. only:: esp32
73-
74-
.. note::
75-
ESP32 doesn't support multiple wake words.
76-
77-
.. only:: esp32s3
78-
79-
.. note::
80-
ESP32-S3 does support multiple wake words. Users can select more than one wake words according to the hardware flash size.
81-
82-
For more details, please refer to :doc:`WakeNet <../wake_word_engine/README>` .
83-
84-
Use Multinet
85-
~~~~~~~~~~~~
86-
87-
This option is enabled by default. When users only use WakeNet or other algorithm modules, please disable this option, which reduces the size of the project firmware in some cases.
88-
89-
Chinese Speech Commands Model
90-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
91-
92-
.. only:: esp32
34+
- Replace ``6000K`` with your custom partition size according to the selected models.
35+
- ``model`` is the partition label (fixed value).
9336

94-
ESP32 only supports command words in Chinese:
95-
96-
- None
97-
- Chinese single recognition (MultiNet2)
98-
99-
.. only:: esp32s3
100-
101-
ESP32-S3 supports command words in both Chinese and English:
102-
103-
- None
104-
- Chinese single recognition (MultiNet4.5)
105-
- Chinese single recognition (MultiNet4.5 quantized with 8-bit)
106-
- English Speech Commands Model
107-
108-
The user needs to add Chinese Speech Command words to this item when ``Chinese Speech Commands Model`` is not ``None``.
109-
110-
.. only:: esp32s3
111-
112-
English Speech Commands Model
113-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
114-
115-
ESP32-S3 supports command words in both Chinese and English, and allows users to switch between these two languages.
116-
117-
- None
118-
- English recognition (MultiNet5 quantized with 8-bit, depends on WakeNet8)
119-
- Add Chinese speech commands
120-
121-
The user needs to add English Speech Command words to this item when ``English Speech Commands Model`` is not ``None``.
122-
123-
For more details, please refer to Section :doc:`MultiNet <../speech_command_recognition/README>` .
124-
125-
How To Use
126-
----------
127-
128-
After the above-mentioned configuration, users can initialize and start using the models following the examples described in the `ESP-Skainet <https://github.com/espressif/esp-skainet>`_ repo.
129-
130-
Here, we only introduce the code implementation, which can also be found in :project_file:`src/model_path.c`.
131-
132-
.. only:: esp32
133-
134-
ESP32 can only load model data from flash. Therefore, the model data in the code will automatically read the required data from the Flash according to the address. Note that, ESP32 and ESP32-S3 APIs are compatible.
135-
136-
.. only:: esp32s3
137-
138-
ESP32-S3 can load model data from flash or SD card.
139-
140-
Load Model Data from flash
141-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
142-
143-
#. Write a partition table:
144-
145-
::
146-
147-
model, data, spiffs, , SIZE,
148-
149-
Among them, ``SIZE`` can refer to the recommended size when the user uses ``idf.py build`` to compile, for example: ``Recommended model partition size: 500K``
150-
151-
#. Initialize the flash partition: User can use ``esp_srmodel_init(partition_label)`` API to initialize flash and return all loaded models.
152-
153-
- base_path: The model storage ``base_path`` is ``srmodel`` and cannot be changed
154-
- partition_label: The partition label of the model is ``model``, which needs to be consistent with the ``Name`` in the above partition table
155-
156-
After completing the above configuration, the project will automatically generate ``model.bin`` after the project is compiled, and flash it to the flash partition.
37+
Model Loading
38+
-------------
15739

158-
.. only:: esp32s3
40+
ESP-IDF Framework
41+
~~~~~~~~~~~~~~~~~
15942

160-
Load Model Data from SD Card
161-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
43+
ESP-SR automatically handles model loading through its CMake scripts:
16244

163-
When configured to load model data from ``Read model data from SD card``, users need to:
45+
1. Flash the device with all components:
46+
``idf.py flash``
47+
*This command automatically loads the selected models.*
16448

165-
- Manually load model data from SD card
166-
After the above-mentioned configuration, users can compile the code, and copy the files in ``model/target`` to the root directory of the SD card.
49+
2. For code debugging (without re-flashing models):
50+
``idf.py app-flash``
16751

168-
- Initialize SD card
169-
Users must initialize SD card so the chip can load SD card. Users of `ESP-Skainet <https://github.com/espressif/esp-skainet>`_ can call ``esp_sdcard_init("/sdcard", num);`` to initialize any board supported SD cards. Otherwise, users need to write the initialization code themselves.
170-
After the above-mentioned steps, users can flash the project.
52+
.. note::
53+
The model loading script is defined in ``esp-sr/CMakeLists.txt``. Models are flashed to the partition labeled ``model`` during initial flashing.
17154

172-
- Read models
173-
User use ``esp_srmodel_init(model_path)`` to read models in ``model_path`` of SD card.
55+
Arduino Framework
56+
~~~~~~~~~~~~~~~~~
17457

58+
To manually generate and load models:
17559

176-
.. |select wake wake| image:: ../../_static/wn_menu1.png
177-
.. |multi wake wake| image:: ../../_static/wn_menu2.png
178-
.. |image1| image:: ../../_static/wn_menu3.png
60+
1. Use the provided Python script to generate ``srmodels.bin``:
17961

62+
.. code-block:: bash
18063
181-
.. only:: html
64+
python {esp-sr_path}/movemodel.py -d1 {sdkconfig_path} -d2 {esp-sr_path} -d3 {build_path}
18265
183-
Model initialization and Usage
184-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
66+
**Parameters:**
18567

186-
::
68+
- ``esp-sr_path``: Path to your ESP-SR component directory
18769

188-
//
189-
// step1: return models in flash or in sdcard
190-
//
191-
char *model_path = your_model_path: // partition_label or model_path in sdcard;
192-
models = esp_srmodel_init(model_path);
70+
- ``sdkconfig_path``: Project's ``sdkconfig`` file path
19371

194-
//
195-
// step2: select the specific model by keywords
196-
//
197-
char *wn_name = esp_srmodel_filter(models, ESP_WN_PREFIX, NULL); // select WakeNet model
198-
char *nm_name = esp_srmodel_filter(models, ESP_MN_PREFIX, NULL); // select MultiNet model
199-
char *alexa_wn_name = esp_srmodel_filter(models, ESP_WN_PREFIX, "alexa"); // select WakeNet with "alexa" wake word.
200-
char *en_mn_name = esp_srmodel_filter(models, ESP_MN_PREFIX, ESP_MN_ENGLISH); // select english MultiNet model
201-
char *cn_mn_name = esp_srmodel_filter(models, ESP_MN_PREFIX, ESP_MN_CHINESE); // select english MultiNet model
72+
- ``build_path``: Project's build directory (typically ``your_project_path/build``)
20273

203-
// It also works if you use the model name directly in your code.
204-
char *my_wn_name = "wn9_hilexin"
205-
// we recommend you to check that it is loaded correctly
206-
if (!esp_srmodel_exists(models, my_wn_name))
207-
printf("%s can not be loaded correctly\n")
74+
2. The generated ``srmodels.bin`` will be located at:
75+
``{build_path}/srmodels/srmodels.bin``
20876

209-
//
210-
// step3: initialize model
211-
//
212-
esp_wn_iface_t *wakenet = esp_wn_handle_from_name(wn_name);
213-
model_iface_data_t *wn_model_data = wakenet->create(wn_name, DET_MODE_2CH_90);
77+
3. Flash the generated binary to your device.
21478

215-
esp_mn_iface_t *multinet = esp_mn_handle_from_name(mn_name);
216-
model_iface_data_t *mn_model_data = multinet->create(mn_name, 6000);
79+
.. important::
80+
Just regenerate ``srmodels.bin`` after changing model configurations in ``menuconfig``.

docs/en/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ ESP-SR User Guide
1919
VAD Model vadnet <vadnet/README>
2020
Speech Command Word MultiNet <speech_command_recognition/README>
2121
Speech Synthesis (Only Supports Chinese Language) <speech_synthesis/readme>
22-
Flashing Models <flash_model/README>
22+
Model Selection and Loading <flash_model/README>
2323
Benchmark <benchmark/README>
2424
Test Methods <test_report/README>
2525
Glossary <glossary/glossary>

0 commit comments

Comments
 (0)