Added description for limitation of the mixbw mode. (#62)

MrWhoami · Copilot · web-flow · commit 72e9eea3bede · 2025-05-07T21:12:38.000+08:00
Co-authored-by: Copilot &lt;175728472+Copilot@users.noreply.github.com&gt;
diff --git a/docs/toolchain/manual_4_bie.md b/docs/toolchain/manual_4_bie.md
@@ -26,11 +26,12 @@ Args:
 * percentile (float, optional): used under 'mmse' mode. The range to search. The larger the value, the larger the search range, the better the performance but the longer the simulation time. Defaults to 0.001,
 * outlier_factor (float, optional): used under 'mmse' mode. The factor applied on outliers. For example, if clamping data is sensitive to your model, set outlier_factor to 2 or higher. Higher outlier_factor will reduce outlier removal by increasing range. Defaults to 1.0.
 * percentage (float, optional): used under 'percentage' mode. Suggest to set value between 0.999 and 1.0. Use 1.0 for detection models. Defaults to 0.999.
-* datapath_bitwidth_mode: choose from "int8"/"int16"/"mix balance"/"mix light". ("int16" is not supported in kdp520. "mix balance" and "mix light" are combines of int8 and int16 mode. "mix balance" prefers int16 while "mix light" prefers int8.)
-* weight_bitwidth_mode: choose from "int8"/"int16"/"int4"/"mix balance"/"mix light". ("int16" is not supported in kdp520. "int4" is not supported in kdp720. "mix balance" and "mix light" are combines of int8 and int16 mode. "mix balance" prefers int16 while "mix light" prefers int8.)
-* model_in_bitwidth_mode: choose from "int8"/"int16". ("int16" is not supported in kdp520.)
-* model_out_bitwidth_mode: choose from "int8"/"int16". ("int16" is not supported in kdp520.)
-* cpu_node_bitwidth_mode: choose from "int8"/"int16". ("int16" is not supported in kdp520.)
+* datapath_bitwidth_mode: choose from "int8"/"int16"/"mix balance"/"mix light"/"mixbw". ("int16" is not supported in kdp520. "mixbw", "mix balance" and "mix light" are combines of int8 and int16 mode. "mix balance" prefers int16 while "mix light" prefers int8. "mixbw" automatically select the best bitwidth for each layer.)
+* weight_bitwidth_mode: choose from "int8"/"int16"/"int4"/"mix balance"/"mix light". ("int16" is not supported in kdp520. "int4" is not supported in kdp720. "mixbw", "mix balance" and "mix light" are combines of int8 and int16 mode. "mix balance" prefers int16 while "mix light" prefers int8. "mixbw" automatically select the best bitwidth for each layer.)
+* model_in_bitwidth_mode: choose from "int8"/"int16". ("int16" is not supported in kdp520. When "mixbw" is set, this parameter is ignored.)
+* model_out_bitwidth_mode: choose from "int8"/"int16". ("int16" is not supported in kdp520. When "mixbw" is set, this parameter is ignored.)
+* cpu_node_bitwidth_mode: choose from "int8"/"int16". ("int16" is not supported in kdp520. When "mixbw" is set, this parameter is ignored.)
+* flops_ratio (float, optional): the ratio of the flops of the model. The larger the value, the better the performance but the longer the simulation time. Defaults to 0.2.
 * compiler_tiling (str, optional): could be "default" or "deep_search". Get a better image cut method through deep search, so as to improve the efficiency of our NPU. Defaults to "default".
 * mode (int, optional): running mode for the analysis. Defaults to 1.
     - 0: run ip_evaluator only. This mode will not output bie file.
diff --git a/docs/toolchain/quantization/1.3_Optimizing_Quantization_Modes.md b/docs/toolchain/quantization/1.3_Optimizing_Quantization_Modes.md
@@ -66,7 +66,7 @@ bie_path = km.analysis(
 
 ### 3.2.3 Use `mixbw` for Sensitivity-Guided Quantization
 
-If `mix light` precision is insufficient, use `mixbw`. This mode analyzes Conv node sensitivity and automatically prioritizes 16-bit quantization for sensitive Conv layers. Control compute overhead with flops_ratio (default=0.2). `mixbw` mode may need more time and disk space to evaluate quant sensitivity, but its fps is still faster than all int16.
+If `mix light` precision is insufficient, use `mixbw`. This mode analyzes Conv node sensitivity and automatically prioritizes 16-bit quantization for sensitive Conv layers. Control compute overhead with flops_ratio (default=0.2). `mixbw` mode may need more time and disk space to evaluate quant sensitivity, but its fps is still faster than all int16. When using `mixbw`, the `model_in_bitwidth_mode`, `model_out_bitwidth_mode`, and `cpu_node_bitwidth_mode` are always `int16` and are not changeable.
 
 
 ```python  
@@ -76,9 +76,6 @@ bie_path = km.analysis(
     output_dir="/data1/kneron_flow",  
     datapath_bitwidth_mode="mixbw",  
     weight_bitwidth_mode="mixbw",  
-    model_in_bitwidth_mode="int16",  
-    model_out_bitwidth_mode="int16",  
-    cpu_node_bitwidth_mode="int16",  
     flops_ratio=0.2  
 )
 ```