Skip to content

Commit 0b48f47

Browse files
authored
Update quantization-overview.md
1 parent f184329 commit 0b48f47

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

docs/source/quantization-overview.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,9 @@ Backend developers will need to implement their own ``Quantizer`` to express how
1414
Modeling users will use the ``Quantizer`` specific to their target backend to quantize their model, e.g. ``XNNPACKQuantizer``.
1515

1616
For an example quantization flow with ``XNPACKQuantizer``, more documentation and tutorials, please see ``Performing Quantization`` section in [ExecuTorch tutorial](./tutorials/export-to-executorch-tutorial).
17+
18+
## Source Quantization: Int8DynActInt4WeightQuantizer
19+
20+
In addition to export based quantization (described above), ExecuTorch also provides source-based quantization. This is accomplished via [torchao](https://github.com/pytorch/ao). One specific example is `Int8DynActInt4WeightQuantizer`.
21+
22+
Imported with ``from torchao.quantization.quant_api import Int8DynActInt4WeightQuantizer``, this class uses a quantization object constructed with the specified precicion and groupsize, to mutate a provided ``nn.Module``.

0 commit comments

Comments
 (0)