From 0b48f47eb0923d006d1e1001e1c3e03cc12af1a2 Mon Sep 17 00:00:00 2001
From: Jack-Khuu <jackkhuu@fb.com>
Date: Tue, 4 Jun 2024 19:02:43 -0700
Subject: [PATCH 1/3] Update quantization-overview.md

---
 docs/source/quantization-overview.md | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/docs/source/quantization-overview.md b/docs/source/quantization-overview.md
index 3a56fb4577f..0a2d67c5015 100644
--- a/docs/source/quantization-overview.md
+++ b/docs/source/quantization-overview.md
@@ -14,3 +14,9 @@ Backend developers will need to implement their own ``Quantizer`` to express how
 Modeling users will use the ``Quantizer`` specific to their target backend to quantize their model, e.g. ``XNNPACKQuantizer``.
 
 For an example quantization flow with ``XNPACKQuantizer``, more documentation and tutorials, please see ``Performing Quantization`` section in [ExecuTorch tutorial](./tutorials/export-to-executorch-tutorial).
+
+## Source Quantization: Int8DynActInt4WeightQuantizer
+
+In addition to export based quantization (described above), ExecuTorch also provides source-based quantization. This is accomplished via [torchao](https://github.com/pytorch/ao). One specific example is `Int8DynActInt4WeightQuantizer`. 
+
+Imported with ``from torchao.quantization.quant_api import Int8DynActInt4WeightQuantizer``, this class uses a quantization object constructed with the specified precicion and groupsize, to mutate a provided ``nn.Module``.

From 9d80b628b2b50a22999d8833ab5d9f41501c629b Mon Sep 17 00:00:00 2001
From: Jack-Khuu <jackkhuu@fb.com>
Date: Wed, 5 Jun 2024 00:15:46 -0700
Subject: [PATCH 2/3] Update quantization-overview.md

---
 docs/source/quantization-overview.md | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/docs/source/quantization-overview.md b/docs/source/quantization-overview.md
index 0a2d67c5015..94a6f93c32b 100644
--- a/docs/source/quantization-overview.md
+++ b/docs/source/quantization-overview.md
@@ -17,6 +17,15 @@ For an example quantization flow with ``XNPACKQuantizer``, more documentation an
 
 ## Source Quantization: Int8DynActInt4WeightQuantizer
 
-In addition to export based quantization (described above), ExecuTorch also provides source-based quantization. This is accomplished via [torchao](https://github.com/pytorch/ao). One specific example is `Int8DynActInt4WeightQuantizer`. 
+In addition to export based quantization (described above), ExecuTorch wants to highlight source based quantizations, accomplished via [torchao](https://github.com/pytorch/ao). Unlike export based quantization, source based quantization directly modifies the model prior to export. One specific example is `Int8DynActInt4WeightQuantizer`. 
+
+This scheme represents 4-bit weight quantization with 8-bit dynamic quantization of activation during inference.
+
+Imported with ``from torchao.quantization.quant_api import Int8DynActInt4WeightQuantizer``, this class uses a quantization instance constructed with a specified dtype precision and groupsize, to mutate a provided ``nn.Module``.
+
+```
+from torchao.quantization.quant_api import Int8DynActInt4WeightQuantizer
+
+model = Int8DynActInt4WeightQuantizer(precision=torch_dtype, groupsize=group_size).quantize(model)
+```
 
-Imported with ``from torchao.quantization.quant_api import Int8DynActInt4WeightQuantizer``, this class uses a quantization object constructed with the specified precicion and groupsize, to mutate a provided ``nn.Module``.

From be9b90c7963a0d9cd67ee6df627d847c1a5df298 Mon Sep 17 00:00:00 2001
From: Jack-Khuu <jackkhuu@fb.com>
Date: Wed, 5 Jun 2024 13:22:03 -0700
Subject: [PATCH 3/3] Added info on lowering

---
 docs/source/quantization-overview.md | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/docs/source/quantization-overview.md b/docs/source/quantization-overview.md
index 94a6f93c32b..e80cfd2eb83 100644
--- a/docs/source/quantization-overview.md
+++ b/docs/source/quantization-overview.md
@@ -24,8 +24,15 @@ This scheme represents 4-bit weight quantization with 8-bit dynamic quantization
 Imported with ``from torchao.quantization.quant_api import Int8DynActInt4WeightQuantizer``, this class uses a quantization instance constructed with a specified dtype precision and groupsize, to mutate a provided ``nn.Module``.
 
 ```
+# Source Quant
 from torchao.quantization.quant_api import Int8DynActInt4WeightQuantizer
 
 model = Int8DynActInt4WeightQuantizer(precision=torch_dtype, groupsize=group_size).quantize(model)
-```
 
+# Export to ExecuTorch
+from executorch.exir import to_edge
+from torch.export import export
+
+exported_model = export(model, ...)
+et_program = to_edge(exported_model, ...).to_executorch(...)
+```