Autoquant #82

HDCharles · 2024-03-25T23:29:20Z

Summary: Adding autoquantization functionality, using hte do_quant api we can test kernel speeds and pick the best quantization type (or no quantization) for each layer.

Test Plan: python test/test.py -k "autoquant"

also tested on SAM and SDXL
pytorch-labs/segment-anything-fast#114 HDCharles/sdxl-fast@8d9942a

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

Summary: Adding autoquantization functionality, using hte do_quant api we can test kernel speeds and pick the best quantization type (or no quantization) for each layer. Test Plan: python test/test.py -k "autoquant" also tested on SAM and SDXL pytorch-labs/segment-anything-fast#114 HDCharles/sdxl-fast@8d9942a Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

This reverts commit 8119319.

cpuhrsch · 2024-03-26T06:38:58Z

torchao/__init__.py

 from . import dtypes
-from .quantization.quant_api import apply_dynamic_quant
-from .quantization.quant_api import apply_weight_only_int8_quant

 __all__ = [


I don't think we should make all of these public right away. Like apply_dynamic_quant is a bit of a duplicate of change_linear_weights_to_int8_dqtensors and swap_conv2d_1x1_to_linear might be something we just want to do automatically instead of making it a toplevel API.

Also dtypes is twice.

Is it possible to only add autoquant for now?

cpuhrsch · 2024-03-26T06:39:25Z

torchao/quantization/quant_api.py

@@ -136,10 +143,14 @@ def apply_dynamic_quant(model, filter_fn=None):


 def _get_subclass_inserter(cls, **kwargs):
-
+    # pyre-fixme[53]: Captured variable `cls` is not annotated.


You don't need the pyre-fixmes anymore

This reverts commit a6c7367.

Summary: Adding autoquantization functionality, using hte do_quant api we can test kernel speeds and pick the best quantization type (or no quantization) for each layer. Test Plan: python test/test.py -k "autoquant" also tested on SAM and SDXL pytorch-labs/segment-anything-fast#114 HDCharles/sdxl-fast@8d9942a Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

This reverts commit 8119319.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 25, 2024

HDCharles merged commit 8119319 into main Mar 25, 2024

cpuhrsch added a commit that referenced this pull request Mar 26, 2024

Revert "Autoquant (#82)"

a6c7367

This reverts commit 8119319.

cpuhrsch mentioned this pull request Mar 26, 2024

Revert "Autoquant" #83

Merged

cpuhrsch added a commit that referenced this pull request Mar 26, 2024

Revert "Autoquant (#82)" (#83)

62c7871

This reverts commit 8119319.

cpuhrsch reviewed Mar 26, 2024

View reviewed changes

cpuhrsch added a commit that referenced this pull request Apr 1, 2024

Reapply "Autoquant (#82)"

3118a3d

This reverts commit a6c7367.

cpuhrsch added a commit that referenced this pull request Apr 5, 2024

Reapply Autoquant (#82) (#109)

c403580

dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024

Revert "Autoquant (pytorch#82)" (pytorch#83)

1770675

This reverts commit 8119319.

dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024

Reapply Autoquant (pytorch#82) (pytorch#109)

37ae1d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoquant #82

Autoquant #82

HDCharles commented Mar 25, 2024

cpuhrsch Mar 26, 2024

cpuhrsch Mar 26, 2024

		@@ -136,10 +143,14 @@ def apply_dynamic_quant(model, filter_fn=None):


		def _get_subclass_inserter(cls, **kwargs):

		# pyre-fixme[53]: Captured variable `cls` is not annotated.

Autoquant #82

Autoquant #82

Conversation

HDCharles commented Mar 25, 2024

cpuhrsch Mar 26, 2024

Choose a reason for hiding this comment

cpuhrsch Mar 26, 2024

Choose a reason for hiding this comment