Skip to content

Commit 4aae2a3

Browse files
committed
Update on "Autoquant"
Summary: Adding autoquantization functionality, using hte do_quant api we can test kernel speeds and pick the best quantization type (or no quantization) for each layer. Test Plan: python test/test.py -k "autoquant" also tested on SAM and SDXL pytorch-labs/segment-anything-fast#114 HDCharles/sdxl-fast@8d9942a Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D55103983](https://our.internmc.facebook.com/intern/diff/D55103983) [ghstack-poisoned]
2 parents bc2deb7 + 490c7c1 commit 4aae2a3

File tree

1 file changed

+0
-6
lines changed

1 file changed

+0
-6
lines changed

test/test.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -894,12 +894,6 @@ def test_aq_int8_dynamic_quant_subclass(self):
894894
AQInt8DynamicallyQuantizedLinearWeight.from_float, 35, test_dtype
895895
)
896896

897-
def test_aq_int8_weight_only_quant_subclass(self):
898-
for test_dtype in [torch.float32, torch.float16, torch.bfloat16]:
899-
self._test_lin_weight_subclass_impl(
900-
AQInt8DynamicallyQuantizedLinearWeight.from_float, 35, test_dtype
901-
)
902-
903897
def test_aq_int8_weight_only_quant_subclass(self):
904898
for test_dtype in [torch.float32, torch.float16, torch.bfloat16]:
905899
self._test_lin_weight_subclass_impl(

0 commit comments

Comments
 (0)