[ET-VK][int4] patch 4-bit source transformation quantizer to support linear modules with biases #8224

nathanaelsee · 2025-02-05T19:59:30Z

Stack from ghstack (oldest at bottom):

[ET-VK][int4] Wrap int4 linear calls with view_copy nodes to squeeze/unsqueeze inputs #8226
[ET-VK][int4] patch 4-bit linear op for ensuring w-packed in/out #8225
-> [ET-VK][int4] patch 4-bit source transformation quantizer to support linear modules with biases #8224

While LLaMa does not have biases, there are some models which will have biases in their linear modules.

Add support in the source transform quantizer for biases.

Differential Revision: D69072087

…linear modules with biases While LLaMa does not have biases, there are some models which will have biases in their linear modules. Add support in the source transform quantizer for biases. Differential Revision: [D69072087](https://our.internmc.facebook.com/intern/diff/D69072087/) [ghstack-poisoned]

pytorch-bot · 2025-02-05T19:59:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8224

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCM Infra failures during checkout of PyTorch

❌ 1 New Failure

As of commit fde8c70 with merge base 7805229 ():

NEW FAILURE - The following job has failed:

pull / unittest-arm / linux-job (gh)
backends/arm/test/ops/test_mm.py::TestMM::test_mm_single_input_tosa_BI_3

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-02-05T19:59:45Z

This pull request was exported from Phabricator. Differential Revision: D69072087

…to support linear modules with biases" While LLaMa does not have biases, there are some models which will have biases in their linear modules. Add support in the source transform quantizer for biases. Differential Revision: [D69072087](https://our.internmc.facebook.com/intern/diff/D69072087/) [ghstack-poisoned]

facebook-github-bot · 2025-02-05T22:04:22Z

This pull request was exported from Phabricator. Differential Revision: D69072087

…to support linear modules with biases" While LLaMa does not have biases, there are some models which will have biases in their linear modules. Add support in the source transform quantizer for biases. Differential Revision: [D69072087](https://our.internmc.facebook.com/intern/diff/D69072087/) [ghstack-poisoned]

facebook-github-bot · 2025-02-05T22:19:43Z

This pull request was exported from Phabricator. Differential Revision: D69072087

…to support linear modules with biases" While LLaMa does not have biases, there are some models which will have biases in their linear modules. Add support in the source transform quantizer for biases. Differential Revision: [D69072087](https://our.internmc.facebook.com/intern/diff/D69072087/) [ghstack-poisoned]

facebook-github-bot · 2025-02-06T00:56:15Z

This pull request was exported from Phabricator. Differential Revision: D69072087

…linear modules with biases Pull Request resolved: #8224 While LLaMa does not have biases, there are some models which will have biases in their linear modules. Add support in the source transform quantizer for biases. ghstack-source-id: 264952608 @exported-using-ghexport Differential Revision: [D69072087](https://our.internmc.facebook.com/intern/diff/D69072087/) Co-authored-by: Nathanael See <[email protected]>

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 5, 2025

facebook-github-bot added the fb-exported label Feb 5, 2025

This was referenced Feb 5, 2025

[ET-VK][int4] patch 4-bit linear op for ensuring w-packed in/out #8225

Merged

[ET-VK][int4] Wrap int4 linear calls with view_copy nodes to squeeze/unsqueeze inputs #8226

Merged

nathanaelsee added the release notes: vulkan Changes to the Vulkan backend delegate label Feb 5, 2025

SS-JIA approved these changes Feb 5, 2025

View reviewed changes

facebook-github-bot merged commit ab10ab3 into gh/nathanaelsee/1/base Feb 6, 2025
45 of 47 checks passed

facebook-github-bot deleted the gh/nathanaelsee/1/head branch February 6, 2025 03:41

facebook-github-bot temporarily deployed to cherry-pick-bot February 6, 2025 03:41 — with GitHub Actions Inactive

pytorchbot mentioned this pull request Feb 6, 2025

[ET-VK][int4] patch 4-bit source transformation quantizer to support linear modules with biases #8252

Merged

This was referenced Feb 11, 2025

Weekly pr metrics report - 2025-02-01..2025-02-07 wdvr/pytorch#6

Open

Weekly pr metrics report - 2025-02-01..2025-02-07 wdvr/pytorch#8

Open

This was referenced Feb 24, 2025

Weekly pr metrics report - 2025-02-01..2025-02-07 wdvr/pytorch#10

Open

Weekly pr metrics report - 2025-02-01..2025-02-07 wdvr/pytorch#14

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK][int4] patch 4-bit source transformation quantizer to support linear modules with biases #8224

[ET-VK][int4] patch 4-bit source transformation quantizer to support linear modules with biases #8224

nathanaelsee commented Feb 5, 2025 •

edited

Loading

pytorch-bot bot commented Feb 5, 2025 •

edited

Loading

facebook-github-bot commented Feb 5, 2025

facebook-github-bot commented Feb 5, 2025

facebook-github-bot commented Feb 5, 2025

facebook-github-bot commented Feb 6, 2025

[ET-VK][int4] patch 4-bit source transformation quantizer to support linear modules with biases #8224

[ET-VK][int4] patch 4-bit source transformation quantizer to support linear modules with biases #8224

Conversation

nathanaelsee commented Feb 5, 2025 • edited Loading

pytorch-bot bot commented Feb 5, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8224

❗ 1 Active SEVs

❌ 1 New Failure

facebook-github-bot commented Feb 5, 2025

facebook-github-bot commented Feb 5, 2025

facebook-github-bot commented Feb 5, 2025

facebook-github-bot commented Feb 6, 2025

nathanaelsee commented Feb 5, 2025 •

edited

Loading

pytorch-bot bot commented Feb 5, 2025 •

edited

Loading