[GPU] Recognize parameters as valid inputs for compressed weights #32276

mdvoretc-intel · 2025-10-02T12:10:00Z

Details:

The change allows parameters to be recognized alongside constants as valid weight inputs for transformations producing FullyConnectedCompressed nodes

Description of the issue:

At present, the FC_COMPRESSED_WEIGHT_PATTERN macro contains a pattern for dequantization of a constant integer weight. This pattern is used to recognize and fold cases where fused weight dequantization can be used, replacing them with FullyConnectedCompressed nodes. Due to expecting a constant weight input, this pattern fails to recognize quantized LoRA weights, which are provided as parameters:

With the changes in this patch, these weights can be recognized, and the transformations can proceed and produce nodes that would then leverage oneDNN fused QGEMM for execution:

Tickets:

CVS-172090

mdvoretc-intel · 2025-10-28T16:15:42Z

build_jenkins

mdvoretc-intel · 2025-10-29T11:36:33Z

build_jenkins

mklimenk

Two branches of the if (pattern_map.count(weights_const_m)) { condition share a lot of similarities, please consider refactoring it to avoid code duplication

src/plugins/intel_gpu/src/plugin/transformations/convert_fc_to_compressed.cpp

This change enables use of quantized LoRA weights, passed as parameters during execution, to be recognized by the transformaions that produce FullyConnectedCompressed nodes for QGEMM execution.

The test previously expected the transformation to fail due to the use of input2 as a weight. The new logic allows use of parameters as weights, so the test has been adjusted to expect a successful transformation.

mklimenk

Looks much cleaner now, thanks!

mdvoretc-intel · 2025-11-03T16:19:39Z

@CuriousPanCake please review.

mdvoretc-intel · 2025-11-11T14:25:49Z

@CuriousPanCake please review.

CuriousPanCake · 2025-11-14T09:38:42Z

@Lyamin-Roman please take a look

This has required addressing a previously not covered use case where a reshape follows float conversion, as well as adding logic for reshaping the param weight if the reshape covers the entire decompression pattern.

mdvoretc-intel · 2025-11-28T14:21:48Z

Consider add new tests

Tests added.

mdvoretc-intel · 2025-11-28T14:23:08Z

@Lyamin-Roman please review.

src/plugins/intel_gpu/src/plugin/transformations/convert_fc_to_compressed.cpp

Lyamin-Roman · 2025-11-28T16:09:25Z

src/plugins/intel_gpu/tests/unit/transformations/convert_fc_to_compressed_test.cpp

    }
 }

+TEST_F(TransformationTestsF, ConvertFCToCompressed11) {


[random spot] Please add functional accuracy tests
It looks like you can extend an existing test with additional parameters
src/plugins/intel_gpu/tests/functional/subgraph_tests/dynamic/matmul_weights_decompression.cpp

Attempted this. The tests have a singular parameter for input precision which prevents proper creation of weight parameters in compressed types.

Tests added with a configure_model() override to ensure weight parameters are provided in appropriate type. Tests currently fail due to the lack of u4 transposition support, WIP to fix.

src/plugins/intel_gpu/src/plugin/transformations/compressed_weights_pattern.hpp

mryzhov

Please address the requested changes

p-durandin · 2025-12-03T05:55:27Z

build_jenkins

susbhere · 2025-12-06T06:06:44Z

build_jenkins

p-durandin · 2025-12-08T05:31:49Z

build_jenkins

mklimenk · 2025-12-10T16:56:54Z

src/plugins/intel_gpu/tests/functional/subgraph_tests/dynamic/matmul_weights_decompression.cpp

+    void configure_model() override {
+        ov::preprocess::PrePostProcessor p(function);
+        {
+            auto& params = function->get_parameters();


Please remove this unused variable, it causes build to fail.
@ZackyLake

github-actions · 2025-12-29T00:32:05Z

This PR will be closed in a week because of 2 weeks of no activity.

mdvoretc-intel · 2026-01-09T13:04:54Z

@mryzhov All requested changes have been addressed.

github-actions bot added category: GPU OpenVINO GPU plugin category: transformations OpenVINO Runtime library - Transformations labels Oct 2, 2025

sys-openvino-ci added the ExternalIntelPR External contributor from Intel label Oct 2, 2025

mdvoretc-intel force-pushed the param_quant_weight branch from 522f237 to 6a6a649 Compare October 28, 2025 16:14

mdvoretc-intel force-pushed the param_quant_weight branch from 6a6a649 to 16d03b4 Compare October 29, 2025 11:35

mdvoretc-intel marked this pull request as ready for review October 29, 2025 11:36

mdvoretc-intel requested review from a team as code owners October 29, 2025 11:36

mdvoretc-intel requested review from CuriousPanCake and removed request for a team October 29, 2025 11:36

mdvoretc-intel force-pushed the param_quant_weight branch 2 times, most recently from b82b902 to 476f80f Compare October 30, 2025 09:25

mklimenk reviewed Oct 31, 2025

View reviewed changes

src/plugins/intel_gpu/src/plugin/transformations/convert_fc_to_compressed.cpp Outdated Show resolved Hide resolved

mdvoretc-intel added 2 commits November 3, 2025 13:40

[gpu] Recognize parameters as valid inputs for compressed weights

f557a05

This change enables use of quantized LoRA weights, passed as parameters during execution, to be recognized by the transformaions that produce FullyConnectedCompressed nodes for QGEMM execution.

Adjust ConvertMatMulToFullyConnectedExceptionTest_sibling_matmul

1c45696

The test previously expected the transformation to fail due to the use of input2 as a weight. The new logic allows use of parameters as weights, so the test has been adjusted to expect a successful transformation.

mdvoretc-intel force-pushed the param_quant_weight branch from e9e889a to 1c45696 Compare November 3, 2025 13:40

Address review comments

a714620

mklimenk reviewed Nov 3, 2025

View reviewed changes

Merge branch 'master' into param_quant_weight

53d9074

github-actions bot removed the category: transformations OpenVINO Runtime library - Transformations label Nov 11, 2025

Restore convert_matmul_to_fc change after refactor

ea48b32

CuriousPanCake requested review from Lyamin-Roman, evkotov and mryzhov November 13, 2025 12:21

Add unit tests per review comments

30dd0ef

This has required addressing a previously not covered use case where a reshape follows float conversion, as well as adding logic for reshaping the param weight if the reshape covers the entire decompression pattern.

Merge branch 'master' into param_quant_weight

d80ce82

Lyamin-Roman approved these changes Nov 28, 2025

View reviewed changes

Lyamin-Roman self-requested a review November 28, 2025 16:10

mryzhov self-assigned this Dec 1, 2025

Remove redundant code

515f0d0

mryzhov reviewed Dec 1, 2025

View reviewed changes

src/plugins/intel_gpu/src/plugin/transformations/compressed_weights_pattern.hpp Outdated Show resolved Hide resolved

mryzhov requested changes Dec 1, 2025

View reviewed changes

Generalize immediate weight reshape

83a0b55

mdvoretc-intel added 2 commits December 4, 2025 04:08

Add test cases

d8d9e22

Skip cases that would require 4-bit transpose

6f49a0d

mklimenk reviewed Dec 10, 2025

View reviewed changes

github-actions bot added the Stale label Dec 29, 2025

mdvoretc-intel added 3 commits January 6, 2026 01:04

Remove unused variable

9831c47

Merge branch 'master' into param_quant_weight

9f75ca4

Cover the added test case

ae5c996

github-actions bot added the category: Core OpenVINO Core (aka ngraph) label Jan 6, 2026

mdvoretc-intel added 2 commits January 7, 2026 05:49

Remove incorrect assertions

9f96b7f

Fix test API issue

d443c08

github-actions bot removed the category: Core OpenVINO Core (aka ngraph) label Jan 8, 2026

Update logic for squeezing parameter weights

b510b91

mdvoretc-intel force-pushed the param_quant_weight branch from e2a50ee to b510b91 Compare January 8, 2026 15:14

github-actions bot removed the Stale label Jan 9, 2026

[GPU] Recognize parameters as valid inputs for compressed weights #32276

Are you sure you want to change the base?

[GPU] Recognize parameters as valid inputs for compressed weights #32276

Conversation

mdvoretc-intel commented Oct 2, 2025

Details:

Description of the issue:

Tickets:

Uh oh!

mdvoretc-intel commented Oct 28, 2025

Uh oh!

mdvoretc-intel commented Oct 29, 2025

Uh oh!

mklimenk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mklimenk left a comment

Choose a reason for hiding this comment

Uh oh!

mdvoretc-intel commented Nov 3, 2025

Uh oh!

mdvoretc-intel commented Nov 11, 2025

Uh oh!

CuriousPanCake commented Nov 14, 2025

Uh oh!

mdvoretc-intel commented Nov 28, 2025

Uh oh!

mdvoretc-intel commented Nov 28, 2025

Uh oh!

Uh oh!

Lyamin-Roman Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

mdvoretc-intel Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

mdvoretc-intel Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mryzhov left a comment

Choose a reason for hiding this comment

Uh oh!

p-durandin commented Dec 3, 2025

Uh oh!

susbhere commented Dec 6, 2025

Uh oh!

p-durandin commented Dec 8, 2025

Uh oh!

mklimenk Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 29, 2025

Uh oh!

mdvoretc-intel commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants