Skip to content

Conversation

@hyunback
Copy link
Contributor

@hyunback hyunback commented Jun 14, 2024

Details:

  • Stable Diffusion in dpas has bad first inference latency because all onednn convolutions are compiled at first inference. We can resolve this bottleneck with shape agnostic kernel. Target kernel is convolution_fsv16_1x1

Tickets:

  • 143317

@hyunback hyunback added category: GPU OpenVINO GPU plugin WIP work in progress labels Jun 14, 2024
@hyunback hyunback requested review from a team as code owners June 14, 2024 05:47
@hyunback hyunback force-pushed the sa_conv_fsv16_poc branch 3 times, most recently from 36192b8 to d5da6d5 Compare June 20, 2024 00:28
@hyunback hyunback force-pushed the sa_conv_fsv16_poc branch from d5da6d5 to 01e85b5 Compare June 20, 2024 01:03
hyunback added 2 commits June 21, 2024 21:22
Signed-off-by: hyunback <[email protected]>
Signed-off-by: hyunback <[email protected]>
@hyunback hyunback force-pushed the sa_conv_fsv16_poc branch 2 times, most recently from 1100ab5 to f3fa3d3 Compare June 21, 2024 14:05
hyunback added 3 commits June 25, 2024 19:15
Signed-off-by: hyunback <[email protected]>
Signed-off-by: hyunback <[email protected]>
@hyunback hyunback removed the WIP work in progress label Jun 26, 2024
Copy link
Contributor

@e-ddykim e-ddykim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Signed-off-by: hyunback <[email protected]>
Comment on lines 443 to 445
kd.internalBufferSizes.clear();
kd.internalBufferSizes.push_back(prim_params.inputs[0].PhysicalSizeInBytes());
kd.internalBufferDataType = prim_params.inputs[0].GetDType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this internal buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied, frankly no need, it came from convolution_kernel_base.

@yeonbok yeonbok added this pull request to the merge queue Jun 28, 2024
Merged via the queue into openvinotoolkit:master with commit a0d195d Jun 28, 2024
AsyaPronina pushed a commit to AsyaPronina/openvino that referenced this pull request Jul 1, 2024
### Details:
- Stable Diffusion in dpas has bad first inference latency because all
onednn convolutions are compiled at first inference. We can resolve this
bottleneck with shape agnostic kernel. Target kernel is
convolution_fsv16_1x1


### Tickets:
 - *143317*

---------

Signed-off-by: hyunback <[email protected]>
AsyaPronina pushed a commit to AsyaPronina/openvino that referenced this pull request Jul 1, 2024
### Details:
- Stable Diffusion in dpas has bad first inference latency because all
onednn convolutions are compiled at first inference. We can resolve this
bottleneck with shape agnostic kernel. Target kernel is
convolution_fsv16_1x1


### Tickets:
 - *143317*

---------

Signed-off-by: hyunback <[email protected]>
@hyunback hyunback deleted the sa_conv_fsv16_poc branch April 8, 2025 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants