[CPU] enable f16 inference precision #16500

usstq · 2023-03-23T06:25:11Z

Details:

Replace enforceBF16 in config with inferencePrecision
Add f16 support in CPU config property INFERENCE_PRECISION_HINT
Add f16 support in jit_convert_truncation_emitter, jit_convert_saturation_emitter, jit_load_emitter & jit_store_emitter
Brings f16 related oneDNN impls back
Add f16 support in conv/deconv/eltwise/fullyconnect/matmul/mvn/pad/pooling/softmax/subgraph

oneDNN fork PR

openvinotoolkit/oneDNN#197

Tickets:

CVS-92294

Signed-off-by: HU Yuan2 <[email protected]>

akladiev · 2023-05-11T16:35:38Z

This PR will be closed in 2 weeks in case of no activity.

Signed-off-by: HU Yuan2 <[email protected]>

yuxu42 · 2023-05-23T09:59:50Z

@usstq can we go on for the PR review?

wenjiew · 2023-05-27T16:08:09Z

Is this PR ready for review? Thanks!

usstq · 2023-05-29T07:51:52Z

@wenjiew Yes, I think it's ready!

yuxu42 · 2023-05-29T08:24:42Z

@luo-cheng2021 @tiger100256-hu Could you please review? Thanks!

src/plugins/intel_cpu/src/graph.cpp

src/plugins/intel_cpu/src/nodes/conv.cpp

src/plugins/intel_cpu/src/utils/debug_capabilities.cpp

src/plugins/intel_cpu/src/utils/debug_capabilities.h

luo-cheng2021

LGTM. We'd better to link the ticket about adding testcases to cover the fp16 function.

src/plugins/intel_cpu/src/config.cpp

usstq · 2023-06-21T03:16:27Z

I found some regression in performance after rebase, debugging...

src/plugins/intel_cpu/src/config.cpp

dmitry-gorokhov · 2023-06-21T05:52:27Z

src/plugins/intel_cpu/src/nodes/eltwise.cpp

                    uni_vpslld(vmm_src, vmm_src, 16);
                    break;
+                case Precision::FP16:
+                    assert(mayiuse(x64::avx512_core_fp16));


BTW, from my understanding conversion instructions like vcvtph2ps don't actually require avx512_core_fp16 ISA, but just F16C + AVX512VL/AVX512F (or pure avx2) which is available on all modern intel CPUs. Given that we can relax ISA limitation for all operations that uses only FP32<->FP16 conversion and keep real math in FP32 (like Eltwise, MVN, Interpolate etc). By doing that we can enable FP16 tests for such layers in precommit already now.
What do you think? Sounds like worth to add in follow-up PR.

OK, let me remove these assert and test can be added in follow-up PR

BTW, one exception is vcvtss2sh/vcvtsh2ss used in load_scalar/store_scalar which requires AVX512-FP16, Can we changed them into using vcvtps2ph/vcvtph2ps instead? which may pollute higher bits in xmm_src, is it safe ?

For Eltwise it is save.
Not sure about load/store_emitters. Would ask @chenhu-wang to comment.

OK, that's enough, load/store_emitters actually already using vector version vcvtps2ph/vcvtph2ps with mask to handle variable length load/store

You are still using avx512_core_fp16 check which, from my understanding, is available on SPR only. In other words avx512_core_fp16 is not equal f16c + avx512f + avx512vl. So to enable single layer tests in precommit we need to relax isa limitation.

Yes, indeed, some avx512_core_fp16 checks are still there, sorry I didn't remove them completely, and we can do that in single layer tests PR.

Just a reminder that sse4 do not support f16<-> f32 convert instructions. Instead of report exceptions or assert, there should be some alignment to fallback to f32, in property(precision hint) reading stage or createSupportedPrimitive() stage of nodes.

src/plugins/intel_cpu/docs/debug_capabilities/infer_prc.md

src/plugins/intel_cpu/src/nodes/deconv.cpp

dmitry-gorokhov · 2023-06-21T06:08:05Z

@usstq Could you please create PR with corresponding changed in oneDNN fork? I would also ask you to have only one commit for FP16 enabling there.

dmitry-gorokhov · 2023-06-21T06:09:04Z

@usstq We also need to create ticket for FP16 signle layer tests (for Convolutions, Matmuls etc) enabling activities once GNR will be available.

dmitry-gorokhov · 2023-06-21T06:26:42Z

@usstq Please also check binary size impact. We need to understand the change caused by FP16 instances.

usstq · 2023-06-21T07:44:30Z

@usstq We also need to create ticket for FP16 signle layer tests (for Convolutions, Matmuls etc) enabling activities once GNR will be available.

OK, will do.

@usstq Could you please create PR with corresponding changed in oneDNN fork? I would also ask you to have only one commit for FP16 enabling there.

Done: openvinotoolkit/oneDNN#197

@usstq Please also check binary size impact. We need to understand the change caused by FP16 instances.

libopenvino_intel_cpu_plugin.so has been increased from 47813192 to 48566856, by 736KB, relative increase 1.58%

usstq · 2023-06-21T07:47:04Z

@dmitry-gorokhov I have fixed fp16 brgconv issue and validated locally, and recent review comments also have been addressed, please review again, Thanks!

usstq · 2023-06-21T08:36:58Z

@dmitry-gorokhov I just found that MHA node will throw exception when enforce fp16, should I change it's behaviour by fallback to FP32 automatically


void MHA::initSupportedPrimitiveDescriptors() {
    if (!supportedPrimitiveDescriptors.empty())
        return;

    for (auto idx : {0, 1, 2, 3}) {
        inputPrecisions.push_back(getOriginalInputPrecisionAtPort(idx));
        if (!one_of(inputPrecisions[idx], Precision::FP32, Precision::BF16, Precision::I8))
            THROW_ERROR << "doesn't support " << inputPrecisions[idx].name() << " precision on " << idx <<  " input port";
    }

dmitry-gorokhov · 2023-06-21T08:50:54Z

@dmitry-gorokhov I just found that MHA node will throw exception when enforce fp16, should I change it's behaviour by fallback to FP32 automatically


void MHA::initSupportedPrimitiveDescriptors() {
    if (!supportedPrimitiveDescriptors.empty())
        return;

    for (auto idx : {0, 1, 2, 3}) {
        inputPrecisions.push_back(getOriginalInputPrecisionAtPort(idx));
        if (!one_of(inputPrecisions[idx], Precision::FP32, Precision::BF16, Precision::I8))
            THROW_ERROR << "doesn't support " << inputPrecisions[idx].name() << " precision on " << idx <<  " input port";
    }

Yes. MHA behavior should be updated.

usstq · 2023-06-21T13:19:43Z

@dmitry-gorokhov MHA's behavior is changed and avx512_fp16 assertions are totally removed, I validated following models on local machine using FP16 infer precision and found no regression in accuracy & performance:

bert-large-uncased-whole-word-masking-squad-onnx
resnet_v1.5_50
distilbert-base-uncased-sst-2
vit-base-16-224

dmitry-gorokhov

Merging the PR as long as major functionality is completed.
There are two follow-ups:

Enable FP16 single layer tests on HW with AVX512 support.
Clarify with oneDNN on lacking functionality

usstq · 2023-06-22T12:51:25Z

Merging the PR as long as major functionality is completed. There are two follow-ups:

Enable FP16 single layer tests on HW with AVX512 support.

Clarify with oneDNN on lacking functionality

OK, no problem, will follow-up these tasks

### Details: - *#16500 (comment) - *add test case for conv dconv fullconnect matmul mvn pad pooling subgraph softmax* ### Tickets: - *CVS-110112* --------- Signed-off-by: HU Yuan2 <[email protected]>

### Details: - *openvinotoolkit#16500 (comment) - *add test case for conv dconv fullconnect matmul mvn pad pooling subgraph softmax* ### Tickets: - *CVS-110112* --------- Signed-off-by: HU Yuan2 <[email protected]>

github-actions bot added the category: CPU OpenVINO CPU plugin label Mar 23, 2023

github-actions bot added the category: LP transformations OpenVINO Low Precision transformations label Mar 31, 2023

usstq force-pushed the tq/fp16 branch from c8b6dca to 0c9ab9f Compare April 1, 2023 14:42

github-actions bot removed the category: LP transformations OpenVINO Low Precision transformations label Apr 25, 2023

tiger100256-hu added 2 commits May 10, 2023 15:10

fp16 code of tingqian

9c659c1

Signed-off-by: HU Yuan2 <[email protected]>

udpate onednn

20068f6

Signed-off-by: HU Yuan2 <[email protected]>

akladiev added the Stale label May 11, 2023

tiger100256-hu force-pushed the tq/fp16 branch from 08260b3 to 775744d Compare May 15, 2023 07:21

update onednn repo back to openvino/onednn

775744d

Signed-off-by: HU Yuan2 <[email protected]>

akladiev removed the Stale label May 16, 2023

usstq added 2 commits May 24, 2023 16:22

Merge remote-tracking branch 'origin/master' into tq/fp16

d7da7c5

retrigger checks

84ae48e

usstq marked this pull request as ready for review May 24, 2023 08:35

usstq requested review from a team as code owners May 24, 2023 08:35

usstq changed the title ~~enable f16 inference precision~~ [CPU] enable f16 inference precision May 24, 2023

usstq added 2 commits May 25, 2023 14:40

remove unwanted changes

42d6b97

code clean-up

a56e433

fix cc build

3636aaf

yuxu42 requested review from luo-cheng2021 and tiger100256-hu May 29, 2023 08:25

tiger100256-hu reviewed May 30, 2023

View reviewed changes

luo-cheng2021 approved these changes May 30, 2023

View reviewed changes

src/plugins/intel_cpu/src/config.cpp Show resolved Hide resolved

usstq added 2 commits June 1, 2023 14:16

fix according to review comments

d209a09

Merge remote-tracking branch 'origin/master' into tq/fp16

8fab4c4

usstq added 3 commits June 21, 2023 06:04

Merge remote-tracking branch 'origin/master' into tq/fp16

0649ce7

Merge remote-tracking branch 'origin/master' into tq/fp16

ed6c0c5

Merge branch 'tq/fp16' of https://github.com/usstq/openvino into tq/fp16

816fa03

dmitry-gorokhov reviewed Jun 21, 2023

View reviewed changes

fix bug in EnforceInferPrcDebug

c8ddf20

usstq mentioned this pull request Jun 21, 2023

Add fp16 kernels back openvinotoolkit/oneDNN#197

Merged

usstq added 7 commits June 21, 2023 22:24

Add negative pattern to EnforceInferPrcDebug

d4341fe

Fix BrgConv for f16

85dc913

Address review comment

ea82556

remove avx512_fp16 isa assert

3b8bf07

Squash onednn commits

8a97deb

MHA : unsupported precision fallbacks to FP32

18866a2

Eltwise: replace vcvtsh2ss/vcvtss2sh with vcvtph2ps/vcvtps2ph

1a95b03

dmitry-gorokhov approved these changes Jun 22, 2023

View reviewed changes

dmitry-gorokhov merged commit 54bb74b into openvinotoolkit:master Jun 22, 2023

xuchen-intel mentioned this pull request Jul 10, 2023

[CPU] Reduce node supports fp16 precision #18227

Merged

tiger100256-hu mentioned this pull request Sep 11, 2023

[CPU] Add test case for precision fp16 #19721

Merged

[CPU] enable f16 inference precision #16500

[CPU] enable f16 inference precision #16500

Uh oh!

Conversation

usstq commented Mar 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details:

oneDNN fork PR

Tickets:

Uh oh!

akladiev commented May 11, 2023

Uh oh!

yuxu42 commented May 23, 2023

Uh oh!

wenjiew commented May 27, 2023

Uh oh!

usstq commented May 29, 2023

Uh oh!

yuxu42 commented May 29, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luo-cheng2021 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

usstq commented Jun 21, 2023

Uh oh!

Uh oh!

dmitry-gorokhov Jun 21, 2023

Choose a reason for hiding this comment

Uh oh!

usstq Jun 21, 2023

Choose a reason for hiding this comment

Uh oh!

usstq Jun 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dmitry-gorokhov Jun 21, 2023

Choose a reason for hiding this comment

Uh oh!

usstq Jun 21, 2023

Choose a reason for hiding this comment

Uh oh!

dmitry-gorokhov Jun 22, 2023

Choose a reason for hiding this comment

Uh oh!

usstq Jun 22, 2023

Choose a reason for hiding this comment

Uh oh!

chenhu-wang Jun 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dmitry-gorokhov commented Jun 21, 2023

Uh oh!

dmitry-gorokhov commented Jun 21, 2023

Uh oh!

dmitry-gorokhov commented Jun 21, 2023

Uh oh!

usstq commented Jun 21, 2023

Uh oh!

usstq commented Jun 21, 2023

Uh oh!

usstq commented Jun 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dmitry-gorokhov commented Jun 21, 2023

Uh oh!

usstq commented Jun 21, 2023

Uh oh!

dmitry-gorokhov left a comment

Choose a reason for hiding this comment

Uh oh!

usstq commented Jun 22, 2023

usstq commented Mar 23, 2023 •

edited

Loading

usstq Jun 21, 2023 •

edited

Loading

chenhu-wang Jun 29, 2023 •

edited

Loading

usstq commented Jun 21, 2023 •

edited

Loading