Skip to content

Commit 55908c4

Browse files
author
dmitrygo
committed
[CPU] FullyConnected acceleration for BF16 compressed weights
1 parent 0d95325 commit 55908c4

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

src/plugins/intel_cpu/src/nodes/executors/dnnl/dnnl_fullyconnected_primitive.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ bool DnnlFCPrimitive::useWeightsDecompressionImpl(const ov::element::Type inputT
122122
// f16c kernel saves memory footprint with additional decompression computational overhead
123123
// which is only meaningful on LLM with small batch-size.
124124
// TODO: fall-back to use f32 weights on large batch-size
125-
if (inputType == f32 && weightsType == f16)
125+
if (inputType == f32 && one_of(weightsType, f16, bf16))
126126
return true;
127127
}
128128
}

0 commit comments

Comments
 (0)