s390x: widening multiplication does not optimize #129705

folkertdev · 2025-03-04T13:34:07Z

this LLVM

define range(i32 0, -131070) <4 x i32> @manual_mule(<8 x i16> %a, <8 x i16> %b) unnamed_addr {
start:
  %0 = shufflevector <8 x i16> %a, <8 x i16> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
  %1 = zext <4 x i16> %0 to <4 x i32>
  %2 = shufflevector <8 x i16> %b, <8 x i16> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
  %3 = zext <4 x i16> %2 to <4 x i32>
  %4 = mul nuw <4 x i32> %3, %1
  ret <4 x i32> %4
}

does not optimize to the expected output of vec_mule, a single vmleh instruction. The same is true for the other multiplication flavors (low, high, odd).

The text was updated successfully, but these errors were encountered:

folkertdev · 2025-03-04T16:50:13Z

by extension I think the widening multiplication should also merge into subsequent additions, so that vec_mule(a, b) + c is equal to vec_meadd(a, b, c). That doesn't seem to be happening today either.

https://godbolt.org/z/n5cvaqfb5

…ation Detect (non-intrinsic) IR patterns corresponding to the semantics of the various widening and high-word multiplication instructions. Specifically, this is done by: - Recognizing even/odd widening multiplication patterns in DAGCombine - Recognizing widening multiply-and-add on top during ISel - Implementing the standard MULHS/MUHLU IR opcodes - Detecting high-word multiply-and-add (which common code does not) Depending on architecture level, this can support all integer vector types as well as the scalar i128 type. Fixes: llvm/llvm-project#129705

llvmbot added the new issue label Mar 4, 2025

frederick-vs-ja added llvm:instcombine missed-optimization and removed new issue labels Mar 4, 2025

uweigand added the backend:SystemZ label Mar 4, 2025

folkertdev mentioned this issue Mar 4, 2025

s390x: another batch of intrinsics rust-lang/stdarch#1738

Merged

uweigand closed this as completed in cdc7864 Mar 15, 2025

EugeneZelenko removed the llvm:instcombine label Mar 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s390x: widening multiplication does not optimize #129705

s390x: widening multiplication does not optimize #129705

folkertdev commented Mar 4, 2025

folkertdev commented Mar 4, 2025 •

edited

Loading

s390x: widening multiplication does not optimize #129705

s390x: widening multiplication does not optimize #129705

Comments

folkertdev commented Mar 4, 2025

folkertdev commented Mar 4, 2025 • edited Loading

folkertdev commented Mar 4, 2025 •

edited

Loading