reassociate multiplies with fast-math #22142

rotateright · 2014-12-05T21:50:55Z


Bugzilla Link	21768
Resolution	FIXED
Resolved on	Dec 10, 2017 09:38
Version	trunk
OS	All
Blocks	#34959
CC	@hfinkel,@RKSimon,@sanjoy

Extended Description

define float @foo(float %f0, float %f1, float %f2, float %f3) {
  %mul0 = fmul fast float %f1, %f0
  %mul1 = fmul fast float %mul0, %f2
  %mul2 = fmul fast float %mul1, %f3
  ret float %mul2
}

This should be optimized to:

define float @foo(float %f0, float %f1, float %f2, float %f3) {
  %mul0 = fmul fast float %f0, %f1
  %mul1 = fmul fast float %f2, %f3
  %mul2 = fmul fast float %mul0, %mul1
  ret float %mul2
}

Ie, instead of 3 dependent multiplies, we should have 2 independent fmuls followed by the fmul of those results.

This can be generalized for N fmuls to form the optimal binary tree of independent ops.

rotateright · 2014-12-05T23:41:29Z

We apparently don't do this for integers either:

define i32 @foo(i32 %f0, i32 %f1, i32 %f2, i32 %f3) #0 {
  %mul = mul i32 %f1, %f0
  %mul1 = mul i32 %mul, %f2
  %mul2 = mul i32 %mul1, %f3
  ret i32 %mul2
}

Also, related bug 17305 - looks like we're not reassociating any math ops like this.

rotateright · 2015-07-29T23:11:21Z

We don't do this optimization in IR because it can increase register pressure, but fast-math FP multiplies on x86 are now reassociated by the machine combiner pass:

http://llvm.org/viewvc/llvm-project?view=revision&revision=239486
http://llvm.org/viewvc/llvm-project?view=revision&revision=240361
http://llvm.org/viewvc/llvm-project?view=revision&revision=241752
http://llvm.org/viewvc/llvm-project?view=revision&revision=241873

As with bug 17305, we don't produce the optimal binary tree for larger cases, because it could make compile time explode, but this is probably as good as it gets.

I'll leave this open until integer multiplies are implemented too.

rotateright · 2015-07-31T18:54:01Z

x86 reg-reg integer multiplies can now be reassociated:
http://llvm.org/viewvc/llvm-project?view=revision&revision=243756

There's no reason that the reassociation logic can't be hoisted/extended to other architectures, so I hope this optimization will eventually make it to other targets. There's a 'TODO' comment about this in the code.

RKSimon · 2021-11-27T00:04:53Z

mentioned in issue #34959

RKSimon mentioned this issue Dec 10, 2017

[meta] Missed fast-math optimizations #34959

Open

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reassociate multiplies with fast-math #22142

reassociate multiplies with fast-math #22142

rotateright commented Dec 5, 2014 •

edited by RKSimon

Loading

rotateright commented Dec 5, 2014 •

edited by RKSimon

Loading

rotateright commented Jul 29, 2015

rotateright commented Jul 31, 2015

RKSimon commented Nov 27, 2021

reassociate multiplies with fast-math #22142

reassociate multiplies with fast-math #22142

Comments

rotateright commented Dec 5, 2014 • edited by RKSimon Loading

Extended Description

rotateright commented Dec 5, 2014 • edited by RKSimon Loading

rotateright commented Jul 29, 2015

rotateright commented Jul 31, 2015

RKSimon commented Nov 27, 2021

rotateright commented Dec 5, 2014 •

edited by RKSimon

Loading

rotateright commented Dec 5, 2014 •

edited by RKSimon

Loading