[TailRecElim] TRE fails to optimize `shl` accumulators (phase ordering issue with `InstCombine`)

While LLVM can optimize recursive functions with accumulators in the presence of commutative or associative instructions, such as ([compiler explorer -O1](https://godbolt.org/z/sPT1TMc6Y)):
```c
int f(int x) {
    if (x == 1) return 1;
    return f(x-1) * 3;
}
```

The Tail Recursion Elimination (TRE) pass fails to transform recursive functions into loops when the accumulator operation is a left shift (`shl`). This is particularly problematic because `InstCombine` often strength-reduces multiplications into `shl`, effectively blocking TRE from optimizing the function: a phase-ordering problem.
Even with -O3, the following function remains recursive, whereas GCC successfully transforms it ([compiler explorer -O3](https://godbolt.org/z/ee4fM597K)):
```c
int f(int x) {
    if (x == 1) return 1;
    return f(x-1) * 2; // --> %mul = shl nsw i32 %call, 1
}
```
While `shl` is neither associative nor commutative, a chain of shifts by a constant amount C is equivalent to a single shift by the sum of the amounts: $$(\dots (Base \ll C) \ll C) \dots \ll C = Base \ll (C \times Iterations)$$ (say a pseudo-associative relation $$(Base \ll C) \ll C  = Base \ll (C + C)$$?)
This is subject to overflow, but the transformation is still valid: signed integer overflow is UB in C and it is lowered to `shl nsw`, for instance.

**Improvements**
TRE should be extended to support `shl` (and potentially `lshr`/`ashr`) when the shift amount is an invariant constant. The transformation can initialize the accumulator with the base case value (more base cases could be handled, as in the link above) and then apply the current logic as is.
Of course, this fix would improve the robustness of TRE against common canonicalizations performed by, for instance, `InstCombine`, but it requires a little bit of compilation time.

Am I missing something? Is there a fundamental reason to limit TRE to associative/commutative instructions?

https://github.com/llvm/llvm-project/blob/05e908609227e1e8d993659e604a63668dfd2825/llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp#L377-L379

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TailRecElim] TRE fails to optimize `shl` accumulators (phase ordering issue with `InstCombine`) #178805

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	static bool canTransformAccumulatorRecursion(Instruction I, CallInst CI) {
	if (!I->isAssociative() \|\| !I->isCommutative())
	return false;

[TailRecElim] TRE fails to optimize shl accumulators (phase ordering issue with InstCombine) #178805

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[TailRecElim] TRE fails to optimize `shl` accumulators (phase ordering issue with `InstCombine`) #178805