-
Notifications
You must be signed in to change notification settings - Fork 15.9k
Description
While LLVM can optimize recursive functions with accumulators in the presence of commutative or associative instructions, such as (compiler explorer -O1):
int f(int x) {
if (x == 1) return 1;
return f(x-1) * 3;
}The Tail Recursion Elimination (TRE) pass fails to transform recursive functions into loops when the accumulator operation is a left shift (shl). This is particularly problematic because InstCombine often strength-reduces multiplications into shl, effectively blocking TRE from optimizing the function: a phase-ordering problem.
Even with -O3, the following function remains recursive, whereas GCC successfully transforms it (compiler explorer -O3):
int f(int x) {
if (x == 1) return 1;
return f(x-1) * 2; // --> %mul = shl nsw i32 %call, 1
}While shl is neither associative nor commutative, a chain of shifts by a constant amount C is equivalent to a single shift by the sum of the amounts:
This is subject to overflow, but the transformation is still valid: signed integer overflow is UB in C and it is lowered to shl nsw, for instance.
Improvements
TRE should be extended to support shl (and potentially lshr/ashr) when the shift amount is an invariant constant. The transformation can initialize the accumulator with the base case value (more base cases could be handled, as in the link above) and then apply the current logic as is.
Of course, this fix would improve the robustness of TRE against common canonicalizations performed by, for instance, InstCombine, but it requires a little bit of compilation time.
Am I missing something? Is there a fundamental reason to limit TRE to associative/commutative instructions?
llvm-project/llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
Lines 377 to 379 in 05e9086
| static bool canTransformAccumulatorRecursion(Instruction *I, CallInst *CI) { | |
| if (!I->isAssociative() || !I->isCommutative()) | |
| return false; |