Skip to content

Commit 5b657f5

Browse files
committed
AMDGPU: Move LICM after AMDGPUCodeGenPrepare
The commit that added the run says it's to hoist uniform parts of integer division expansion. That expansion is performed later, so this didn't do anything in that case. Move this later so the original test shows the improvement. This also saves a run of "Canonicalize natural loops". Not sure why this appears to be still getting a separate loop PM run. Also feels a bit heavy to run this just for divide. Is there a way to specifically hoist the divide sequence when it expands?
1 parent 8fdedcd commit 5b657f5

File tree

9 files changed

+438
-534
lines changed

9 files changed

+438
-534
lines changed

llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp

+5-1
Original file line numberDiff line numberDiff line change
@@ -967,7 +967,6 @@ void AMDGPUPassConfig::addEarlyCSEOrGVNPass() {
967967
}
968968

969969
void AMDGPUPassConfig::addStraightLineScalarOptimizationPasses() {
970-
addPass(createLICMPass());
971970
addPass(createSeparateConstOffsetFromGEPPass());
972971
// ReassociateGEPs exposes more opportunities for SLSR. See
973972
// the example in reassociate-geps-and-slsr.ll.
@@ -1039,6 +1038,11 @@ void AMDGPUPassConfig::addIRPasses() {
10391038
// TODO: May want to move later or split into an early and late one.
10401039
addPass(createAMDGPUCodeGenPreparePass());
10411040
}
1041+
1042+
// Try to hoist loop invariant parts of divisions AMDGPUCodeGenPrepare may
1043+
// have expanded.
1044+
if (TM.getOptLevel() > CodeGenOpt::Less)
1045+
addPass(createLICMPass());
10421046
}
10431047

10441048
TargetPassConfig::addIRPasses();

0 commit comments

Comments
 (0)