Skip to content

Commit 25f9d5b

Browse files
author
wilco
committed
[AArch64] Use more LDP/STP in shrinkwrapping
The shrinkwrap optimization added in GCC 7 allows each callee-save to be delayed and done only across blocks which need a particular callee-save. Although this reduces unnecessary memory traffic on code paths that need few callee-saves, it typically uses LDR/STR rather than LDP/STP. This means more memory accesses and increased codesize, ~1.0% on average. To improve this, if a particular callee-save must be saved/restored, also add the adjacent callee-save to allow use of LDP/STP. This significantly reduces codesize (for example gcc_r, povray_r, parest_r, xalancbmk_r are 1% smaller). This is a simple fix which can be backported. A more advanced approach would scan blocks for pairs of callee-saves, but that requires a full rewrite of all the callee-save code which is too late at this stage. An example epilog in a shrinkwrapped function before: ldp x21, x22, [sp,#16] ldr x23, [sp,#32] ldr x24, [sp,#40] ldp x25, x26, [sp,#48] ldr x27, [sp,#64] ldr x28, [sp,#72] ldr x30, [sp,#80] ldr d8, [sp,#88] ldp x19, x20, [sp],#96 ret And after this patch: ldr d8, [sp,#88] ldp x21, x22, [sp,#16] ldp x23, x24, [sp,#32] ldp x25, x26, [sp,#48] ldp x27, x28, [sp,#64] ldr x30, [sp,#80] ldp x19, x20, [sp],#96 ret gcc/ * config/aarch64/aarch64.c (aarch64_components_for_bb): Increase LDP/STP opportunities by adding adjacent callee-saves. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@257482 138bc75d-0d04-0410-961f-82ee72b054a4
1 parent 4327f0d commit 25f9d5b

File tree

2 files changed

+21
-1
lines changed

2 files changed

+21
-1
lines changed

gcc/ChangeLog

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
2018-02-08 Wilco Dijkstra <[email protected]>
2+
3+
* config/aarch64/aarch64.c (aarch64_components_for_bb):
4+
Increase LDP/STP opportunities by adding adjacent callee-saves.
5+
16
2018-02-08 Wilco Dijkstra <[email protected]>
27

38
PR rtl-optimization/84068

gcc/config/aarch64/aarch64.c

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4552,7 +4552,22 @@ aarch64_components_for_bb (basic_block bb)
45524552
&& (bitmap_bit_p (in, regno)
45534553
|| bitmap_bit_p (gen, regno)
45544554
|| bitmap_bit_p (kill, regno)))
4555-
bitmap_set_bit (components, regno);
4555+
{
4556+
unsigned regno2, offset, offset2;
4557+
bitmap_set_bit (components, regno);
4558+
4559+
/* If there is a callee-save at an adjacent offset, add it too
4560+
to increase the use of LDP/STP. */
4561+
offset = cfun->machine->frame.reg_offset[regno];
4562+
regno2 = ((offset & 8) == 0) ? regno + 1 : regno - 1;
4563+
4564+
if (regno2 <= LAST_SAVED_REGNUM)
4565+
{
4566+
offset2 = cfun->machine->frame.reg_offset[regno2];
4567+
if ((offset & ~8) == (offset2 & ~8))
4568+
bitmap_set_bit (components, regno2);
4569+
}
4570+
}
45564571

45574572
return components;
45584573
}

0 commit comments

Comments
 (0)