Skip to content

Commit 5f6a6c9

Browse files
author
Richard Earnshaw
committed
gimple: allow more folding of memcpy [PR102125]
The current restriction on folding memcpy to a single element of size MOVE_MAX is excessively cautious on most machines and limits some significant further optimizations. So relax the restriction provided the copy size does not exceed MOVE_MAX * MOVE_RATIO and that a SET insn exists for moving the value into machine registers. Note that there were already checks in place for having misaligned move operations when one or more of the operands were unaligned. On Arm this now permits optimizing uint64_t bar64(const uint8_t *rData1) { uint64_t buffer; memcpy(&buffer, rData1, sizeof(buffer)); return buffer; } from ldr r2, [r0] @ unaligned sub sp, sp, gcc-mirror#8 ldr r3, [r0, #4] @ unaligned strd r2, [sp] ldrd r0, [sp] add sp, sp, gcc-mirror#8 to mov r3, r0 ldr r0, [r0] @ unaligned ldr r1, [r3, #4] @ unaligned PR target/102125 - (ARM Cortex-M3 and newer) missed optimization. memcpy not needed operations gcc/ChangeLog: PR target/102125 * gimple-fold.c (gimple_fold_builtin_memory_op): Allow folding memcpy if the size is not more than MOVE_MAX * MOVE_RATIO.
1 parent f0cfd07 commit 5f6a6c9

File tree

1 file changed

+11
-5
lines changed

1 file changed

+11
-5
lines changed

gcc/gimple-fold.c

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,8 @@ along with GCC; see the file COPYING3. If not see
6767
#include "tree-vector-builder.h"
6868
#include "tree-ssa-strlen.h"
6969
#include "varasm.h"
70+
#include "memmodel.h"
71+
#include "optabs.h"
7072

7173
enum strlen_range_kind {
7274
/* Compute the exact constant string length. */
@@ -957,14 +959,17 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
957959
= build_int_cst (build_pointer_type_for_mode (char_type_node,
958960
ptr_mode, true), 0);
959961

960-
/* If we can perform the copy efficiently with first doing all loads
961-
and then all stores inline it that way. Currently efficiently
962-
means that we can load all the memory into a single integer
963-
register which is what MOVE_MAX gives us. */
962+
/* If we can perform the copy efficiently with first doing all loads and
963+
then all stores inline it that way. Currently efficiently means that
964+
we can load all the memory with a single set operation and that the
965+
total size is less than MOVE_MAX * MOVE_RATIO. */
964966
src_align = get_pointer_alignment (src);
965967
dest_align = get_pointer_alignment (dest);
966968
if (tree_fits_uhwi_p (len)
967-
&& compare_tree_int (len, MOVE_MAX) <= 0
969+
&& (compare_tree_int
970+
(len, (MOVE_MAX
971+
* MOVE_RATIO (optimize_function_for_size_p (cfun))))
972+
<= 0)
968973
/* FIXME: Don't transform copies from strings with known length.
969974
Until GCC 9 this prevented a case in gcc.dg/strlenopt-8.c
970975
from being handled, and the case was XFAILed for that reason.
@@ -1000,6 +1005,7 @@ gimple_fold_builtin_memory_op (gimple_stmt_iterator *gsi,
10001005
if (type
10011006
&& is_a <scalar_int_mode> (TYPE_MODE (type), &mode)
10021007
&& GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
1008+
&& have_insn_for (SET, mode)
10031009
/* If the destination pointer is not aligned we must be able
10041010
to emit an unaligned store. */
10051011
&& (dest_align >= GET_MODE_ALIGNMENT (mode)

0 commit comments

Comments
 (0)