aarch64: Fix overly optimistic LDP/STP matching [PR104005]
In g:526e1639aa76b0a8496b0dc3a3ff2c450229544e I'd added support for finding more consecutive MEMs. However, the check was too eager, in that it matched MEM_REFs with the same base address even if that base address was an arbitrary SSA name. This can give wrong results if a MEM_REF from one loop iteration is compared with a MEM_REF from another (e.g. after rtl unrolling). In principle, we could still accept MEM_REFs based on the same incoming SSA name, but there seems to be no out-of-the-box API for doing that. Adding a new one at this stage in GCC 12 doesn't feel like a good risk/reward trade-off. This patch therefore restricts the MEM_EXPR comparison to base decls only, excluding all MEM_REFs. It means we lose all the new STPs in the PR testcase but keep the ones in the original stp_1.c testcase. gcc/ PR target/104005 * config/aarch64/aarch64.cc (aarch64_check_consecutive_mems): When using MEM_EXPR, require the base to be a decl. gcc/testsuite/ PR target/104005 * gcc.target/aarch64/pr104005.c: New test.
This commit is contained in:
parent
d21db05b6f
commit
38ec23fafb
@ -24747,6 +24747,7 @@ aarch64_check_consecutive_mems (rtx *mem1, rtx *mem2, bool *reversed)
|
||||
&expr_offset2);
|
||||
if (!expr_base1
|
||||
|| !expr_base2
|
||||
|| !DECL_P (expr_base1)
|
||||
|| !operand_equal_p (expr_base1, expr_base2, OEP_ADDRESS_OF))
|
||||
return false;
|
||||
|
||||
|
17
gcc/testsuite/gcc.target/aarch64/pr104005.c
Normal file
17
gcc/testsuite/gcc.target/aarch64/pr104005.c
Normal file
@ -0,0 +1,17 @@
|
||||
/* { dg-options "-O2 -funroll-loops" } */
|
||||
|
||||
typedef int v2 __attribute__((vector_size(8)));
|
||||
|
||||
void f(void) {
|
||||
v2 v[1024];
|
||||
v2 *ptr = v;
|
||||
for (int i = 0; i < 512; ++i)
|
||||
{
|
||||
ptr[0][0] = 0;
|
||||
asm volatile ("":::"memory");
|
||||
ptr[0][1] = 1;
|
||||
ptr += 2;
|
||||
}
|
||||
}
|
||||
|
||||
/* { dg-final { scan-assembler-not {\tstp\t} } } */
|
Loading…
Reference in New Issue
Block a user