aarch64: Fix missed shrink-wrapping opportunity

wb_candidate1 and wb_candidate2 exist for two overlapping cases:
when we use an STR or STP with writeback to allocate the frame,
and when we set up a frame chain record (either using writeback
allocation or not).

However, aarch64_layout_frame was leaving these fields with
legitimate register numbers even if we decided to do neither
of those things.  This prevented those registers from being
shrink-wrapped, even though we were otherwise treating them
as normal saves and restores.

The case this patch handles isn't the common case, so it might
not be worth going out of our way to optimise it.  But I think
the patch actually makes the output of aarch64_layout_frame more
consistent.

2020-05-28  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64.h (aarch64_frame): Add a comment above
	wb_candidate1 and wb_candidate2.
	* config/aarch64/aarch64.c (aarch64_layout_frame): Invalidate
	wb_candidate1 and wb_candidate2 if we decided not to use them.

gcc/testsuite/
	* gcc.target/aarch64/shrink_wrap_1.c: New test.
This commit is contained in:
Richard Sandiford 2020-05-28 13:18:13 +01:00
parent 1ccbfffb0f
commit 59a3d73d50
3 changed files with 44 additions and 0 deletions

View File

@ -6749,6 +6749,14 @@ aarch64_layout_frame (void)
+ frame.sve_callee_adjust
+ frame.final_adjust, frame.frame_size));
if (!frame.emit_frame_chain && frame.callee_adjust == 0)
{
/* We've decided not to associate any register saves with the initial
stack allocation. */
frame.wb_candidate1 = INVALID_REGNUM;
frame.wb_candidate2 = INVALID_REGNUM;
}
frame.laid_out = true;
}

View File

@ -842,6 +842,23 @@ struct GTY (()) aarch64_frame
/* Store FP,LR and setup a frame pointer. */
bool emit_frame_chain;
/* In each frame, we can associate up to two register saves with the
initial stack allocation. This happens in one of two ways:
(1) Using an STR or STP with writeback to perform the initial
stack allocation. When EMIT_FRAME_CHAIN, the registers will
be those needed to create a frame chain.
Indicated by CALLEE_ADJUST != 0.
(2) Using a separate STP to set up the frame record, after the
initial stack allocation but before setting up the frame pointer.
This is used if the offset is too large to use writeback.
Indicated by CALLEE_ADJUST == 0 && EMIT_FRAME_CHAIN.
These fields indicate which registers we've decided to handle using
(1) or (2), or INVALID_REGNUM if none. */
unsigned wb_candidate1;
unsigned wb_candidate2;

View File

@ -0,0 +1,19 @@
/* { dg-do compile { target { aarch64*-*-* } } } */
/* { dg-options "-O2" } */
/* { dg-final { check-function-bodies "**" "" } } */
/*
** foo:
** ...
** str d8, \[sp\]
** ldr d8, \[sp\]
** ...
*/
void
foo (int x)
{
int tmp[0x1000];
asm volatile ("" : "=m" (tmp));
if (x == 1)
asm volatile ("" ::: "d8");
}