common.opt (flra-remat): New.

2014-11-12  Vladimir Makarov  <vmakarov@redhat.com>

	* common.opt (flra-remat): New.
	* opts.c (default_options_table): Add entry for flra_remat.
	* timevar_def (TV_LRA_REMAT): New.
	* doc/invoke.texi (-flra-remat): Add description of the new
	option.
	* doc/passes.texi (-flra-remat): Remove lra-equivs.c and
	lra-saves.c.  Add lra-remat.c.
	* Makefile.in (OBJS): Add lra-remat.o.
	* lra-remat.c: New file.
	* lra.c: Add info about the rematerialization pass in the top
	comment.
	(collect_non_operand_hard_regs, add_regs_to_insn_regno_info):
	Process unallocatable regs too.
	(lra_constraint_new_insn_uid_start): Remove.
	(lra): Add code for calling rematerialization sub-pass.
	* lra-int.h (lra_constraint_new_insn_uid_start): Remove.
	(lra_constrain_insn, lra_remat): New prototypes.
	(lra_eliminate_regs_1): Add parameter.
	* lra-lives.c (make_hard_regno_born, make_hard_regno_dead):
	Process unallocatable hard regs too.
	(process_bb_lives): Ditto.
	* lra-spills.c (remove_pseudos): Add argument to
	lra_eliminate_regs_1 call.
	* lra-eliminations.c (lra_eliminate_regs_1): Add parameter.  Use it
	for sp offset calculation.
	(lra_eliminate_regs): Add argument for lra_eliminate_regs_1 call.
	(eliminate_regs_in_insn): Add parameter.  Use it for sp offset
	calculation.
	(process_insn_for_elimination): Add argument for
	eliminate_regs_in_insn call.
	* lra-constraints.c (get_equiv_with_elimination):  Add argument
	for lra_eliminate_regs_1 call.
	(process_addr_reg): Add parameter.  Use it.
	(process_address_1): Ditto.  Add argument for process_addr_reg
	call.
	(process_address): Ditto.
	(curr_insn_transform): Add parameter.  Use it.  Add argument for
	process_address calls.
	(lra_constrain_insn): New function.
	(lra_constraints): Add argument for curr_insn_transform call.

From-SVN: r217458
This commit is contained in:
Vladimir Makarov 2014-11-13 03:02:49 +00:00 committed by Vladimir Makarov
parent 778e02fdc4
commit d9cf932c33
14 changed files with 1532 additions and 162 deletions

View File

@ -1,3 +1,46 @@
2014-11-12 Vladimir Makarov <vmakarov@redhat.com>
* common.opt (flra-remat): New.
* opts.c (default_options_table): Add entry for flra_remat.
* timevar_def (TV_LRA_REMAT): New.
* doc/invoke.texi (-flra-remat): Add description of the new
option.
* doc/passes.texi (-flra-remat): Remove lra-equivs.c and
lra-saves.c. Add lra-remat.c.
* Makefile.in (OBJS): Add lra-remat.o.
* lra-remat.c: New file.
* lra.c: Add info about the rematerialization pass in the top
comment.
(collect_non_operand_hard_regs, add_regs_to_insn_regno_info):
Process unallocatable regs too.
(lra_constraint_new_insn_uid_start): Remove.
(lra): Add code for calling rematerialization sub-pass.
* lra-int.h (lra_constraint_new_insn_uid_start): Remove.
(lra_constrain_insn, lra_remat): New prototypes.
(lra_eliminate_regs_1): Add parameter.
* lra-lives.c (make_hard_regno_born, make_hard_regno_dead):
Process unallocatable hard regs too.
(process_bb_lives): Ditto.
* lra-spills.c (remove_pseudos): Add argument to
lra_eliminate_regs_1 call.
* lra-eliminations.c (lra_eliminate_regs_1): Add parameter. Use it
for sp offset calculation.
(lra_eliminate_regs): Add argument for lra_eliminate_regs_1 call.
(eliminate_regs_in_insn): Add parameter. Use it for sp offset
calculation.
(process_insn_for_elimination): Add argument for
eliminate_regs_in_insn call.
* lra-constraints.c (get_equiv_with_elimination): Add argument
for lra_eliminate_regs_1 call.
(process_addr_reg): Add parameter. Use it.
(process_address_1): Ditto. Add argument for process_addr_reg
call.
(process_address): Ditto.
(curr_insn_transform): Add parameter. Use it. Add argument for
process_address calls.
(lra_constrain_insn): New function.
(lra_constraints): Add argument for curr_insn_transform call.
2014-11-13 Manuel López-Ibáñez <manu@gcc.gnu.org>
* opts-global.c (postpone_unknown_option_warning): Fix spelling.

View File

@ -1304,6 +1304,7 @@ OBJS = \
lra-constraints.o \
lra-eliminations.o \
lra-lives.o \
lra-remat.o \
lra-spills.o \
lto-cgraph.o \
lto-streamer.o \

View File

@ -1551,6 +1551,10 @@ floop-optimize
Common Ignore
Does nothing. Preserved for backward compatibility.
flra-remat
Common Report Var(flag_lra_remat) Optimization
Do CFG-sensitive rematerialization in LRA
flto
Common
Enable link-time optimization.

View File

@ -392,7 +392,7 @@ Objective-C and Objective-C++ Dialects}.
-fisolate-erroneous-paths-dereference -fisolate-erroneous-paths-attribute @gol
-fivopts -fkeep-inline-functions -fkeep-static-consts -flive-range-shrinkage @gol
-floop-block -floop-interchange -floop-strip-mine -floop-nest-optimize @gol
-floop-parallelize-all -flto -flto-compression-level @gol
-floop-parallelize-all -flra-remat -flto -flto-compression-level @gol
-flto-partition=@var{alg} -flto-report -flto-report-wpa -fmerge-all-constants @gol
-fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
-fmove-loop-invariants -fno-branch-count-reg @gol
@ -7183,6 +7183,7 @@ also turns on the following optimization flags:
-fipa-sra @gol
-fipa-icf @gol
-fisolate-erroneous-paths-dereference @gol
-flra-remat @gol
-foptimize-sibling-calls @gol
-foptimize-strlen @gol
-fpartial-inlining @gol
@ -7811,6 +7812,14 @@ Control the verbosity of the dump file for the integrated register allocator.
The default value is 5. If the value @var{n} is greater or equal to 10,
the dump output is sent to stderr using the same format as @var{n} minus 10.
@item -flra-remat
@opindex fcaller-saves
Enable CFG-sensitive rematerialization in LRA. Instead of loading
values of spilled pseudos, LRA tries to rematerialize (recalculate)
values if it is profitable.
Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
@item -fdelayed-branch
@opindex fdelayed-branch
If supported for the target machine, attempt to reorder instructions

View File

@ -911,10 +911,10 @@ Source files are @file{reload.c} and @file{reload1.c}, plus the header
This pass is a modern replacement of the reload pass. Source files
are @file{lra.c}, @file{lra-assign.c}, @file{lra-coalesce.c},
@file{lra-constraints.c}, @file{lra-eliminations.c},
@file{lra-equivs.c}, @file{lra-lives.c}, @file{lra-saves.c},
@file{lra-spills.c}, the header @file{lra-int.h} used for
communication between them, and the header @file{lra.h} used for
communication between LRA and the rest of compiler.
@file{lra-lives.c}, @file{lra-remat.c}, @file{lra-spills.c}, the
header @file{lra-int.h} used for communication between them, and the
header @file{lra.h} used for communication between LRA and the rest of
compiler.
Unlike the reload pass, intermediate LRA decisions are reflected in
RTL as much as possible. This reduces the number of target-dependent

View File

@ -506,7 +506,8 @@ get_equiv_with_elimination (rtx x, rtx_insn *insn)
if (x == res || CONSTANT_P (res))
return res;
return lra_eliminate_regs_1 (insn, res, GET_MODE (res), false, false, true);
return lra_eliminate_regs_1 (insn, res, GET_MODE (res),
0, false, false, true);
}
/* Set up curr_operand_mode. */
@ -1243,12 +1244,16 @@ static bool no_input_reloads_p, no_output_reloads_p;
insn. */
static int curr_swapped;
/* Arrange for address element *LOC to be a register of class CL.
Add any input reloads to list BEFORE. AFTER is nonnull if *LOC is an
automodified value; handle that case by adding the required output
reloads to list AFTER. Return true if the RTL was changed. */
/* if CHECK_ONLY_P is false, arrange for address element *LOC to be a
register of class CL. Add any input reloads to list BEFORE. AFTER
is nonnull if *LOC is an automodified value; handle that case by
adding the required output reloads to list AFTER. Return true if
the RTL was changed.
if CHECK_ONLY_P is true, check that the *LOC is a correct address
register. Return false if the address register is correct. */
static bool
process_addr_reg (rtx *loc, rtx_insn **before, rtx_insn **after,
process_addr_reg (rtx *loc, bool check_only_p, rtx_insn **before, rtx_insn **after,
enum reg_class cl)
{
int regno;
@ -1265,6 +1270,8 @@ process_addr_reg (rtx *loc, rtx_insn **before, rtx_insn **after,
mode = GET_MODE (reg);
if (! REG_P (reg))
{
if (check_only_p)
return true;
/* Always reload memory in an address even if the target supports
such addresses. */
new_reg = lra_create_new_reg_with_unique_value (mode, reg, cl, "address");
@ -1274,7 +1281,8 @@ process_addr_reg (rtx *loc, rtx_insn **before, rtx_insn **after,
{
regno = REGNO (reg);
rclass = get_reg_class (regno);
if ((*loc = get_equiv_with_elimination (reg, curr_insn)) != reg)
if (! check_only_p
&& (*loc = get_equiv_with_elimination (reg, curr_insn)) != reg)
{
if (lra_dump_file != NULL)
{
@ -1288,6 +1296,8 @@ process_addr_reg (rtx *loc, rtx_insn **before, rtx_insn **after,
}
if (*loc != reg || ! in_class_p (reg, cl, &new_class))
{
if (check_only_p)
return true;
reg = *loc;
if (get_reload_reg (after == NULL ? OP_IN : OP_INOUT,
mode, reg, cl, subreg_p, "address", &new_reg))
@ -1295,6 +1305,8 @@ process_addr_reg (rtx *loc, rtx_insn **before, rtx_insn **after,
}
else if (new_class != NO_REGS && rclass != new_class)
{
if (check_only_p)
return true;
lra_change_class (regno, new_class, " Change to", true);
return false;
}
@ -2740,8 +2752,9 @@ equiv_address_substitution (struct address_info *ad)
return change_p;
}
/* Major function to make reloads for an address in operand NOP.
The supported cases are:
/* Major function to make reloads for an address in operand NOP or
check its correctness (If CHECK_ONLY_P is true). The supported
cases are:
1) an address that existed before LRA started, at which point it
must have been valid. These addresses are subject to elimination
@ -2761,18 +2774,19 @@ equiv_address_substitution (struct address_info *ad)
address. Return true for any RTL change.
The function is a helper function which does not produce all
transformations which can be necessary. It does just basic steps.
To do all necessary transformations use function
process_address. */
transformations (when CHECK_ONLY_P is false) which can be
necessary. It does just basic steps. To do all necessary
transformations use function process_address. */
static bool
process_address_1 (int nop, rtx_insn **before, rtx_insn **after)
process_address_1 (int nop, bool check_only_p,
rtx_insn **before, rtx_insn **after)
{
struct address_info ad;
rtx new_reg;
rtx op = *curr_id->operand_loc[nop];
const char *constraint = curr_static_id->operand[nop].constraint;
enum constraint_num cn = lookup_constraint (constraint);
bool change_p;
bool change_p = false;
if (insn_extra_address_constraint (cn))
decompose_lea_address (&ad, curr_id->operand_loc[nop]);
@ -2783,10 +2797,11 @@ process_address_1 (int nop, rtx_insn **before, rtx_insn **after)
decompose_mem_address (&ad, SUBREG_REG (op));
else
return false;
change_p = equiv_address_substitution (&ad);
if (! check_only_p)
change_p = equiv_address_substitution (&ad);
if (ad.base_term != NULL
&& (process_addr_reg
(ad.base_term, before,
(ad.base_term, check_only_p, before,
(ad.autoinc_p
&& !(REG_P (*ad.base_term)
&& find_regno_note (curr_insn, REG_DEAD,
@ -2800,7 +2815,8 @@ process_address_1 (int nop, rtx_insn **before, rtx_insn **after)
*ad.base_term2 = *ad.base_term;
}
if (ad.index_term != NULL
&& process_addr_reg (ad.index_term, before, NULL, INDEX_REG_CLASS))
&& process_addr_reg (ad.index_term, check_only_p,
before, NULL, INDEX_REG_CLASS))
change_p = true;
/* Target hooks sometimes don't treat extra-constraint addresses as
@ -2809,6 +2825,9 @@ process_address_1 (int nop, rtx_insn **before, rtx_insn **after)
&& satisfies_address_constraint_p (&ad, cn))
return change_p;
if (check_only_p)
return change_p;
/* There are three cases where the shape of *AD.INNER may now be invalid:
1) the original address was valid, but either elimination or
@ -2977,15 +2996,24 @@ process_address_1 (int nop, rtx_insn **before, rtx_insn **after)
return true;
}
/* Do address reloads until it is necessary. Use process_address_1 as
a helper function. Return true for any RTL changes. */
/* If CHECK_ONLY_P is false, do address reloads until it is necessary.
Use process_address_1 as a helper function. Return true for any
RTL changes.
If CHECK_ONLY_P is true, just check address correctness. Return
false if the address correct. */
static bool
process_address (int nop, rtx_insn **before, rtx_insn **after)
process_address (int nop, bool check_only_p,
rtx_insn **before, rtx_insn **after)
{
bool res = false;
while (process_address_1 (nop, before, after))
res = true;
while (process_address_1 (nop, check_only_p, before, after))
{
if (check_only_p)
return true;
res = true;
}
return res;
}
@ -3157,9 +3185,15 @@ swap_operands (int nop)
model can be changed in future. Make commutative operand exchange
if it is chosen.
Return true if some RTL changes happened during function call. */
if CHECK_ONLY_P is false, do RTL changes to satisfy the
constraints. Return true if any change happened during function
call.
If CHECK_ONLY_P is true then don't do any transformation. Just
check that the insn satisfies all constraints. If the insn does
not satisfy any constraint, return true. */
static bool
curr_insn_transform (void)
curr_insn_transform (bool check_only_p)
{
int i, j, k;
int n_operands;
@ -3226,50 +3260,53 @@ curr_insn_transform (void)
curr_swapped = false;
goal_alt_swapped = false;
/* Make equivalence substitution and memory subreg elimination
before address processing because an address legitimacy can
depend on memory mode. */
for (i = 0; i < n_operands; i++)
{
rtx op = *curr_id->operand_loc[i];
rtx subst, old = op;
bool op_change_p = false;
if (GET_CODE (old) == SUBREG)
old = SUBREG_REG (old);
subst = get_equiv_with_elimination (old, curr_insn);
if (subst != old)
{
subst = copy_rtx (subst);
lra_assert (REG_P (old));
if (GET_CODE (op) == SUBREG)
SUBREG_REG (op) = subst;
else
*curr_id->operand_loc[i] = subst;
if (lra_dump_file != NULL)
{
fprintf (lra_dump_file,
"Changing pseudo %d in operand %i of insn %u on equiv ",
REGNO (old), i, INSN_UID (curr_insn));
dump_value_slim (lra_dump_file, subst, 1);
if (! check_only_p)
/* Make equivalence substitution and memory subreg elimination
before address processing because an address legitimacy can
depend on memory mode. */
for (i = 0; i < n_operands; i++)
{
rtx op = *curr_id->operand_loc[i];
rtx subst, old = op;
bool op_change_p = false;
if (GET_CODE (old) == SUBREG)
old = SUBREG_REG (old);
subst = get_equiv_with_elimination (old, curr_insn);
if (subst != old)
{
subst = copy_rtx (subst);
lra_assert (REG_P (old));
if (GET_CODE (op) == SUBREG)
SUBREG_REG (op) = subst;
else
*curr_id->operand_loc[i] = subst;
if (lra_dump_file != NULL)
{
fprintf (lra_dump_file,
"Changing pseudo %d in operand %i of insn %u on equiv ",
REGNO (old), i, INSN_UID (curr_insn));
dump_value_slim (lra_dump_file, subst, 1);
fprintf (lra_dump_file, "\n");
}
op_change_p = change_p = true;
}
if (simplify_operand_subreg (i, GET_MODE (old)) || op_change_p)
{
change_p = true;
lra_update_dup (curr_id, i);
}
}
}
op_change_p = change_p = true;
}
if (simplify_operand_subreg (i, GET_MODE (old)) || op_change_p)
{
change_p = true;
lra_update_dup (curr_id, i);
}
}
/* Reload address registers and displacements. We do it before
finding an alternative because of memory constraints. */
before = after = NULL;
for (i = 0; i < n_operands; i++)
if (! curr_static_id->operand[i].is_operator
&& process_address (i, &before, &after))
&& process_address (i, check_only_p, &before, &after))
{
if (check_only_p)
return true;
change_p = true;
lra_update_dup (curr_id, i);
}
@ -3279,13 +3316,13 @@ curr_insn_transform (void)
we chose previously may no longer be valid. */
lra_set_used_insn_alternative (curr_insn, -1);
if (curr_insn_set != NULL_RTX
if (! check_only_p && curr_insn_set != NULL_RTX
&& check_and_process_move (&change_p, &sec_mem_p))
return change_p;
try_swapped:
reused_alternative_num = curr_id->used_insn_alternative;
reused_alternative_num = check_only_p ? -1 : curr_id->used_insn_alternative;
if (lra_dump_file != NULL && reused_alternative_num >= 0)
fprintf (lra_dump_file, "Reusing alternative %d for insn #%u\n",
reused_alternative_num, INSN_UID (curr_insn));
@ -3293,6 +3330,9 @@ curr_insn_transform (void)
if (process_alt_operands (reused_alternative_num))
alt_p = true;
if (check_only_p)
return ! alt_p || best_losers != 0;
/* If insn is commutative (it's safe to exchange a certain pair of
operands) then we need to try each alternative twice, the second
time matching those two operands as if we had exchanged them. To
@ -3522,7 +3562,7 @@ curr_insn_transform (void)
*curr_id->operand_loc[i] = tem;
lra_update_dup (curr_id, i);
process_address (i, &before, &after);
process_address (i, false, &before, &after);
/* If the alternative accepts constant pool refs directly
there will be no reload needed at all. */
@ -3746,6 +3786,26 @@ curr_insn_transform (void)
return change_p;
}
/* Return true if INSN satisfies all constraints. In other words, no
reload insns are needed. */
bool
lra_constrain_insn (rtx_insn *insn)
{
int saved_new_regno_start = new_regno_start;
int saved_new_insn_uid_start = new_insn_uid_start;
bool change_p;
curr_insn = insn;
curr_id = lra_get_insn_recog_data (curr_insn);
curr_static_id = curr_id->insn_static_data;
new_insn_uid_start = get_max_uid ();
new_regno_start = max_reg_num ();
change_p = curr_insn_transform (true);
new_regno_start = saved_new_regno_start;
new_insn_uid_start = saved_new_insn_uid_start;
return ! change_p;
}
/* Return true if X is in LIST. */
static bool
in_list_p (rtx x, rtx list)
@ -4238,7 +4298,7 @@ lra_constraints (bool first_p)
curr_static_id = curr_id->insn_static_data;
init_curr_insn_input_reloads ();
init_curr_operand_mode ();
if (curr_insn_transform ())
if (curr_insn_transform (false))
changed_p = true;
/* Check non-transformed insns too for equiv change as USE
or CLOBBER don't need reloads but can contain pseudos

View File

@ -298,7 +298,8 @@ get_elimination (rtx reg)
a change in the offset between the eliminable register and its
substitution if UPDATE_P, or the full offset if FULL_P, or
otherwise zero. If FULL_P, we also use the SP offsets for
elimination to SP.
elimination to SP. If UPDATE_P, use UPDATE_SP_OFFSET for updating
offsets of register elimnable to SP.
MEM_MODE is the mode of an enclosing MEM. We need this to know how
much to adjust a register for, e.g., PRE_DEC. Also, if we are
@ -311,7 +312,8 @@ get_elimination (rtx reg)
sp offset. */
rtx
lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
bool subst_p, bool update_p, bool full_p)
bool subst_p, bool update_p,
HOST_WIDE_INT update_sp_offset, bool full_p)
{
enum rtx_code code = GET_CODE (x);
struct lra_elim_table *ep;
@ -346,7 +348,10 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
rtx to = subst_p ? ep->to_rtx : ep->from_rtx;
if (update_p)
return plus_constant (Pmode, to, ep->offset - ep->previous_offset);
return plus_constant (Pmode, to,
ep->offset - ep->previous_offset
+ (ep->to_rtx == stack_pointer_rtx
? update_sp_offset : 0));
else if (full_p)
return plus_constant (Pmode, to,
ep->offset
@ -373,7 +378,10 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
return gen_rtx_PLUS (Pmode, to, XEXP (x, 1));
offset = (update_p
? ep->offset - ep->previous_offset : ep->offset);
? ep->offset - ep->previous_offset
+ (ep->to_rtx == stack_pointer_rtx
? update_sp_offset : 0)
: ep->offset);
if (full_p && insn != NULL_RTX && ep->to_rtx == stack_pointer_rtx)
offset -= lra_get_insn_recog_data (insn)->sp_offset;
if (CONST_INT_P (XEXP (x, 1))
@ -402,9 +410,11 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
{
rtx new0 = lra_eliminate_regs_1 (insn, XEXP (x, 0), mem_mode,
subst_p, update_p, full_p);
subst_p, update_p,
update_sp_offset, full_p);
rtx new1 = lra_eliminate_regs_1 (insn, XEXP (x, 1), mem_mode,
subst_p, update_p, full_p);
subst_p, update_p,
update_sp_offset, full_p);
if (new0 != XEXP (x, 0) || new1 != XEXP (x, 1))
return form_sum (new0, new1);
@ -423,11 +433,12 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
rtx to = subst_p ? ep->to_rtx : ep->from_rtx;
if (update_p)
return
plus_constant (Pmode,
gen_rtx_MULT (Pmode, to, XEXP (x, 1)),
(ep->offset - ep->previous_offset)
* INTVAL (XEXP (x, 1)));
return plus_constant (Pmode,
gen_rtx_MULT (Pmode, to, XEXP (x, 1)),
(ep->offset - ep->previous_offset
+ (ep->to_rtx == stack_pointer_rtx
? update_sp_offset : 0))
* INTVAL (XEXP (x, 1)));
else if (full_p)
{
HOST_WIDE_INT offset = ep->offset;
@ -459,10 +470,12 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
case LE: case LT: case LEU: case LTU:
{
rtx new0 = lra_eliminate_regs_1 (insn, XEXP (x, 0), mem_mode,
subst_p, update_p, full_p);
subst_p, update_p,
update_sp_offset, full_p);
rtx new1 = XEXP (x, 1)
? lra_eliminate_regs_1 (insn, XEXP (x, 1), mem_mode,
subst_p, update_p, full_p) : 0;
subst_p, update_p,
update_sp_offset, full_p) : 0;
if (new0 != XEXP (x, 0) || new1 != XEXP (x, 1))
return gen_rtx_fmt_ee (code, GET_MODE (x), new0, new1);
@ -475,7 +488,8 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
if (XEXP (x, 0))
{
new_rtx = lra_eliminate_regs_1 (insn, XEXP (x, 0), mem_mode,
subst_p, update_p, full_p);
subst_p, update_p,
update_sp_offset, full_p);
if (new_rtx != XEXP (x, 0))
{
/* If this is a REG_DEAD note, it is not valid anymore.
@ -484,7 +498,8 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
if (REG_NOTE_KIND (x) == REG_DEAD)
return (XEXP (x, 1)
? lra_eliminate_regs_1 (insn, XEXP (x, 1), mem_mode,
subst_p, update_p, full_p)
subst_p, update_p,
update_sp_offset, full_p)
: NULL_RTX);
x = alloc_reg_note (REG_NOTE_KIND (x), new_rtx, XEXP (x, 1));
@ -501,7 +516,8 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
if (XEXP (x, 1))
{
new_rtx = lra_eliminate_regs_1 (insn, XEXP (x, 1), mem_mode,
subst_p, update_p, full_p);
subst_p, update_p,
update_sp_offset, full_p);
if (new_rtx != XEXP (x, 1))
return
gen_rtx_fmt_ee (GET_CODE (x), GET_MODE (x),
@ -528,8 +544,8 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
&& XEXP (XEXP (x, 1), 0) == XEXP (x, 0))
{
rtx new_rtx = lra_eliminate_regs_1 (insn, XEXP (XEXP (x, 1), 1),
mem_mode,
subst_p, update_p, full_p);
mem_mode, subst_p, update_p,
update_sp_offset, full_p);
if (new_rtx != XEXP (XEXP (x, 1), 1))
return gen_rtx_fmt_ee (code, GET_MODE (x), XEXP (x, 0),
@ -553,14 +569,16 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
case PARITY:
case BSWAP:
new_rtx = lra_eliminate_regs_1 (insn, XEXP (x, 0), mem_mode,
subst_p, update_p, full_p);
subst_p, update_p,
update_sp_offset, full_p);
if (new_rtx != XEXP (x, 0))
return gen_rtx_fmt_e (code, GET_MODE (x), new_rtx);
return x;
case SUBREG:
new_rtx = lra_eliminate_regs_1 (insn, SUBREG_REG (x), mem_mode,
subst_p, update_p, full_p);
subst_p, update_p,
update_sp_offset, full_p);
if (new_rtx != SUBREG_REG (x))
{
@ -598,12 +616,12 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
replace_equiv_address_nv
(x,
lra_eliminate_regs_1 (insn, XEXP (x, 0), GET_MODE (x),
subst_p, update_p, full_p));
subst_p, update_p, update_sp_offset, full_p));
case USE:
/* Handle insn_list USE that a call to a pure function may generate. */
new_rtx = lra_eliminate_regs_1 (insn, XEXP (x, 0), VOIDmode,
subst_p, update_p, full_p);
subst_p, update_p, update_sp_offset, full_p);
if (new_rtx != XEXP (x, 0))
return gen_rtx_USE (GET_MODE (x), new_rtx);
return x;
@ -624,7 +642,8 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
if (*fmt == 'e')
{
new_rtx = lra_eliminate_regs_1 (insn, XEXP (x, i), mem_mode,
subst_p, update_p, full_p);
subst_p, update_p,
update_sp_offset, full_p);
if (new_rtx != XEXP (x, i) && ! copied)
{
x = shallow_copy_rtx (x);
@ -638,7 +657,8 @@ lra_eliminate_regs_1 (rtx_insn *insn, rtx x, machine_mode mem_mode,
for (j = 0; j < XVECLEN (x, i); j++)
{
new_rtx = lra_eliminate_regs_1 (insn, XVECEXP (x, i, j), mem_mode,
subst_p, update_p, full_p);
subst_p, update_p,
update_sp_offset, full_p);
if (new_rtx != XVECEXP (x, i, j) && ! copied_vec)
{
rtvec new_v = gen_rtvec_v (XVECLEN (x, i),
@ -665,7 +685,7 @@ rtx
lra_eliminate_regs (rtx x, machine_mode mem_mode,
rtx insn ATTRIBUTE_UNUSED)
{
return lra_eliminate_regs_1 (NULL, x, mem_mode, true, false, true);
return lra_eliminate_regs_1 (NULL, x, mem_mode, true, false, 0, true);
}
/* Stack pointer offset before the current insn relative to one at the
@ -850,13 +870,15 @@ remove_reg_equal_offset_note (rtx insn, rtx what)
If REPLACE_P is false, just update the offsets while keeping the
base register the same. If FIRST_P, use the sp offset for
elimination to sp. Attach the note about used elimination for
insns setting frame pointer to update elimination easy (without
parsing already generated elimination insns to find offset
previously used) in future. */
elimination to sp. Otherwise, use UPDATE_SP_OFFSET for this.
Attach the note about used elimination for insns setting frame
pointer to update elimination easy (without parsing already
generated elimination insns to find offset previously used) in
future. */
static void
eliminate_regs_in_insn (rtx_insn *insn, bool replace_p, bool first_p)
void
eliminate_regs_in_insn (rtx_insn *insn, bool replace_p, bool first_p,
HOST_WIDE_INT update_sp_offset)
{
int icode = recog_memoized (insn);
rtx old_set = single_set (insn);
@ -986,8 +1008,13 @@ eliminate_regs_in_insn (rtx_insn *insn, bool replace_p, bool first_p)
if (! replace_p)
{
offset += (ep->offset - ep->previous_offset);
if (first_p && ep->to_rtx == stack_pointer_rtx)
offset -= lra_get_insn_recog_data (insn)->sp_offset;
if (ep->to_rtx == stack_pointer_rtx)
{
if (first_p)
offset -= lra_get_insn_recog_data (insn)->sp_offset;
else
offset += update_sp_offset;
}
offset = trunc_int_for_mode (offset, GET_MODE (plus_cst_src));
}
@ -1061,7 +1088,7 @@ eliminate_regs_in_insn (rtx_insn *insn, bool replace_p, bool first_p)
substed_operand[i]
= lra_eliminate_regs_1 (insn, *id->operand_loc[i], VOIDmode,
replace_p, ! replace_p && ! first_p,
first_p);
update_sp_offset, first_p);
if (substed_operand[i] != orig_operand[i])
validate_p = true;
}
@ -1349,7 +1376,7 @@ lra_eliminate_reg_if_possible (rtx *loc)
static void
process_insn_for_elimination (rtx_insn *insn, bool final_p, bool first_p)
{
eliminate_regs_in_insn (insn, final_p, first_p);
eliminate_regs_in_insn (insn, final_p, first_p, 0);
if (! final_p)
{
/* Check that insn changed its code. This is a case when a move

View File

@ -328,7 +328,6 @@ extern bitmap_head lra_inheritance_pseudos;
extern bitmap_head lra_split_regs;
extern bitmap_head lra_subreg_reload_pseudos;
extern bitmap_head lra_optional_reload_pseudos;
extern int lra_constraint_new_insn_uid_start;
/* lra-constraints.c: */
@ -339,6 +338,7 @@ extern int lra_constraint_iter;
extern bool lra_risky_transformations_p;
extern int lra_inheritance_iter;
extern int lra_undo_inheritance_iter;
extern bool lra_constrain_insn (rtx_insn *);
extern bool lra_constraints (bool);
extern void lra_constraints_init (void);
extern void lra_constraints_finish (void);
@ -389,13 +389,17 @@ extern bool lra_need_for_spills_p (void);
extern void lra_spill (void);
extern void lra_final_code_change (void);
/* lra-remat.c: */
extern bool lra_remat (void);
/* lra-elimination.c: */
extern void lra_debug_elim_table (void);
extern int lra_get_elimination_hard_regno (int);
extern rtx lra_eliminate_regs_1 (rtx_insn *, rtx, machine_mode, bool,
bool, bool);
extern rtx lra_eliminate_regs_1 (rtx_insn *, rtx, machine_mode,
bool, bool, HOST_WIDE_INT, bool);
extern void eliminate_regs_in_insn (rtx_insn *insn, bool, bool, HOST_WIDE_INT);
extern void lra_eliminate (bool, bool);
extern void lra_eliminate_reg_if_possible (rtx *);

View File

@ -252,8 +252,7 @@ make_hard_regno_born (int regno)
unsigned int i;
lra_assert (regno < FIRST_PSEUDO_REGISTER);
if (TEST_HARD_REG_BIT (lra_no_alloc_regs, regno)
|| TEST_HARD_REG_BIT (hard_regs_live, regno))
if (TEST_HARD_REG_BIT (hard_regs_live, regno))
return;
SET_HARD_REG_BIT (hard_regs_live, regno);
sparseset_set_bit (start_living, regno);
@ -267,8 +266,7 @@ static void
make_hard_regno_dead (int regno)
{
lra_assert (regno < FIRST_PSEUDO_REGISTER);
if (TEST_HARD_REG_BIT (lra_no_alloc_regs, regno)
|| ! TEST_HARD_REG_BIT (hard_regs_live, regno))
if (! TEST_HARD_REG_BIT (hard_regs_live, regno))
return;
sparseset_set_bit (start_dying, regno);
CLEAR_HARD_REG_BIT (hard_regs_live, regno);
@ -662,7 +660,6 @@ process_bb_lives (basic_block bb, int &curr_point)
sparseset_clear (pseudos_live_through_setjumps);
REG_SET_TO_HARD_REG_SET (hard_regs_live, reg_live_out);
AND_COMPL_HARD_REG_SET (hard_regs_live, eliminable_regset);
AND_COMPL_HARD_REG_SET (hard_regs_live, lra_no_alloc_regs);
EXECUTE_IF_SET_IN_BITMAP (reg_live_out, FIRST_PSEUDO_REGISTER, j, bi)
mark_pseudo_live (j, curr_point);

1212
gcc/lra-remat.c Normal file

File diff suppressed because it is too large Load Diff

View File

@ -445,7 +445,7 @@ remove_pseudos (rtx *loc, rtx_insn *insn)
{
rtx x = lra_eliminate_regs_1 (insn, pseudo_slots[i].mem,
GET_MODE (pseudo_slots[i].mem),
false, false, true);
0, false, false, true);
*loc = x != pseudo_slots[i].mem ? x : copy_rtx (x);
}
return;

109
gcc/lra.c
View File

@ -37,6 +37,7 @@ along with GCC; see the file COPYING3. If not see
generated;
o Some pseudos might be spilled to assign hard registers to
new reload pseudos;
o Recalculating spilled pseudo values (rematerialization);
o Changing spilled pseudos to stack memory or their equivalences;
o Allocation stack memory changes the address displacement and
new iteration is needed.
@ -57,19 +58,26 @@ along with GCC; see the file COPYING3. If not see
----------- | ---------------- |
| | |
| V New |
---------------- No ------------ pseudos -------------------
| Spilled pseudo | change |Constraints:| or insns | Inheritance/split |
| to memory |<-------| RTL |--------->| transformations |
| substitution | | transfor- | | in EBB scope |
---------------- | mations | -------------------
| ------------
V
-------------------------
| Hard regs substitution, |
| devirtalization, and |------> Finish
| restoring scratches got |
| memory |
-------------------------
| ------------ pseudos -------------------
| |Constraints:| or insns | Inheritance/split |
| | RTL |--------->| transformations |
| | transfor- | | in EBB scope |
| substi- | mations | -------------------
| tutions ------------
| | No change
---------------- V
| Spilled pseudo | -------------------
| to memory |<----| Rematerialization |
| substitution | -------------------
----------------
| No susbtitions
V
-------------------------
| Hard regs substitution, |
| devirtalization, and |------> Finish
| restoring scratches got |
| memory |
-------------------------
To speed up the process:
o We process only insns affected by changes on previous
@ -849,38 +857,38 @@ collect_non_operand_hard_regs (rtx *x, lra_insn_recog_data_t data,
{
if ((regno = REGNO (op)) >= FIRST_PSEUDO_REGISTER)
return list;
/* Process all regs even unallocatable ones as we need info
about all regs for rematerialization pass. */
for (last = regno + hard_regno_nregs[regno][mode];
regno < last;
regno++)
if (! TEST_HARD_REG_BIT (lra_no_alloc_regs, regno)
|| TEST_HARD_REG_BIT (eliminable_regset, regno))
{
for (curr = list; curr != NULL; curr = curr->next)
if (curr->regno == regno && curr->subreg_p == subreg_p
&& curr->biggest_mode == mode)
{
if (curr->type != type)
curr->type = OP_INOUT;
if (curr->early_clobber != early_clobber)
curr->early_clobber = true;
break;
}
if (curr == NULL)
{
for (curr = list; curr != NULL; curr = curr->next)
if (curr->regno == regno && curr->subreg_p == subreg_p
&& curr->biggest_mode == mode)
{
/* This is a new hard regno or the info can not be
integrated into the found structure. */
#ifdef STACK_REGS
early_clobber
= (early_clobber
/* This clobber is to inform popping floating
point stack only. */
&& ! (FIRST_STACK_REG <= regno
&& regno <= LAST_STACK_REG));
#endif
list = new_insn_reg (data->insn, regno, type, mode, subreg_p,
early_clobber, list);
if (curr->type != type)
curr->type = OP_INOUT;
if (curr->early_clobber != early_clobber)
curr->early_clobber = true;
break;
}
}
if (curr == NULL)
{
/* This is a new hard regno or the info can not be
integrated into the found structure. */
#ifdef STACK_REGS
early_clobber
= (early_clobber
/* This clobber is to inform popping floating
point stack only. */
&& ! (FIRST_STACK_REG <= regno
&& regno <= LAST_STACK_REG));
#endif
list = new_insn_reg (data->insn, regno, type, mode, subreg_p,
early_clobber, list);
}
}
return list;
}
switch (code)
@ -1456,10 +1464,8 @@ add_regs_to_insn_regno_info (lra_insn_recog_data_t data, rtx x, int uid,
if (REG_P (x))
{
regno = REGNO (x);
if (regno < FIRST_PSEUDO_REGISTER
&& TEST_HARD_REG_BIT (lra_no_alloc_regs, regno)
&& ! TEST_HARD_REG_BIT (eliminable_regset, regno))
return;
/* Process all regs even unallocatable ones as we need info about
all regs for rematerialization pass. */
expand_reg_info ();
if (bitmap_set_bit (&lra_reg_info[regno].insn_bitmap, uid))
{
@ -2152,9 +2158,6 @@ bitmap_head lra_optional_reload_pseudos;
pass. */
bitmap_head lra_subreg_reload_pseudos;
/* First UID of insns generated before a new spill pass. */
int lra_constraint_new_insn_uid_start;
/* File used for output of LRA debug information. */
FILE *lra_dump_file;
@ -2252,7 +2255,6 @@ lra (FILE *f)
lra_curr_reload_num = 0;
push_insns (get_last_insn (), NULL);
/* It is needed for the 1st coalescing. */
lra_constraint_new_insn_uid_start = get_max_uid ();
bitmap_initialize (&lra_inheritance_pseudos, &reg_obstack);
bitmap_initialize (&lra_split_regs, &reg_obstack);
bitmap_initialize (&lra_optional_reload_pseudos, &reg_obstack);
@ -2345,12 +2347,21 @@ lra (FILE *f)
lra_create_live_ranges (lra_reg_spill_p);
live_p = true;
}
/* Now we know what pseudos should be spilled. Try to
rematerialize them first. */
if (lra_remat ())
{
/* We need full live info -- see the comment above. */
lra_create_live_ranges (lra_reg_spill_p);
live_p = true;
if (! lra_need_for_spills_p ())
break;
}
lra_spill ();
/* Assignment of stack slots changes elimination offsets for
some eliminations. So update the offsets here. */
lra_eliminate (false, false);
lra_constraint_new_regno_start = max_reg_num ();
lra_constraint_new_insn_uid_start = get_max_uid ();
lra_assignment_iter_after_spill = 0;
}
restore_scratches ();

View File

@ -500,6 +500,7 @@ static const struct default_options default_options_table[] =
{ OPT_LEVELS_2_PLUS, OPT_fipa_icf, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 },
/* -O3 optimizations. */
{ OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },

View File

@ -237,6 +237,7 @@ DEFTIMEVAR (TV_LRA_INHERITANCE , "LRA reload inheritance")
DEFTIMEVAR (TV_LRA_CREATE_LIVE_RANGES, "LRA create live ranges")
DEFTIMEVAR (TV_LRA_ASSIGN , "LRA hard reg assignment")
DEFTIMEVAR (TV_LRA_COALESCE , "LRA coalesce pseudo regs")
DEFTIMEVAR (TV_LRA_REMAT , "LRA rematerialization")
DEFTIMEVAR (TV_RELOAD , "reload")
DEFTIMEVAR (TV_RELOAD_CSE_REGS , "reload CSE regs")
DEFTIMEVAR (TV_GCSE_AFTER_RELOAD , "load CSE after reload")