params.def (PARAM_IRA_LOOP_RESERVED_REGS): New.

2009-09-26  Vladimir Makarov  <vmakarov@redhat.com>

	* params.def (PARAM_IRA_LOOP_RESERVED_REGS): New.
	* params.h (IRA_LOOP_RESERVED_REGS): New
	* tree-pass.h (pass_subregs_of_mode_init,
	pass_subregs_of_mode_finish): Remove.
	* passes.c (pass_subregs_of_mode_init,
	pass_subregs_of_mode_finish): Remove.
	(pass_reginfo_init): Move before loop optimizations.
	* config/i386/i386.h (STACK_REG_COVER_CLASS): Define.
	* common.opt (fira-loop-pressure): New.
	* toplev.h (flag_ira_loop_pressure): New.
	* rtl.h (init_subregs_of_mode, finish_subregs_of_mode): New
	externals.
	* reginfo.c (init_subregs_of_mode, finish_subregs_of_mode):
	Make external and void type functions.
	(gate_subregs_of_mode_init, pass_subregs_of_mode_init,
	pass_subregs_of_mode_finish): Remove.
	* ira-costs.c (init_costs): Call init_subregs_of_mode.
	* regmove.c: Include ira.h.
	(regmove_optimize): Call ira_set_pseudo_classes after IRA based
	register pressure calculation in loops.
	* loop-invariant.c: Include REGS_H and ira.h.
	(struct loop_data): New members max_reg_pressure, regs_ref, and
	regs_live.
	(struct invariant): New member orig_regno.
	(curr_loop): New variable.
	(find_exits): Initialize regs_ref and regs_live.
	(create_new_invariant): Initialize orig_regno.
	(get_cover_class_and_nregs): New.
	(get_inv_cost): Make aregs_needed an array.  Use regs_needed as an
	array.  Add code for flag_ira_loop_pressure.
	(gain_for_invariant): Make new_regs an array.  Add code for
	flag_ira_loop_pressure.
	(best_gain_for_invariant): Ditto.
	(set_move_mark): New parameter gain.  Use it for debugging output.
	(find_invariants_to_move): Make regs_needed and new_regs an array.
	Add code for flag_ira_loop_pressure.
	(move_invariant_reg): Set up orig_regno.
	(move_invariants): Set up reg classes for pseudos for
	flag_ira_loop_pressure.
	(free_loop_data): Clear regs_ref and regs_live.
	(curr_regs_live, curr_reg_pressure, regs_set, n_regs_set,
	get_regno_cover_class, change_pressure, mark_regno_live,
	mark_regno_death, mark_reg_store, mark_reg_clobber,
	mark_reg_death, mark_ref_regs, calculate_loop_reg_pressure): New.
	(move_loop_invariants): Calculate pressure.  Initialize curr_loop.
	* ira.c (ira): Call ira_set_pseudo_classes after IRA based
	register pressure calculation in loops if new regs were added.
	Call finish_subregs_of_mode.
	* opts.c (decode_options): Set up flag_ira_loop_pressure.
	* Makefile.in (loop-invariant.o): Add ira.h.
	(regmove.o): Ditto.
	* doc/invoke.texi (-fira-loop-pressure, ira-loop-reserved-regs):
	Describe.
	* doc/tm.texi (STACK_REG_COVER_CLASS): Describe.

From-SVN: r152770
This commit is contained in:
Vladimir Makarov 2009-10-14 16:24:11 +00:00 committed by Vladimir Makarov
parent 200c8750d6
commit 1833192f30
18 changed files with 695 additions and 114 deletions

View File

@ -1,3 +1,60 @@
2009-09-26 Vladimir Makarov <vmakarov@redhat.com>
* params.def (PARAM_IRA_LOOP_RESERVED_REGS): New.
* params.h (IRA_LOOP_RESERVED_REGS): New
* tree-pass.h (pass_subregs_of_mode_init,
pass_subregs_of_mode_finish): Remove.
* passes.c (pass_subregs_of_mode_init,
pass_subregs_of_mode_finish): Remove.
(pass_reginfo_init): Move before loop optimizations.
* config/i386/i386.h (STACK_REG_COVER_CLASS): Define.
* common.opt (fira-loop-pressure): New.
* toplev.h (flag_ira_loop_pressure): New.
* rtl.h (init_subregs_of_mode, finish_subregs_of_mode): New
externals.
* reginfo.c (init_subregs_of_mode, finish_subregs_of_mode):
Make external and void type functions.
(gate_subregs_of_mode_init, pass_subregs_of_mode_init,
pass_subregs_of_mode_finish): Remove.
* ira-costs.c (init_costs): Call init_subregs_of_mode.
* regmove.c: Include ira.h.
(regmove_optimize): Call ira_set_pseudo_classes after IRA based
register pressure calculation in loops.
* loop-invariant.c: Include REGS_H and ira.h.
(struct loop_data): New members max_reg_pressure, regs_ref, and
regs_live.
(struct invariant): New member orig_regno.
(curr_loop): New variable.
(find_exits): Initialize regs_ref and regs_live.
(create_new_invariant): Initialize orig_regno.
(get_cover_class_and_nregs): New.
(get_inv_cost): Make aregs_needed an array. Use regs_needed as an
array. Add code for flag_ira_loop_pressure.
(gain_for_invariant): Make new_regs an array. Add code for
flag_ira_loop_pressure.
(best_gain_for_invariant): Ditto.
(set_move_mark): New parameter gain. Use it for debugging output.
(find_invariants_to_move): Make regs_needed and new_regs an array.
Add code for flag_ira_loop_pressure.
(move_invariant_reg): Set up orig_regno.
(move_invariants): Set up reg classes for pseudos for
flag_ira_loop_pressure.
(free_loop_data): Clear regs_ref and regs_live.
(curr_regs_live, curr_reg_pressure, regs_set, n_regs_set,
get_regno_cover_class, change_pressure, mark_regno_live,
mark_regno_death, mark_reg_store, mark_reg_clobber,
mark_reg_death, mark_ref_regs, calculate_loop_reg_pressure): New.
(move_loop_invariants): Calculate pressure. Initialize curr_loop.
* ira.c (ira): Call ira_set_pseudo_classes after IRA based
register pressure calculation in loops if new regs were added.
Call finish_subregs_of_mode.
* opts.c (decode_options): Set up flag_ira_loop_pressure.
* Makefile.in (loop-invariant.o): Add ira.h.
(regmove.o): Ditto.
* doc/invoke.texi (-fira-loop-pressure, ira-loop-reserved-regs):
Describe.
* doc/tm.texi (STACK_REG_COVER_CLASS): Describe.
2009-10-14 Richard Guenther <rguenther@suse.de>
* lto-symtab.c (lto_symtab_compatible): Fold in ...

View File

@ -3076,9 +3076,9 @@ loop-iv.o : loop-iv.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) $(BASIC_BLOCK_H) \
hard-reg-set.h $(CFGLOOP_H) $(EXPR_H) coretypes.h $(TM_H) $(OBSTACK_H) \
output.h intl.h $(TOPLEV_H) $(DF_H) $(HASHTAB_H)
loop-invariant.o : loop-invariant.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
$(BASIC_BLOCK_H) hard-reg-set.h $(CFGLOOP_H) $(EXPR_H) $(RECOG_H) coretypes.h \
$(TM_H) $(TM_P_H) $(FUNCTION_H) $(FLAGS_H) $(DF_H) $(OBSTACK_H) output.h \
$(HASHTAB_H) $(EXCEPT_H) $(PARAMS_H)
$(BASIC_BLOCK_H) hard-reg-set.h $(CFGLOOP_H) $(EXPR_H) $(RECOG_H) \
coretypes.h $(TM_H) $(TM_P_H) $(FUNCTION_H) $(FLAGS_H) $(DF_H) \
$(OBSTACK_H) output.h $(HASHTAB_H) $(EXCEPT_H) $(PARAMS_H) $(REGS_H) ira.h
cfgloopmanip.o : cfgloopmanip.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
$(BASIC_BLOCK_H) hard-reg-set.h $(CFGLOOP_H) $(CFGLAYOUT_H) output.h \
coretypes.h $(TM_H) cfghooks.h $(OBSTACK_H) $(TREE_FLOW_H)
@ -3192,7 +3192,7 @@ ira.o: ira.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
regmove.o : regmove.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
insn-config.h $(TIMEVAR_H) $(TREE_PASS_H) $(DF_H)\
$(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
$(EXPR_H) $(BASIC_BLOCK_H) $(TOPLEV_H) $(TM_P_H) $(EXCEPT_H) reload.h
$(EXPR_H) $(BASIC_BLOCK_H) $(TOPLEV_H) $(TM_P_H) $(EXCEPT_H) ira.h reload.h
combine-stack-adj.o : combine-stack-adj.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(RTL_H) insn-config.h $(TIMEVAR_H) $(TREE_PASS_H) \
$(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \

View File

@ -717,6 +717,11 @@ fira-coalesce
Common Report Var(flag_ira_coalesce) Init(0)
Do optimistic coalescing.
fira-loop-pressure
Common Report Var(flag_ira_loop_pressure)
Use IRA based register pressure calculation
in RTL loop optimizations.
fira-share-save-slots
Common Report Var(flag_ira_share_save_slots) Init(1)
Share slots for saving different hard registers.

View File

@ -873,6 +873,9 @@ enum target_cpu_default
|| ((MODE) == DFmode && (!TARGET_SSE2 || !TARGET_SSE_MATH)) \
|| (MODE) == XFmode)
/* Cover class containing the stack registers. */
#define STACK_REG_COVER_CLASS FLOAT_REGS
/* Number of actual hardware registers.
The hardware registers are assigned numbers for the compiler
from 0 to just below FIRST_PSEUDO_REGISTER.

View File

@ -346,7 +346,8 @@ Objective-C and Objective-C++ Dialects}.
-finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
-fipa-pure-const -fipa-reference -fipa-struct-reorg @gol
-fipa-type-escape -fira-algorithm=@var{algorithm} @gol
-fira-region=@var{region} -fira-coalesce -fno-ira-share-save-slots @gol
-fira-region=@var{region} -fira-coalesce @gol
-fira-loop-pressure -fno-ira-share-save-slots @gol
-fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
-fivopts -fkeep-inline-functions -fkeep-static-consts @gol
-floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
@ -5719,7 +5720,8 @@ invoking @option{-O2} on programs that use computed gotos.
Optimize yet more. @option{-O3} turns on all optimizations specified
by @option{-O2} and also turns on the @option{-finline-functions},
@option{-funswitch-loops}, @option{-fpredictive-commoning},
@option{-fgcse-after-reload} and @option{-ftree-vectorize} options.
@option{-fgcse-after-reload}, @option{-ftree-vectorize} and
@option{-fira-loop-pressure} options.
@item -O0
@opindex O0
@ -6216,6 +6218,14 @@ give the best results in most cases and for most architectures.
Do optimistic register coalescing. This option might be profitable for
architectures with big regular register files.
@item -fira-loop-pressure
@opindex fira-loop-pressure
Use IRA to evaluate register pressure in loops for decision to move
loop invariants. Usage of this option usually results in generation
of faster and smaller code but can slow compiler down.
This option is enabled at level @option{-O3}.
@item -fno-ira-share-save-slots
@opindex fno-ira-share-save-slots
Switch off sharing stack slots used for saving call used hard
@ -8387,6 +8397,14 @@ lower quality register allocation algorithm will be used. The
algorithm do not use pseudo-register conflicts. The default value of
the parameter is 2000.
@item ira-loop-reserved-regs
IRA can be used to evaluate more accurate register pressure in loops
for decision to move loop invariants (see @option{-O3}). The number
of available registers reserved for some other purposes is described
by this parameter. The default value of the parameter is 2 which is
minimal number of registers needed for execution of typical
instruction. This value is the best found from numerous experiments.
@item loop-invariant-max-bbs-in-loop
Loop invariant motion can be very expensive, both in compile time and
in amount of needed compile time memory, with very large loops. Loops

View File

@ -2349,6 +2349,11 @@ with it, as well as defining these macros.
Define this if the machine has any stack-like registers.
@end defmac
@defmac STACK_REG_COVER_CLASS
This is a cover class containing the stack registers. Define this if
the machine has any stack-like registers.
@end defmac
@defmac FIRST_STACK_REG
The number of the first stack-like register. This one is the top
of the stack.

View File

@ -1665,6 +1665,7 @@ ira_finish_costs_once (void)
static void
init_costs (void)
{
init_subregs_of_mode ();
costs = (struct costs *) ira_allocate (max_struct_costs_size
* cost_elements_num);
pref_buffer

View File

@ -3132,6 +3132,9 @@ ira (FILE *f)
epilogue thus changing register elimination offsets. */
current_function_is_leaf = leaf_function_p ();
if (resize_reg_info () && flag_ira_loop_pressure)
ira_set_pseudo_classes (ira_dump_file);
rebuild_p = update_equiv_regs ();
#ifndef IRA_NO_OBSTACK
@ -3158,7 +3161,6 @@ ira (FILE *f)
}
max_regno_before_ira = allocated_reg_info_size = max_reg_num ();
resize_reg_info ();
ira_setup_eliminable_regset ();
ira_overall_cost = ira_reg_cost = ira_mem_cost = 0;
@ -3272,6 +3274,8 @@ ira (FILE *f)
reload_completed = !reload (get_insns (), ira_conflicts_p);
finish_subregs_of_mode ();
timevar_pop (TV_RELOAD);
timevar_push (TV_IRA);

View File

@ -39,9 +39,9 @@ along with GCC; see the file COPYING3. If not see
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "hard-reg-set.h"
#include "rtl.h"
#include "tm_p.h"
#include "hard-reg-set.h"
#include "obstack.h"
#include "basic-block.h"
#include "cfgloop.h"
@ -54,6 +54,8 @@ along with GCC; see the file COPYING3. If not see
#include "hashtab.h"
#include "except.h"
#include "params.h"
#include "regs.h"
#include "ira.h"
/* The data stored for the loop. */
@ -61,6 +63,12 @@ struct loop_data
{
struct loop *outermost_exit; /* The outermost exit of the loop. */
bool has_call; /* True if the loop contains a call. */
/* Maximal register pressure inside loop for given register class
(defined only for the cover classes). */
int max_reg_pressure[N_REG_CLASSES];
/* Loop regs referenced and live pseudo-registers. */
bitmap_head regs_ref;
bitmap_head regs_live;
};
#define LOOP_DATA(LOOP) ((struct loop_data *) (LOOP)->aux)
@ -100,6 +108,10 @@ struct invariant
value. */
rtx reg;
/* If we moved the invariant out of the loop, the original regno
that contained its value. */
int orig_regno;
/* The definition of the invariant. */
struct def *def;
@ -126,6 +138,9 @@ struct invariant
unsigned stamp;
};
/* Currently processed loop. */
static struct loop *curr_loop;
/* Table of invariants indexed by the df_ref uid field. */
static unsigned int invariant_table_size = 0;
@ -615,7 +630,12 @@ find_exits (struct loop *loop, basic_block *body,
}
}
loop->aux = xcalloc (1, sizeof (struct loop_data));
if (loop->aux == NULL)
{
loop->aux = xcalloc (1, sizeof (struct loop_data));
bitmap_initialize (&LOOP_DATA (loop)->regs_ref, &reg_obstack);
bitmap_initialize (&LOOP_DATA (loop)->regs_live, &reg_obstack);
}
LOOP_DATA (loop)->outermost_exit = outermost_exit;
LOOP_DATA (loop)->has_call = has_call;
}
@ -696,6 +716,7 @@ create_new_invariant (struct def *def, rtx insn, bitmap depends_on,
inv->move = false;
inv->reg = NULL_RTX;
inv->orig_regno = -1;
inv->stamp = 0;
inv->insn = insn;
@ -982,14 +1003,46 @@ free_use_list (struct use *use)
}
}
/* Return cover class and number of hard registers (through *NREGS)
for destination of INSN. */
static enum reg_class
get_cover_class_and_nregs (rtx insn, int *nregs)
{
rtx reg;
enum reg_class cover_class;
rtx set = single_set (insn);
/* Considered invariant insns have only one set. */
gcc_assert (set != NULL_RTX);
reg = SET_DEST (set);
if (GET_CODE (reg) == SUBREG)
reg = SUBREG_REG (reg);
if (MEM_P (reg))
{
*nregs = 0;
cover_class = NO_REGS;
}
else
{
if (! REG_P (reg))
reg = NULL_RTX;
if (reg == NULL_RTX)
cover_class = GENERAL_REGS;
else
cover_class = reg_cover_class (REGNO (reg));
*nregs = ira_reg_class_nregs[cover_class][GET_MODE (SET_SRC (set))];
}
return cover_class;
}
/* Calculates cost and number of registers needed for moving invariant INV
out of the loop and stores them to *COST and *REGS_NEEDED. */
static void
get_inv_cost (struct invariant *inv, int *comp_cost, unsigned *regs_needed)
{
int acomp_cost;
unsigned aregs_needed;
int i, acomp_cost;
unsigned aregs_needed[N_REG_CLASSES];
unsigned depno;
struct invariant *dep;
bitmap_iterator bi;
@ -998,13 +1051,30 @@ get_inv_cost (struct invariant *inv, int *comp_cost, unsigned *regs_needed)
inv = VEC_index (invariant_p, invariants, inv->eqto);
*comp_cost = 0;
*regs_needed = 0;
if (! flag_ira_loop_pressure)
regs_needed[0] = 0;
else
{
for (i = 0; i < ira_reg_class_cover_size; i++)
regs_needed[ira_reg_class_cover[i]] = 0;
}
if (inv->move
|| inv->stamp == actual_stamp)
return;
inv->stamp = actual_stamp;
(*regs_needed)++;
if (! flag_ira_loop_pressure)
regs_needed[0]++;
else
{
int nregs;
enum reg_class cover_class;
cover_class = get_cover_class_and_nregs (inv->insn, &nregs);
regs_needed[cover_class] += nregs;
}
if (!inv->cheap_address
|| inv->def->n_addr_uses < inv->def->n_uses)
(*comp_cost) += inv->cost;
@ -1029,19 +1099,35 @@ get_inv_cost (struct invariant *inv, int *comp_cost, unsigned *regs_needed)
on floating point constants is unlikely to ever occur. */
rtx set = single_set (inv->insn);
if (set
&& IS_STACK_MODE (GET_MODE (SET_SRC (set)))
&& constant_pool_constant_p (SET_SRC (set)))
(*regs_needed) += 2;
&& IS_STACK_MODE (GET_MODE (SET_SRC (set)))
&& constant_pool_constant_p (SET_SRC (set)))
{
if (flag_ira_loop_pressure)
regs_needed[STACK_REG_COVER_CLASS] += 2;
else
regs_needed[0] += 2;
}
}
#endif
EXECUTE_IF_SET_IN_BITMAP (inv->depends_on, 0, depno, bi)
{
bool check_p;
dep = VEC_index (invariant_p, invariants, depno);
get_inv_cost (dep, &acomp_cost, &aregs_needed);
get_inv_cost (dep, &acomp_cost, aregs_needed);
if (aregs_needed
if (! flag_ira_loop_pressure)
check_p = aregs_needed[0] != 0;
else
{
for (i = 0; i < ira_reg_class_cover_size; i++)
if (aregs_needed[ira_reg_class_cover[i]] != 0)
break;
check_p = i < ira_reg_class_cover_size;
}
if (check_p
/* We need to check always_executed, since if the original value of
the invariant may be preserved, we may need to keep it in a
separate register. TODO check whether the register has an
@ -1051,10 +1137,26 @@ get_inv_cost (struct invariant *inv, int *comp_cost, unsigned *regs_needed)
{
/* If this is a single use, after moving the dependency we will not
need a new register. */
aregs_needed--;
if (! flag_ira_loop_pressure)
aregs_needed[0]--;
else
{
int nregs;
enum reg_class cover_class;
cover_class = get_cover_class_and_nregs (inv->insn, &nregs);
aregs_needed[cover_class] -= nregs;
}
}
(*regs_needed) += aregs_needed;
if (! flag_ira_loop_pressure)
regs_needed[0] += aregs_needed[0];
else
{
for (i = 0; i < ira_reg_class_cover_size; i++)
regs_needed[ira_reg_class_cover[i]]
+= aregs_needed[ira_reg_class_cover[i]];
}
(*comp_cost) += acomp_cost;
}
}
@ -1066,15 +1168,62 @@ get_inv_cost (struct invariant *inv, int *comp_cost, unsigned *regs_needed)
static int
gain_for_invariant (struct invariant *inv, unsigned *regs_needed,
unsigned new_regs, unsigned regs_used, bool speed)
unsigned *new_regs, unsigned regs_used, bool speed)
{
int comp_cost, size_cost;
get_inv_cost (inv, &comp_cost, regs_needed);
actual_stamp++;
size_cost = (estimate_reg_pressure_cost (new_regs + *regs_needed, regs_used, speed)
- estimate_reg_pressure_cost (new_regs, regs_used, speed));
get_inv_cost (inv, &comp_cost, regs_needed);
if (! flag_ira_loop_pressure)
{
size_cost = (estimate_reg_pressure_cost (new_regs[0] + regs_needed[0],
regs_used, speed)
- estimate_reg_pressure_cost (new_regs[0],
regs_used, speed));
}
else
{
int i;
enum reg_class cover_class;
for (i = 0; i < ira_reg_class_cover_size; i++)
{
cover_class = ira_reg_class_cover[i];
if ((int) new_regs[cover_class]
+ (int) regs_needed[cover_class]
+ LOOP_DATA (curr_loop)->max_reg_pressure[cover_class]
+ IRA_LOOP_RESERVED_REGS
> ira_available_class_regs[cover_class])
break;
}
if (i < ira_reg_class_cover_size)
/* There will be register pressure excess and we want not to
make this loop invariant motion. All loop invariants with
non-positive gains will be rejected in function
find_invariants_to_move. Therefore we return the negative
number here.
One could think that this rejects also expensive loop
invariant motions and this will hurt code performance.
However numerous experiments with different heuristics
taking invariant cost into account did not confirm this
assumption. There are possible explanations for this
result:
o probably all expensive invariants were already moved out
of the loop by PRE and gimple invariant motion pass.
o expensive invariant execution will be hidden by insn
scheduling or OOO processor hardware because usually such
invariants have a lot of freedom to be executed
out-of-order.
Another reason for ignoring invariant cost vs spilling cost
heuristics is also in difficulties to evaluate accurately
spill cost at this stage. */
return -1;
else
size_cost = 0;
}
return comp_cost - size_cost;
}
@ -1087,11 +1236,11 @@ gain_for_invariant (struct invariant *inv, unsigned *regs_needed,
static int
best_gain_for_invariant (struct invariant **best, unsigned *regs_needed,
unsigned new_regs, unsigned regs_used, bool speed)
unsigned *new_regs, unsigned regs_used, bool speed)
{
struct invariant *inv;
int gain = 0, again;
unsigned aregs_needed, invno;
int i, gain = 0, again;
unsigned aregs_needed[N_REG_CLASSES], invno;
for (invno = 0; VEC_iterate (invariant_p, invariants, invno, inv); invno++)
{
@ -1102,13 +1251,20 @@ best_gain_for_invariant (struct invariant **best, unsigned *regs_needed,
if (inv->eqto != inv->invno)
continue;
again = gain_for_invariant (inv, &aregs_needed, new_regs, regs_used,
again = gain_for_invariant (inv, aregs_needed, new_regs, regs_used,
speed);
if (again > gain)
{
gain = again;
*best = inv;
*regs_needed = aregs_needed;
if (! flag_ira_loop_pressure)
regs_needed[0] = aregs_needed[0];
else
{
for (i = 0; i < ira_reg_class_cover_size; i++)
regs_needed[ira_reg_class_cover[i]]
= aregs_needed[ira_reg_class_cover[i]];
}
}
}
@ -1118,7 +1274,7 @@ best_gain_for_invariant (struct invariant **best, unsigned *regs_needed,
/* Marks invariant INVNO and all its dependencies for moving. */
static void
set_move_mark (unsigned invno)
set_move_mark (unsigned invno, int gain)
{
struct invariant *inv = VEC_index (invariant_p, invariants, invno);
bitmap_iterator bi;
@ -1131,11 +1287,18 @@ set_move_mark (unsigned invno)
inv->move = true;
if (dump_file)
fprintf (dump_file, "Decided to move invariant %d\n", invno);
{
if (gain >= 0)
fprintf (dump_file, "Decided to move invariant %d -- gain %d\n",
invno, gain);
else
fprintf (dump_file, "Decided to move dependent invariant %d\n",
invno);
};
EXECUTE_IF_SET_IN_BITMAP (inv->depends_on, 0, invno, bi)
{
set_move_mark (invno);
set_move_mark (invno, -1);
}
}
@ -1144,32 +1307,54 @@ set_move_mark (unsigned invno)
static void
find_invariants_to_move (bool speed)
{
unsigned i, regs_used, regs_needed = 0, new_regs;
int gain;
unsigned i, regs_used, regs_needed[N_REG_CLASSES], new_regs[N_REG_CLASSES];
struct invariant *inv = NULL;
unsigned int n_regs = DF_REG_SIZE (df);
if (!VEC_length (invariant_p, invariants))
return;
/* We do not really do a good job in estimating number of registers used;
we put some initial bound here to stand for induction variables etc.
that we do not detect. */
regs_used = 2;
for (i = 0; i < n_regs; i++)
if (flag_ira_loop_pressure)
/* REGS_USED is actually never used when the flag is on. */
regs_used = 0;
else
/* We do not really do a good job in estimating number of
registers used; we put some initial bound here to stand for
induction variables etc. that we do not detect. */
{
if (!DF_REGNO_FIRST_DEF (i) && DF_REGNO_LAST_USE (i))
unsigned int n_regs = DF_REG_SIZE (df);
regs_used = 2;
for (i = 0; i < n_regs; i++)
{
/* This is a value that is used but not changed inside loop. */
regs_used++;
if (!DF_REGNO_FIRST_DEF (i) && DF_REGNO_LAST_USE (i))
{
/* This is a value that is used but not changed inside loop. */
regs_used++;
}
}
}
new_regs = 0;
while (best_gain_for_invariant (&inv, &regs_needed, new_regs, regs_used, speed) > 0)
if (! flag_ira_loop_pressure)
new_regs[0] = regs_needed[0] = 0;
else
{
set_move_mark (inv->invno);
new_regs += regs_needed;
for (i = 0; (int) i < ira_reg_class_cover_size; i++)
new_regs[ira_reg_class_cover[i]] = 0;
}
while ((gain = best_gain_for_invariant (&inv, regs_needed,
new_regs, regs_used, speed)) > 0)
{
set_move_mark (inv->invno, gain);
if (! flag_ira_loop_pressure)
new_regs[0] += regs_needed[0];
else
{
for (i = 0; (int) i < ira_reg_class_cover_size; i++)
new_regs[ira_reg_class_cover[i]]
+= regs_needed[ira_reg_class_cover[i]];
}
}
}
@ -1186,11 +1371,13 @@ move_invariant_reg (struct loop *loop, unsigned invno)
rtx reg, set, dest, note;
struct use *use;
bitmap_iterator bi;
int regno;
if (inv->reg)
return true;
if (!repr->move)
return false;
regno = -1;
/* If this is a representative of the class of equivalent invariants,
really move the invariant. Otherwise just replace its use with
the register used for the representative. */
@ -1211,7 +1398,12 @@ move_invariant_reg (struct loop *loop, unsigned invno)
would not be dominated by it, we may just move it (TODO). Otherwise we
need to create a temporary register. */
set = single_set (inv->insn);
dest = SET_DEST (set);
reg = dest = SET_DEST (set);
if (GET_CODE (reg) == SUBREG)
reg = SUBREG_REG (reg);
if (REG_P (reg))
regno = REGNO (reg);
reg = gen_reg_rtx_and_attrs (dest);
/* Try replacing the destination by a new pseudoregister. */
@ -1237,6 +1429,7 @@ move_invariant_reg (struct loop *loop, unsigned invno)
if (!move_invariant_reg (loop, repr->invno))
goto fail;
reg = repr->reg;
regno = repr->orig_regno;
set = single_set (inv->insn);
emit_insn_after (gen_move_insn (SET_DEST (set), reg), inv->insn);
delete_insn (inv->insn);
@ -1244,6 +1437,7 @@ move_invariant_reg (struct loop *loop, unsigned invno)
inv->reg = reg;
inv->orig_regno = regno;
/* Replace the uses we know to be dominated. It saves work for copy
propagation, and also it is necessary so that dependent invariants
@ -1266,6 +1460,7 @@ fail:
fprintf (dump_file, "Failed to move invariant %d\n", invno);
inv->move = false;
inv->reg = NULL_RTX;
inv->orig_regno = -1;
return false;
}
@ -1281,6 +1476,21 @@ move_invariants (struct loop *loop)
for (i = 0; VEC_iterate (invariant_p, invariants, i, inv); i++)
move_invariant_reg (loop, i);
if (flag_ira_loop_pressure && resize_reg_info ())
{
for (i = 0; VEC_iterate (invariant_p, invariants, i, inv); i++)
if (inv->reg != NULL_RTX)
{
if (inv->orig_regno >= 0)
setup_reg_classes (REGNO (inv->reg),
reg_preferred_class (inv->orig_regno),
reg_alternate_class (inv->orig_regno),
reg_cover_class (inv->orig_regno));
else
setup_reg_classes (REGNO (inv->reg),
GENERAL_REGS, NO_REGS, GENERAL_REGS);
}
}
}
/* Initializes invariant motion data. */
@ -1346,10 +1556,317 @@ free_loop_data (struct loop *loop)
{
struct loop_data *data = LOOP_DATA (loop);
bitmap_clear (&LOOP_DATA (loop)->regs_ref);
bitmap_clear (&LOOP_DATA (loop)->regs_live);
free (data);
loop->aux = NULL;
}
/* Registers currently living. */
static bitmap_head curr_regs_live;
/* Current reg pressure for each cover class. */
static int curr_reg_pressure[N_REG_CLASSES];
/* Record all regs that are set in any one insn. Communication from
mark_reg_{store,clobber} and global_conflicts. Asm can refer to
all hard-registers. */
static rtx regs_set[(FIRST_PSEUDO_REGISTER > MAX_RECOG_OPERANDS
? FIRST_PSEUDO_REGISTER : MAX_RECOG_OPERANDS) * 2];
/* Number of regs stored in the previous array. */
static int n_regs_set;
/* Return cover class and number of needed hard registers (through
*NREGS) of register REGNO. */
static enum reg_class
get_regno_cover_class (int regno, int *nregs)
{
if (regno >= FIRST_PSEUDO_REGISTER)
{
enum reg_class cover_class = reg_cover_class (regno);
*nregs = ira_reg_class_nregs[cover_class][PSEUDO_REGNO_MODE (regno)];
return cover_class;
}
else if (! TEST_HARD_REG_BIT (ira_no_alloc_regs, regno)
&& ! TEST_HARD_REG_BIT (eliminable_regset, regno))
{
*nregs = 1;
return ira_class_translate[REGNO_REG_CLASS (regno)];
}
else
{
*nregs = 0;
return NO_REGS;
}
}
/* Increase (if INCR_P) or decrease current register pressure for
register REGNO. */
static void
change_pressure (int regno, bool incr_p)
{
int nregs;
enum reg_class cover_class;
cover_class = get_regno_cover_class (regno, &nregs);
if (! incr_p)
curr_reg_pressure[cover_class] -= nregs;
else
{
curr_reg_pressure[cover_class] += nregs;
if (LOOP_DATA (curr_loop)->max_reg_pressure[cover_class]
< curr_reg_pressure[cover_class])
LOOP_DATA (curr_loop)->max_reg_pressure[cover_class]
= curr_reg_pressure[cover_class];
}
}
/* Mark REGNO birth. */
static void
mark_regno_live (int regno)
{
struct loop *loop;
for (loop = curr_loop;
loop != current_loops->tree_root;
loop = loop_outer (loop))
bitmap_set_bit (&LOOP_DATA (loop)->regs_live, regno);
if (bitmap_bit_p (&curr_regs_live, regno))
return;
bitmap_set_bit (&curr_regs_live, regno);
change_pressure (regno, true);
}
/* Mark REGNO death. */
static void
mark_regno_death (int regno)
{
if (! bitmap_bit_p (&curr_regs_live, regno))
return;
bitmap_clear_bit (&curr_regs_live, regno);
change_pressure (regno, false);
}
/* Mark setting register REG. */
static void
mark_reg_store (rtx reg, const_rtx setter ATTRIBUTE_UNUSED,
void *data ATTRIBUTE_UNUSED)
{
int regno;
if (GET_CODE (reg) == SUBREG)
reg = SUBREG_REG (reg);
if (! REG_P (reg))
return;
regs_set[n_regs_set++] = reg;
regno = REGNO (reg);
if (regno >= FIRST_PSEUDO_REGISTER)
mark_regno_live (regno);
else
{
int last = regno + hard_regno_nregs[regno][GET_MODE (reg)];
while (regno < last)
{
mark_regno_live (regno);
regno++;
}
}
}
/* Mark clobbering register REG. */
static void
mark_reg_clobber (rtx reg, const_rtx setter, void *data)
{
if (GET_CODE (setter) == CLOBBER)
mark_reg_store (reg, setter, data);
}
/* Mark register REG death. */
static void
mark_reg_death (rtx reg)
{
int regno = REGNO (reg);
if (regno >= FIRST_PSEUDO_REGISTER)
mark_regno_death (regno);
else
{
int last = regno + hard_regno_nregs[regno][GET_MODE (reg)];
while (regno < last)
{
mark_regno_death (regno);
regno++;
}
}
}
/* Mark occurrence of registers in X for the current loop. */
static void
mark_ref_regs (rtx x)
{
RTX_CODE code;
int i;
const char *fmt;
if (!x)
return;
code = GET_CODE (x);
if (code == REG)
{
struct loop *loop;
for (loop = curr_loop;
loop != current_loops->tree_root;
loop = loop_outer (loop))
bitmap_set_bit (&LOOP_DATA (loop)->regs_ref, REGNO (x));
return;
}
fmt = GET_RTX_FORMAT (code);
for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
if (fmt[i] == 'e')
mark_ref_regs (XEXP (x, i));
else if (fmt[i] == 'E')
{
int j;
for (j = 0; j < XVECLEN (x, i); j++)
mark_ref_regs (XVECEXP (x, i, j));
}
}
/* Calculate register pressure in the loops. */
static void
calculate_loop_reg_pressure (void)
{
int i;
unsigned int j;
bitmap_iterator bi;
basic_block bb;
rtx insn, link;
struct loop *loop, *parent;
loop_iterator li;
FOR_EACH_LOOP (li, loop, 0)
if (loop->aux == NULL)
{
loop->aux = xcalloc (1, sizeof (struct loop_data));
bitmap_initialize (&LOOP_DATA (loop)->regs_ref, &reg_obstack);
bitmap_initialize (&LOOP_DATA (loop)->regs_live, &reg_obstack);
}
ira_setup_eliminable_regset ();
bitmap_initialize (&curr_regs_live, &reg_obstack);
FOR_EACH_BB (bb)
{
curr_loop = bb->loop_father;
if (curr_loop == current_loops->tree_root)
continue;
for (loop = curr_loop;
loop != current_loops->tree_root;
loop = loop_outer (loop))
bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_IN (bb));
bitmap_copy (&curr_regs_live, DF_LR_IN (bb));
for (i = 0; i < ira_reg_class_cover_size; i++)
curr_reg_pressure[ira_reg_class_cover[i]] = 0;
EXECUTE_IF_SET_IN_BITMAP (&curr_regs_live, 0, j, bi)
change_pressure (j, true);
FOR_BB_INSNS (bb, insn)
{
if (! INSN_P (insn))
continue;
mark_ref_regs (PATTERN (insn));
n_regs_set = 0;
note_stores (PATTERN (insn), mark_reg_clobber, NULL);
/* Mark any registers dead after INSN as dead now. */
for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
if (REG_NOTE_KIND (link) == REG_DEAD)
mark_reg_death (XEXP (link, 0));
/* Mark any registers set in INSN as live,
and mark them as conflicting with all other live regs.
Clobbers are processed again, so they conflict with
the registers that are set. */
note_stores (PATTERN (insn), mark_reg_store, NULL);
#ifdef AUTO_INC_DEC
for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
if (REG_NOTE_KIND (link) == REG_INC)
mark_reg_store (XEXP (link, 0), NULL_RTX, NULL);
#endif
while (n_regs_set-- > 0)
{
rtx note = find_regno_note (insn, REG_UNUSED,
REGNO (regs_set[n_regs_set]));
if (! note)
continue;
mark_reg_death (XEXP (note, 0));
}
}
}
bitmap_clear (&curr_regs_live);
if (flag_ira_region == IRA_REGION_MIXED
|| flag_ira_region == IRA_REGION_ALL)
FOR_EACH_LOOP (li, loop, 0)
{
EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_live, 0, j, bi)
if (! bitmap_bit_p (&LOOP_DATA (loop)->regs_ref, j))
{
enum reg_class cover_class;
int nregs;
cover_class = get_regno_cover_class (j, &nregs);
LOOP_DATA (loop)->max_reg_pressure[cover_class] -= nregs;
}
}
if (dump_file == NULL)
return;
FOR_EACH_LOOP (li, loop, 0)
{
parent = loop_outer (loop);
fprintf (dump_file, "\n Loop %d (parent %d, header bb%d, depth %d)\n",
loop->num, (parent == NULL ? -1 : parent->num),
loop->header->index, loop_depth (loop));
fprintf (dump_file, "\n ref. regnos:");
EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_ref, 0, j, bi)
fprintf (dump_file, " %d", j);
fprintf (dump_file, "\n live regnos:");
EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_live, 0, j, bi)
fprintf (dump_file, " %d", j);
fprintf (dump_file, "\n Pressure:");
for (i = 0; (int) i < ira_reg_class_cover_size; i++)
{
enum reg_class cover_class;
cover_class = ira_reg_class_cover[i];
if (LOOP_DATA (loop)->max_reg_pressure[cover_class] == 0)
continue;
fprintf (dump_file, " %s=%d", reg_class_names[cover_class],
LOOP_DATA (loop)->max_reg_pressure[cover_class]);
}
fprintf (dump_file, "\n");
}
}
/* Move the invariants out of the loops. */
void
@ -1358,10 +1875,17 @@ move_loop_invariants (void)
struct loop *loop;
loop_iterator li;
if (flag_ira_loop_pressure)
{
df_analyze ();
ira_set_pseudo_classes (dump_file);
calculate_loop_reg_pressure ();
}
df_set_flags (DF_EQ_NOTES + DF_DEFER_INSN_RESCAN);
/* Process the loops, innermost first. */
FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST)
{
curr_loop = loop;
/* move_single_loop_invariants for very large loops
is time consuming and might need a lot of memory. */
if (loop->num_nodes <= (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP)
@ -1373,6 +1897,10 @@ move_loop_invariants (void)
free_loop_data (loop);
}
if (flag_ira_loop_pressure)
/* There is no sense to keep this info because it was most
probably outdated by subsequent passes. */
free_reg_info ();
free (invariant_table);
invariant_table = NULL;
invariant_table_size = 0;

View File

@ -917,6 +917,7 @@ decode_options (unsigned int argc, const char **argv)
flag_ipa_cp_clone = opt3;
if (flag_ipa_cp_clone)
flag_ipa_cp = 1;
flag_ira_loop_pressure = opt3;
/* Just -O1/-O0 optimizations. */
opt1_max = (optimize <= 1);

View File

@ -719,6 +719,11 @@ DEFPARAM (PARAM_IRA_MAX_CONFLICT_TABLE_SIZE,
"max size of conflict table in MB",
1000, 0, 0)
DEFPARAM (PARAM_IRA_LOOP_RESERVED_REGS,
"ira-loop-reserved-regs",
"The number of registers in each class kept unused by loop invariant motion",
2, 0, 0)
/* Switch initialization conversion will refuse to create arrays that are
bigger than this parameter times the number of switch branches. */

View File

@ -160,6 +160,8 @@ typedef enum compiler_param
PARAM_VALUE (PARAM_IRA_MAX_LOOPS_NUM)
#define IRA_MAX_CONFLICT_TABLE_SIZE \
PARAM_VALUE (PARAM_IRA_MAX_CONFLICT_TABLE_SIZE)
#define IRA_LOOP_RESERVED_REGS \
PARAM_VALUE (PARAM_IRA_LOOP_RESERVED_REGS)
#define SWITCH_CONVERSION_BRANCH_RATIO \
PARAM_VALUE (PARAM_SWITCH_CONVERSION_BRANCH_RATIO)
#define LOOP_INVARIANT_MAX_BBS_IN_LOOP \

View File

@ -943,6 +943,7 @@ init_optimization_passes (void)
NEXT_PASS (pass_rtl_store_motion);
NEXT_PASS (pass_cse_after_global_opts);
NEXT_PASS (pass_rtl_ifcvt);
NEXT_PASS (pass_reginfo_init);
/* Perform loop optimizations. It might be better to do them a bit
sooner, but we want the profile feedback to work more
efficiently. */
@ -962,7 +963,6 @@ init_optimization_passes (void)
NEXT_PASS (pass_cse2);
NEXT_PASS (pass_rtl_dse1);
NEXT_PASS (pass_rtl_fwprop_addr);
NEXT_PASS (pass_reginfo_init);
NEXT_PASS (pass_inc_dec);
NEXT_PASS (pass_initialize_regs);
NEXT_PASS (pass_ud_rtl_dce);
@ -978,10 +978,8 @@ init_optimization_passes (void)
NEXT_PASS (pass_mode_switching);
NEXT_PASS (pass_match_asm_constraints);
NEXT_PASS (pass_sms);
NEXT_PASS (pass_subregs_of_mode_init);
NEXT_PASS (pass_sched);
NEXT_PASS (pass_ira);
NEXT_PASS (pass_subregs_of_mode_finish);
NEXT_PASS (pass_postreload);
{
struct opt_pass **p = &pass_postreload.pass.sub;

View File

@ -904,6 +904,9 @@ struct reg_pref
run. */
static struct reg_pref *reg_pref;
/* Current size of reg_info. */
static int reg_info_size;
/* Return the reg_class in which pseudo reg number REGNO is best allocated.
This function is sometimes called before the info has been computed.
When that happens, just return GENERAL_REGS, which is innocuous. */
@ -937,9 +940,6 @@ reg_cover_class (int regno)
/* Current size of reg_info. */
static int reg_info_size;
/* Allocate space for reg info. */
static void
allocate_reg_info (void)
@ -1040,6 +1040,7 @@ setup_reg_classes (int regno,
{
if (reg_pref == NULL)
return;
gcc_assert (reg_info_size == max_reg_num ());
reg_pref[regno].prefclass = prefclass;
reg_pref[regno].altclass = altclass;
reg_pref[regno].coverclass = coverclass;
@ -1321,7 +1322,7 @@ find_subregs_of_mode (rtx x)
}
}
static unsigned int
void
init_subregs_of_mode (void)
{
basic_block bb;
@ -1336,8 +1337,6 @@ init_subregs_of_mode (void)
FOR_BB_INSNS (bb, insn)
if (INSN_P (insn))
find_subregs_of_mode (PATTERN (insn));
return 0;
}
/* Return 1 if REGNO has had an invalid mode change in CLASS from FROM
@ -1367,74 +1366,22 @@ invalid_mode_change_p (unsigned int regno,
return false;
}
static unsigned int
void
finish_subregs_of_mode (void)
{
htab_delete (subregs_of_mode);
subregs_of_mode = 0;
return 0;
}
#else
static unsigned int
void
init_subregs_of_mode (void)
{
return 0;
}
static unsigned int
void
finish_subregs_of_mode (void)
{
return 0;
}
#endif /* CANNOT_CHANGE_MODE_CLASS */
static bool
gate_subregs_of_mode_init (void)
{
#ifdef CANNOT_CHANGE_MODE_CLASS
return true;
#else
return false;
#endif
}
struct rtl_opt_pass pass_subregs_of_mode_init =
{
{
RTL_PASS,
"subregs_of_mode_init", /* name */
gate_subregs_of_mode_init, /* gate */
init_subregs_of_mode, /* execute */
NULL, /* sub */
NULL, /* next */
0, /* static_pass_number */
TV_NONE, /* tv_id */
0, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
0 /* todo_flags_finish */
}
};
struct rtl_opt_pass pass_subregs_of_mode_finish =
{
{
RTL_PASS,
"subregs_of_mode_finish", /* name */
gate_subregs_of_mode_init, /* gate */
finish_subregs_of_mode, /* execute */
NULL, /* sub */
NULL, /* next */
0, /* static_pass_number */
TV_NONE, /* tv_id */
0, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
0 /* todo_flags_finish */
}
};
#include "gt-reginfo.h"

View File

@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see
#include "timevar.h"
#include "tree-pass.h"
#include "df.h"
#include "ira.h"
static int optimize_reg_copy_1 (rtx, rtx, rtx);
static void optimize_reg_copy_2 (rtx, rtx, rtx);
@ -1226,6 +1227,9 @@ regmove_optimize (void)
df_note_add_problem ();
df_analyze ();
if (flag_ira_loop_pressure)
ira_set_pseudo_classes (dump_file);
regstat_init_n_sets_and_refs ();
regstat_compute_ri ();
@ -1248,6 +1252,8 @@ regmove_optimize (void)
}
regstat_free_n_sets_and_refs ();
regstat_free_ri ();
if (flag_ira_loop_pressure)
free_reg_info ();
return 0;
}

View File

@ -1930,6 +1930,8 @@ extern void init_move_cost (enum machine_mode);
extern bool resize_reg_info (void);
/* Free up register info memory. */
extern void free_reg_info (void);
extern void init_subregs_of_mode (void);
extern void finish_subregs_of_mode (void);
/* recog.c */
extern rtx extract_asm_operands (rtx);

View File

@ -142,6 +142,7 @@ extern int flag_unroll_all_loops;
extern int flag_unswitch_loops;
extern int flag_cprop_registers;
extern int time_report;
extern int flag_ira_loop_pressure;
extern int flag_ira_coalesce;
extern int flag_ira_move_spills;
extern int flag_ira_share_save_slots;

View File

@ -497,8 +497,6 @@ extern struct rtl_opt_pass pass_cse2;
extern struct rtl_opt_pass pass_df_initialize_opt;
extern struct rtl_opt_pass pass_df_initialize_no_opt;
extern struct rtl_opt_pass pass_reginfo_init;
extern struct rtl_opt_pass pass_subregs_of_mode_init;
extern struct rtl_opt_pass pass_subregs_of_mode_finish;
extern struct rtl_opt_pass pass_inc_dec;
extern struct rtl_opt_pass pass_stack_ptr_mod;
extern struct rtl_opt_pass pass_initialize_regs;