dbxout.c (dbxout_symbol_location): Pass new argument to alter_subreg.

2012-10-23  Vladimir Makarov  <vmakarov@redhat.com>

	* dbxout.c (dbxout_symbol_location): Pass new argument to
	alter_subreg.
	* dwarf2out.c: Include ira.h and lra.h.
	(based_loc_descr, compute_frame_pointer_to_fb_displacement): Use
	lra_eliminate_regs for LRA instead of eliminate_regs.
	* expr.c (emit_move_insn_1): Pass an additional argument to
	emit_move_via_integer.  Use emit_move_via_integer for LRA only if
	the insn is recognized.
	* emit-rtl.c (gen_rtx_REG): Add lra_in_progress.
	(validate_subreg): Don't check offset for LRA and floating point
	modes.
	* final.c (final_scan_insn, cleanup_subreg_operands): Pass new
	argument to alter_subreg.
	(walk_alter_subreg, output_operand): Ditto.
	(alter_subreg): Add new argument.
	* gcse.c (calculate_bb_reg_pressure): Add parameter to
	ira_setup_eliminable_regset call.
	* ira.c: Include lra.h.
	(ira_init_once, ira_init, ira_finish_once): Call lra_start_once,
	lra_init, lra_finish_once in anyway.
	(ira_setup_eliminable_regset): Add parameter.  Remove need_fp.
	Call lra_init_elimination and mark HARD_FRAME_POINTER_REGNUM as
	living forever if frame_pointer_needed.
	(setup_reg_class_relations): Set up ira_reg_class_subset.
	(ira_reg_equiv_invariant_p, ira_reg_equiv_const): Remove.
	(find_reg_equiv_invariant_const): Ditto.
	(setup_reg_renumber): Use ira_equiv_no_lvalue_p instead of
	ira_reg_equiv_invariant_p.  Skip caps for LRA.
	(setup_reg_equiv_init, ira_update_equiv_info_by_shuffle_insn): New
	functions.
	(ira_reg_equiv_len, ira_reg_equiv): New externals.
	(ira_reg_equiv): New.
	(ira_expand_reg_equiv, init_reg_equiv, finish_reg_equiv): New
	functions.
	(no_equiv, update_equiv_regs): Use ira_reg_equiv instead of
	reg_equiv_init.
	(setup_reg_equiv): New function.
	(ira_use_lra_p): New global.
	(ira): Set up lra_simple_p and ira_conflicts_p.  Set up and
	restore flag_caller_saves and flag_ira_region.  Move
	initialization of ira_obstack and ira_bitmap_obstack upper.  Call
	init_reg_equiv, setup_reg_equiv, and setup_reg_equiv_init instead
	of initialization of ira_reg_equiv_len, ira_reg_equiv_invariant_p,
	and ira_reg_equiv_const.  Call ira_setup_eliminable_regset with a
	new argument.  Don't flatten IRA IRA for LRA.  Don't reassign
	conflict allocnos for LRA. Call finish_reg_equiv.
        (do_reload): Prepare code for LRA call.  Call LRA.
	* ira.h (ira_use_lra_p): New external.
	(struct target_ira): Add members x_ira_class_subset_p
	x_ira_reg_class_subset, and x_ira_reg_classes_intersect_p.
	(ira_class_subset_p, ira_reg_class_subset): New macros.
	(ira_reg_classes_intersect_p): New macro.
	(struct ira_reg_equiv): New.
	(ira_setup_eliminable_regset): Add an argument.
	(ira_expand_reg_equiv, ira_update_equiv_info_by_shuffle_insn): New
	prototypes.
	* ira-color.c (color_pass, move_spill_restore, coalesce_allocnos):
	Use ira_equiv_no_lvalue_p.
	(coalesce_spill_slots, ira_sort_regnos_for_alter_reg): Ditto.
	* ira-emit.c (ira_create_new_reg): Call ira_expand_reg_equiv.
	(generate_edge_moves, change_loop) Use ira_equiv_no_lvalue_p.
	(emit_move_list): Simplify code.  Call
	ira_update_equiv_info_by_shuffle_insn.  Use ira_reg_equiv instead
	of ira_reg_equiv_invariant_p and ira_reg_equiv_const.  Change
	assert.
	* ira-int.h (struct target_ira_int): Remove x_ira_class_subset_p
	and x_ira_reg_classes_intersect_p.
	(ira_class_subset_p, ira_reg_classes_intersect_p): Remove.
	(ira_reg_equiv_len, ira_reg_equiv_invariant_p): Ditto.
	(ira_reg_equiv_const): Ditto.
	(ira_equiv_no_lvalue_p): New function.
	* jump.c (true_regnum): Always use hard_regno for subreg_get_info
	when lra is in progress.
	* haifa-sched.c (sched_init): Pass new argument to
	ira_setup_eliminable_regset.
	* loop-invariant.c (calculate_loop_reg_pressure): Pass new
	argument to ira_setup_eliminable_regset.
	* lra.h: New.
	* lra-int.h: Ditto.
	* lra.c: Ditto.
	* lra-assigns.c: Ditto.
	* lra-constraints.c: Ditto.
	* lra-coalesce.c: Ditto.
	* lra-eliminations.c: Ditto.
	* lra-lives.c: Ditto.
	* lra-spills.c: Ditto.
	* Makefile.in (LRA_INT_H): New.
	(OBJS): Add lra.o, lra-assigns.o, lra-coalesce.o,
	lra-constraints.o, lra-eliminations.o, lra-lives.o, and
	lra-spills.o.
	(dwarf2out.o): Add dependence on ira.h and lra.h.
	(ira.o): Add dependence on lra.h.
	(lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o): New
	entries.
	(lra-eliminations.o, lra-lives.o, lra-spills.o): Ditto.
	* output.h (alter_subreg): Add new argument.
	* rtlanal.c (simplify_subreg_regno): Permit mode changes for LRA.
	Permit ARG_POINTER_REGNUM and STACK_POINTER_REGNUM for LRA.
	* recog.c (general_operand, register_operand): Accept paradoxical
	FLOAT_MODE subregs for LRA.
	(scratch_operand): Accept pseudos for LRA.
	* rtl.h (lra_in_progress): New external.
	(debug_bb_n_slim, debug_bb_slim, print_value_slim): New
	prototypes.
	(debug_rtl_slim, debug_insn_slim): Ditto.
	* sdbout.c (sdbout_symbol): Pass new argument to alter_subreg.
	* sched-vis.c (print_value_slim): New.
	* target.def (lra_p): New hook.
	(register_priority): Ditto.
	(different_addr_displacement_p): Ditto.
	(spill_class): Ditto.
	* target-globals.h (this_target_lra_int): New external.
	(target_globals): New member lra_int.
	(restore_target_globals): Restore this_target_lra_int.
	* target-globals.c: Include lra-int.h.
	(default_target_globals): Add &default_target_lra_int.
	* targhooks.c (default_lra_p): New function.
	(default_register_priority): Ditto.
	(default_different_addr_displacement_p): Ditto.
	* targhooks.h (default_lra_p): Declare.
	(default_register_priority): Ditto.
	(default_different_addr_displacement_p): Ditto.
	* timevar.def (TV_LRA, TV_LRA_ELIMINATE, TV_LRA_INHERITANCE): New.
	(TV_LRA_CREATE_LIVE_RANGES, TV_LRA_ASSIGN, TV_LRA_COALESCE): New.
	* config/arm/arm.c (load_multiple_sequence): Pass new argument toOB
	alter_subreg.
	(store_multiple_sequence): Ditto.
	* config/i386/i386.h (enum ix86_tune_indices): Add
	X86_TUNE_GENERAL_REGS_SSE_SPILL.
	(TARGET_GENERAL_REGS_SSE_SPILL): New macro.
	* config/i386/i386.c (initial_ix86_tune_features): Set up
	X86_TUNE_GENERAL_REGS_SSE_SPILL for m_COREI7 and m_CORE2I7.
	(ix86_lra_p, ix86_register_priority): New functions.
	(ix86_secondary_reload): Add NON_Q_REGS, SIREG, DIREG.
	(inline_secondary_memory_needed): Change assert.
	(ix86_spill_class): New function.
	(TARGET_LRA_P, TARGET_REGISTER_BANK, TARGET_SPILL_CLASS): New
	macros.
	* config/m68k/m68k.c (emit_move_sequence): Pass new argument to
	alter_subreg.
	* config/m32r/m32r.c (gen_split_move_double): Ditto.
	* config/pa/pa.c (pa_emit_move_sequence): Ditto.
	* config/sh/sh.md: Ditto.
	* config/v850/v850.c (v850_reorg): Ditto.
	* config/xtensa/xtensa.c (fixup_subreg_mem): Ditto.
	* doc/md.texi: Add new interpretation of hint * for LRA.
	* doc/passes.texi: Describe LRA pass.
	* doc/tm.texi.in: Add TARGET_LRA_P, TARGET_REGISTER_PRIORITY,
	TARGET_DIFFERENT_ADDR_DISPLACEMENT_P, and TARGET_SPILL_CLASS.
	* doc/tm.texi: Update.

From-SVN: r192719
This commit is contained in:
Vladimir Makarov 2012-10-23 15:51:41 +00:00 committed by Vladimir Makarov
parent 6acf25e4b3
commit 55a2c3226a
50 changed files with 13722 additions and 282 deletions

View File

@ -1,3 +1,156 @@
2012-10-23 Vladimir Makarov <vmakarov@redhat.com>
* dbxout.c (dbxout_symbol_location): Pass new argument to
alter_subreg.
* dwarf2out.c: Include ira.h and lra.h.
(based_loc_descr, compute_frame_pointer_to_fb_displacement): Use
lra_eliminate_regs for LRA instead of eliminate_regs.
* expr.c (emit_move_insn_1): Pass an additional argument to
emit_move_via_integer. Use emit_move_via_integer for LRA only if
the insn is recognized.
* emit-rtl.c (gen_rtx_REG): Add lra_in_progress.
(validate_subreg): Don't check offset for LRA and floating point
modes.
* final.c (final_scan_insn, cleanup_subreg_operands): Pass new
argument to alter_subreg.
(walk_alter_subreg, output_operand): Ditto.
(alter_subreg): Add new argument.
* gcse.c (calculate_bb_reg_pressure): Add parameter to
ira_setup_eliminable_regset call.
* ira.c: Include lra.h.
(ira_init_once, ira_init, ira_finish_once): Call lra_start_once,
lra_init, lra_finish_once in anyway.
(ira_setup_eliminable_regset): Add parameter. Remove need_fp.
Call lra_init_elimination and mark HARD_FRAME_POINTER_REGNUM as
living forever if frame_pointer_needed.
(setup_reg_class_relations): Set up ira_reg_class_subset.
(ira_reg_equiv_invariant_p, ira_reg_equiv_const): Remove.
(find_reg_equiv_invariant_const): Ditto.
(setup_reg_renumber): Use ira_equiv_no_lvalue_p instead of
ira_reg_equiv_invariant_p. Skip caps for LRA.
(setup_reg_equiv_init, ira_update_equiv_info_by_shuffle_insn): New
functions.
(ira_reg_equiv_len, ira_reg_equiv): New externals.
(ira_reg_equiv): New.
(ira_expand_reg_equiv, init_reg_equiv, finish_reg_equiv): New
functions.
(no_equiv, update_equiv_regs): Use ira_reg_equiv instead of
reg_equiv_init.
(setup_reg_equiv): New function.
(ira_use_lra_p): New global.
(ira): Set up lra_simple_p and ira_conflicts_p. Set up and
restore flag_caller_saves and flag_ira_region. Move
initialization of ira_obstack and ira_bitmap_obstack upper. Call
init_reg_equiv, setup_reg_equiv, and setup_reg_equiv_init instead
of initialization of ira_reg_equiv_len, ira_reg_equiv_invariant_p,
and ira_reg_equiv_const. Call ira_setup_eliminable_regset with a
new argument. Don't flatten IRA IRA for LRA. Don't reassign
conflict allocnos for LRA. Call finish_reg_equiv.
(do_reload): Prepare code for LRA call. Call LRA.
* ira.h (ira_use_lra_p): New external.
(struct target_ira): Add members x_ira_class_subset_p
x_ira_reg_class_subset, and x_ira_reg_classes_intersect_p.
(ira_class_subset_p, ira_reg_class_subset): New macros.
(ira_reg_classes_intersect_p): New macro.
(struct ira_reg_equiv): New.
(ira_setup_eliminable_regset): Add an argument.
(ira_expand_reg_equiv, ira_update_equiv_info_by_shuffle_insn): New
prototypes.
* ira-color.c (color_pass, move_spill_restore, coalesce_allocnos):
Use ira_equiv_no_lvalue_p.
(coalesce_spill_slots, ira_sort_regnos_for_alter_reg): Ditto.
* ira-emit.c (ira_create_new_reg): Call ira_expand_reg_equiv.
(generate_edge_moves, change_loop) Use ira_equiv_no_lvalue_p.
(emit_move_list): Simplify code. Call
ira_update_equiv_info_by_shuffle_insn. Use ira_reg_equiv instead
of ira_reg_equiv_invariant_p and ira_reg_equiv_const. Change
assert.
* ira-int.h (struct target_ira_int): Remove x_ira_class_subset_p
and x_ira_reg_classes_intersect_p.
(ira_class_subset_p, ira_reg_classes_intersect_p): Remove.
(ira_reg_equiv_len, ira_reg_equiv_invariant_p): Ditto.
(ira_reg_equiv_const): Ditto.
(ira_equiv_no_lvalue_p): New function.
* jump.c (true_regnum): Always use hard_regno for subreg_get_info
when lra is in progress.
* haifa-sched.c (sched_init): Pass new argument to
ira_setup_eliminable_regset.
* loop-invariant.c (calculate_loop_reg_pressure): Pass new
argument to ira_setup_eliminable_regset.
* lra.h: New.
* lra-int.h: Ditto.
* lra.c: Ditto.
* lra-assigns.c: Ditto.
* lra-constraints.c: Ditto.
* lra-coalesce.c: Ditto.
* lra-eliminations.c: Ditto.
* lra-lives.c: Ditto.
* lra-spills.c: Ditto.
* Makefile.in (LRA_INT_H): New.
(OBJS): Add lra.o, lra-assigns.o, lra-coalesce.o,
lra-constraints.o, lra-eliminations.o, lra-lives.o, and
lra-spills.o.
(dwarf2out.o): Add dependence on ira.h and lra.h.
(ira.o): Add dependence on lra.h.
(lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o): New
entries.
(lra-eliminations.o, lra-lives.o, lra-spills.o): Ditto.
* output.h (alter_subreg): Add new argument.
* rtlanal.c (simplify_subreg_regno): Permit mode changes for LRA.
Permit ARG_POINTER_REGNUM and STACK_POINTER_REGNUM for LRA.
* recog.c (general_operand, register_operand): Accept paradoxical
FLOAT_MODE subregs for LRA.
(scratch_operand): Accept pseudos for LRA.
* rtl.h (lra_in_progress): New external.
(debug_bb_n_slim, debug_bb_slim, print_value_slim): New
prototypes.
(debug_rtl_slim, debug_insn_slim): Ditto.
* sdbout.c (sdbout_symbol): Pass new argument to alter_subreg.
* sched-vis.c (print_value_slim): New.
* target.def (lra_p): New hook.
(register_priority): Ditto.
(different_addr_displacement_p): Ditto.
(spill_class): Ditto.
* target-globals.h (this_target_lra_int): New external.
(target_globals): New member lra_int.
(restore_target_globals): Restore this_target_lra_int.
* target-globals.c: Include lra-int.h.
(default_target_globals): Add &default_target_lra_int.
* targhooks.c (default_lra_p): New function.
(default_register_priority): Ditto.
(default_different_addr_displacement_p): Ditto.
* targhooks.h (default_lra_p): Declare.
(default_register_priority): Ditto.
(default_different_addr_displacement_p): Ditto.
* timevar.def (TV_LRA, TV_LRA_ELIMINATE, TV_LRA_INHERITANCE): New.
(TV_LRA_CREATE_LIVE_RANGES, TV_LRA_ASSIGN, TV_LRA_COALESCE): New.
* config/arm/arm.c (load_multiple_sequence): Pass new argument toOB
alter_subreg.
(store_multiple_sequence): Ditto.
* config/i386/i386.h (enum ix86_tune_indices): Add
X86_TUNE_GENERAL_REGS_SSE_SPILL.
(TARGET_GENERAL_REGS_SSE_SPILL): New macro.
* config/i386/i386.c (initial_ix86_tune_features): Set up
X86_TUNE_GENERAL_REGS_SSE_SPILL for m_COREI7 and m_CORE2I7.
(ix86_lra_p, ix86_register_priority): New functions.
(ix86_secondary_reload): Add NON_Q_REGS, SIREG, DIREG.
(inline_secondary_memory_needed): Change assert.
(ix86_spill_class): New function.
(TARGET_LRA_P, TARGET_REGISTER_BANK, TARGET_SPILL_CLASS): New
macros.
* config/m68k/m68k.c (emit_move_sequence): Pass new argument to
alter_subreg.
* config/m32r/m32r.c (gen_split_move_double): Ditto.
* config/pa/pa.c (pa_emit_move_sequence): Ditto.
* config/sh/sh.md: Ditto.
* config/v850/v850.c (v850_reorg): Ditto.
* config/xtensa/xtensa.c (fixup_subreg_mem): Ditto.
* doc/md.texi: Add new interpretation of hint * for LRA.
* doc/passes.texi: Describe LRA pass.
* doc/tm.texi.in: Add TARGET_LRA_P, TARGET_REGISTER_PRIORITY,
TARGET_DIFFERENT_ADDR_DISPLACEMENT_P, and TARGET_SPILL_CLASS.
* doc/tm.texi: Update.
2012-10-23 Jan Hubicka <jh@suse.cz>
* loop-unroll.c (decide_peel_simple): Simple peeling makes sense even

View File

@ -940,6 +940,7 @@ TREE_DATA_REF_H = tree-data-ref.h $(OMEGA_H) graphds.h $(SCEV_H)
TREE_INLINE_H = tree-inline.h vecir.h
REAL_H = real.h $(MACHMODE_H)
IRA_INT_H = ira.h ira-int.h $(CFGLOOP_H) alloc-pool.h
LRA_INT_H = lra.h $(BITMAP_H) $(RECOG_H) $(INSN_ATTR_H) insn-codes.h lra-int.h
DBGCNT_H = dbgcnt.h dbgcnt.def
EBITMAP_H = ebitmap.h sbitmap.h
LTO_STREAMER_H = lto-streamer.h $(LINKER_PLUGIN_API_H) $(TARGET_H) \
@ -1272,6 +1273,13 @@ OBJS = \
loop-unroll.o \
loop-unswitch.o \
lower-subreg.o \
lra.o \
lra-assigns.o \
lra-coalesce.o \
lra-constraints.o \
lra-eliminations.o \
lra-lives.o \
lra-spills.o \
lto-cgraph.o \
lto-streamer.o \
lto-streamer-in.o \
@ -2783,7 +2791,7 @@ dwarf2out.o : dwarf2out.c $(CONFIG_H) $(SYSTEM_H) coretypes.h dumpfile.h \
toplev.h $(DIAGNOSTIC_CORE_H) $(DWARF2OUT_H) reload.h \
$(GGC_H) $(EXCEPT_H) dwarf2asm.h $(TM_P_H) langhooks.h $(HASHTAB_H) \
gt-dwarf2out.h $(TARGET_H) $(CGRAPH_H) $(MD5_H) $(INPUT_H) $(FUNCTION_H) \
$(GIMPLE_H) $(TREE_FLOW_H) \
$(GIMPLE_H) ira.h lra.h $(TREE_FLOW_H) \
$(TREE_PRETTY_PRINT_H) $(COMMON_TARGET_H) $(OPTS_H)
dwarf2cfi.o : dwarf2cfi.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
version.h $(RTL_H) $(EXPR_H) $(REGS_H) $(FUNCTION_H) output.h \
@ -3217,7 +3225,43 @@ ira.o: ira.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(REGS_H) $(RTL_H) $(TM_P_H) $(TARGET_H) $(FLAGS_H) $(OBSTACK_H) \
$(BITMAP_H) hard-reg-set.h $(BASIC_BLOCK_H) $(DBGCNT_H) $(FUNCTION_H) \
$(EXPR_H) $(RECOG_H) $(PARAMS_H) $(TREE_PASS_H) output.h \
$(EXCEPT_H) reload.h toplev.h $(DIAGNOSTIC_CORE_H) $(DF_H) $(GGC_H) $(IRA_INT_H)
$(EXCEPT_H) reload.h toplev.h $(DIAGNOSTIC_CORE_H) \
$(DF_H) $(GGC_H) $(IRA_INT_H) lra.h
lra.o : lra.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
$(RTL_H) $(REGS_H) insn-config.h insn-codes.h $(TIMEVAR_H) $(TREE_PASS_H) \
$(DF_H) $(RECOG_H) output.h addresses.h $(REGS_H) hard-reg-set.h \
$(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) \
$(EXCEPT_H) ira.h $(LRA_INT_H)
lra-assigns.o : lra-assigns.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \
$(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
$(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) ira.h \
rtl-error.h sparseset.h $(LRA_INT_H)
lra-coalesce.o : lra-coalesce.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \
$(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
$(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) ira.h \
rtl-error.h ira.h $(LRA_INT_H)
lra-constraints.o : lra-constraints.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(RTL_H) $(REGS_H) insn-config.h insn-codes.h $(DF_H) \
$(RECOG_H) output.h addresses.h $(REGS_H) hard-reg-set.h $(FLAGS_H) \
$(FUNCTION_H) $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) \
ira.h rtl-error.h $(LRA_INT_H)
lra-eliminations.o : lra-eliminations.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \
$(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
$(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) ira.h \
rtl-error.h $(LRA_INT_H)
lra-lives.o : lra-lives.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
$(RTL_H) $(REGS_H) insn-config.h $(DF_H) \
$(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
$(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) \
$(LRA_INT_H)
lra-spills.o : lra-spills.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
$(RTL_H) $(REGS_H) insn-config.h $(DF_H) \
$(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \
$(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) \
ira.h $(LRA_INT_H)
regmove.o : regmove.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
insn-config.h $(TREE_PASS_H) $(DF_H) \
$(RECOG_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \

View File

@ -10328,7 +10328,7 @@ load_multiple_sequence (rtx *operands, int nops, int *regs, int *saved_order,
/* Convert a subreg of a mem into the mem itself. */
if (GET_CODE (operands[nops + i]) == SUBREG)
operands[nops + i] = alter_subreg (operands + (nops + i));
operands[nops + i] = alter_subreg (operands + (nops + i), true);
gcc_assert (MEM_P (operands[nops + i]));
@ -10480,7 +10480,7 @@ store_multiple_sequence (rtx *operands, int nops, int nops_total,
/* Convert a subreg of a mem into the mem itself. */
if (GET_CODE (operands[nops + i]) == SUBREG)
operands[nops + i] = alter_subreg (operands + (nops + i));
operands[nops + i] = alter_subreg (operands + (nops + i), true);
gcc_assert (MEM_P (operands[nops + i]));

View File

@ -2267,7 +2267,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
/* X86_TUNE_REASSOC_FP_TO_PARALLEL: Try to produce parallel computations
during reassociation of fp computation. */
m_ATOM
m_ATOM,
/* X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE
regs instead of memory. */
m_COREI7 | m_CORE2I7
};
/* Feature tests against the various architecture variations. */
@ -32046,6 +32050,38 @@ ix86_free_from_memory (enum machine_mode mode)
}
}
/* Return true if we use LRA instead of reload pass. */
static bool
ix86_lra_p (void)
{
return true;
}
/* Return a register priority for hard reg REGNO. */
static int
ix86_register_priority (int hard_regno)
{
/* ebp and r13 as the base always wants a displacement, r12 as the
base always wants an index. So discourage their usage in an
address. */
if (hard_regno == R12_REG || hard_regno == R13_REG)
return 0;
if (hard_regno == BP_REG)
return 1;
/* New x86-64 int registers result in bigger code size. Discourage
them. */
if (FIRST_REX_INT_REG <= hard_regno && hard_regno <= LAST_REX_INT_REG)
return 2;
/* New x86-64 SSE registers result in bigger code size. Discourage
them. */
if (FIRST_REX_SSE_REG <= hard_regno && hard_regno <= LAST_REX_SSE_REG)
return 2;
/* Usage of AX register results in smaller code. Prefer it. */
if (hard_regno == 0)
return 4;
return 3;
}
/* Implement TARGET_PREFERRED_RELOAD_CLASS.
Put float CONST_DOUBLE in the constant pool instead of fp regs.
@ -32179,6 +32215,9 @@ ix86_secondary_reload (bool in_p, rtx x, reg_class_t rclass,
&& !in_p && mode == QImode
&& (rclass == GENERAL_REGS
|| rclass == LEGACY_REGS
|| rclass == NON_Q_REGS
|| rclass == SIREG
|| rclass == DIREG
|| rclass == INDEX_REGS))
{
int regno;
@ -32288,7 +32327,7 @@ inline_secondary_memory_needed (enum reg_class class1, enum reg_class class2,
|| MAYBE_MMX_CLASS_P (class1) != MMX_CLASS_P (class1)
|| MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2))
{
gcc_assert (!strict);
gcc_assert (!strict || lra_in_progress);
return true;
}
@ -40839,6 +40878,22 @@ ix86_autovectorize_vector_sizes (void)
return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
}
/* Return class of registers which could be used for pseudo of MODE
and of class RCLASS for spilling instead of memory. Return NO_REGS
if it is not possible or non-profitable. */
static reg_class_t
ix86_spill_class (reg_class_t rclass, enum machine_mode mode)
{
if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! TARGET_MMX
&& hard_reg_set_subset_p (reg_class_contents[rclass],
reg_class_contents[GENERAL_REGS])
&& (mode == SImode || (TARGET_64BIT && mode == DImode)))
return SSE_REGS;
return NO_REGS;
}
/* Implement targetm.vectorize.init_cost. */
static void *
@ -41241,6 +41296,12 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)
#undef TARGET_LEGITIMATE_ADDRESS_P
#define TARGET_LEGITIMATE_ADDRESS_P ix86_legitimate_address_p
#undef TARGET_LRA_P
#define TARGET_LRA_P ix86_lra_p
#undef TARGET_REGISTER_PRIORITY
#define TARGET_REGISTER_PRIORITY ix86_register_priority
#undef TARGET_LEGITIMATE_CONSTANT_P
#define TARGET_LEGITIMATE_CONSTANT_P ix86_legitimate_constant_p
@ -41264,6 +41325,9 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)
#define TARGET_INIT_LIBFUNCS darwin_rename_builtins
#endif
#undef TARGET_SPILL_CLASS
#define TARGET_SPILL_CLASS ix86_spill_class
struct gcc_target targetm = TARGET_INITIALIZER;
#include "gt-i386.h"

View File

@ -327,6 +327,7 @@ enum ix86_tune_indices {
X86_TUNE_AVX128_OPTIMAL,
X86_TUNE_REASSOC_INT_TO_PARALLEL,
X86_TUNE_REASSOC_FP_TO_PARALLEL,
X86_TUNE_GENERAL_REGS_SSE_SPILL,
X86_TUNE_LAST
};
@ -431,6 +432,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
ix86_tune_features[X86_TUNE_REASSOC_INT_TO_PARALLEL]
#define TARGET_REASSOC_FP_TO_PARALLEL \
ix86_tune_features[X86_TUNE_REASSOC_FP_TO_PARALLEL]
#define TARGET_GENERAL_REGS_SSE_SPILL \
ix86_tune_features[X86_TUNE_GENERAL_REGS_SSE_SPILL]
/* Feature tests against the various architecture variations. */
enum ix86_arch_indices {

View File

@ -1030,9 +1030,9 @@ gen_split_move_double (rtx operands[])
subregs to make this code simpler. It is safe to call
alter_subreg any time after reload. */
if (GET_CODE (dest) == SUBREG)
alter_subreg (&dest);
alter_subreg (&dest, true);
if (GET_CODE (src) == SUBREG)
alter_subreg (&src);
alter_subreg (&src, true);
start_sequence ();
if (REG_P (dest))

View File

@ -3658,7 +3658,7 @@ emit_move_sequence (rtx *operands, enum machine_mode mode, rtx scratch_reg)
rtx temp = gen_rtx_SUBREG (GET_MODE (operand0),
reg_equiv_mem (REGNO (SUBREG_REG (operand0))),
SUBREG_BYTE (operand0));
operand0 = alter_subreg (&temp);
operand0 = alter_subreg (&temp, true);
}
if (scratch_reg
@ -3675,7 +3675,7 @@ emit_move_sequence (rtx *operands, enum machine_mode mode, rtx scratch_reg)
rtx temp = gen_rtx_SUBREG (GET_MODE (operand1),
reg_equiv_mem (REGNO (SUBREG_REG (operand1))),
SUBREG_BYTE (operand1));
operand1 = alter_subreg (&temp);
operand1 = alter_subreg (&temp, true);
}
if (scratch_reg && reload_in_progress && GET_CODE (operand0) == MEM

View File

@ -1616,7 +1616,7 @@ pa_emit_move_sequence (rtx *operands, enum machine_mode mode, rtx scratch_reg)
rtx temp = gen_rtx_SUBREG (GET_MODE (operand0),
reg_equiv_mem (REGNO (SUBREG_REG (operand0))),
SUBREG_BYTE (operand0));
operand0 = alter_subreg (&temp);
operand0 = alter_subreg (&temp, true);
}
if (scratch_reg
@ -1633,7 +1633,7 @@ pa_emit_move_sequence (rtx *operands, enum machine_mode mode, rtx scratch_reg)
rtx temp = gen_rtx_SUBREG (GET_MODE (operand1),
reg_equiv_mem (REGNO (SUBREG_REG (operand1))),
SUBREG_BYTE (operand1));
operand1 = alter_subreg (&temp);
operand1 = alter_subreg (&temp, true);
}
if (scratch_reg && reload_in_progress && GET_CODE (operand0) == MEM

View File

@ -7366,7 +7366,7 @@ label:
rtx regop = operands[store_p], word0 ,word1;
if (GET_CODE (regop) == SUBREG)
alter_subreg (&regop);
alter_subreg (&regop, true);
if (REGNO (XEXP (addr, 0)) == REGNO (XEXP (addr, 1)))
offset = 2;
else
@ -7374,9 +7374,9 @@ label:
mem = copy_rtx (mem);
PUT_MODE (mem, SImode);
word0 = gen_rtx_SUBREG (SImode, regop, 0);
alter_subreg (&word0);
alter_subreg (&word0, true);
word1 = gen_rtx_SUBREG (SImode, regop, 4);
alter_subreg (&word1);
alter_subreg (&word1, true);
if (store_p || ! refers_to_regno_p (REGNO (word0),
REGNO (word0) + 1, addr, 0))
{
@ -7834,7 +7834,7 @@ label:
else
{
x = gen_rtx_SUBREG (V2SFmode, operands[0], i * 8);
alter_subreg (&x);
alter_subreg (&x, true);
}
if (MEM_P (operands[1]))
@ -7843,7 +7843,7 @@ label:
else
{
y = gen_rtx_SUBREG (V2SFmode, operands[1], i * 8);
alter_subreg (&y);
alter_subreg (&y, true);
}
emit_insn (gen_movv2sf_i (x, y));

View File

@ -1301,11 +1301,11 @@ v850_reorg (void)
if (GET_CODE (dest) == SUBREG
&& (GET_CODE (SUBREG_REG (dest)) == MEM
|| GET_CODE (SUBREG_REG (dest)) == REG))
alter_subreg (&dest);
alter_subreg (&dest, true);
if (GET_CODE (src) == SUBREG
&& (GET_CODE (SUBREG_REG (src)) == MEM
|| GET_CODE (SUBREG_REG (src)) == REG))
alter_subreg (&src);
alter_subreg (&src, true);
if (GET_CODE (dest) == MEM && GET_CODE (src) == MEM)
mem = NULL_RTX;

View File

@ -1087,7 +1087,7 @@ fixup_subreg_mem (rtx x)
gen_rtx_SUBREG (GET_MODE (x),
reg_equiv_mem (REGNO (SUBREG_REG (x))),
SUBREG_BYTE (x));
x = alter_subreg (&temp);
x = alter_subreg (&temp, true);
}
return x;
}

View File

@ -2994,7 +2994,7 @@ dbxout_symbol_location (tree decl, tree type, const char *suffix, rtx home)
if (REGNO (value) >= FIRST_PSEUDO_REGISTER)
return 0;
}
home = alter_subreg (&home);
home = alter_subreg (&home, true);
}
if (REG_P (home))
{

View File

@ -1,5 +1,5 @@
@c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1996, 1998, 1999, 2000, 2001,
@c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
@c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012
@c Free Software Foundation, Inc.
@c This is part of the GCC manual.
@c For copying conditions, see the file gcc.texi.
@ -1622,7 +1622,9 @@ register preferences.
@item *
Says that the following character should be ignored when choosing
register preferences. @samp{*} has no effect on the meaning of the
constraint as a constraint, and no effect on reloading.
constraint as a constraint, and no effect on reloading. For LRA
@samp{*} additionally disparages slightly the alternative if the
following character matches the operand.
@ifset INTERNALS
Here is an example: the 68000 has an instruction to sign-extend a

View File

@ -771,7 +771,7 @@ branch instructions. The source file for this pass is @file{gcse.c}.
This pass attempts to replace conditional branches and surrounding
assignments with arithmetic, boolean value producing comparison
instructions, and conditional move instructions. In the very last
invocation after reload, it will generate predicated instructions
invocation after reload/LRA, it will generate predicated instructions
when supported by the target. The code is located in @file{ifcvt.c}.
@item Web construction
@ -842,9 +842,9 @@ source file is @file{regmove.c}.
The integrated register allocator (@acronym{IRA}). It is called
integrated because coalescing, register live range splitting, and hard
register preferencing are done on-the-fly during coloring. It also
has better integration with the reload pass. Pseudo-registers spilled
by the allocator or the reload have still a chance to get
hard-registers if the reload evicts some pseudo-registers from
has better integration with the reload/LRA pass. Pseudo-registers spilled
by the allocator or the reload/LRA have still a chance to get
hard-registers if the reload/LRA evicts some pseudo-registers from
hard-registers. The allocator helps to choose better pseudos for
spilling based on their live ranges and to coalesce stack slots
allocated for the spilled pseudo-registers. IRA is a regional
@ -875,6 +875,23 @@ instructions to save and restore call-clobbered registers around calls.
Source files are @file{reload.c} and @file{reload1.c}, plus the header
@file{reload.h} used for communication between them.
@cindex Local Register Allocator (LRA)
@item
This pass is a modern replacement of the reload pass. Source files
are @file{lra.c}, @file{lra-assign.c}, @file{lra-coalesce.c},
@file{lra-constraints.c}, @file{lra-eliminations.c},
@file{lra-equivs.c}, @file{lra-lives.c}, @file{lra-saves.c},
@file{lra-spills.c}, the header @file{lra-int.h} used for
communication between them, and the header @file{lra.h} used for
communication between LRA and the rest of compiler.
Unlike the reload pass, intermediate LRA decisions are reflected in
RTL as much as possible. This reduces the number of target-dependent
macros and hooks, leaving instruction constraints as the primary
source of control.
LRA is run on targets for which TARGET_LRA_P returns true.
@end itemize
@item Basic block reordering

View File

@ -2893,6 +2893,22 @@ as below:
@end smallexample
@end defmac
@deftypefn {Target Hook} bool TARGET_LRA_P (void)
A target hook which returns true if we use LRA instead of reload pass. It means that LRA was ported to the target. The default version of this target hook returns always false.
@end deftypefn
@deftypefn {Target Hook} int TARGET_REGISTER_PRIORITY (int)
A target hook which returns the register priority number to which the register @var{hard_regno} belongs to. The bigger the number, the more preferable the hard register usage (when all other conditions are the same). This hook can be used to prefer some hard register over others in LRA. For example, some x86-64 register usage needs additional prefix which makes instructions longer. The hook can return lower priority number for such registers make them less favorable and as result making the generated code smaller. The default version of this target hook returns always zero.
@end deftypefn
@deftypefn {Target Hook} bool TARGET_DIFFERENT_ADDR_DISPLACEMENT_P (void)
A target hook which returns true if an address with the same structure can have different maximal legitimate displacement. For example, the displacement can depend on memory mode or on operand combinations in the insn. The default version of this target hook returns always false.
@end deftypefn
@deftypefn {Target Hook} reg_class_t TARGET_SPILL_CLASS (reg_class_t, enum @var{machine_mode})
This hook defines a class of registers which could be used for spilling pseudos of the given mode and class, or @code{NO_REGS} if only memory should be used. Not defining this hook is equivalent to returning @code{NO_REGS} for all inputs.
@end deftypefn
@node Old Constraints
@section Obsolete Macros for Defining Constraints
@cindex defining constraints, obsolete method

View File

@ -2869,6 +2869,14 @@ as below:
@end smallexample
@end defmac
@hook TARGET_LRA_P
@hook TARGET_REGISTER_PRIORITY
@hook TARGET_DIFFERENT_ADDR_DISPLACEMENT_P
@hook TARGET_SPILL_CLASS
@node Old Constraints
@section Obsolete Macros for Defining Constraints
@cindex defining constraints, obsolete method

View File

@ -90,6 +90,8 @@ along with GCC; see the file COPYING3. If not see
#include "cgraph.h"
#include "input.h"
#include "gimple.h"
#include "ira.h"
#include "lra.h"
#include "dumpfile.h"
#include "opts.h"
@ -10162,7 +10164,9 @@ based_loc_descr (rtx reg, HOST_WIDE_INT offset,
argument pointer and soft frame pointer rtx's. */
if (reg == arg_pointer_rtx || reg == frame_pointer_rtx)
{
rtx elim = eliminate_regs (reg, VOIDmode, NULL_RTX);
rtx elim = (ira_use_lra_p
? lra_eliminate_regs (reg, VOIDmode, NULL_RTX)
: eliminate_regs (reg, VOIDmode, NULL_RTX));
if (elim != reg)
{
@ -15020,7 +15024,9 @@ compute_frame_pointer_to_fb_displacement (HOST_WIDE_INT offset)
offset += ARG_POINTER_CFA_OFFSET (current_function_decl);
#endif
elim = eliminate_regs (reg, VOIDmode, NULL_RTX);
elim = (ira_use_lra_p
? lra_eliminate_regs (reg, VOIDmode, NULL_RTX)
: eliminate_regs (reg, VOIDmode, NULL_RTX));
if (GET_CODE (elim) == PLUS)
{
offset += INTVAL (XEXP (elim, 1));

View File

@ -578,7 +578,7 @@ gen_rtx_REG (enum machine_mode mode, unsigned int regno)
Also don't do this when we are making new REGs in reload, since
we don't want to get confused with the real pointers. */
if (mode == Pmode && !reload_in_progress)
if (mode == Pmode && !reload_in_progress && !lra_in_progress)
{
if (regno == FRAME_POINTER_REGNUM
&& (!reload_completed || frame_pointer_needed))
@ -720,7 +720,14 @@ validate_subreg (enum machine_mode omode, enum machine_mode imode,
(subreg:SI (reg:DF) 0) isn't. */
else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
{
if (isize != osize)
if (! (isize == osize
/* LRA can use subreg to store a floating point value in
an integer mode. Although the floating point and the
integer modes need the same number of hard registers,
the size of floating point mode can be less than the
integer mode. LRA also uses subregs for a register
should be used in different mode in on insn. */
|| lra_in_progress))
return false;
}
@ -753,7 +760,8 @@ validate_subreg (enum machine_mode omode, enum machine_mode imode,
of a subword. A subreg does *not* perform arbitrary bit extraction.
Given that we've already checked mode/offset alignment, we only have
to check subword subregs here. */
if (osize < UNITS_PER_WORD)
if (osize < UNITS_PER_WORD
&& ! (lra_in_progress && (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))))
{
enum machine_mode wmode = isize > UNITS_PER_WORD ? word_mode : imode;
unsigned int low_off = subreg_lowpart_offset (omode, wmode);

View File

@ -3448,9 +3448,13 @@ emit_move_insn_1 (rtx x, rtx y)
fits within a HOST_WIDE_INT. */
if (!CONSTANT_P (y) || GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT)
{
rtx ret = emit_move_via_integer (mode, x, y, false);
rtx ret = emit_move_via_integer (mode, x, y, lra_in_progress);
if (ret)
return ret;
{
if (! lra_in_progress || recog (PATTERN (ret), ret, 0) >= 0)
return ret;
}
}
return emit_move_multi_word (mode, x, y);

View File

@ -2560,7 +2560,7 @@ final_scan_insn (rtx insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED,
{
rtx src1, src2;
if (GET_CODE (SET_SRC (set)) == SUBREG)
SET_SRC (set) = alter_subreg (&SET_SRC (set));
SET_SRC (set) = alter_subreg (&SET_SRC (set), true);
src1 = SET_SRC (set);
src2 = NULL_RTX;
@ -2568,10 +2568,10 @@ final_scan_insn (rtx insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED,
{
if (GET_CODE (XEXP (SET_SRC (set), 0)) == SUBREG)
XEXP (SET_SRC (set), 0)
= alter_subreg (&XEXP (SET_SRC (set), 0));
= alter_subreg (&XEXP (SET_SRC (set), 0), true);
if (GET_CODE (XEXP (SET_SRC (set), 1)) == SUBREG)
XEXP (SET_SRC (set), 1)
= alter_subreg (&XEXP (SET_SRC (set), 1));
= alter_subreg (&XEXP (SET_SRC (set), 1), true);
if (XEXP (SET_SRC (set), 1)
== CONST0_RTX (GET_MODE (XEXP (SET_SRC (set), 0))))
src2 = XEXP (SET_SRC (set), 0);
@ -2974,7 +2974,7 @@ cleanup_subreg_operands (rtx insn)
expression directly. */
if (GET_CODE (*recog_data.operand_loc[i]) == SUBREG)
{
recog_data.operand[i] = alter_subreg (recog_data.operand_loc[i]);
recog_data.operand[i] = alter_subreg (recog_data.operand_loc[i], true);
changed = true;
}
else if (GET_CODE (recog_data.operand[i]) == PLUS
@ -2987,7 +2987,7 @@ cleanup_subreg_operands (rtx insn)
{
if (GET_CODE (*recog_data.dup_loc[i]) == SUBREG)
{
*recog_data.dup_loc[i] = alter_subreg (recog_data.dup_loc[i]);
*recog_data.dup_loc[i] = alter_subreg (recog_data.dup_loc[i], true);
changed = true;
}
else if (GET_CODE (*recog_data.dup_loc[i]) == PLUS
@ -2999,11 +2999,11 @@ cleanup_subreg_operands (rtx insn)
df_insn_rescan (insn);
}
/* If X is a SUBREG, replace it with a REG or a MEM,
based on the thing it is a subreg of. */
/* If X is a SUBREG, try to replace it with a REG or a MEM, based on
the thing it is a subreg of. Do it anyway if FINAL_P. */
rtx
alter_subreg (rtx *xp)
alter_subreg (rtx *xp, bool final_p)
{
rtx x = *xp;
rtx y = SUBREG_REG (x);
@ -3027,16 +3027,19 @@ alter_subreg (rtx *xp)
offset += difference % UNITS_PER_WORD;
}
*xp = adjust_address (y, GET_MODE (x), offset);
if (final_p)
*xp = adjust_address (y, GET_MODE (x), offset);
else
*xp = adjust_address_nv (y, GET_MODE (x), offset);
}
else
{
rtx new_rtx = simplify_subreg (GET_MODE (x), y, GET_MODE (y),
SUBREG_BYTE (x));
SUBREG_BYTE (x));
if (new_rtx != 0)
*xp = new_rtx;
else if (REG_P (y))
else if (final_p && REG_P (y))
{
/* Simplify_subreg can't handle some REG cases, but we have to. */
unsigned int regno;
@ -3076,7 +3079,7 @@ walk_alter_subreg (rtx *xp, bool *changed)
case SUBREG:
*changed = true;
return alter_subreg (xp);
return alter_subreg (xp, true);
default:
break;
@ -3682,7 +3685,7 @@ void
output_operand (rtx x, int code ATTRIBUTE_UNUSED)
{
if (x && GET_CODE (x) == SUBREG)
x = alter_subreg (&x);
x = alter_subreg (&x, true);
/* X must not be a pseudo reg. */
gcc_assert (!x || !REG_P (x) || REGNO (x) < FIRST_PSEUDO_REGISTER);

View File

@ -3377,7 +3377,7 @@ calculate_bb_reg_pressure (void)
bitmap_iterator bi;
ira_setup_eliminable_regset ();
ira_setup_eliminable_regset (false);
curr_regs_live = BITMAP_ALLOC (&reg_obstack);
FOR_EACH_BB (bb)
{

View File

@ -6548,7 +6548,7 @@ sched_init (void)
sched_pressure = SCHED_PRESSURE_NONE;
if (sched_pressure != SCHED_PRESSURE_NONE)
ira_setup_eliminable_regset ();
ira_setup_eliminable_regset (false);
/* Initialize SPEC_INFO. */
if (targetm.sched.set_sched_flags)

View File

@ -2834,8 +2834,7 @@ color_pass (ira_loop_tree_node_t loop_tree_node)
exit_freq = ira_loop_edge_freq (subloop_node, regno, true);
enter_freq = ira_loop_edge_freq (subloop_node, regno, false);
ira_assert (regno < ira_reg_equiv_len);
if (ira_reg_equiv_invariant_p[regno]
|| ira_reg_equiv_const[regno] != NULL_RTX)
if (ira_equiv_no_lvalue_p (regno))
{
if (! ALLOCNO_ASSIGNED_P (subloop_allocno))
{
@ -2940,9 +2939,7 @@ move_spill_restore (void)
copies and the reload pass can spill the allocno set
by copy although the allocno will not get memory
slot. */
|| (regno < ira_reg_equiv_len
&& (ira_reg_equiv_invariant_p[regno]
|| ira_reg_equiv_const[regno] != NULL_RTX))
|| ira_equiv_no_lvalue_p (regno)
|| !bitmap_bit_p (loop_node->border_allocnos, ALLOCNO_NUM (a)))
continue;
mode = ALLOCNO_MODE (a);
@ -3366,9 +3363,7 @@ coalesce_allocnos (void)
a = ira_allocnos[j];
regno = ALLOCNO_REGNO (a);
if (! ALLOCNO_ASSIGNED_P (a) || ALLOCNO_HARD_REGNO (a) >= 0
|| (regno < ira_reg_equiv_len
&& (ira_reg_equiv_const[regno] != NULL_RTX
|| ira_reg_equiv_invariant_p[regno])))
|| ira_equiv_no_lvalue_p (regno))
continue;
for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp)
{
@ -3383,9 +3378,7 @@ coalesce_allocnos (void)
if ((cp->insn != NULL || cp->constraint_p)
&& ALLOCNO_ASSIGNED_P (cp->second)
&& ALLOCNO_HARD_REGNO (cp->second) < 0
&& (regno >= ira_reg_equiv_len
|| (! ira_reg_equiv_invariant_p[regno]
&& ira_reg_equiv_const[regno] == NULL_RTX)))
&& ! ira_equiv_no_lvalue_p (regno))
sorted_copies[cp_num++] = cp;
}
else if (cp->second == a)
@ -3651,9 +3644,7 @@ coalesce_spill_slots (ira_allocno_t *spilled_coalesced_allocnos, int num)
allocno = spilled_coalesced_allocnos[i];
if (ALLOCNO_COALESCE_DATA (allocno)->first != allocno
|| bitmap_bit_p (set_jump_crosses, ALLOCNO_REGNO (allocno))
|| (ALLOCNO_REGNO (allocno) < ira_reg_equiv_len
&& (ira_reg_equiv_const[ALLOCNO_REGNO (allocno)] != NULL_RTX
|| ira_reg_equiv_invariant_p[ALLOCNO_REGNO (allocno)])))
|| ira_equiv_no_lvalue_p (ALLOCNO_REGNO (allocno)))
continue;
for (j = 0; j < i; j++)
{
@ -3661,9 +3652,7 @@ coalesce_spill_slots (ira_allocno_t *spilled_coalesced_allocnos, int num)
n = ALLOCNO_COALESCE_DATA (a)->temp;
if (ALLOCNO_COALESCE_DATA (a)->first == a
&& ! bitmap_bit_p (set_jump_crosses, ALLOCNO_REGNO (a))
&& (ALLOCNO_REGNO (a) >= ira_reg_equiv_len
|| (! ira_reg_equiv_invariant_p[ALLOCNO_REGNO (a)]
&& ira_reg_equiv_const[ALLOCNO_REGNO (a)] == NULL_RTX))
&& ! ira_equiv_no_lvalue_p (ALLOCNO_REGNO (a))
&& ! slot_coalesced_allocno_live_ranges_intersect_p (allocno, n))
break;
}
@ -3771,9 +3760,7 @@ ira_sort_regnos_for_alter_reg (int *pseudo_regnos, int n,
allocno = spilled_coalesced_allocnos[i];
if (ALLOCNO_COALESCE_DATA (allocno)->first != allocno
|| ALLOCNO_HARD_REGNO (allocno) >= 0
|| (ALLOCNO_REGNO (allocno) < ira_reg_equiv_len
&& (ira_reg_equiv_const[ALLOCNO_REGNO (allocno)] != NULL_RTX
|| ira_reg_equiv_invariant_p[ALLOCNO_REGNO (allocno)])))
|| ira_equiv_no_lvalue_p (ALLOCNO_REGNO (allocno)))
continue;
if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
fprintf (ira_dump_file, " Slot %d (freq,size):", slot_num);

View File

@ -340,6 +340,7 @@ ira_create_new_reg (rtx original_reg)
if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
fprintf (ira_dump_file, " Creating newreg=%i from oldreg=%i\n",
REGNO (new_reg), REGNO (original_reg));
ira_expand_reg_equiv ();
return new_reg;
}
@ -518,8 +519,7 @@ generate_edge_moves (edge e)
/* Remove unnecessary stores at the region exit. We should do
this for readonly memory for sure and this is guaranteed by
that we never generate moves on region borders (see
checking ira_reg_equiv_invariant_p in function
change_loop). */
checking in function change_loop). */
if (ALLOCNO_HARD_REGNO (dest_allocno) < 0
&& ALLOCNO_HARD_REGNO (src_allocno) >= 0
&& store_can_be_removed_p (src_allocno, dest_allocno))
@ -613,8 +613,7 @@ change_loop (ira_loop_tree_node_t node)
/* don't create copies because reload can spill an
allocno set by copy although the allocno will not
get memory slot. */
|| ira_reg_equiv_invariant_p[regno]
|| ira_reg_equiv_const[regno] != NULL_RTX))
|| ira_equiv_no_lvalue_p (regno)))
continue;
original_reg = allocno_emit_reg (allocno);
if (parent_allocno == NULL
@ -902,17 +901,22 @@ modify_move_list (move_t list)
static rtx
emit_move_list (move_t list, int freq)
{
int cost, regno;
rtx result, insn, set, to;
rtx to, from, dest;
int to_regno, from_regno, cost, regno;
rtx result, insn, set;
enum machine_mode mode;
enum reg_class aclass;
grow_reg_equivs ();
start_sequence ();
for (; list != NULL; list = list->next)
{
start_sequence ();
emit_move_insn (allocno_emit_reg (list->to),
allocno_emit_reg (list->from));
to = allocno_emit_reg (list->to);
to_regno = REGNO (to);
from = allocno_emit_reg (list->from);
from_regno = REGNO (from);
emit_move_insn (to, from);
list->insn = get_insns ();
end_sequence ();
for (insn = list->insn; insn != NULL_RTX; insn = NEXT_INSN (insn))
@ -928,21 +932,22 @@ emit_move_list (move_t list, int freq)
to use the equivalence. */
if ((set = single_set (insn)) != NULL_RTX)
{
to = SET_DEST (set);
if (GET_CODE (to) == SUBREG)
to = SUBREG_REG (to);
ira_assert (REG_P (to));
regno = REGNO (to);
dest = SET_DEST (set);
if (GET_CODE (dest) == SUBREG)
dest = SUBREG_REG (dest);
ira_assert (REG_P (dest));
regno = REGNO (dest);
if (regno >= ira_reg_equiv_len
|| (! ira_reg_equiv_invariant_p[regno]
&& ira_reg_equiv_const[regno] == NULL_RTX))
|| (ira_reg_equiv[regno].invariant == NULL_RTX
&& ira_reg_equiv[regno].constant == NULL_RTX))
continue; /* regno has no equivalence. */
ira_assert ((int) VEC_length (reg_equivs_t, reg_equivs)
>= ira_reg_equiv_len);
> regno);
reg_equiv_init (regno)
= gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init (regno));
}
}
ira_update_equiv_info_by_shuffle_insn (to_regno, from_regno, list->insn);
emit_insn (list->insn);
mode = ALLOCNO_MODE (list->to);
aclass = ALLOCNO_CLASS (list->to);

View File

@ -795,11 +795,6 @@ struct target_ira_int {
/* Map class->true if class is a pressure class, false otherwise. */
bool x_ira_reg_pressure_class_p[N_REG_CLASSES];
/* Register class subset relation: TRUE if the first class is a subset
of the second one considering only hard registers available for the
allocation. */
int x_ira_class_subset_p[N_REG_CLASSES][N_REG_CLASSES];
/* Array of the number of hard registers of given class which are
available for allocation. The order is defined by the hard
register numbers. */
@ -852,13 +847,8 @@ struct target_ira_int {
taking all hard-registers including fixed ones into account. */
enum reg_class x_ira_reg_class_intersect[N_REG_CLASSES][N_REG_CLASSES];
/* True if the two classes (that is calculated taking only hard
registers available for allocation into account; are
intersected. */
bool x_ira_reg_classes_intersect_p[N_REG_CLASSES][N_REG_CLASSES];
/* Classes with end marker LIM_REG_CLASSES which are intersected with
given class (the first index;. That includes given class itself.
given class (the first index). That includes given class itself.
This is calculated taking only hard registers available for
allocation into account. */
enum reg_class x_ira_reg_class_super_classes[N_REG_CLASSES][N_REG_CLASSES];
@ -875,7 +865,7 @@ struct target_ira_int {
/* For each reg class, table listing all the classes contained in it
(excluding the class itself. Non-allocatable registers are
excluded from the consideration;. */
excluded from the consideration). */
enum reg_class x_alloc_reg_class_subclasses[N_REG_CLASSES][N_REG_CLASSES];
/* Array whose values are hard regset of hard registers for which
@ -908,8 +898,6 @@ extern struct target_ira_int *this_target_ira_int;
(this_target_ira_int->x_ira_reg_allocno_class_p)
#define ira_reg_pressure_class_p \
(this_target_ira_int->x_ira_reg_pressure_class_p)
#define ira_class_subset_p \
(this_target_ira_int->x_ira_class_subset_p)
#define ira_non_ordered_class_hard_regs \
(this_target_ira_int->x_ira_non_ordered_class_hard_regs)
#define ira_class_hard_reg_index \
@ -928,8 +916,6 @@ extern struct target_ira_int *this_target_ira_int;
(this_target_ira_int->x_ira_uniform_class_p)
#define ira_reg_class_intersect \
(this_target_ira_int->x_ira_reg_class_intersect)
#define ira_reg_classes_intersect_p \
(this_target_ira_int->x_ira_reg_classes_intersect_p)
#define ira_reg_class_super_classes \
(this_target_ira_int->x_ira_reg_class_super_classes)
#define ira_reg_class_subunion \
@ -950,17 +936,6 @@ extern void ira_debug_disposition (void);
extern void ira_debug_allocno_classes (void);
extern void ira_init_register_move_cost (enum machine_mode);
/* The length of the two following arrays. */
extern int ira_reg_equiv_len;
/* The element value is TRUE if the corresponding regno value is
invariant. */
extern bool *ira_reg_equiv_invariant_p;
/* The element value is equiv constant of given pseudo-register or
NULL_RTX. */
extern rtx *ira_reg_equiv_const;
/* ira-build.c */
/* The current loop tree node and its regno allocno map. */
@ -1044,6 +1019,20 @@ extern void ira_emit (bool);
/* Return true if equivalence of pseudo REGNO is not a lvalue. */
static inline bool
ira_equiv_no_lvalue_p (int regno)
{
if (regno >= ira_reg_equiv_len)
return false;
return (ira_reg_equiv[regno].constant != NULL_RTX
|| ira_reg_equiv[regno].invariant != NULL_RTX
|| (ira_reg_equiv[regno].memory != NULL_RTX
&& MEM_READONLY_P (ira_reg_equiv[regno].memory)));
}
/* Initialize register costs for MODE if necessary. */
static inline void
ira_init_register_move_cost_if_necessary (enum machine_mode mode)

584
gcc/ira.c
View File

@ -382,6 +382,7 @@ along with GCC; see the file COPYING3. If not see
#include "function.h"
#include "ggc.h"
#include "ira-int.h"
#include "lra.h"
#include "dce.h"
#include "dbgcnt.h"
@ -1201,6 +1202,7 @@ setup_reg_class_relations (void)
{
ira_reg_classes_intersect_p[cl1][cl2] = false;
ira_reg_class_intersect[cl1][cl2] = NO_REGS;
ira_reg_class_subset[cl1][cl2] = NO_REGS;
COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]);
AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs);
COPY_HARD_REG_SET (temp_set2, reg_class_contents[cl2]);
@ -1248,9 +1250,8 @@ setup_reg_class_relations (void)
COPY_HARD_REG_SET (union_set, reg_class_contents[cl1]);
IOR_HARD_REG_SET (union_set, reg_class_contents[cl2]);
AND_COMPL_HARD_REG_SET (union_set, no_unit_alloc_regs);
for (i = 0; i < ira_important_classes_num; i++)
for (cl3 = 0; cl3 < N_REG_CLASSES; cl3++)
{
cl3 = ira_important_classes[i];
COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl3]);
AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs);
if (hard_reg_set_subset_p (temp_hard_regset, intersection_set))
@ -1258,25 +1259,45 @@ setup_reg_class_relations (void)
/* CL3 allocatable hard register set is inside of
intersection of allocatable hard register sets
of CL1 and CL2. */
if (important_class_p[cl3])
{
COPY_HARD_REG_SET
(temp_set2,
reg_class_contents
[(int) ira_reg_class_intersect[cl1][cl2]]);
AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs);
if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2)
/* If the allocatable hard register sets are
the same, prefer GENERAL_REGS or the
smallest class for debugging
purposes. */
|| (hard_reg_set_equal_p (temp_hard_regset, temp_set2)
&& (cl3 == GENERAL_REGS
|| ((ira_reg_class_intersect[cl1][cl2]
!= GENERAL_REGS)
&& hard_reg_set_subset_p
(reg_class_contents[cl3],
reg_class_contents
[(int)
ira_reg_class_intersect[cl1][cl2]])))))
ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3;
}
COPY_HARD_REG_SET
(temp_set2,
reg_class_contents[(int)
ira_reg_class_intersect[cl1][cl2]]);
reg_class_contents[(int) ira_reg_class_subset[cl1][cl2]]);
AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs);
if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2)
/* If the allocatable hard register sets are the
same, prefer GENERAL_REGS or the smallest
class for debugging purposes. */
if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2)
/* Ignore unavailable hard registers and prefer
smallest class for debugging purposes. */
|| (hard_reg_set_equal_p (temp_hard_regset, temp_set2)
&& (cl3 == GENERAL_REGS
|| (ira_reg_class_intersect[cl1][cl2] != GENERAL_REGS
&& hard_reg_set_subset_p
(reg_class_contents[cl3],
reg_class_contents
[(int) ira_reg_class_intersect[cl1][cl2]])))))
ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3;
&& hard_reg_set_subset_p
(reg_class_contents[cl3],
reg_class_contents
[(int) ira_reg_class_subset[cl1][cl2]])))
ira_reg_class_subset[cl1][cl2] = (enum reg_class) cl3;
}
if (hard_reg_set_subset_p (temp_hard_regset, union_set))
if (important_class_p[cl3]
&& hard_reg_set_subset_p (temp_hard_regset, union_set))
{
/* CL3 allocatbale hard register set is inside of
union of allocatable hard register sets of CL1
@ -1632,6 +1653,7 @@ void
ira_init_once (void)
{
ira_init_costs_once ();
lra_init_once ();
}
/* Free ira_max_register_move_cost, ira_may_move_in_cost and
@ -1679,6 +1701,7 @@ ira_init (void)
clarify_prohibited_class_mode_regs ();
setup_hard_regno_aclass ();
ira_init_costs ();
lra_init ();
}
/* Function called once at the end of compiler work. */
@ -1687,6 +1710,7 @@ ira_finish_once (void)
{
ira_finish_costs_once ();
free_register_move_costs ();
lra_finish_once ();
}
@ -1823,9 +1847,11 @@ compute_regs_asm_clobbered (void)
}
/* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and REGS_EVER_LIVE. */
/* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and REGS_EVER_LIVE.
If the function is called from IRA (not from the insn scheduler or
RTL loop invariant motion), FROM_IRA_P is true. */
void
ira_setup_eliminable_regset (void)
ira_setup_eliminable_regset (bool from_ira_p)
{
#ifdef ELIMINABLE_REGS
int i;
@ -1835,7 +1861,7 @@ ira_setup_eliminable_regset (void)
sp for alloca. So we can't eliminate the frame pointer in that
case. At some point, we should improve this by emitting the
sp-adjusting insns for this case. */
int need_fp
frame_pointer_needed
= (! flag_omit_frame_pointer
|| (cfun->calls_alloca && EXIT_IGNORE_STACK)
/* We need the frame pointer to catch stack overflow exceptions
@ -1845,8 +1871,14 @@ ira_setup_eliminable_regset (void)
|| crtl->stack_realign_needed
|| targetm.frame_pointer_required ());
frame_pointer_needed = need_fp;
if (from_ira_p && ira_use_lra_p)
/* It can change FRAME_POINTER_NEEDED. We call it only from IRA
because it is expensive. */
lra_init_elimination ();
if (frame_pointer_needed)
df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM, true);
COPY_HARD_REG_SET (ira_no_alloc_regs, no_unit_alloc_regs);
CLEAR_HARD_REG_SET (eliminable_regset);
@ -1859,7 +1891,7 @@ ira_setup_eliminable_regset (void)
{
bool cannot_elim
= (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to)
|| (eliminables[i].to == STACK_POINTER_REGNUM && need_fp));
|| (eliminables[i].to == STACK_POINTER_REGNUM && frame_pointer_needed));
if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, eliminables[i].from))
{
@ -1878,10 +1910,10 @@ ira_setup_eliminable_regset (void)
if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, HARD_FRAME_POINTER_REGNUM))
{
SET_HARD_REG_BIT (eliminable_regset, HARD_FRAME_POINTER_REGNUM);
if (need_fp)
if (frame_pointer_needed)
SET_HARD_REG_BIT (ira_no_alloc_regs, HARD_FRAME_POINTER_REGNUM);
}
else if (need_fp)
else if (frame_pointer_needed)
error ("%s cannot be used in asm here",
reg_names[HARD_FRAME_POINTER_REGNUM]);
else
@ -1892,10 +1924,10 @@ ira_setup_eliminable_regset (void)
if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, HARD_FRAME_POINTER_REGNUM))
{
SET_HARD_REG_BIT (eliminable_regset, FRAME_POINTER_REGNUM);
if (need_fp)
if (frame_pointer_needed)
SET_HARD_REG_BIT (ira_no_alloc_regs, FRAME_POINTER_REGNUM);
}
else if (need_fp)
else if (frame_pointer_needed)
error ("%s cannot be used in asm here", reg_names[FRAME_POINTER_REGNUM]);
else
df_set_regs_ever_live (FRAME_POINTER_REGNUM, true);
@ -1904,66 +1936,6 @@ ira_setup_eliminable_regset (void)
/* The length of the following two arrays. */
int ira_reg_equiv_len;
/* The element value is TRUE if the corresponding regno value is
invariant. */
bool *ira_reg_equiv_invariant_p;
/* The element value is equiv constant of given pseudo-register or
NULL_RTX. */
rtx *ira_reg_equiv_const;
/* Set up the two arrays declared above. */
static void
find_reg_equiv_invariant_const (void)
{
unsigned int i;
bool invariant_p;
rtx list, insn, note, constant, x;
for (i = FIRST_PSEUDO_REGISTER; i < VEC_length (reg_equivs_t, reg_equivs); i++)
{
constant = NULL_RTX;
invariant_p = false;
for (list = reg_equiv_init (i); list != NULL_RTX; list = XEXP (list, 1))
{
insn = XEXP (list, 0);
note = find_reg_note (insn, REG_EQUIV, NULL_RTX);
if (note == NULL_RTX)
continue;
x = XEXP (note, 0);
if (! CONSTANT_P (x)
|| ! flag_pic || LEGITIMATE_PIC_OPERAND_P (x))
{
/* It can happen that a REG_EQUIV note contains a MEM
that is not a legitimate memory operand. As later
stages of the reload assume that all addresses found
in the reg_equiv_* arrays were originally legitimate,
we ignore such REG_EQUIV notes. */
if (memory_operand (x, VOIDmode))
invariant_p = MEM_READONLY_P (x);
else if (function_invariant_p (x))
{
if (GET_CODE (x) == PLUS
|| x == frame_pointer_rtx || x == arg_pointer_rtx)
invariant_p = true;
else
constant = x;
}
}
}
ira_reg_equiv_invariant_p[i] = invariant_p;
ira_reg_equiv_const[i] = constant;
}
}
/* Vector of substitutions of register numbers,
used to map pseudo regs into hardware regs.
This is set up as a result of register allocation.
@ -1984,6 +1956,8 @@ setup_reg_renumber (void)
caller_save_needed = 0;
FOR_EACH_ALLOCNO (a, ai)
{
if (ira_use_lra_p && ALLOCNO_CAP_MEMBER (a) != NULL)
continue;
/* There are no caps at this point. */
ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL);
if (! ALLOCNO_ASSIGNED_P (a))
@ -2015,9 +1989,7 @@ setup_reg_renumber (void)
ira_assert (!optimize || flag_caller_saves
|| (ALLOCNO_CALLS_CROSSED_NUM (a)
== ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a))
|| regno >= ira_reg_equiv_len
|| ira_reg_equiv_const[regno]
|| ira_reg_equiv_invariant_p[regno]);
|| ira_equiv_no_lvalue_p (regno));
caller_save_needed = 1;
}
}
@ -2184,6 +2156,109 @@ check_allocation (void)
}
#endif
/* Allocate REG_EQUIV_INIT. Set up it from IRA_REG_EQUIV which should
be already calculated. */
static void
setup_reg_equiv_init (void)
{
int i;
int max_regno = max_reg_num ();
for (i = 0; i < max_regno; i++)
reg_equiv_init (i) = ira_reg_equiv[i].init_insns;
}
/* Update equiv regno from movement of FROM_REGNO to TO_REGNO. INSNS
are insns which were generated for such movement. It is assumed
that FROM_REGNO and TO_REGNO always have the same value at the
point of any move containing such registers. This function is used
to update equiv info for register shuffles on the region borders
and for caller save/restore insns. */
void
ira_update_equiv_info_by_shuffle_insn (int to_regno, int from_regno, rtx insns)
{
rtx insn, x, note;
if (! ira_reg_equiv[from_regno].defined_p
&& (! ira_reg_equiv[to_regno].defined_p
|| ((x = ira_reg_equiv[to_regno].memory) != NULL_RTX
&& ! MEM_READONLY_P (x))))
return;
insn = insns;
if (NEXT_INSN (insn) != NULL_RTX)
{
if (! ira_reg_equiv[to_regno].defined_p)
{
ira_assert (ira_reg_equiv[to_regno].init_insns == NULL_RTX);
return;
}
ira_reg_equiv[to_regno].defined_p = false;
ira_reg_equiv[to_regno].memory
= ira_reg_equiv[to_regno].constant
= ira_reg_equiv[to_regno].invariant
= ira_reg_equiv[to_regno].init_insns = NULL_RTX;
if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
fprintf (ira_dump_file,
" Invalidating equiv info for reg %d\n", to_regno);
return;
}
/* It is possible that FROM_REGNO still has no equivalence because
in shuffles to_regno<-from_regno and from_regno<-to_regno the 2nd
insn was not processed yet. */
if (ira_reg_equiv[from_regno].defined_p)
{
ira_reg_equiv[to_regno].defined_p = true;
if ((x = ira_reg_equiv[from_regno].memory) != NULL_RTX)
{
ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX
&& ira_reg_equiv[from_regno].constant == NULL_RTX);
ira_assert (ira_reg_equiv[to_regno].memory == NULL_RTX
|| rtx_equal_p (ira_reg_equiv[to_regno].memory, x));
ira_reg_equiv[to_regno].memory = x;
if (! MEM_READONLY_P (x))
/* We don't add the insn to insn init list because memory
equivalence is just to say what memory is better to use
when the pseudo is spilled. */
return;
}
else if ((x = ira_reg_equiv[from_regno].constant) != NULL_RTX)
{
ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX);
ira_assert (ira_reg_equiv[to_regno].constant == NULL_RTX
|| rtx_equal_p (ira_reg_equiv[to_regno].constant, x));
ira_reg_equiv[to_regno].constant = x;
}
else
{
x = ira_reg_equiv[from_regno].invariant;
ira_assert (x != NULL_RTX);
ira_assert (ira_reg_equiv[to_regno].invariant == NULL_RTX
|| rtx_equal_p (ira_reg_equiv[to_regno].invariant, x));
ira_reg_equiv[to_regno].invariant = x;
}
if (find_reg_note (insn, REG_EQUIV, x) == NULL_RTX)
{
note = set_unique_reg_note (insn, REG_EQUIV, x);
gcc_assert (note != NULL_RTX);
if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
{
fprintf (ira_dump_file,
" Adding equiv note to insn %u for reg %d ",
INSN_UID (insn), to_regno);
print_value_slim (ira_dump_file, x, 1);
fprintf (ira_dump_file, "\n");
}
}
}
ira_reg_equiv[to_regno].init_insns
= gen_rtx_INSN_LIST (VOIDmode, insn,
ira_reg_equiv[to_regno].init_insns);
if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
fprintf (ira_dump_file,
" Adding equiv init move insn %u to reg %d\n",
INSN_UID (insn), to_regno);
}
/* Fix values of array REG_EQUIV_INIT after live range splitting done
by IRA. */
static void
@ -2221,6 +2296,7 @@ fix_reg_equiv_init (void)
prev = x;
else
{
/* Remove the wrong list element. */
if (prev == NULL_RTX)
reg_equiv_init (i) = next;
else
@ -2360,6 +2436,46 @@ mark_elimination (int from, int to)
/* The length of the following array. */
int ira_reg_equiv_len;
/* Info about equiv. info for each register. */
struct ira_reg_equiv *ira_reg_equiv;
/* Expand ira_reg_equiv if necessary. */
void
ira_expand_reg_equiv (void)
{
int old = ira_reg_equiv_len;
if (ira_reg_equiv_len > max_reg_num ())
return;
ira_reg_equiv_len = max_reg_num () * 3 / 2 + 1;
ira_reg_equiv
= (struct ira_reg_equiv *) xrealloc (ira_reg_equiv,
ira_reg_equiv_len
* sizeof (struct ira_reg_equiv));
gcc_assert (old < ira_reg_equiv_len);
memset (ira_reg_equiv + old, 0,
sizeof (struct ira_reg_equiv) * (ira_reg_equiv_len - old));
}
static void
init_reg_equiv (void)
{
ira_reg_equiv_len = 0;
ira_reg_equiv = NULL;
ira_expand_reg_equiv ();
}
static void
finish_reg_equiv (void)
{
free (ira_reg_equiv);
}
struct equivalence
{
/* Set when a REG_EQUIV note is found or created. Use to
@ -2733,7 +2849,8 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSED,
should keep their initialization insns. */
if (reg_equiv[regno].is_arg_equivalence)
return;
reg_equiv_init (regno) = NULL_RTX;
ira_reg_equiv[regno].defined_p = false;
ira_reg_equiv[regno].init_insns = NULL_RTX;
for (; list; list = XEXP (list, 1))
{
rtx insn = XEXP (list, 0);
@ -2769,7 +2886,7 @@ static int recorded_label_ref;
value into the using insn. If it succeeds, we can eliminate the
register completely.
Initialize the REG_EQUIV_INIT array of initializing insns.
Initialize init_insns in ira_reg_equiv array.
Return non-zero if jump label rebuilding should be done. */
static int
@ -2844,14 +2961,16 @@ update_equiv_regs (void)
gcc_assert (REG_P (dest));
regno = REGNO (dest);
/* Note that we don't want to clear reg_equiv_init even if there
are multiple sets of this register. */
/* Note that we don't want to clear init_insns in
ira_reg_equiv even if there are multiple sets of this
register. */
reg_equiv[regno].is_arg_equivalence = 1;
/* Record for reload that this is an equivalencing insn. */
if (rtx_equal_p (src, XEXP (note, 0)))
reg_equiv_init (regno)
= gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init (regno));
ira_reg_equiv[regno].init_insns
= gen_rtx_INSN_LIST (VOIDmode, insn,
ira_reg_equiv[regno].init_insns);
/* Continue normally in case this is a candidate for
replacements. */
@ -2951,8 +3070,9 @@ update_equiv_regs (void)
/* If we haven't done so, record for reload that this is an
equivalencing insn. */
if (!reg_equiv[regno].is_arg_equivalence)
reg_equiv_init (regno)
= gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init (regno));
ira_reg_equiv[regno].init_insns
= gen_rtx_INSN_LIST (VOIDmode, insn,
ira_reg_equiv[regno].init_insns);
/* Record whether or not we created a REG_EQUIV note for a LABEL_REF.
We might end up substituting the LABEL_REF for uses of the
@ -3052,7 +3172,7 @@ update_equiv_regs (void)
{
/* This insn makes the equivalence, not the one initializing
the register. */
reg_equiv_init (regno)
ira_reg_equiv[regno].init_insns
= gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
df_notes_rescan (init_insn);
}
@ -3106,9 +3226,10 @@ update_equiv_regs (void)
/* reg_equiv[REGNO].replace gets set only when
REG_N_REFS[REGNO] is 2, i.e. the register is set
once and used once. (If it were only set, but not used,
flow would have deleted the setting insns.) Hence
there can only be one insn in reg_equiv[REGNO].init_insns. */
once and used once. (If it were only set, but
not used, flow would have deleted the setting
insns.) Hence there can only be one insn in
reg_equiv[REGNO].init_insns. */
gcc_assert (reg_equiv[regno].init_insns
&& !XEXP (reg_equiv[regno].init_insns, 1));
equiv_insn = XEXP (reg_equiv[regno].init_insns, 0);
@ -3155,7 +3276,7 @@ update_equiv_regs (void)
reg_equiv[regno].init_insns
= XEXP (reg_equiv[regno].init_insns, 1);
reg_equiv_init (regno) = NULL_RTX;
ira_reg_equiv[regno].init_insns = NULL_RTX;
bitmap_set_bit (cleared_regs, regno);
}
/* Move the initialization of the register to just before
@ -3188,7 +3309,7 @@ update_equiv_regs (void)
if (insn == BB_HEAD (bb))
BB_HEAD (bb) = PREV_INSN (insn);
reg_equiv_init (regno)
ira_reg_equiv[regno].init_insns
= gen_rtx_INSN_LIST (VOIDmode, new_insn, NULL_RTX);
bitmap_set_bit (cleared_regs, regno);
}
@ -3236,6 +3357,88 @@ update_equiv_regs (void)
/* Set up fields memory, constant, and invariant from init_insns in
the structures of array ira_reg_equiv. */
static void
setup_reg_equiv (void)
{
int i;
rtx elem, insn, set, x;
for (i = FIRST_PSEUDO_REGISTER; i < ira_reg_equiv_len; i++)
for (elem = ira_reg_equiv[i].init_insns; elem; elem = XEXP (elem, 1))
{
insn = XEXP (elem, 0);
set = single_set (insn);
/* Init insns can set up equivalence when the reg is a destination or
a source (in this case the destination is memory). */
if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))))
{
if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL)
x = XEXP (x, 0);
else if (REG_P (SET_DEST (set))
&& REGNO (SET_DEST (set)) == (unsigned int) i)
x = SET_SRC (set);
else
{
gcc_assert (REG_P (SET_SRC (set))
&& REGNO (SET_SRC (set)) == (unsigned int) i);
x = SET_DEST (set);
}
if (! function_invariant_p (x)
|| ! flag_pic
/* A function invariant is often CONSTANT_P but may
include a register. We promise to only pass
CONSTANT_P objects to LEGITIMATE_PIC_OPERAND_P. */
|| (CONSTANT_P (x) && LEGITIMATE_PIC_OPERAND_P (x)))
{
/* It can happen that a REG_EQUIV note contains a MEM
that is not a legitimate memory operand. As later
stages of reload assume that all addresses found in
the lra_regno_equiv_* arrays were originally
legitimate, we ignore such REG_EQUIV notes. */
if (memory_operand (x, VOIDmode))
{
ira_reg_equiv[i].defined_p = true;
ira_reg_equiv[i].memory = x;
continue;
}
else if (function_invariant_p (x))
{
enum machine_mode mode;
mode = GET_MODE (SET_DEST (set));
if (GET_CODE (x) == PLUS
|| x == frame_pointer_rtx || x == arg_pointer_rtx)
/* This is PLUS of frame pointer and a constant,
or fp, or argp. */
ira_reg_equiv[i].invariant = x;
else if (targetm.legitimate_constant_p (mode, x))
ira_reg_equiv[i].constant = x;
else
{
ira_reg_equiv[i].memory = force_const_mem (mode, x);
if (ira_reg_equiv[i].memory == NULL_RTX)
{
ira_reg_equiv[i].defined_p = false;
ira_reg_equiv[i].init_insns = NULL_RTX;
break;
}
}
ira_reg_equiv[i].defined_p = true;
continue;
}
}
}
ira_reg_equiv[i].defined_p = false;
ira_reg_equiv[i].init_insns = NULL_RTX;
break;
}
}
/* Print chain C to FILE. */
static void
print_insn_chain (FILE *file, struct insn_chain *c)
@ -4130,6 +4333,11 @@ allocate_initial_values (void)
}
}
/* True when we use LRA instead of reload pass for the current
function. */
bool ira_use_lra_p;
/* All natural loops. */
struct loops ira_loops;
@ -4147,6 +4355,31 @@ ira (FILE *f)
bool loops_p;
int max_regno_before_ira, ira_max_point_before_emit;
int rebuild_p;
bool saved_flag_caller_saves = flag_caller_saves;
enum ira_region saved_flag_ira_region = flag_ira_region;
ira_conflicts_p = optimize > 0;
ira_use_lra_p = targetm.lra_p ();
/* If there are too many pseudos and/or basic blocks (e.g. 10K
pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
use simplified and faster algorithms in LRA. */
lra_simple_p
= (ira_use_lra_p && max_reg_num () >= (1 << 26) / last_basic_block);
if (lra_simple_p)
{
/* It permits to skip live range splitting in LRA. */
flag_caller_saves = false;
/* There is no sense to do regional allocation when we use
simplified LRA. */
flag_ira_region = IRA_REGION_ONE;
ira_conflicts_p = false;
}
#ifndef IRA_NO_OBSTACK
gcc_obstack_init (&ira_obstack);
#endif
bitmap_obstack_initialize (&ira_bitmap_obstack);
if (flag_caller_saves)
init_caller_save ();
@ -4162,7 +4395,6 @@ ira (FILE *f)
ira_dump_file = stderr;
}
ira_conflicts_p = optimize > 0;
setup_prohibited_mode_move_regs ();
df_note_add_problem ();
@ -4188,30 +4420,18 @@ ira (FILE *f)
if (resize_reg_info () && flag_ira_loop_pressure)
ira_set_pseudo_classes (true, ira_dump_file);
init_reg_equiv ();
rebuild_p = update_equiv_regs ();
setup_reg_equiv ();
setup_reg_equiv_init ();
#ifndef IRA_NO_OBSTACK
gcc_obstack_init (&ira_obstack);
#endif
bitmap_obstack_initialize (&ira_bitmap_obstack);
if (optimize)
if (optimize && rebuild_p)
{
max_regno = max_reg_num ();
ira_reg_equiv_len = max_regno;
ira_reg_equiv_invariant_p
= (bool *) ira_allocate (max_regno * sizeof (bool));
memset (ira_reg_equiv_invariant_p, 0, max_regno * sizeof (bool));
ira_reg_equiv_const = (rtx *) ira_allocate (max_regno * sizeof (rtx));
memset (ira_reg_equiv_const, 0, max_regno * sizeof (rtx));
find_reg_equiv_invariant_const ();
if (rebuild_p)
{
timevar_push (TV_JUMP);
rebuild_jump_labels (get_insns ());
if (purge_all_dead_edges ())
delete_unreachable_blocks ();
timevar_pop (TV_JUMP);
}
timevar_push (TV_JUMP);
rebuild_jump_labels (get_insns ());
if (purge_all_dead_edges ())
delete_unreachable_blocks ();
timevar_pop (TV_JUMP);
}
allocated_reg_info_size = max_reg_num ();
@ -4226,7 +4446,7 @@ ira (FILE *f)
find_moveable_pseudos ();
max_regno_before_ira = max_reg_num ();
ira_setup_eliminable_regset ();
ira_setup_eliminable_regset (true);
ira_overall_cost = ira_reg_cost = ira_mem_cost = 0;
ira_load_cost = ira_store_cost = ira_shuffle_cost = 0;
@ -4263,19 +4483,32 @@ ira (FILE *f)
ira_emit (loops_p);
max_regno = max_reg_num ();
if (ira_conflicts_p)
{
max_regno = max_reg_num ();
if (! loops_p)
ira_initiate_assign ();
{
if (! ira_use_lra_p)
ira_initiate_assign ();
}
else
{
expand_reg_info ();
if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
fprintf (ira_dump_file, "Flattening IR\n");
ira_flattening (max_regno_before_ira, ira_max_point_before_emit);
if (ira_use_lra_p)
{
ira_allocno_t a;
ira_allocno_iterator ai;
FOR_EACH_ALLOCNO (a, ai)
ALLOCNO_REGNO (a) = REGNO (ALLOCNO_EMIT_DATA (a)->reg);
}
else
{
if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
fprintf (ira_dump_file, "Flattening IR\n");
ira_flattening (max_regno_before_ira, ira_max_point_before_emit);
}
/* New insns were generated: add notes and recalculate live
info. */
df_analyze ();
@ -4289,9 +4522,12 @@ ira (FILE *f)
current_loops = &ira_loops;
record_loop_exits ();
setup_allocno_assignment_flags ();
ira_initiate_assign ();
ira_reassign_conflict_allocnos (max_regno);
if (! ira_use_lra_p)
{
setup_allocno_assignment_flags ();
ira_initiate_assign ();
ira_reassign_conflict_allocnos (max_regno);
}
}
}
@ -4338,6 +4574,13 @@ ira (FILE *f)
/* See comment for find_moveable_pseudos call. */
if (ira_conflicts_p)
move_unallocated_pseudos ();
/* Restore original values. */
if (lra_simple_p)
{
flag_caller_saves = saved_flag_caller_saves;
flag_ira_region = saved_flag_ira_region;
}
}
static void
@ -4349,46 +4592,77 @@ do_reload (void)
if (flag_ira_verbose < 10)
ira_dump_file = dump_file;
df_set_flags (DF_NO_INSN_RESCAN);
build_insn_chain ();
timevar_push (TV_RELOAD);
if (ira_use_lra_p)
{
if (current_loops != NULL)
{
release_recorded_exits ();
flow_loops_free (&ira_loops);
free_dominance_info (CDI_DOMINATORS);
}
FOR_ALL_BB (bb)
bb->loop_father = NULL;
current_loops = NULL;
if (ira_conflicts_p)
ira_free (ira_spilled_reg_stack_slots);
need_dce = reload (get_insns (), ira_conflicts_p);
ira_destroy ();
lra (ira_dump_file);
/* ???!!! Move it before lra () when we use ira_reg_equiv in
LRA. */
VEC_free (reg_equivs_t, gc, reg_equivs);
reg_equivs = NULL;
need_dce = false;
}
else
{
df_set_flags (DF_NO_INSN_RESCAN);
build_insn_chain ();
need_dce = reload (get_insns (), ira_conflicts_p);
}
timevar_pop (TV_RELOAD);
timevar_push (TV_IRA);
if (ira_conflicts_p)
if (ira_conflicts_p && ! ira_use_lra_p)
{
ira_free (ira_spilled_reg_stack_slots);
ira_finish_assign ();
}
if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL
&& overall_cost_before != ira_overall_cost)
fprintf (ira_dump_file, "+++Overall after reload %d\n", ira_overall_cost);
ira_destroy ();
flag_ira_share_spill_slots = saved_flag_ira_share_spill_slots;
if (current_loops != NULL)
if (! ira_use_lra_p)
{
release_recorded_exits ();
flow_loops_free (&ira_loops);
free_dominance_info (CDI_DOMINATORS);
ira_destroy ();
if (current_loops != NULL)
{
release_recorded_exits ();
flow_loops_free (&ira_loops);
free_dominance_info (CDI_DOMINATORS);
}
FOR_ALL_BB (bb)
bb->loop_father = NULL;
current_loops = NULL;
regstat_free_ri ();
regstat_free_n_sets_and_refs ();
}
FOR_ALL_BB (bb)
bb->loop_father = NULL;
current_loops = NULL;
regstat_free_ri ();
regstat_free_n_sets_and_refs ();
if (optimize)
{
cleanup_cfg (CLEANUP_EXPENSIVE);
cleanup_cfg (CLEANUP_EXPENSIVE);
ira_free (ira_reg_equiv_invariant_p);
ira_free (ira_reg_equiv_const);
}
finish_reg_equiv ();
bitmap_obstack_release (&ira_bitmap_obstack);
#ifndef IRA_NO_OBSTACK

View File

@ -20,11 +20,16 @@ You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
/* True when we use LRA instead of reload pass for the current
function. */
extern bool ira_use_lra_p;
/* True if we have allocno conflicts. It is false for non-optimized
mode or when the conflict table is too big. */
extern bool ira_conflicts_p;
struct target_ira {
struct target_ira
{
/* Map: hard register number -> allocno class it belongs to. If the
corresponding class is NO_REGS, the hard register is not available
for allocation. */
@ -79,6 +84,23 @@ struct target_ira {
class. */
int x_ira_class_hard_regs_num[N_REG_CLASSES];
/* Register class subset relation: TRUE if the first class is a subset
of the second one considering only hard registers available for the
allocation. */
int x_ira_class_subset_p[N_REG_CLASSES][N_REG_CLASSES];
/* The biggest class inside of intersection of the two classes (that
is calculated taking only hard registers available for allocation
into account. If the both classes contain no hard registers
available for allocation, the value is calculated with taking all
hard-registers including fixed ones into account. */
enum reg_class x_ira_reg_class_subset[N_REG_CLASSES][N_REG_CLASSES];
/* True if the two classes (that is calculated taking only hard
registers available for allocation into account; are
intersected. */
bool x_ira_reg_classes_intersect_p[N_REG_CLASSES][N_REG_CLASSES];
/* If class CL has a single allocatable register of mode M,
index [CL][M] gives the number of that register, otherwise it is -1. */
short x_ira_class_singleton[N_REG_CLASSES][MAX_MACHINE_MODE];
@ -121,18 +143,48 @@ extern struct target_ira *this_target_ira;
(this_target_ira->x_ira_class_hard_regs)
#define ira_class_hard_regs_num \
(this_target_ira->x_ira_class_hard_regs_num)
#define ira_class_subset_p \
(this_target_ira->x_ira_class_subset_p)
#define ira_reg_class_subset \
(this_target_ira->x_ira_reg_class_subset)
#define ira_reg_classes_intersect_p \
(this_target_ira->x_ira_reg_classes_intersect_p)
#define ira_class_singleton \
(this_target_ira->x_ira_class_singleton)
#define ira_no_alloc_regs \
(this_target_ira->x_ira_no_alloc_regs)
/* Major structure describing equivalence info for a pseudo. */
struct ira_reg_equiv
{
/* True if we can use this equivalence. */
bool defined_p;
/* True if the usage of the equivalence is profitable. */
bool profitable_p;
/* Equiv. memory, constant, invariant, and initializing insns of
given pseudo-register or NULL_RTX. */
rtx memory;
rtx constant;
rtx invariant;
/* Always NULL_RTX if defined_p is false. */
rtx init_insns;
};
/* The length of the following array. */
extern int ira_reg_equiv_len;
/* Info about equiv. info for each register. */
extern struct ira_reg_equiv *ira_reg_equiv;
extern void ira_init_once (void);
extern void ira_init (void);
extern void ira_finish_once (void);
extern void ira_setup_eliminable_regset (void);
extern void ira_setup_eliminable_regset (bool);
extern rtx ira_eliminate_regs (rtx, enum machine_mode);
extern void ira_set_pseudo_classes (bool, FILE *);
extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *);
extern void ira_expand_reg_equiv (void);
extern void ira_update_equiv_info_by_shuffle_insn (int, int, rtx);
extern void ira_sort_regnos_for_alter_reg (int *, int, unsigned int *);
extern void ira_mark_allocation_change (int);

View File

@ -1868,7 +1868,8 @@ true_regnum (const_rtx x)
{
if (REG_P (x))
{
if (REGNO (x) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO (x)] >= 0)
if (REGNO (x) >= FIRST_PSEUDO_REGISTER
&& (lra_in_progress || reg_renumber[REGNO (x)] >= 0))
return reg_renumber[REGNO (x)];
return REGNO (x);
}
@ -1880,7 +1881,8 @@ true_regnum (const_rtx x)
{
struct subreg_info info;
subreg_get_info (REGNO (SUBREG_REG (x)),
subreg_get_info (lra_in_progress
? (unsigned) base : REGNO (SUBREG_REG (x)),
GET_MODE (SUBREG_REG (x)),
SUBREG_BYTE (x), GET_MODE (x), &info);

View File

@ -1824,7 +1824,7 @@ calculate_loop_reg_pressure (void)
bitmap_initialize (&LOOP_DATA (loop)->regs_ref, &reg_obstack);
bitmap_initialize (&LOOP_DATA (loop)->regs_live, &reg_obstack);
}
ira_setup_eliminable_regset ();
ira_setup_eliminable_regset (false);
bitmap_initialize (&curr_regs_live, &reg_obstack);
FOR_EACH_BB (bb)
{

1398
gcc/lra-assigns.c Normal file

File diff suppressed because it is too large Load Diff

351
gcc/lra-coalesce.c Normal file
View File

@ -0,0 +1,351 @@
/* Coalesce spilled pseudos.
Copyright (C) 2010, 2011, 2012
Free Software Foundation, Inc.
Contributed by Vladimir Makarov <vmakarov@redhat.com>.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
/* This file contains a pass making some simple RTL code
transformations by coalescing pseudos to remove some move insns.
Spilling pseudos in LRA can create memory-memory moves. We should
remove potential memory-memory moves before the next constraint
pass because the constraint pass will generate additional insns for
such moves and all these insns will be hard to remove afterwards.
Here we coalesce only spilled pseudos. Coalescing non-spilled
pseudos (with different hard regs) might result in spilling
additional pseudos because of possible conflicts with other
non-spilled pseudos and, as a consequence, in more constraint
passes and even LRA infinite cycling. Trivial the same hard
register moves will be removed by subsequent compiler passes.
We don't coalesce special reload pseudos. It complicates LRA code
a lot without visible generated code improvement.
The pseudo live-ranges are used to find conflicting pseudos during
coalescing.
Most frequently executed moves is tried to be coalesced first. */
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "rtl.h"
#include "tm_p.h"
#include "insn-config.h"
#include "recog.h"
#include "output.h"
#include "regs.h"
#include "hard-reg-set.h"
#include "flags.h"
#include "function.h"
#include "expr.h"
#include "basic-block.h"
#include "except.h"
#include "timevar.h"
#include "ira.h"
#include "lra-int.h"
#include "df.h"
/* Arrays whose elements represent the first and the next pseudo
(regno) in the coalesced pseudos group to which given pseudo (its
regno is the index) belongs. The next of the last pseudo in the
group refers to the first pseudo in the group, in other words the
group is represented by a cyclic list. */
static int *first_coalesced_pseudo, *next_coalesced_pseudo;
/* The function is used to sort moves according to their execution
frequencies. */
static int
move_freq_compare_func (const void *v1p, const void *v2p)
{
rtx mv1 = *(const rtx *) v1p;
rtx mv2 = *(const rtx *) v2p;
int pri1, pri2;
pri1 = BLOCK_FOR_INSN (mv1)->frequency;
pri2 = BLOCK_FOR_INSN (mv2)->frequency;
if (pri2 - pri1)
return pri2 - pri1;
/* If frequencies are equal, sort by moves, so that the results of
qsort leave nothing to chance. */
return (int) INSN_UID (mv1) - (int) INSN_UID (mv2);
}
/* Pseudos which go away after coalescing. */
static bitmap_head coalesced_pseudos_bitmap;
/* Merge two sets of coalesced pseudos given correspondingly by
pseudos REGNO1 and REGNO2 (more accurately merging REGNO2 group
into REGNO1 group). Set up COALESCED_PSEUDOS_BITMAP. */
static void
merge_pseudos (int regno1, int regno2)
{
int regno, first, first2, last, next;
first = first_coalesced_pseudo[regno1];
if ((first2 = first_coalesced_pseudo[regno2]) == first)
return;
for (last = regno2, regno = next_coalesced_pseudo[regno2];;
regno = next_coalesced_pseudo[regno])
{
first_coalesced_pseudo[regno] = first;
bitmap_set_bit (&coalesced_pseudos_bitmap, regno);
if (regno == regno2)
break;
last = regno;
}
next = next_coalesced_pseudo[first];
next_coalesced_pseudo[first] = regno2;
next_coalesced_pseudo[last] = next;
lra_reg_info[first].live_ranges
= (lra_merge_live_ranges
(lra_reg_info[first].live_ranges,
lra_copy_live_range_list (lra_reg_info[first2].live_ranges)));
if (GET_MODE_SIZE (lra_reg_info[first].biggest_mode)
< GET_MODE_SIZE (lra_reg_info[first2].biggest_mode))
lra_reg_info[first].biggest_mode = lra_reg_info[first2].biggest_mode;
}
/* Change pseudos in *LOC on their coalescing group
representatives. */
static bool
substitute (rtx *loc)
{
int i, regno;
const char *fmt;
enum rtx_code code;
bool res;
if (*loc == NULL_RTX)
return false;
code = GET_CODE (*loc);
if (code == REG)
{
regno = REGNO (*loc);
if (regno < FIRST_PSEUDO_REGISTER
|| first_coalesced_pseudo[regno] == regno)
return false;
*loc = regno_reg_rtx[first_coalesced_pseudo[regno]];
return true;
}
res = false;
fmt = GET_RTX_FORMAT (code);
for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
{
if (fmt[i] == 'e')
{
if (substitute (&XEXP (*loc, i)))
res = true;
}
else if (fmt[i] == 'E')
{
int j;
for (j = XVECLEN (*loc, i) - 1; j >= 0; j--)
if (substitute (&XVECEXP (*loc, i, j)))
res = true;
}
}
return res;
}
/* The current iteration (1, 2, ...) of the coalescing pass. */
int lra_coalesce_iter;
/* Return true if the move involving REGNO1 and REGNO2 is a potential
memory-memory move. */
static bool
mem_move_p (int regno1, int regno2)
{
return reg_renumber[regno1] < 0 && reg_renumber[regno2] < 0;
}
/* Pseudos used instead of the coalesced pseudos. */
static bitmap_head used_pseudos_bitmap;
/* Set up USED_PSEUDOS_BITMAP, and update LR_BITMAP (a BB live info
bitmap). */
static void
update_live_info (bitmap lr_bitmap)
{
unsigned int j;
bitmap_iterator bi;
bitmap_clear (&used_pseudos_bitmap);
EXECUTE_IF_AND_IN_BITMAP (&coalesced_pseudos_bitmap, lr_bitmap,
FIRST_PSEUDO_REGISTER, j, bi)
bitmap_set_bit (&used_pseudos_bitmap, first_coalesced_pseudo[j]);
if (! bitmap_empty_p (&used_pseudos_bitmap))
{
bitmap_and_compl_into (lr_bitmap, &coalesced_pseudos_bitmap);
bitmap_ior_into (lr_bitmap, &used_pseudos_bitmap);
}
}
/* Return true if pseudo REGNO can be potentially coalesced. Use
SPLIT_PSEUDO_BITMAP to find pseudos whose live ranges were
split. */
static bool
coalescable_pseudo_p (int regno, bitmap split_origin_bitmap)
{
lra_assert (regno >= FIRST_PSEUDO_REGISTER);
/* Don't coalesce inheritance pseudos because spilled inheritance
pseudos will be removed in subsequent 'undo inheritance'
pass. */
return (lra_reg_info[regno].restore_regno < 0
/* We undo splits for spilled pseudos whose live ranges were
split. So don't coalesce them, it is not necessary and
the undo transformations would be wrong. */
&& ! bitmap_bit_p (split_origin_bitmap, regno)
/* We don't want to coalesce regnos with equivalences, at
least without updating this info. */
&& ira_reg_equiv[regno].constant == NULL_RTX
&& ira_reg_equiv[regno].memory == NULL_RTX
&& ira_reg_equiv[regno].invariant == NULL_RTX);
}
/* The major function for aggressive pseudo coalescing of moves only
if the both pseudos were spilled and not special reload pseudos. */
bool
lra_coalesce (void)
{
basic_block bb;
rtx mv, set, insn, next, *sorted_moves;
int i, mv_num, sregno, dregno, restore_regno;
unsigned int regno;
int coalesced_moves;
int max_regno = max_reg_num ();
bitmap_head involved_insns_bitmap, split_origin_bitmap;
bitmap_iterator bi;
timevar_push (TV_LRA_COALESCE);
if (lra_dump_file != NULL)
fprintf (lra_dump_file,
"\n********** Pseudos coalescing #%d: **********\n\n",
++lra_coalesce_iter);
first_coalesced_pseudo = XNEWVEC (int, max_regno);
next_coalesced_pseudo = XNEWVEC (int, max_regno);
for (i = 0; i < max_regno; i++)
first_coalesced_pseudo[i] = next_coalesced_pseudo[i] = i;
sorted_moves = XNEWVEC (rtx, get_max_uid ());
mv_num = 0;
/* Collect pseudos whose live ranges were split. */
bitmap_initialize (&split_origin_bitmap, &reg_obstack);
EXECUTE_IF_SET_IN_BITMAP (&lra_split_regs, 0, regno, bi)
if ((restore_regno = lra_reg_info[regno].restore_regno) >= 0)
bitmap_set_bit (&split_origin_bitmap, restore_regno);
/* Collect moves. */
coalesced_moves = 0;
FOR_EACH_BB (bb)
{
FOR_BB_INSNS_SAFE (bb, insn, next)
if (INSN_P (insn)
&& (set = single_set (insn)) != NULL_RTX
&& REG_P (SET_DEST (set)) && REG_P (SET_SRC (set))
&& (sregno = REGNO (SET_SRC (set))) >= FIRST_PSEUDO_REGISTER
&& (dregno = REGNO (SET_DEST (set))) >= FIRST_PSEUDO_REGISTER
&& mem_move_p (sregno, dregno)
&& coalescable_pseudo_p (sregno, &split_origin_bitmap)
&& coalescable_pseudo_p (dregno, &split_origin_bitmap)
&& ! side_effects_p (set)
&& !(lra_intersected_live_ranges_p
(lra_reg_info[sregno].live_ranges,
lra_reg_info[dregno].live_ranges)))
sorted_moves[mv_num++] = insn;
}
bitmap_clear (&split_origin_bitmap);
qsort (sorted_moves, mv_num, sizeof (rtx), move_freq_compare_func);
/* Coalesced copies, most frequently executed first. */
bitmap_initialize (&coalesced_pseudos_bitmap, &reg_obstack);
bitmap_initialize (&involved_insns_bitmap, &reg_obstack);
for (i = 0; i < mv_num; i++)
{
mv = sorted_moves[i];
set = single_set (mv);
lra_assert (set != NULL && REG_P (SET_SRC (set))
&& REG_P (SET_DEST (set)));
sregno = REGNO (SET_SRC (set));
dregno = REGNO (SET_DEST (set));
if (first_coalesced_pseudo[sregno] == first_coalesced_pseudo[dregno])
{
coalesced_moves++;
if (lra_dump_file != NULL)
fprintf
(lra_dump_file, " Coalescing move %i:r%d-r%d (freq=%d)\n",
INSN_UID (mv), sregno, dregno,
BLOCK_FOR_INSN (mv)->frequency);
/* We updated involved_insns_bitmap when doing the merge. */
}
else if (!(lra_intersected_live_ranges_p
(lra_reg_info[first_coalesced_pseudo[sregno]].live_ranges,
lra_reg_info[first_coalesced_pseudo[dregno]].live_ranges)))
{
coalesced_moves++;
if (lra_dump_file != NULL)
fprintf
(lra_dump_file,
" Coalescing move %i:r%d(%d)-r%d(%d) (freq=%d)\n",
INSN_UID (mv), sregno, ORIGINAL_REGNO (SET_SRC (set)),
dregno, ORIGINAL_REGNO (SET_DEST (set)),
BLOCK_FOR_INSN (mv)->frequency);
bitmap_ior_into (&involved_insns_bitmap,
&lra_reg_info[sregno].insn_bitmap);
bitmap_ior_into (&involved_insns_bitmap,
&lra_reg_info[dregno].insn_bitmap);
merge_pseudos (sregno, dregno);
}
}
bitmap_initialize (&used_pseudos_bitmap, &reg_obstack);
FOR_EACH_BB (bb)
{
update_live_info (df_get_live_in (bb));
update_live_info (df_get_live_out (bb));
FOR_BB_INSNS_SAFE (bb, insn, next)
if (INSN_P (insn)
&& bitmap_bit_p (&involved_insns_bitmap, INSN_UID (insn)))
{
if (! substitute (&insn))
continue;
lra_update_insn_regno_info (insn);
if ((set = single_set (insn)) != NULL_RTX && set_noop_p (set))
{
/* Coalesced move. */
if (lra_dump_file != NULL)
fprintf (lra_dump_file, " Removing move %i (freq=%d)\n",
INSN_UID (insn), BLOCK_FOR_INSN (insn)->frequency);
lra_set_insn_deleted (insn);
}
}
}
bitmap_clear (&used_pseudos_bitmap);
bitmap_clear (&involved_insns_bitmap);
bitmap_clear (&coalesced_pseudos_bitmap);
if (lra_dump_file != NULL && coalesced_moves != 0)
fprintf (lra_dump_file, "Coalesced Moves = %d\n", coalesced_moves);
free (sorted_moves);
free (next_coalesced_pseudo);
free (first_coalesced_pseudo);
timevar_pop (TV_LRA_COALESCE);
return coalesced_moves != 0;
}

5130
gcc/lra-constraints.c Normal file

File diff suppressed because it is too large Load Diff

1301
gcc/lra-eliminations.c Normal file

File diff suppressed because it is too large Load Diff

438
gcc/lra-int.h Normal file
View File

@ -0,0 +1,438 @@
/* Local Register Allocator (LRA) intercommunication header file.
Copyright (C) 2010, 2011, 2012
Free Software Foundation, Inc.
Contributed by Vladimir Makarov <vmakarov@redhat.com>.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
#include "lra.h"
#include "bitmap.h"
#include "recog.h"
#include "insn-attr.h"
#include "insn-codes.h"
#ifdef ENABLE_CHECKING
#define lra_assert(c) gcc_assert (c)
#else
/* Always define and include C, so that warnings for empty body in an
if statement and unused variable do not occur. */
#define lra_assert(c) ((void)(0 && (c)))
#endif
/* The parameter used to prevent infinite reloading for an insn. Each
insn operands might require a reload and, if it is a memory, its
base and index registers might require a reload too. */
#define LRA_MAX_INSN_RELOADS (MAX_RECOG_OPERANDS * 3)
/* Return the hard register which given pseudo REGNO assigned to.
Negative value means that the register got memory or we don't know
allocation yet. */
static inline int
lra_get_regno_hard_regno (int regno)
{
resize_reg_info ();
return reg_renumber[regno];
}
typedef struct lra_live_range *lra_live_range_t;
/* The structure describes program points where a given pseudo lives.
The live ranges can be used to find conflicts with other pseudos.
If the live ranges of two pseudos are intersected, the pseudos are
in conflict. */
struct lra_live_range
{
/* Pseudo regno whose live range is described by given
structure. */
int regno;
/* Program point range. */
int start, finish;
/* Next structure describing program points where the pseudo
lives. */
lra_live_range_t next;
/* Pointer to structures with the same start. */
lra_live_range_t start_next;
};
typedef struct lra_copy *lra_copy_t;
/* Copy between pseudos which affects assigning hard registers. */
struct lra_copy
{
/* True if regno1 is the destination of the copy. */
bool regno1_dest_p;
/* Execution frequency of the copy. */
int freq;
/* Pseudos connected by the copy. REGNO1 < REGNO2. */
int regno1, regno2;
/* Next copy with correspondingly REGNO1 and REGNO2. */
lra_copy_t regno1_next, regno2_next;
};
/* Common info about a register (pseudo or hard register). */
struct lra_reg
{
/* Bitmap of UIDs of insns (including debug insns) referring the
reg. */
bitmap_head insn_bitmap;
/* The following fields are defined only for pseudos. */
/* Hard registers with which the pseudo conflicts. */
HARD_REG_SET conflict_hard_regs;
/* We assign hard registers to reload pseudos which can occur in few
places. So two hard register preferences are enough for them.
The following fields define the preferred hard registers. If
there are no such hard registers the first field value is
negative. If there is only one preferred hard register, the 2nd
field is negative. */
int preferred_hard_regno1, preferred_hard_regno2;
/* Profits to use the corresponding preferred hard registers. If
the both hard registers defined, the first hard register has not
less profit than the second one. */
int preferred_hard_regno_profit1, preferred_hard_regno_profit2;
#ifdef STACK_REGS
/* True if the pseudo should not be assigned to a stack register. */
bool no_stack_p;
#endif
#ifdef ENABLE_CHECKING
/* True if the pseudo crosses a call. It is setup in lra-lives.c
and used to check that the pseudo crossing a call did not get a
call used hard register. */
bool call_p;
#endif
/* Number of references and execution frequencies of the register in
*non-debug* insns. */
int nrefs, freq;
int last_reload;
/* Regno used to undo the inheritance. It can be non-zero only
between couple of inheritance and undo inheritance passes. */
int restore_regno;
/* Value holding by register. If the pseudos have the same value
they do not conflict. */
int val;
/* These members are set up in lra-lives.c and updated in
lra-coalesce.c. */
/* The biggest size mode in which each pseudo reg is referred in
whole function (possibly via subreg). */
enum machine_mode biggest_mode;
/* Live ranges of the pseudo. */
lra_live_range_t live_ranges;
/* This member is set up in lra-lives.c for subsequent
assignments. */
lra_copy_t copies;
};
/* References to the common info about each register. */
extern struct lra_reg *lra_reg_info;
/* Static info about each insn operand (common for all insns with the
same ICODE). Warning: if the structure definition is changed, the
initializer for debug_operand_data in lra.c should be changed
too. */
struct lra_operand_data
{
/* The machine description constraint string of the operand. */
const char *constraint;
/* It is taken only from machine description (which is different
from recog_data.operand_mode) and can be of VOIDmode. */
ENUM_BITFIELD(machine_mode) mode : 16;
/* The type of the operand (in/out/inout). */
ENUM_BITFIELD (op_type) type : 8;
/* Through if accessed through STRICT_LOW. */
unsigned int strict_low : 1;
/* True if the operand is an operator. */
unsigned int is_operator : 1;
/* True if there is an early clobber alternative for this operand.
This field is set up every time when corresponding
operand_alternative in lra_static_insn_data is set up. */
unsigned int early_clobber : 1;
/* True if the operand is an address. */
unsigned int is_address : 1;
};
/* Info about register occurrence in an insn. */
struct lra_insn_reg
{
/* The biggest mode through which the insn refers to the register
occurrence (remember the register can be accessed through a
subreg in the insn). */
ENUM_BITFIELD(machine_mode) biggest_mode : 16;
/* The type of the corresponding operand which is the register. */
ENUM_BITFIELD (op_type) type : 8;
/* True if the reg is accessed through a subreg and the subreg is
just a part of the register. */
unsigned int subreg_p : 1;
/* True if there is an early clobber alternative for this
operand. */
unsigned int early_clobber : 1;
/* The corresponding regno of the register. */
int regno;
/* Next reg info of the same insn. */
struct lra_insn_reg *next;
};
/* Static part (common info for insns with the same ICODE) of LRA
internal insn info. It exists in at most one exemplar for each
non-negative ICODE. There is only one exception. Each asm insn has
own structure. Warning: if the structure definition is changed,
the initializer for debug_insn_static_data in lra.c should be
changed too. */
struct lra_static_insn_data
{
/* Static info about each insn operand. */
struct lra_operand_data *operand;
/* Each duplication refers to the number of the corresponding
operand which is duplicated. */
int *dup_num;
/* The number of an operand marked as commutative, -1 otherwise. */
int commutative;
/* Number of operands, duplications, and alternatives of the
insn. */
char n_operands;
char n_dups;
char n_alternatives;
/* Insns in machine description (or clobbers in asm) may contain
explicit hard regs which are not operands. The following list
describes such hard registers. */
struct lra_insn_reg *hard_regs;
/* Array [n_alternatives][n_operand] of static constraint info for
given operand in given alternative. This info can be changed if
the target reg info is changed. */
struct operand_alternative *operand_alternative;
};
/* LRA internal info about an insn (LRA internal insn
representation). */
struct lra_insn_recog_data
{
/* The insn code. */
int icode;
/* The insn itself. */
rtx insn;
/* Common data for insns with the same ICODE. Asm insns (their
ICODE is negative) do not share such structures. */
struct lra_static_insn_data *insn_static_data;
/* Two arrays of size correspondingly equal to the operand and the
duplication numbers: */
rtx **operand_loc; /* The operand locations, NULL if no operands. */
rtx **dup_loc; /* The dup locations, NULL if no dups. */
/* Number of hard registers implicitly used in given call insn. The
value can be NULL or points to array of the hard register numbers
ending with a negative value. */
int *arg_hard_regs;
#ifdef HAVE_ATTR_enabled
/* Alternative enabled for the insn. NULL for debug insns. */
bool *alternative_enabled_p;
#endif
/* The alternative should be used for the insn, -1 if invalid, or we
should try to use any alternative, or the insn is a debug
insn. */
int used_insn_alternative;
/* The following member value is always NULL for a debug insn. */
struct lra_insn_reg *regs;
};
typedef struct lra_insn_recog_data *lra_insn_recog_data_t;
/* lra.c: */
extern FILE *lra_dump_file;
extern bool lra_reg_spill_p;
extern HARD_REG_SET lra_no_alloc_regs;
extern int lra_insn_recog_data_len;
extern lra_insn_recog_data_t *lra_insn_recog_data;
extern int lra_curr_reload_num;
extern void lra_push_insn (rtx);
extern void lra_push_insn_by_uid (unsigned int);
extern void lra_push_insn_and_update_insn_regno_info (rtx);
extern rtx lra_pop_insn (void);
extern unsigned int lra_insn_stack_length (void);
extern rtx lra_create_new_reg_with_unique_value (enum machine_mode, rtx,
enum reg_class, const char *);
extern void lra_set_regno_unique_value (int);
extern void lra_invalidate_insn_data (rtx);
extern void lra_set_insn_deleted (rtx);
extern void lra_delete_dead_insn (rtx);
extern void lra_emit_add (rtx, rtx, rtx);
extern void lra_emit_move (rtx, rtx);
extern void lra_update_dups (lra_insn_recog_data_t, signed char *);
extern void lra_process_new_insns (rtx, rtx, rtx, const char *);
extern lra_insn_recog_data_t lra_set_insn_recog_data (rtx);
extern lra_insn_recog_data_t lra_update_insn_recog_data (rtx);
extern void lra_set_used_insn_alternative (rtx, int);
extern void lra_set_used_insn_alternative_by_uid (int, int);
extern void lra_invalidate_insn_regno_info (rtx);
extern void lra_update_insn_regno_info (rtx);
extern struct lra_insn_reg *lra_get_insn_regs (int);
extern void lra_free_copies (void);
extern void lra_create_copy (int, int, int);
extern lra_copy_t lra_get_copy (int);
extern bool lra_former_scratch_p (int);
extern bool lra_former_scratch_operand_p (rtx, int);
extern int lra_constraint_new_regno_start;
extern bitmap_head lra_inheritance_pseudos;
extern bitmap_head lra_split_regs;
extern bitmap_head lra_optional_reload_pseudos;
extern int lra_constraint_new_insn_uid_start;
/* lra-constraints.c: */
extern int lra_constraint_offset (int, enum machine_mode);
extern int lra_constraint_iter;
extern int lra_constraint_iter_after_spill;
extern bool lra_risky_transformations_p;
extern int lra_inheritance_iter;
extern int lra_undo_inheritance_iter;
extern bool lra_constraints (bool);
extern void lra_constraints_init (void);
extern void lra_constraints_finish (void);
extern void lra_inheritance (void);
extern bool lra_undo_inheritance (void);
/* lra-lives.c: */
extern int lra_live_max_point;
extern int *lra_point_freq;
extern int lra_hard_reg_usage[FIRST_PSEUDO_REGISTER];
extern int lra_live_range_iter;
extern void lra_create_live_ranges (bool);
extern lra_live_range_t lra_copy_live_range_list (lra_live_range_t);
extern lra_live_range_t lra_merge_live_ranges (lra_live_range_t,
lra_live_range_t);
extern bool lra_intersected_live_ranges_p (lra_live_range_t,
lra_live_range_t);
extern void lra_print_live_range_list (FILE *, lra_live_range_t);
extern void lra_debug_live_range_list (lra_live_range_t);
extern void lra_debug_pseudo_live_ranges (int);
extern void lra_debug_live_ranges (void);
extern void lra_clear_live_ranges (void);
extern void lra_live_ranges_init (void);
extern void lra_live_ranges_finish (void);
extern void lra_setup_reload_pseudo_preferenced_hard_reg (int, int, int);
/* lra-assigns.c: */
extern void lra_setup_reg_renumber (int, int, bool);
extern bool lra_assign (void);
/* lra-coalesce.c: */
extern int lra_coalesce_iter;
extern bool lra_coalesce (void);
/* lra-spills.c: */
extern bool lra_need_for_spills_p (void);
extern void lra_spill (void);
extern void lra_hard_reg_substitution (void);
/* lra-elimination.c: */
extern void lra_debug_elim_table (void);
extern int lra_get_elimination_hard_regno (int);
extern rtx lra_eliminate_regs_1 (rtx, enum machine_mode, bool, bool, bool);
extern void lra_eliminate (bool);
extern void lra_eliminate_reg_if_possible (rtx *);
/* Update insn operands which are duplication of NOP operand. The
insn is represented by its LRA internal representation ID. */
static inline void
lra_update_dup (lra_insn_recog_data_t id, int nop)
{
int i;
struct lra_static_insn_data *static_id = id->insn_static_data;
for (i = 0; i < static_id->n_dups; i++)
if (static_id->dup_num[i] == nop)
*id->dup_loc[i] = *id->operand_loc[nop];
}
/* Process operator duplications in insn with ID. We do it after the
operands processing. Generally speaking, we could do this probably
simultaneously with operands processing because a common practice
is to enumerate the operators after their operands. */
static inline void
lra_update_operator_dups (lra_insn_recog_data_t id)
{
int i;
struct lra_static_insn_data *static_id = id->insn_static_data;
for (i = 0; i < static_id->n_dups; i++)
{
int ndup = static_id->dup_num[i];
if (static_id->operand[ndup].is_operator)
*id->dup_loc[i] = *id->operand_loc[ndup];
}
}
/* Return info about INSN. Set up the info if it is not done yet. */
static inline lra_insn_recog_data_t
lra_get_insn_recog_data (rtx insn)
{
lra_insn_recog_data_t data;
unsigned int uid = INSN_UID (insn);
if (lra_insn_recog_data_len > (int) uid
&& (data = lra_insn_recog_data[uid]) != NULL)
{
/* Check that we did not change insn without updating the insn
info. */
lra_assert (data->insn == insn
&& (INSN_CODE (insn) < 0
|| data->icode == INSN_CODE (insn)));
return data;
}
return lra_set_insn_recog_data (insn);
}
struct target_lra_int
{
/* Map INSN_UID -> the operand alternative data (NULL if unknown).
We assume that this data is valid until register info is changed
because classes in the data can be changed. */
struct operand_alternative *x_op_alt_data[LAST_INSN_CODE];
};
extern struct target_lra_int default_target_lra_int;
#if SWITCHABLE_TARGET
extern struct target_lra_int *this_target_lra_int;
#else
#define this_target_lra_int (&default_target_lra_int)
#endif
#define op_alt_data (this_target_lra_int->x_op_alt_data)

1010
gcc/lra-lives.c Normal file

File diff suppressed because it is too large Load Diff

611
gcc/lra-spills.c Normal file
View File

@ -0,0 +1,611 @@
/* Change pseudos by memory.
Copyright (C) 2010, 2011, 2012
Free Software Foundation, Inc.
Contributed by Vladimir Makarov <vmakarov@redhat.com>.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
/* This file contains code for a pass to change spilled pseudos into
memory.
The pass creates necessary stack slots and assigns spilled pseudos
to the stack slots in following way:
for all spilled pseudos P most frequently used first do
for all stack slots S do
if P doesn't conflict with pseudos assigned to S then
assign S to P and goto to the next pseudo process
end
end
create new stack slot S and assign P to S
end
The actual algorithm is bit more complicated because of different
pseudo sizes.
After that the code changes spilled pseudos (except ones created
from scratches) by corresponding stack slot memory in RTL.
If at least one stack slot was created, we need to run more passes
because we have new addresses which should be checked and because
the old address displacements might change and address constraints
(or insn memory constraints) might not be satisfied any more.
For some targets, the pass can spill some pseudos into hard
registers of different class (usually into vector registers)
instead of spilling them into memory if it is possible and
profitable. Spilling GENERAL_REGS pseudo into SSE registers for
Intel Corei7 is an example of such optimization. And this is
actually recommended by Intel optimization guide.
The file also contains code for final change of pseudos on hard
regs correspondingly assigned to them. */
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "rtl.h"
#include "tm_p.h"
#include "insn-config.h"
#include "recog.h"
#include "output.h"
#include "regs.h"
#include "hard-reg-set.h"
#include "flags.h"
#include "function.h"
#include "expr.h"
#include "basic-block.h"
#include "except.h"
#include "timevar.h"
#include "target.h"
#include "lra-int.h"
#include "ira.h"
#include "df.h"
/* Max regno at the start of the pass. */
static int regs_num;
/* Map spilled regno -> hard regno used instead of memory for
spilling. */
static rtx *spill_hard_reg;
/* The structure describes stack slot of a spilled pseudo. */
struct pseudo_slot
{
/* Number (0, 1, ...) of the stack slot to which given pseudo
belongs. */
int slot_num;
/* First or next slot with the same slot number. */
struct pseudo_slot *next, *first;
/* Memory representing the spilled pseudo. */
rtx mem;
};
/* The stack slots for each spilled pseudo. Indexed by regnos. */
static struct pseudo_slot *pseudo_slots;
/* The structure describes a register or a stack slot which can be
used for several spilled pseudos. */
struct slot
{
/* First pseudo with given stack slot. */
int regno;
/* Hard reg into which the slot pseudos are spilled. The value is
negative for pseudos spilled into memory. */
int hard_regno;
/* Memory representing the all stack slot. It can be different from
memory representing a pseudo belonging to give stack slot because
pseudo can be placed in a part of the corresponding stack slot.
The value is NULL for pseudos spilled into a hard reg. */
rtx mem;
/* Combined live ranges of all pseudos belonging to given slot. It
is used to figure out that a new spilled pseudo can use given
stack slot. */
lra_live_range_t live_ranges;
};
/* Array containing info about the stack slots. The array element is
indexed by the stack slot number in the range [0..slots_num). */
static struct slot *slots;
/* The number of the stack slots currently existing. */
static int slots_num;
/* Set up memory of the spilled pseudo I. The function can allocate
the corresponding stack slot if it is not done yet. */
static void
assign_mem_slot (int i)
{
rtx x = NULL_RTX;
enum machine_mode mode = GET_MODE (regno_reg_rtx[i]);
unsigned int inherent_size = PSEUDO_REGNO_BYTES (i);
unsigned int inherent_align = GET_MODE_ALIGNMENT (mode);
unsigned int max_ref_width = GET_MODE_SIZE (lra_reg_info[i].biggest_mode);
unsigned int total_size = MAX (inherent_size, max_ref_width);
unsigned int min_align = max_ref_width * BITS_PER_UNIT;
int adjust = 0;
lra_assert (regno_reg_rtx[i] != NULL_RTX && REG_P (regno_reg_rtx[i])
&& lra_reg_info[i].nrefs != 0 && reg_renumber[i] < 0);
x = slots[pseudo_slots[i].slot_num].mem;
/* We can use a slot already allocated because it is guaranteed the
slot provides both enough inherent space and enough total
space. */
if (x)
;
/* Each pseudo has an inherent size which comes from its own mode,
and a total size which provides room for paradoxical subregs
which refer to the pseudo reg in wider modes. We allocate a new
slot, making sure that it has enough inherent space and total
space. */
else
{
rtx stack_slot;
/* No known place to spill from => no slot to reuse. */
x = assign_stack_local (mode, total_size,
min_align > inherent_align
|| total_size > inherent_size ? -1 : 0);
x = lra_eliminate_regs_1 (x, GET_MODE (x), false, false, true);
stack_slot = x;
/* Cancel the big-endian correction done in assign_stack_local.
Get the address of the beginning of the slot. This is so we
can do a big-endian correction unconditionally below. */
if (BYTES_BIG_ENDIAN)
{
adjust = inherent_size - total_size;
if (adjust)
stack_slot
= adjust_address_nv (x,
mode_for_size (total_size * BITS_PER_UNIT,
MODE_INT, 1),
adjust);
}
slots[pseudo_slots[i].slot_num].mem = stack_slot;
}
/* On a big endian machine, the "address" of the slot is the address
of the low part that fits its inherent mode. */
if (BYTES_BIG_ENDIAN && inherent_size < total_size)
adjust += (total_size - inherent_size);
x = adjust_address_nv (x, GET_MODE (regno_reg_rtx[i]), adjust);
/* Set all of the memory attributes as appropriate for a spill. */
set_mem_attrs_for_spill (x);
pseudo_slots[i].mem = x;
}
/* Sort pseudos according their usage frequencies. */
static int
regno_freq_compare (const void *v1p, const void *v2p)
{
const int regno1 = *(const int *) v1p;
const int regno2 = *(const int *) v2p;
int diff;
if ((diff = lra_reg_info[regno2].freq - lra_reg_info[regno1].freq) != 0)
return diff;
return regno1 - regno2;
}
/* Redefine STACK_GROWS_DOWNWARD in terms of 0 or 1. */
#ifdef STACK_GROWS_DOWNWARD
# undef STACK_GROWS_DOWNWARD
# define STACK_GROWS_DOWNWARD 1
#else
# define STACK_GROWS_DOWNWARD 0
#endif
/* Sort pseudos according to their slots, putting the slots in the order
that they should be allocated. Slots with lower numbers have the highest
priority and should get the smallest displacement from the stack or
frame pointer (whichever is being used).
The first allocated slot is always closest to the frame pointer,
so prefer lower slot numbers when frame_pointer_needed. If the stack
and frame grow in the same direction, then the first allocated slot is
always closest to the initial stack pointer and furthest away from the
final stack pointer, so allocate higher numbers first when using the
stack pointer in that case. The reverse is true if the stack and
frame grow in opposite directions. */
static int
pseudo_reg_slot_compare (const void *v1p, const void *v2p)
{
const int regno1 = *(const int *) v1p;
const int regno2 = *(const int *) v2p;
int diff, slot_num1, slot_num2;
int total_size1, total_size2;
slot_num1 = pseudo_slots[regno1].slot_num;
slot_num2 = pseudo_slots[regno2].slot_num;
if ((diff = slot_num1 - slot_num2) != 0)
return (frame_pointer_needed
|| !FRAME_GROWS_DOWNWARD == STACK_GROWS_DOWNWARD ? diff : -diff);
total_size1 = GET_MODE_SIZE (lra_reg_info[regno1].biggest_mode);
total_size2 = GET_MODE_SIZE (lra_reg_info[regno2].biggest_mode);
if ((diff = total_size2 - total_size1) != 0)
return diff;
return regno1 - regno2;
}
/* Assign spill hard registers to N pseudos in PSEUDO_REGNOS which is
sorted in order of highest frequency first. Put the pseudos which
did not get a spill hard register at the beginning of array
PSEUDO_REGNOS. Return the number of such pseudos. */
static int
assign_spill_hard_regs (int *pseudo_regnos, int n)
{
int i, k, p, regno, res, spill_class_size, hard_regno, nr;
enum reg_class rclass, spill_class;
enum machine_mode mode;
lra_live_range_t r;
rtx insn, set;
basic_block bb;
HARD_REG_SET conflict_hard_regs;
bitmap_head ok_insn_bitmap;
bitmap setjump_crosses = regstat_get_setjmp_crosses ();
/* Hard registers which can not be used for any purpose at given
program point because they are unallocatable or already allocated
for other pseudos. */
HARD_REG_SET *reserved_hard_regs;
if (! lra_reg_spill_p)
return n;
/* Set up reserved hard regs for every program point. */
reserved_hard_regs = XNEWVEC (HARD_REG_SET, lra_live_max_point);
for (p = 0; p < lra_live_max_point; p++)
COPY_HARD_REG_SET (reserved_hard_regs[p], lra_no_alloc_regs);
for (i = FIRST_PSEUDO_REGISTER; i < regs_num; i++)
if (lra_reg_info[i].nrefs != 0
&& (hard_regno = lra_get_regno_hard_regno (i)) >= 0)
for (r = lra_reg_info[i].live_ranges; r != NULL; r = r->next)
for (p = r->start; p <= r->finish; p++)
add_to_hard_reg_set (&reserved_hard_regs[p],
lra_reg_info[i].biggest_mode, hard_regno);
bitmap_initialize (&ok_insn_bitmap, &reg_obstack);
FOR_EACH_BB (bb)
FOR_BB_INSNS (bb, insn)
if (DEBUG_INSN_P (insn)
|| ((set = single_set (insn)) != NULL_RTX
&& REG_P (SET_SRC (set)) && REG_P (SET_DEST (set))))
bitmap_set_bit (&ok_insn_bitmap, INSN_UID (insn));
for (res = i = 0; i < n; i++)
{
regno = pseudo_regnos[i];
rclass = lra_get_allocno_class (regno);
if (bitmap_bit_p (setjump_crosses, regno)
|| (spill_class
= ((enum reg_class)
targetm.spill_class ((reg_class_t) rclass,
PSEUDO_REGNO_MODE (regno)))) == NO_REGS
|| bitmap_intersect_compl_p (&lra_reg_info[regno].insn_bitmap,
&ok_insn_bitmap))
{
pseudo_regnos[res++] = regno;
continue;
}
lra_assert (spill_class != NO_REGS);
COPY_HARD_REG_SET (conflict_hard_regs,
lra_reg_info[regno].conflict_hard_regs);
for (r = lra_reg_info[regno].live_ranges; r != NULL; r = r->next)
for (p = r->start; p <= r->finish; p++)
IOR_HARD_REG_SET (conflict_hard_regs, reserved_hard_regs[p]);
spill_class_size = ira_class_hard_regs_num[spill_class];
mode = lra_reg_info[regno].biggest_mode;
for (k = 0; k < spill_class_size; k++)
{
hard_regno = ira_class_hard_regs[spill_class][k];
if (! overlaps_hard_reg_set_p (conflict_hard_regs, mode, hard_regno))
break;
}
if (k >= spill_class_size)
{
/* There is no available regs -- assign memory later. */
pseudo_regnos[res++] = regno;
continue;
}
if (lra_dump_file != NULL)
fprintf (lra_dump_file, " Spill r%d into hr%d\n", regno, hard_regno);
/* Update reserved_hard_regs. */
for (r = lra_reg_info[regno].live_ranges; r != NULL; r = r->next)
for (p = r->start; p <= r->finish; p++)
add_to_hard_reg_set (&reserved_hard_regs[p],
lra_reg_info[regno].biggest_mode, hard_regno);
spill_hard_reg[regno]
= gen_raw_REG (PSEUDO_REGNO_MODE (regno), hard_regno);
for (nr = 0;
nr < hard_regno_nregs[hard_regno][lra_reg_info[regno].biggest_mode];
nr++)
/* Just loop. */;
df_set_regs_ever_live (hard_regno + nr, true);
}
bitmap_clear (&ok_insn_bitmap);
free (reserved_hard_regs);
return res;
}
/* Add pseudo REGNO to slot SLOT_NUM. */
static void
add_pseudo_to_slot (int regno, int slot_num)
{
struct pseudo_slot *first;
if (slots[slot_num].regno < 0)
{
/* It is the first pseudo in the slot. */
slots[slot_num].regno = regno;
pseudo_slots[regno].first = &pseudo_slots[regno];
pseudo_slots[regno].next = NULL;
}
else
{
first = pseudo_slots[regno].first = &pseudo_slots[slots[slot_num].regno];
pseudo_slots[regno].next = first->next;
first->next = &pseudo_slots[regno];
}
pseudo_slots[regno].mem = NULL_RTX;
pseudo_slots[regno].slot_num = slot_num;
slots[slot_num].live_ranges
= lra_merge_live_ranges (slots[slot_num].live_ranges,
lra_copy_live_range_list
(lra_reg_info[regno].live_ranges));
}
/* Assign stack slot numbers to pseudos in array PSEUDO_REGNOS of
length N. Sort pseudos in PSEUDO_REGNOS for subsequent assigning
memory stack slots. */
static void
assign_stack_slot_num_and_sort_pseudos (int *pseudo_regnos, int n)
{
int i, j, regno;
slots_num = 0;
/* Assign stack slot numbers to spilled pseudos, use smaller numbers
for most frequently used pseudos. */
for (i = 0; i < n; i++)
{
regno = pseudo_regnos[i];
if (! flag_ira_share_spill_slots)
j = slots_num;
else
{
for (j = 0; j < slots_num; j++)
if (slots[j].hard_regno < 0
&& ! (lra_intersected_live_ranges_p
(slots[j].live_ranges,
lra_reg_info[regno].live_ranges)))
break;
}
if (j >= slots_num)
{
/* New slot. */
slots[j].live_ranges = NULL;
slots[j].regno = slots[j].hard_regno = -1;
slots[j].mem = NULL_RTX;
slots_num++;
}
add_pseudo_to_slot (regno, j);
}
/* Sort regnos according to their slot numbers. */
qsort (pseudo_regnos, n, sizeof (int), pseudo_reg_slot_compare);
}
/* Recursively process LOC in INSN and change spilled pseudos to the
corresponding memory or spilled hard reg. Ignore spilled pseudos
created from the scratches. */
static void
remove_pseudos (rtx *loc, rtx insn)
{
int i;
rtx hard_reg;
const char *fmt;
enum rtx_code code;
if (*loc == NULL_RTX)
return;
code = GET_CODE (*loc);
if (code == REG && (i = REGNO (*loc)) >= FIRST_PSEUDO_REGISTER
&& lra_get_regno_hard_regno (i) < 0
/* We do not want to assign memory for former scratches because
it might result in an address reload for some targets. In
any case we transform such pseudos not getting hard registers
into scratches back. */
&& ! lra_former_scratch_p (i))
{
hard_reg = spill_hard_reg[i];
*loc = copy_rtx (hard_reg != NULL_RTX ? hard_reg : pseudo_slots[i].mem);
return;
}
fmt = GET_RTX_FORMAT (code);
for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
{
if (fmt[i] == 'e')
remove_pseudos (&XEXP (*loc, i), insn);
else if (fmt[i] == 'E')
{
int j;
for (j = XVECLEN (*loc, i) - 1; j >= 0; j--)
remove_pseudos (&XVECEXP (*loc, i, j), insn);
}
}
}
/* Convert spilled pseudos into their stack slots or spill hard regs,
put insns to process on the constraint stack (that is all insns in
which pseudos were changed to memory or spill hard regs). */
static void
spill_pseudos (void)
{
basic_block bb;
rtx insn;
int i;
bitmap_head spilled_pseudos, changed_insns;
bitmap_initialize (&spilled_pseudos, &reg_obstack);
bitmap_initialize (&changed_insns, &reg_obstack);
for (i = FIRST_PSEUDO_REGISTER; i < regs_num; i++)
{
if (lra_reg_info[i].nrefs != 0 && lra_get_regno_hard_regno (i) < 0
&& ! lra_former_scratch_p (i))
{
bitmap_set_bit (&spilled_pseudos, i);
bitmap_ior_into (&changed_insns, &lra_reg_info[i].insn_bitmap);
}
}
FOR_EACH_BB (bb)
{
FOR_BB_INSNS (bb, insn)
if (bitmap_bit_p (&changed_insns, INSN_UID (insn)))
{
remove_pseudos (&PATTERN (insn), insn);
if (CALL_P (insn))
remove_pseudos (&CALL_INSN_FUNCTION_USAGE (insn), insn);
if (lra_dump_file != NULL)
fprintf (lra_dump_file,
"Changing spilled pseudos to memory in insn #%u\n",
INSN_UID (insn));
lra_push_insn (insn);
if (lra_reg_spill_p || targetm.different_addr_displacement_p ())
lra_set_used_insn_alternative (insn, -1);
}
else if (CALL_P (insn))
/* Presence of any pseudo in CALL_INSN_FUNCTION_USAGE does
not affect value of insn_bitmap of the corresponding
lra_reg_info. That is because we don't need to reload
pseudos in CALL_INSN_FUNCTION_USAGEs. So if we process
only insns in the insn_bitmap of given pseudo here, we
can miss the pseudo in some
CALL_INSN_FUNCTION_USAGEs. */
remove_pseudos (&CALL_INSN_FUNCTION_USAGE (insn), insn);
bitmap_and_compl_into (df_get_live_in (bb), &spilled_pseudos);
bitmap_and_compl_into (df_get_live_out (bb), &spilled_pseudos);
}
bitmap_clear (&spilled_pseudos);
bitmap_clear (&changed_insns);
}
/* Return true if we need to change some pseudos into memory. */
bool
lra_need_for_spills_p (void)
{
int i; max_regno = max_reg_num ();
for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
if (lra_reg_info[i].nrefs != 0 && lra_get_regno_hard_regno (i) < 0
&& ! lra_former_scratch_p (i))
return true;
return false;
}
/* Change spilled pseudos into memory or spill hard regs. Put changed
insns on the constraint stack (these insns will be considered on
the next constraint pass). The changed insns are all insns in
which pseudos were changed. */
void
lra_spill (void)
{
int i, n, curr_regno;
int *pseudo_regnos;
regs_num = max_reg_num ();
spill_hard_reg = XNEWVEC (rtx, regs_num);
pseudo_regnos = XNEWVEC (int, regs_num);
for (n = 0, i = FIRST_PSEUDO_REGISTER; i < regs_num; i++)
if (lra_reg_info[i].nrefs != 0 && lra_get_regno_hard_regno (i) < 0
/* We do not want to assign memory for former scratches. */
&& ! lra_former_scratch_p (i))
{
spill_hard_reg[i] = NULL_RTX;
pseudo_regnos[n++] = i;
}
lra_assert (n > 0);
pseudo_slots = XNEWVEC (struct pseudo_slot, regs_num);
slots = XNEWVEC (struct slot, regs_num);
/* Sort regnos according their usage frequencies. */
qsort (pseudo_regnos, n, sizeof (int), regno_freq_compare);
n = assign_spill_hard_regs (pseudo_regnos, n);
assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
for (i = 0; i < n; i++)
if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
assign_mem_slot (pseudo_regnos[i]);
if (lra_dump_file != NULL)
{
for (i = 0; i < slots_num; i++)
{
fprintf (lra_dump_file, " Slot %d regnos (width = %d):", i,
GET_MODE_SIZE (GET_MODE (slots[i].mem)));
for (curr_regno = slots[i].regno;;
curr_regno = pseudo_slots[curr_regno].next - pseudo_slots)
{
fprintf (lra_dump_file, " %d", curr_regno);
if (pseudo_slots[curr_regno].next == NULL)
break;
}
fprintf (lra_dump_file, "\n");
}
}
spill_pseudos ();
free (slots);
free (pseudo_slots);
free (pseudo_regnos);
}
/* Final change of pseudos got hard registers into the corresponding
hard registers. */
void
lra_hard_reg_substitution (void)
{
int i, hard_regno;
basic_block bb;
rtx insn;
int max_regno = max_reg_num ();
for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
if (lra_reg_info[i].nrefs != 0
&& (hard_regno = lra_get_regno_hard_regno (i)) >= 0)
SET_REGNO (regno_reg_rtx[i], hard_regno);
FOR_EACH_BB (bb)
FOR_BB_INSNS (bb, insn)
if (INSN_P (insn))
{
lra_insn_recog_data_t id;
bool insn_change_p = false;
id = lra_get_insn_recog_data (insn);
for (i = id->insn_static_data->n_operands - 1; i >= 0; i--)
{
rtx op = *id->operand_loc[i];
if (GET_CODE (op) == SUBREG && REG_P (SUBREG_REG (op)))
{
lra_assert (REGNO (SUBREG_REG (op)) < FIRST_PSEUDO_REGISTER);
alter_subreg (id->operand_loc[i], ! DEBUG_INSN_P (insn));
lra_update_dup (id, i);
insn_change_p = true;
}
}
if (insn_change_p)
lra_update_operator_dups (id);
}
}

2398
gcc/lra.c Normal file

File diff suppressed because it is too large Load Diff

42
gcc/lra.h Normal file
View File

@ -0,0 +1,42 @@
/* Communication between the Local Register Allocator (LRA) and
the rest of the compiler.
Copyright (C) 2010, 2011, 2012
Free Software Foundation, Inc.
Contributed by Vladimir Makarov <vmakarov@redhat.com>.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
extern bool lra_simple_p;
/* Return the allocno reg class of REGNO. If it is a reload pseudo,
the pseudo should finally get hard register of the allocno
class. */
static inline enum reg_class
lra_get_allocno_class (int regno)
{
resize_reg_info ();
return reg_allocno_class (regno);
}
extern rtx lra_create_new_reg (enum machine_mode, rtx, enum reg_class,
const char *);
extern void lra_init_elimination (void);
extern rtx lra_eliminate_regs (rtx, enum machine_mode, rtx);
extern void lra (FILE *);
extern void lra_init_once (void);
extern void lra_init (void);
extern void lra_finish_once (void);

View File

@ -76,7 +76,7 @@ extern rtx final_scan_insn (rtx, FILE *, int, int, int *);
/* Replace a SUBREG with a REG or a MEM, based on the thing it is a
subreg of. */
extern rtx alter_subreg (rtx *);
extern rtx alter_subreg (rtx *, bool);
/* Print an operand using machine-dependent assembler syntax. */
extern void output_operand (rtx, int);

View File

@ -993,6 +993,12 @@ general_operand (rtx op, enum machine_mode mode)
/* FLOAT_MODE subregs can't be paradoxical. Combine will occasionally
create such rtl, and we must reject it. */
if (SCALAR_FLOAT_MODE_P (GET_MODE (op))
/* LRA can use subreg to store a floating point value in an
integer mode. Although the floating point and the
integer modes need the same number of hard registers, the
size of floating point mode can be less than the integer
mode. */
&& ! lra_in_progress
&& GET_MODE_SIZE (GET_MODE (op)) > GET_MODE_SIZE (GET_MODE (sub)))
return 0;
@ -1068,6 +1074,12 @@ register_operand (rtx op, enum machine_mode mode)
/* FLOAT_MODE subregs can't be paradoxical. Combine will occasionally
create such rtl, and we must reject it. */
if (SCALAR_FLOAT_MODE_P (GET_MODE (op))
/* LRA can use subreg to store a floating point value in an
integer mode. Although the floating point and the
integer modes need the same number of hard registers, the
size of floating point mode can be less than the integer
mode. */
&& ! lra_in_progress
&& GET_MODE_SIZE (GET_MODE (op)) > GET_MODE_SIZE (GET_MODE (sub)))
return 0;
@ -1099,7 +1111,7 @@ scratch_operand (rtx op, enum machine_mode mode)
return (GET_CODE (op) == SCRATCH
|| (REG_P (op)
&& REGNO (op) < FIRST_PSEUDO_REGISTER));
&& (lra_in_progress || REGNO (op) < FIRST_PSEUDO_REGISTER)));
}
/* Return 1 if OP is a valid immediate operand for mode MODE.

View File

@ -2372,6 +2372,9 @@ extern int epilogue_completed;
extern int reload_in_progress;
/* Set to 1 while in lra. */
extern int lra_in_progress;
/* This macro indicates whether you may create a new
pseudo-register. */
@ -2490,7 +2493,12 @@ extern rtx make_compound_operation (rtx, enum rtx_code);
extern void delete_dead_jumptables (void);
/* In sched-vis.c. */
extern void dump_insn_slim (FILE *, const_rtx x);
extern void debug_bb_n_slim (int);
extern void debug_bb_slim (struct basic_block_def *);
extern void print_value_slim (FILE *, const_rtx, int);
extern void debug_rtl_slim (FILE *, const_rtx, const_rtx, int, int);
extern void dump_insn_slim (FILE *f, const_rtx x);
extern void debug_insn_slim (const_rtx x);
/* In sched-rgn.c. */
extern void schedule_insns (void);

View File

@ -3481,7 +3481,9 @@ simplify_subreg_regno (unsigned int xregno, enum machine_mode xmode,
/* Give the backend a chance to disallow the mode change. */
if (GET_MODE_CLASS (xmode) != MODE_COMPLEX_INT
&& GET_MODE_CLASS (xmode) != MODE_COMPLEX_FLOAT
&& REG_CANNOT_CHANGE_MODE_P (xregno, xmode, ymode))
&& REG_CANNOT_CHANGE_MODE_P (xregno, xmode, ymode)
/* We can use mode change in LRA for some transformations. */
&& ! lra_in_progress)
return -1;
#endif
@ -3491,10 +3493,16 @@ simplify_subreg_regno (unsigned int xregno, enum machine_mode xmode,
return -1;
if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
&& xregno == ARG_POINTER_REGNUM)
/* We should convert arg register in LRA after the elimination
if it is possible. */
&& xregno == ARG_POINTER_REGNUM
&& ! lra_in_progress)
return -1;
if (xregno == STACK_POINTER_REGNUM)
if (xregno == STACK_POINTER_REGNUM
/* We should convert hard stack register in LRA if it is
possible. */
&& ! lra_in_progress)
return -1;
/* Try to get the register offset. */

View File

@ -546,6 +546,21 @@ print_value (char *buf, const_rtx x, int verbose)
}
} /* print_value */
/* Print X, an RTL value node, to file F in slim format. Include
additional information if VERBOSE is nonzero.
Value nodes are constants, registers, labels, symbols and
memory. */
void
print_value_slim (FILE *f, const_rtx x, int verbose)
{
char buf[BUF_LEN];
print_value (buf, x, verbose);
fprintf (f, "%s", buf);
}
/* The next step in insn detalization, its pattern recognition. */
void

View File

@ -767,7 +767,7 @@ sdbout_symbol (tree decl, int local)
if (REGNO (value) >= FIRST_PSEUDO_REGISTER)
return;
}
regno = REGNO (alter_subreg (&value));
regno = REGNO (alter_subreg (&value, true));
SET_DECL_RTL (decl, value);
}
/* Don't output anything if an auto variable

View File

@ -37,6 +37,7 @@ along with GCC; see the file COPYING3. If not see
#include "libfuncs.h"
#include "cfgloop.h"
#include "ira-int.h"
#include "lra-int.h"
#include "builtins.h"
#include "gcse.h"
#include "bb-reorder.h"
@ -55,6 +56,7 @@ struct target_globals default_target_globals = {
&default_target_cfgloop,
&default_target_ira,
&default_target_ira_int,
&default_target_lra_int,
&default_target_builtins,
&default_target_gcse,
&default_target_bb_reorder,

View File

@ -32,6 +32,7 @@ extern struct target_libfuncs *this_target_libfuncs;
extern struct target_cfgloop *this_target_cfgloop;
extern struct target_ira *this_target_ira;
extern struct target_ira_int *this_target_ira_int;
extern struct target_lra_int *this_target_lra_int;
extern struct target_builtins *this_target_builtins;
extern struct target_gcse *this_target_gcse;
extern struct target_bb_reorder *this_target_bb_reorder;
@ -49,6 +50,7 @@ struct GTY(()) target_globals {
struct target_cfgloop *GTY((skip)) cfgloop;
struct target_ira *GTY((skip)) ira;
struct target_ira_int *GTY((skip)) ira_int;
struct target_lra_int *GTY((skip)) lra_int;
struct target_builtins *GTY((skip)) builtins;
struct target_gcse *GTY((skip)) gcse;
struct target_bb_reorder *GTY((skip)) bb_reorder;
@ -73,6 +75,7 @@ restore_target_globals (struct target_globals *g)
this_target_cfgloop = g->cfgloop;
this_target_ira = g->ira;
this_target_ira_int = g->ira_int;
this_target_lra_int = g->lra_int;
this_target_builtins = g->builtins;
this_target_gcse = g->gcse;
this_target_bb_reorder = g->bb_reorder;

View File

@ -2352,6 +2352,55 @@ DEFHOOK
tree, (tree type, tree expr),
hook_tree_tree_tree_null)
/* Return true if we use LRA instead of reload. */
DEFHOOK
(lra_p,
"A target hook which returns true if we use LRA instead of reload pass.\
It means that LRA was ported to the target.\
\
The default version of this target hook returns always false.",
bool, (void),
default_lra_p)
/* Return register priority of given hard regno for the current target. */
DEFHOOK
(register_priority,
"A target hook which returns the register priority number to which the\
register @var{hard_regno} belongs to. The bigger the number, the\
more preferable the hard register usage (when all other conditions are\
the same). This hook can be used to prefer some hard register over\
others in LRA. For example, some x86-64 register usage needs\
additional prefix which makes instructions longer. The hook can\
return lower priority number for such registers make them less favorable\
and as result making the generated code smaller.\
\
The default version of this target hook returns always zero.",
int, (int),
default_register_priority)
/* Return true if maximal address displacement can be different. */
DEFHOOK
(different_addr_displacement_p,
"A target hook which returns true if an address with the same structure\
can have different maximal legitimate displacement. For example, the\
displacement can depend on memory mode or on operand combinations in\
the insn.\
\
The default version of this target hook returns always false.",
bool, (void),
default_different_addr_displacement_p)
/* Determine class for spilling pseudos of given mode into registers
instead of memory. */
DEFHOOK
(spill_class,
"This hook defines a class of registers which could be used for spilling\
pseudos of the given mode and class, or @code{NO_REGS} if only memory\
should be used. Not defining this hook is equivalent to returning\
@code{NO_REGS} for all inputs.",
reg_class_t, (reg_class_t, enum machine_mode),
NULL)
/* True if a structure, union or array with MODE containing FIELD should
be accessed using BLKmode. */
DEFHOOK

View File

@ -840,6 +840,24 @@ default_branch_target_register_class (void)
return NO_REGS;
}
extern bool
default_lra_p (void)
{
return false;
}
int
default_register_priority (int hard_regno ATTRIBUTE_UNUSED)
{
return 0;
}
extern bool
default_different_addr_displacement_p (void)
{
return false;
}
reg_class_t
default_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x ATTRIBUTE_UNUSED,
reg_class_t reload_class_i ATTRIBUTE_UNUSED,

View File

@ -132,6 +132,9 @@ extern rtx default_static_chain (const_tree, bool);
extern void default_trampoline_init (rtx, tree, rtx);
extern int default_return_pops_args (tree, tree, int);
extern reg_class_t default_branch_target_register_class (void);
extern bool default_lra_p (void);
extern int default_register_priority (int);
extern bool default_different_addr_displacement_p (void);
extern reg_class_t default_secondary_reload (bool, rtx, reg_class_t,
enum machine_mode,
secondary_reload_info *);

View File

@ -223,10 +223,16 @@ DEFTIMEVAR (TV_REGMOVE , "regmove")
DEFTIMEVAR (TV_MODE_SWITCH , "mode switching")
DEFTIMEVAR (TV_SMS , "sms modulo scheduling")
DEFTIMEVAR (TV_SCHED , "scheduling")
DEFTIMEVAR (TV_IRA , "integrated RA")
DEFTIMEVAR (TV_RELOAD , "reload")
DEFTIMEVAR (TV_IRA , "integrated RA")
DEFTIMEVAR (TV_LRA , "LRA non-specific")
DEFTIMEVAR (TV_LRA_ELIMINATE , "LRA virtuals elimination")
DEFTIMEVAR (TV_LRA_INHERITANCE , "LRA reload inheritance")
DEFTIMEVAR (TV_LRA_CREATE_LIVE_RANGES, "LRA create live ranges")
DEFTIMEVAR (TV_LRA_ASSIGN , "LRA hard reg assignment")
DEFTIMEVAR (TV_LRA_COALESCE , "LRA coalesce pseudo regs")
DEFTIMEVAR (TV_RELOAD , "reload")
DEFTIMEVAR (TV_RELOAD_CSE_REGS , "reload CSE regs")
DEFTIMEVAR (TV_GCSE_AFTER_RELOAD , "load CSE after reload")
DEFTIMEVAR (TV_GCSE_AFTER_RELOAD , "load CSE after reload")
DEFTIMEVAR (TV_REE , "ree")
DEFTIMEVAR (TV_THREAD_PROLOGUE_AND_EPILOGUE, "thread pro- & epilogue")
DEFTIMEVAR (TV_IFCVT2 , "if-conversion 2")