i386.c (ix86_expand_prologue): Optimize stack checking for leaf functions without dynamic stack allocation.

* config/i386/i386.c (ix86_expand_prologue): Optimize stack checking for
	leaf functions without dynamic stack allocation.
	* config/ia64/ia64.c (ia64_emit_probe_stack_range): Adjust.
	(ia64_expand_prologue): Likewise.
	* config/mips/mips.c (mips_expand_prologue): Likewise.
	* config/rs6000/rs6000.c (rs6000_emit_prologue): Likewise.
	* config/sparc/sparc.c (sparc_expand_prologue): Likewise.
	(sparc_flat_expand_prologue): Likewise.

From-SVN: r204450
This commit is contained in:
Eric Botcazou 2013-11-06 10:55:13 +00:00
parent f054ff5b7c
commit 0dca9cd86c
6 changed files with 165 additions and 89 deletions

View File

@ -1,3 +1,14 @@
2013-11-06 Eric Botcazou <ebotcazou@adacore.com>
* config/i386/i386.c (ix86_expand_prologue): Optimize stack checking for
leaf functions without dynamic stack allocation.
* config/ia64/ia64.c (ia64_emit_probe_stack_range): Adjust.
(ia64_expand_prologue): Likewise.
* config/mips/mips.c (mips_expand_prologue): Likewise.
* config/rs6000/rs6000.c (rs6000_emit_prologue): Likewise.
* config/sparc/sparc.c (sparc_expand_prologue): Likewise.
(sparc_flat_expand_prologue): Likewise.
2013-11-06 James Greenhalgh <james.greenhalgh@arm.com>
* config/aarch64/arm_neon.h
@ -28,12 +39,12 @@
2013-11-06 Christian Bruel <christian.bruel@st.com>
* gcc/config/sh/sh-mem.cc (sh_expand_cmpnstr, sh_expand_cmpstr):
* config/sh/sh-mem.cc (sh_expand_cmpnstr, sh_expand_cmpstr):
Factorize probabilities, Use adjust_address instead of
adjust_automodify_address when possible. Enable for optimize.
(sh_expand_strlen): New function.
* gcc/config/sh/sh-protos.h (sh_expand_strlen): Declare.
* gcc/config/sh/sh.md (strlensi): New pattern.
* config/sh/sh-protos.h (sh_expand_strlen): Declare.
* config/sh/sh.md (strlensi): New pattern.
(UNSPEC_BUILTIN_STRLEN): Define.
2013-11-06 Jakub Jelinek <jakub@redhat.com>
@ -305,19 +316,19 @@
2013-11-04 Wei Mi <wmi@google.com>
* gcc/config/i386/i386.c (memory_address_length): Extract a part
* config/i386/i386.c (memory_address_length): Extract a part
of code to rip_relative_addr_p.
(rip_relative_addr_p): New Function.
(ix86_macro_fusion_p): Ditto.
(ix86_macro_fusion_pair_p): Ditto.
* gcc/config/i386/i386.h: Add new tune features about macro-fusion.
* gcc/config/i386/x86-tune.def (DEF_TUNE): Ditto.
* gcc/doc/tm.texi: Generated.
* gcc/doc/tm.texi.in: Ditto.
* gcc/haifa-sched.c (try_group_insn): New Function.
* config/i386/i386.h: Add new tune features about macro-fusion.
* config/i386/x86-tune.def (DEF_TUNE): Ditto.
* doc/tm.texi: Generated.
* doc/tm.texi.in: Ditto.
* haifa-sched.c (try_group_insn): New Function.
(group_insns_for_macro_fusion): Ditto.
(sched_init): Call group_insns_for_macro_fusion.
* gcc/target.def: Add two hooks: macro_fusion_p and
* target.def: Add two hooks: macro_fusion_p and
macro_fusion_pair_p.
2013-11-04 Kostya Serebryany <kcc@google.com>
@ -337,17 +348,17 @@
2013-11-04 Wei Mi <wmi@google.com>
* gcc/config/i386/i386-c.c (ix86_target_macros_internal): Separate
* config/i386/i386-c.c (ix86_target_macros_internal): Separate
PROCESSOR_COREI7_AVX out from PROCESSOR_COREI7.
* gcc/config/i386/i386.c (ix86_option_override_internal): Ditto.
* config/i386/i386.c (ix86_option_override_internal): Ditto.
(ix86_issue_rate): Ditto.
(ix86_adjust_cost): Ditto.
(ia32_multipass_dfa_lookahead): Ditto.
(ix86_sched_init_global): Ditto.
(get_builtin_code_for_version): Ditto.
* gcc/config/i386/i386.h (enum target_cpu_default): Ditto.
* config/i386/i386.h (enum target_cpu_default): Ditto.
(enum processor_type): Ditto.
* gcc/config/i386/x86-tune.def (DEF_TUNE): Ditto.
* config/i386/x86-tune.def (DEF_TUNE): Ditto.
2013-11-04 Vladimir Makarov <vmakarov@redhat.com>
@ -903,7 +914,7 @@
2013-10-30 Tobias Burnus <burnus@net-b.de>
PR other/33426
* gcc/tree-cfg.c (replace_loop_annotate): Replace warning by
* tree-cfg.c (replace_loop_annotate): Replace warning by
warning_at.
2013-10-30 Jason Merrill <jason@redhat.com>
@ -1024,10 +1035,10 @@
2013-10-30 Christian Bruel <christian.bruel@st.com>
* gcc/config/sh/sh-mem.cc (sh_expand_cmpnstr): New function.
* config/sh/sh-mem.cc (sh_expand_cmpnstr): New function.
(sh_expand_cmpstr): Handle known align and schedule improvements.
* gcc/config/sh/sh-protos.h (sh_expand_cmpstrn): Declare.
* gcc/config/sh/sh.md (cmpstrnsi): New pattern.
* config/sh/sh-protos.h (sh_expand_cmpstrn): Declare.
* config/sh/sh.md (cmpstrnsi): New pattern.
2013-10-30 Martin Jambor <mjambor@suse.cz>
@ -2303,7 +2314,7 @@
2013-10-24 Joern Rennecke <joern.rennecke@embecosm.com>
* gcc/config/arc/arc.c (arc_ccfsm_post_advance): Also handle
* config/arc/arc.c (arc_ccfsm_post_advance): Also handle
TYPE_UNCOND_BRANCH.
(arc_ifcvt) <case 1 and 2>: Check that arc_ccfsm_post_advance
changes statep->state.
@ -2335,12 +2346,12 @@
2013-10-25 Christian Bruel <christian.bruel@st.com>
* config.gcc (sh-*): Add sh-mem.o to extra_obj.
* gcc/config/sh/t-sh (sh-mem.o): New rule.
* gcc/config/sh/sh-mem.cc (expand_block_move): Moved here.
* config/sh/t-sh (sh-mem.o): New rule.
* config/sh/sh-mem.cc (expand_block_move): Moved here.
(sh_expand_cmpstr): New function.
* gcc/config/sh/sh.c (force_into, expand_block_move): Move to sh-mem.c.
* gcc/config/sh/sh-protos.h (sh_expand_cmpstr): Declare.
* gcc/config/sh/sh.md (cmpstrsi, cmpstr_t): New patterns.
* config/sh/sh.c (force_into, expand_block_move): Move to sh-mem.c.
* config/sh/sh-protos.h (sh_expand_cmpstr): Declare.
* config/sh/sh.md (cmpstrsi, cmpstr_t): New patterns.
(rotlhi3_8): Rename.
2013-10-24 Jan-Benedict Glaw <jbglaw@lug-owl.de>
@ -3184,7 +3195,7 @@
2013-10-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* gcc/config/rs6000/vector.md (vec_unpacks_hi_v4sf): Correct for
* config/rs6000/vector.md (vec_unpacks_hi_v4sf): Correct for
endianness.
(vec_unpacks_lo_v4sf): Likewise.
(vec_unpacks_float_hi_v4si): Likewise.
@ -3970,8 +3981,8 @@
(anddi3_insn): Update type attribute.
(xordi3_insn): Likewise.
(one_cmpldi2): Likewise.
* gcc/config/arm/vfp.md (movhf_vfp_neon): Update type attribute.
* gcc/config/arm/neon.md (neon_mov): Update type attribute.
* config/arm/vfp.md (movhf_vfp_neon): Update type attribute.
* config/arm/neon.md (neon_mov): Update type attribute.
(*movmisalign<mode>_neon_store): Likewise.
(*movmisalign<mode>_neon_load): Likewise.
(vec_set<mode>_internal): Likewise.

View File

@ -10657,8 +10657,12 @@ ix86_expand_prologue (void)
if (STACK_CHECK_MOVING_SP)
{
ix86_adjust_stack_and_probe (allocate);
allocate = 0;
if (!(crtl->is_leaf && !cfun->calls_alloca
&& allocate <= PROBE_INTERVAL))
{
ix86_adjust_stack_and_probe (allocate);
allocate = 0;
}
}
else
{
@ -10668,9 +10672,26 @@ ix86_expand_prologue (void)
size = 0x80000000 - STACK_CHECK_PROTECT - 1;
if (TARGET_STACK_PROBE)
ix86_emit_probe_stack_range (0, size + STACK_CHECK_PROTECT);
{
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL)
ix86_emit_probe_stack_range (0, size);
}
else
ix86_emit_probe_stack_range (0, size + STACK_CHECK_PROTECT);
}
else
ix86_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
{
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
ix86_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else
ix86_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
}
}

View File

@ -3206,61 +3206,54 @@ gen_fr_restore_x (rtx dest, rtx src, rtx offset ATTRIBUTE_UNUSED)
#define BACKING_STORE_SIZE(N) ((N) > 0 ? ((N) + (N)/63 + 1) * 8 : 0)
/* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE,
inclusive. These are offsets from the current stack pointer. SOL is the
size of local registers. ??? This clobbers r2 and r3. */
inclusive. These are offsets from the current stack pointer. BS_SIZE
is the size of the backing store. ??? This clobbers r2 and r3. */
static void
ia64_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size, int sol)
ia64_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size,
int bs_size)
{
/* On the IA-64 there is a second stack in memory, namely the Backing Store
of the Register Stack Engine. We also need to probe it after checking
that the 2 stacks don't overlap. */
const int bs_size = BACKING_STORE_SIZE (sol);
rtx r2 = gen_rtx_REG (Pmode, GR_REG (2));
rtx r3 = gen_rtx_REG (Pmode, GR_REG (3));
rtx p6 = gen_rtx_REG (BImode, PR_REG (6));
/* Detect collision of the 2 stacks if necessary. */
if (bs_size > 0 || size > 0)
{
rtx p6 = gen_rtx_REG (BImode, PR_REG (6));
/* On the IA-64 there is a second stack in memory, namely the Backing Store
of the Register Stack Engine. We also need to probe it after checking
that the 2 stacks don't overlap. */
emit_insn (gen_bsp_value (r3));
emit_move_insn (r2, GEN_INT (-(first + size)));
emit_insn (gen_bsp_value (r3));
emit_move_insn (r2, GEN_INT (-(first + size)));
/* Compare current value of BSP and SP registers. */
emit_insn (gen_rtx_SET (VOIDmode, p6,
gen_rtx_fmt_ee (LTU, BImode,
r3, stack_pointer_rtx)));
/* Compare current value of BSP and SP registers. */
emit_insn (gen_rtx_SET (VOIDmode, p6,
gen_rtx_fmt_ee (LTU, BImode,
r3, stack_pointer_rtx)));
/* Compute the address of the probe for the Backing Store (which grows
towards higher addresses). We probe only at the first offset of
the next page because some OS (eg Linux/ia64) only extend the
backing store when this specific address is hit (but generate a SEGV
on other address). Page size is the worst case (4KB). The reserve
size is at least 4096 - (96 + 2) * 8 = 3312 bytes, which is enough.
Also compute the address of the last probe for the memory stack
(which grows towards lower addresses). */
emit_insn (gen_rtx_SET (VOIDmode, r3, plus_constant (Pmode, r3, 4095)));
emit_insn (gen_rtx_SET (VOIDmode, r2,
gen_rtx_PLUS (Pmode, stack_pointer_rtx, r2)));
/* Compute the address of the probe for the Backing Store (which grows
towards higher addresses). We probe only at the first offset of
the next page because some OS (eg Linux/ia64) only extend the
backing store when this specific address is hit (but generate a SEGV
on other address). Page size is the worst case (4KB). The reserve
size is at least 4096 - (96 + 2) * 8 = 3312 bytes, which is enough.
Also compute the address of the last probe for the memory stack
(which grows towards lower addresses). */
emit_insn (gen_rtx_SET (VOIDmode, r3, plus_constant (Pmode, r3, 4095)));
emit_insn (gen_rtx_SET (VOIDmode, r2,
gen_rtx_PLUS (Pmode, stack_pointer_rtx, r2)));
/* Compare them and raise SEGV if the former has topped the latter. */
emit_insn (gen_rtx_COND_EXEC (VOIDmode,
gen_rtx_fmt_ee (NE, VOIDmode, p6,
const0_rtx),
gen_rtx_SET (VOIDmode, p6,
gen_rtx_fmt_ee (GEU, BImode,
r3, r2))));
emit_insn (gen_rtx_SET (VOIDmode,
gen_rtx_ZERO_EXTRACT (DImode, r3, GEN_INT (12),
const0_rtx),
const0_rtx));
emit_insn (gen_rtx_COND_EXEC (VOIDmode,
gen_rtx_fmt_ee (NE, VOIDmode, p6,
const0_rtx),
gen_rtx_TRAP_IF (VOIDmode, const1_rtx,
GEN_INT (11))));
}
/* Compare them and raise SEGV if the former has topped the latter. */
emit_insn (gen_rtx_COND_EXEC (VOIDmode,
gen_rtx_fmt_ee (NE, VOIDmode, p6, const0_rtx),
gen_rtx_SET (VOIDmode, p6,
gen_rtx_fmt_ee (GEU, BImode,
r3, r2))));
emit_insn (gen_rtx_SET (VOIDmode,
gen_rtx_ZERO_EXTRACT (DImode, r3, GEN_INT (12),
const0_rtx),
const0_rtx));
emit_insn (gen_rtx_COND_EXEC (VOIDmode,
gen_rtx_fmt_ee (NE, VOIDmode, p6, const0_rtx),
gen_rtx_TRAP_IF (VOIDmode, const1_rtx,
GEN_INT (11))));
/* Probe the Backing Store if necessary. */
if (bs_size > 0)
@ -3444,10 +3437,23 @@ ia64_expand_prologue (void)
current_function_static_stack_size = current_frame_info.total_size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
ia64_emit_probe_stack_range (STACK_CHECK_PROTECT,
current_frame_info.total_size,
current_frame_info.n_input_regs
+ current_frame_info.n_local_regs);
{
HOST_WIDE_INT size = current_frame_info.total_size;
int bs_size = BACKING_STORE_SIZE (current_frame_info.n_input_regs
+ current_frame_info.n_local_regs);
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
ia64_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT,
bs_size);
else if (size + bs_size > STACK_CHECK_PROTECT)
ia64_emit_probe_stack_range (STACK_CHECK_PROTECT, 0, bs_size);
}
else if (size + bs_size > 0)
ia64_emit_probe_stack_range (STACK_CHECK_PROTECT, size, bs_size);
}
if (dump_file)
{

View File

@ -10994,8 +10994,17 @@ mips_expand_prologue (void)
if (flag_stack_usage_info)
current_function_static_stack_size = size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && size)
mips_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
{
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
mips_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else if (size > 0)
mips_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
/* Save the registers. Allocate up to MIPS_MAX_FIRST_STACK_STEP
bytes beforehand; this is enough to cover the register save area

View File

@ -21538,8 +21538,19 @@ rs6000_emit_prologue (void)
if (flag_stack_usage_info)
current_function_static_stack_size = info->total_size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && info->total_size)
rs6000_emit_probe_stack_range (STACK_CHECK_PROTECT, info->total_size);
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
{
HOST_WIDE_INT size = info->total_size;
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
rs6000_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else if (size > 0)
rs6000_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
if (TARGET_FIX_AND_CONTINUE)
{

View File

@ -5362,8 +5362,17 @@ sparc_expand_prologue (void)
if (flag_stack_usage_info)
current_function_static_stack_size = size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && size)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
{
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else if (size > 0)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
if (size == 0)
; /* do nothing. */
@ -5464,8 +5473,17 @@ sparc_flat_expand_prologue (void)
if (flag_stack_usage_info)
current_function_static_stack_size = size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && size)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
{
if (crtl->is_leaf && !cfun->calls_alloca)
{
if (size > PROBE_INTERVAL && size > STACK_CHECK_PROTECT)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT,
size - STACK_CHECK_PROTECT);
}
else if (size > 0)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
}
if (sparc_save_local_in_regs_p)
emit_save_or_restore_local_in_regs (stack_pointer_rtx, SPARC_STACK_BIAS,