Add SH4 support:

* config/sh/lib1funcs.asm (___movstr_i4_even, ___movstr_i4_odd): Define.
	(___movstrSI12_i4, ___sdivsi3_i4, ___udivsi3_i4): Define.
	* sh.c (reg_class_from_letter, regno_reg_class): Add DF_REGS.
	(fp_reg_names, assembler_dialect): New variables.
	(print_operand_address): Handle SUBREGs.
	(print_operand): Added 'o' case.
	Don't use adj_offsettable_operand on PRE_DEC / POST_INC.
	Name of FP registers depends on mode.
	(expand_block_move): Emit different code for SH4 hardware.
	(prepare_scc_operands): Use emit_sf_insn / emit_df_insn as appropriate.
	(from_compare): Likewise.
	(add_constant): New argument last_value.  Changed all callers.
	(find_barrier): Don't try HImode load for FPUL_REG.
	(machine_dependent_reorg): Likewise.
	(sfunc_uses_reg): A CLOBBER cannot be the address register use.
	(gen_far_branch): Emit a barrier after the new jump.
	(barrier_align): Don't trust instruction lengths before
	fixing up pcloads.
	(machine_dependent_reorg): Add support for FIRST_XD_REG .. LAST_XD_REG.
	Use auto-inc addressing for fp registers if doubles need to
	be loaded in two steps.
	Set sh_flag_remove_dead_before_cse.
	(push): Support for TARGET_FMOVD.  Use gen_push_fpul for fpul.
	(pop): Support for TARGET_FMOVD.  Use gen_pop_fpul for fpul.
	(calc_live_regs): Support for TARGET_FMOVD.  Don't save FPSCR.
	Support for FIRST_XD_REG .. LAST_XD_REG.
	(sh_expand_prologue): Support for FIRST_XD_REG .. LAST_XD_REG.
	(sh_expand_epilogue): Likewise.
	(sh_builtin_saveregs): Use DFmode moves for fp regs on SH4.
	(initial_elimination_offset): Take TARGET_ALIGN_DOUBLE into account.
	(arith_reg_operand): FPUL_REG is OK for SH4.
	(fp_arith_reg_operand, fp_extended_operand) New functions.
	(tertiary_reload_operand, fpscr_operand): Likewise.
	(commutative_float_operator, noncommutative_float_operator): Likewise.
	(binary_float_operator, get_fpscr_rtx, emit_sf_insn): Likewise.
	(emit_df_insn, expand_sf_unop, expand_sf_binop): Likewise.
	(expand_df_unop, expand_df_binop, expand_fp_branch): Likewise.
	(emit_fpscr_use, mark_use, remove_dead_before_cse): Likewise.
	* sh.h (CPP_SPEC): Add support for -m4, m4-single, m4-single-only.
	(CONDITIONAL_REGISTER_USAGE): Likewise.
	(HARD_SH4_BIT, FPU_SINGLE_BIT, SH4_BIT, FMOVD_BIT): Define.
	(TARGET_CACHE32, TARGET_SUPERSCALAR, TARGET_HARWARD): Define.
	(TARGET_HARD_SH4, TARGET_FPU_SINGLE, TARGET_SH4, TARGET_FMOVD): Define.
	(target_flag): Add -m4, m4-single, m4-single-only, -mfmovd.
	(OPTIMIZATION_OPTIONS): If optimizing, set flag_omit_frame_pointer
	to -1 and sh_flag_remove_dead_before_cse to 1.
	(ASSEMBLER_DIALECT): Define to assembler_dialect.
	(assembler_dialect, fp_reg_names): Declare.
	(OVERRIDE_OPTIONS): Add code for TARGET_SH4.
	Hide names of registers that are not accessible.
	(CACHE_LOG): Take TARGET_CACHE32 into account.
	(LOOP_ALIGN): Take TARGET_HARWARD into account.
	(FIRST_XD_REG, LAST_XD_REG, FPSCR_REG): Define.
	(FIRST_PSEUDO_REGISTER: Now 49.
	(FIXED_REGISTERS, CALL_USED_REGISTERS): Include values for registers.
	(HARD_REGNO_NREGS): Special treatment of FIRST_XD_REG .. LAST_XD_REG.
	(HARD_REGNO_MODE_OK): Update.
	(enum reg_class): Add DF_REGS and FPSCR_REGS.
	(REG_CLASS_NAMES, REG_CLASS_CONTENTS, REG_ALLOC_ORDER): Likewise.
	(SECONDARY_OUTPUT_RELOAD_CLASS, SECONDARY_INPUT_RELOAD_CLASS): Update.
	(CLASS_CANNOT_CHANGE_SIZE, DEBUG_REGISTER_NAMES): Define.
	(NPARM_REGS): Eight floating point parameter registers on SH4.
	(BASE_RETURN_VALUE_REG): SH4 also passes double values
	in floating point registers.
	(GET_SH_ARG_CLASS) Likewise.
	Complex float types are also returned in float registers.
	(BASE_ARG_REG): Complex float types are also passes in float registers.
	(FUNCTION_VALUE): Change mode like PROMOTE_MODE does.
	(LIBCALL_VALUE): Remove trailing semicolon.
	(ROUND_REG): Round when double precision value is passed in floating
	point register(s).
	(FUNCTION_ARG_ADVANCE): No change wanted for SH4 when things are
	passed on the stack.
	(FUNCTION_ARG): Little endian adjustment for SH4 SFmode.
	(FUNCTION_ARG_PARTIAL_NREGS): Zero for SH4.
	(TRAMPOLINE_ALIGNMENT): Take TARGET_HARWARD into account.
	(INITIALIZE_TRAMPOLINE): Emit ic_invalidate_line for TARGET_HARWARD.
	(MODE_DISP_OK_8): Not for SH4 DFmode.
	(GO_IF_LEGITIMATE_ADDRESS): No base reg + index reg for SH4 DFmode.
	Allow indexed addressing for PSImode after reload.
	(LEGITIMIZE_ADDRESS): Not for SH4 DFmode.
	(LEGITIMIZE_RELOAD_ADDRESS): Handle SH3E SFmode.
	Don't change SH4 DFmode nor PSImode RELOAD_FOR_INPUT_ADDRESS.
	(DOUBLE_TYPE_SIZE): 64 for SH4.
	(RTX_COSTS): Add PLUS case.
	Increae cost of ASHIFT, ASHIFTRT, LSHIFTRT case.
	(REGISTER_MOVE_COST): Add handling of R0_REGS, FPUL_REGS, T_REGS,
	MAC_REGS, PR_REGS, DF_REGS.
	(REGISTER_NAMES): Use fp_reg_names.
	(enum processor_type): Add PROCESSOR_SH4.
	(sh_flag_remove_dead_before_cse): Declare.
	(rtx_equal_function_value_matters, fpscr_rtx, get_fpscr_rtx): Declare.
	(PREDICATE_CODES): Add binary_float_operator,
	commutative_float_operator, fp_arith_reg_operand, fp_extended_operand,
	fpscr_operand, noncommutative_float_operator.
	(ADJUST_COST): Use different scale for TARGET_SUPERSCALAR.
	(SH_DYNAMIC_SHIFT_COST): Cheaper for SH4.
	* sh.md (attribute cpu): Add value sh4.
	(attrbutes fmovd, issues): Define.
	(attribute type): Add values dfp_arith, dfp_cmp, dfp_conv, dfdiv.
	(function units memory, int, mpy, fp): Make dependent on issue rate.
	(function units issue, single_issue, load_si, load): Define.
	(function units load_store, fdiv, gp_fpul): Define.
	(attribute hit_stack): Provide proper default.
	(use_sfunc_addr+1, udivsi3): Predicated on ! TARGET_SH4.
	(udivsi3_i4, udivsi3_i4_single, divsi3_i4, divsi3_i4_single): New insns.
	(udivsi3, divsi3): Emit special patterns for SH4 hardware,
	(mulsi3_call): Now uses match_operand for function address.
	(mulsi3): Also emit code for SH1 case.  Wrap result in REG_LIBCALL /
	REG_RETVAL notes.
	(push, pop, push_e, pop_e): Now define_expands.
	(push_fpul, push_4, pop_fpul, pop_4, ic_invalidate_line): New expanders.
	(movsi_ie): Added y/i alternative.
	(ic_invalidate_line_i, movdf_i4): New insns.
	(movdf_i4+[123], reload_outdf+[12345], movsi_y+[12]): New splitters.
	(reload_indf, reload_outdf, reload_outsf, reload_insi): New expanders.
	(movdf): Add special code for SH4.
	(movsf_ie, movsf_ie+1, reload_insf, calli): Make use of fpscr visible.
	(call_valuei, calli, call_value): Likewise.
	(movsf): Emit no-op move.
	(mov_nop, movsi_y): New insns.
	(blt, sge): generalize to handle DFmode.
	(return predicate): Call emit_fpscr_use and remove_dead_before_cse.
	(block_move_real, block_lump_real): Predicate on ! TARGET_HARD_SH4.
	(block_move_real_i4, block_lump_real_i4, fpu_switch): New insns.
	(fpu_switch0, fpu_switch1, movpsi): New expanders.
	(fpu_switch+[12], fix_truncsfsi2_i4_2+1): New splitters.
	(toggle_sz): New insn.
	(addsf3, subsf3, mulsf3, divsf3): Now define_expands.
	(addsf3_i, subsf3_i, mulsf3_i4, mulsf3_ie, divsf3_i): New insns.
	(macsf3): Make use of fpscr visible.  Disable for SH4.
	(floatsisf2): Make use of fpscr visible.
	(floatsisf2_i4): New insn.
	(floatsisf2_ie, fixsfsi, cmpgtsf_t, cmpeqsf_t): Disable for SH4.
	(ieee_ccmpeqsf_t): Likewise.
	(fix_truncsfsi2): Emit different code for SH4.
	(fix_truncsfsi2_i4, fix_truncsfsi2_i4_2, cmpgtsf_t_i4): New insns.
	(cmpeqsf_t_i4, ieee_ccmpeqsf_t_4): New insns.
	(negsf2, sqrtsf2, abssf2): Now expanders.
	(adddf3, subdf3i, muldf2, divdf3, floatsidf2): New expanders.
	(negsf2_i, sqrtsf2_i, abssf2_i, adddf3_i, subdf3_i): New insns.
	(muldf3_i, divdf3_i, floatsidf2_i, fix_truncdfsi2_i): New insns.
	(fix_truncdfsi2, cmpdf, negdf2, sqrtdf2, absdf2): New expanders.
	(fix_truncdfsi2_i4, cmpgtdf_t, cmpeqdf_t, ieee_ccmpeqdf_t): New insns.
	(fix_truncdfsi2_i4_2+1): New splitters.
	(negdf2_i, sqrtdf2_i, absdf2_i, extendsfdf2_i4): New insns.
	(extendsfdf2, truncdfsf2): New expanders.
	(truncdfsf2_i4): New insn.
	* t-sh (LIB1ASMFUNCS): Add _movstr_i4, _sdivsi3_i4, _udivsi3_i4.
	(MULTILIB_OPTIONS): Add m4-single-only/m4-single/m4.
	* float-sh.h: When testing for __SH3E__, also test for
	__SH4_SINGLE_ONLY__ .
	* va-sh.h (__va_freg): Define to float.
	(__va_greg, __fa_freg, __gnuc_va_list, va_start):
        Define for __SH4_SINGLE_ONLY__ like for __SH3E__ .
        (__PASS_AS_FLOAT, __TARGET_SH4_P): Likewise.
	(__PASS_AS_FLOAT): Use different definition for __SH4__ and
	 __SH4_SINGLE__.
	(TARGET_SH4_P): Define.
	(va_arg): Use it.
	* sh.md (movdf_k, movsf_i): Tweak the condition so that
	init_expr_once is satisfied about the existence of load / store insns.
	* sh.md (movsi_i, movsi_ie, movsi_i_lowpart, movsf_i, movsf_ie):
        change m constraint in source operand to mr / mf .
	* va-sh.h (__va_arg_sh1): Use __asm instead of asm.
	* (__VA_REEF): Define.
	(__va_arg_sh1): Use it.
	* va-sh.h (va_start, va_arg, va_copy): Add parenteses.

From-SVN: r23777
This commit is contained in:
J"orn Rennecke 1998-11-23 08:50:42 +00:00 committed by Joern Rennecke
parent 57cfc5dd86
commit 225e4f43cc
8 changed files with 2804 additions and 277 deletions

View File

@ -1,3 +1,181 @@
Mon Nov 23 16:46:46 1998 J"orn Rennecke <amylaar@cygnus.co.uk>
Add SH4 support:
* config/sh/lib1funcs.asm (___movstr_i4_even, ___movstr_i4_odd): Define.
(___movstrSI12_i4, ___sdivsi3_i4, ___udivsi3_i4): Define.
* sh.c (reg_class_from_letter, regno_reg_class): Add DF_REGS.
(fp_reg_names, assembler_dialect): New variables.
(print_operand_address): Handle SUBREGs.
(print_operand): Added 'o' case.
Don't use adj_offsettable_operand on PRE_DEC / POST_INC.
Name of FP registers depends on mode.
(expand_block_move): Emit different code for SH4 hardware.
(prepare_scc_operands): Use emit_sf_insn / emit_df_insn as appropriate.
(from_compare): Likewise.
(add_constant): New argument last_value. Changed all callers.
(find_barrier): Don't try HImode load for FPUL_REG.
(machine_dependent_reorg): Likewise.
(sfunc_uses_reg): A CLOBBER cannot be the address register use.
(gen_far_branch): Emit a barrier after the new jump.
(barrier_align): Don't trust instruction lengths before
fixing up pcloads.
(machine_dependent_reorg): Add support for FIRST_XD_REG .. LAST_XD_REG.
Use auto-inc addressing for fp registers if doubles need to
be loaded in two steps.
Set sh_flag_remove_dead_before_cse.
(push): Support for TARGET_FMOVD. Use gen_push_fpul for fpul.
(pop): Support for TARGET_FMOVD. Use gen_pop_fpul for fpul.
(calc_live_regs): Support for TARGET_FMOVD. Don't save FPSCR.
Support for FIRST_XD_REG .. LAST_XD_REG.
(sh_expand_prologue): Support for FIRST_XD_REG .. LAST_XD_REG.
(sh_expand_epilogue): Likewise.
(sh_builtin_saveregs): Use DFmode moves for fp regs on SH4.
(initial_elimination_offset): Take TARGET_ALIGN_DOUBLE into account.
(arith_reg_operand): FPUL_REG is OK for SH4.
(fp_arith_reg_operand, fp_extended_operand) New functions.
(tertiary_reload_operand, fpscr_operand): Likewise.
(commutative_float_operator, noncommutative_float_operator): Likewise.
(binary_float_operator, get_fpscr_rtx, emit_sf_insn): Likewise.
(emit_df_insn, expand_sf_unop, expand_sf_binop): Likewise.
(expand_df_unop, expand_df_binop, expand_fp_branch): Likewise.
(emit_fpscr_use, mark_use, remove_dead_before_cse): Likewise.
* sh.h (CPP_SPEC): Add support for -m4, m4-single, m4-single-only.
(CONDITIONAL_REGISTER_USAGE): Likewise.
(HARD_SH4_BIT, FPU_SINGLE_BIT, SH4_BIT, FMOVD_BIT): Define.
(TARGET_CACHE32, TARGET_SUPERSCALAR, TARGET_HARWARD): Define.
(TARGET_HARD_SH4, TARGET_FPU_SINGLE, TARGET_SH4, TARGET_FMOVD): Define.
(target_flag): Add -m4, m4-single, m4-single-only, -mfmovd.
(OPTIMIZATION_OPTIONS): If optimizing, set flag_omit_frame_pointer
to -1 and sh_flag_remove_dead_before_cse to 1.
(ASSEMBLER_DIALECT): Define to assembler_dialect.
(assembler_dialect, fp_reg_names): Declare.
(OVERRIDE_OPTIONS): Add code for TARGET_SH4.
Hide names of registers that are not accessible.
(CACHE_LOG): Take TARGET_CACHE32 into account.
(LOOP_ALIGN): Take TARGET_HARWARD into account.
(FIRST_XD_REG, LAST_XD_REG, FPSCR_REG): Define.
(FIRST_PSEUDO_REGISTER: Now 49.
(FIXED_REGISTERS, CALL_USED_REGISTERS): Include values for registers.
(HARD_REGNO_NREGS): Special treatment of FIRST_XD_REG .. LAST_XD_REG.
(HARD_REGNO_MODE_OK): Update.
(enum reg_class): Add DF_REGS and FPSCR_REGS.
(REG_CLASS_NAMES, REG_CLASS_CONTENTS, REG_ALLOC_ORDER): Likewise.
(SECONDARY_OUTPUT_RELOAD_CLASS, SECONDARY_INPUT_RELOAD_CLASS): Update.
(CLASS_CANNOT_CHANGE_SIZE, DEBUG_REGISTER_NAMES): Define.
(NPARM_REGS): Eight floating point parameter registers on SH4.
(BASE_RETURN_VALUE_REG): SH4 also passes double values
in floating point registers.
(GET_SH_ARG_CLASS) Likewise.
Complex float types are also returned in float registers.
(BASE_ARG_REG): Complex float types are also passes in float registers.
(FUNCTION_VALUE): Change mode like PROMOTE_MODE does.
(LIBCALL_VALUE): Remove trailing semicolon.
(ROUND_REG): Round when double precision value is passed in floating
point register(s).
(FUNCTION_ARG_ADVANCE): No change wanted for SH4 when things are
passed on the stack.
(FUNCTION_ARG): Little endian adjustment for SH4 SFmode.
(FUNCTION_ARG_PARTIAL_NREGS): Zero for SH4.
(TRAMPOLINE_ALIGNMENT): Take TARGET_HARWARD into account.
(INITIALIZE_TRAMPOLINE): Emit ic_invalidate_line for TARGET_HARWARD.
(MODE_DISP_OK_8): Not for SH4 DFmode.
(GO_IF_LEGITIMATE_ADDRESS): No base reg + index reg for SH4 DFmode.
Allow indexed addressing for PSImode after reload.
(LEGITIMIZE_ADDRESS): Not for SH4 DFmode.
(LEGITIMIZE_RELOAD_ADDRESS): Handle SH3E SFmode.
Don't change SH4 DFmode nor PSImode RELOAD_FOR_INPUT_ADDRESS.
(DOUBLE_TYPE_SIZE): 64 for SH4.
(RTX_COSTS): Add PLUS case.
Increae cost of ASHIFT, ASHIFTRT, LSHIFTRT case.
(REGISTER_MOVE_COST): Add handling of R0_REGS, FPUL_REGS, T_REGS,
MAC_REGS, PR_REGS, DF_REGS.
(REGISTER_NAMES): Use fp_reg_names.
(enum processor_type): Add PROCESSOR_SH4.
(sh_flag_remove_dead_before_cse): Declare.
(rtx_equal_function_value_matters, fpscr_rtx, get_fpscr_rtx): Declare.
(PREDICATE_CODES): Add binary_float_operator,
commutative_float_operator, fp_arith_reg_operand, fp_extended_operand,
fpscr_operand, noncommutative_float_operator.
(ADJUST_COST): Use different scale for TARGET_SUPERSCALAR.
(SH_DYNAMIC_SHIFT_COST): Cheaper for SH4.
* sh.md (attribute cpu): Add value sh4.
(attrbutes fmovd, issues): Define.
(attribute type): Add values dfp_arith, dfp_cmp, dfp_conv, dfdiv.
(function units memory, int, mpy, fp): Make dependent on issue rate.
(function units issue, single_issue, load_si, load): Define.
(function units load_store, fdiv, gp_fpul): Define.
(attribute hit_stack): Provide proper default.
(use_sfunc_addr+1, udivsi3): Predicated on ! TARGET_SH4.
(udivsi3_i4, udivsi3_i4_single, divsi3_i4, divsi3_i4_single): New insns.
(udivsi3, divsi3): Emit special patterns for SH4 hardware,
(mulsi3_call): Now uses match_operand for function address.
(mulsi3): Also emit code for SH1 case. Wrap result in REG_LIBCALL /
REG_RETVAL notes.
(push, pop, push_e, pop_e): Now define_expands.
(push_fpul, push_4, pop_fpul, pop_4, ic_invalidate_line): New expanders.
(movsi_ie): Added y/i alternative.
(ic_invalidate_line_i, movdf_i4): New insns.
(movdf_i4+[123], reload_outdf+[12345], movsi_y+[12]): New splitters.
(reload_indf, reload_outdf, reload_outsf, reload_insi): New expanders.
(movdf): Add special code for SH4.
(movsf_ie, movsf_ie+1, reload_insf, calli): Make use of fpscr visible.
(call_valuei, calli, call_value): Likewise.
(movsf): Emit no-op move.
(mov_nop, movsi_y): New insns.
(blt, sge): generalize to handle DFmode.
(return predicate): Call emit_fpscr_use and remove_dead_before_cse.
(block_move_real, block_lump_real): Predicate on ! TARGET_HARD_SH4.
(block_move_real_i4, block_lump_real_i4, fpu_switch): New insns.
(fpu_switch0, fpu_switch1, movpsi): New expanders.
(fpu_switch+[12], fix_truncsfsi2_i4_2+1): New splitters.
(toggle_sz): New insn.
(addsf3, subsf3, mulsf3, divsf3): Now define_expands.
(addsf3_i, subsf3_i, mulsf3_i4, mulsf3_ie, divsf3_i): New insns.
(macsf3): Make use of fpscr visible. Disable for SH4.
(floatsisf2): Make use of fpscr visible.
(floatsisf2_i4): New insn.
(floatsisf2_ie, fixsfsi, cmpgtsf_t, cmpeqsf_t): Disable for SH4.
(ieee_ccmpeqsf_t): Likewise.
(fix_truncsfsi2): Emit different code for SH4.
(fix_truncsfsi2_i4, fix_truncsfsi2_i4_2, cmpgtsf_t_i4): New insns.
(cmpeqsf_t_i4, ieee_ccmpeqsf_t_4): New insns.
(negsf2, sqrtsf2, abssf2): Now expanders.
(adddf3, subdf3i, muldf2, divdf3, floatsidf2): New expanders.
(negsf2_i, sqrtsf2_i, abssf2_i, adddf3_i, subdf3_i): New insns.
(muldf3_i, divdf3_i, floatsidf2_i, fix_truncdfsi2_i): New insns.
(fix_truncdfsi2, cmpdf, negdf2, sqrtdf2, absdf2): New expanders.
(fix_truncdfsi2_i4, cmpgtdf_t, cmpeqdf_t, ieee_ccmpeqdf_t): New insns.
(fix_truncdfsi2_i4_2+1): New splitters.
(negdf2_i, sqrtdf2_i, absdf2_i, extendsfdf2_i4): New insns.
(extendsfdf2, truncdfsf2): New expanders.
(truncdfsf2_i4): New insn.
* t-sh (LIB1ASMFUNCS): Add _movstr_i4, _sdivsi3_i4, _udivsi3_i4.
(MULTILIB_OPTIONS): Add m4-single-only/m4-single/m4.
* float-sh.h: When testing for __SH3E__, also test for
__SH4_SINGLE_ONLY__ .
* va-sh.h (__va_freg): Define to float.
(__va_greg, __fa_freg, __gnuc_va_list, va_start):
Define for __SH4_SINGLE_ONLY__ like for __SH3E__ .
(__PASS_AS_FLOAT, __TARGET_SH4_P): Likewise.
(__PASS_AS_FLOAT): Use different definition for __SH4__ and
__SH4_SINGLE__.
(TARGET_SH4_P): Define.
(va_arg): Use it.
* sh.md (movdf_k, movsf_i): Tweak the condition so that
init_expr_once is satisfied about the existence of load / store insns.
* sh.md (movsi_i, movsi_ie, movsi_i_lowpart, movsf_i, movsf_ie):
change m constraint in source operand to mr / mf .
* va-sh.h (__va_arg_sh1): Use __asm instead of asm.
* (__VA_REEF): Define.
(__va_arg_sh1): Use it.
* va-sh.h (va_start, va_arg, va_copy): Add parenteses.
Sun Nov 22 21:34:02 1998 Jeffrey A Law (law@cygnus.com)
* i386/dgux.c (struct option): Add new "description field".

View File

@ -37,7 +37,7 @@
#undef FLT_MAX_10_EXP
#define FLT_MAX_10_EXP 38
#ifdef __SH3E__
#if defined (__SH3E__) || defined (__SH4_SINGLE_ONLY__)
/* Number of base-FLT_RADIX digits in the significand of a double */
#undef DBL_MANT_DIG

View File

@ -770,6 +770,64 @@ ___movstr:
add #64,r4
#endif
#ifdef L_movstr_i4
#if defined(__SH4__) || defined(__SH4_SINGLE__) || defined(__SH4_SINGLE_ONLY__)
.text
.global ___movstr_i4_even
.global ___movstr_i4_odd
.global ___movstrSI12_i4
.p2align 5
L_movstr_2mod4_end:
mov.l r0,@(16,r4)
rts
mov.l r1,@(20,r4)
.p2align 2
___movstr_i4_odd:
mov.l @r5+,r1
add #-4,r4
mov.l @r5+,r2
mov.l @r5+,r3
mov.l r1,@(4,r4)
mov.l r2,@(8,r4)
L_movstr_loop:
mov.l r3,@(12,r4)
dt r6
mov.l @r5+,r0
bt/s L_movstr_2mod4_end
mov.l @r5+,r1
add #16,r4
L_movstr_start_even:
mov.l @r5+,r2
mov.l @r5+,r3
mov.l r0,@r4
dt r6
mov.l r1,@(4,r4)
bf/s L_movstr_loop
mov.l r2,@(8,r4)
rts
mov.l r3,@(12,r4)
___movstr_i4_even:
mov.l @r5+,r0
bra L_movstr_start_even
mov.l @r5+,r1
.p2align 4
___movstrSI12_i4:
mov.l @r5,r0
mov.l @(4,r5),r1
mov.l @(8,r5),r2
mov.l r0,@r4
mov.l r1,@(4,r4)
rts
mov.l r2,@(8,r4)
#endif /* ! __SH4__ */
#endif
#ifdef L_mulsi3
@ -808,9 +866,47 @@ hiset: sts macl,r0 ! r0 = bb*dd
#endif
#ifdef L_sdivsi3
#ifdef L_sdivsi3_i4
.title "SH DIVIDE"
!! 4 byte integer Divide code for the Hitachi SH
#ifdef __SH4__
!! args in r4 and r5, result in fpul, clobber dr0, dr2
.global ___sdivsi3_i4
___sdivsi3_i4:
lds r4,fpul
float fpul,dr0
lds r5,fpul
float fpul,dr2
fdiv dr2,dr0
rts
ftrc dr0,fpul
#elif defined(__SH4_SINGLE__) || defined(__SH4_SINGLE_ONLY__)
!! args in r4 and r5, result in fpul, clobber r2, dr0, dr2
.global ___sdivsi3_i4
___sdivsi3_i4:
sts.l fpscr,@-r15
mov #8,r2
swap.w r2,r2
lds r2,fpscr
lds r4,fpul
float fpul,dr0
lds r5,fpul
float fpul,dr2
fdiv dr2,dr0
ftrc dr0,fpul
rts
lds.l @r15+,fpscr
#endif /* ! __SH4__ */
#endif
#ifdef L_sdivsi3
/* __SH4_SINGLE_ONLY__ keeps this part for link compatibility with
sh3e code. */
#if ! defined(__SH4__) && ! defined (__SH4_SINGLE__)
!!
!! Steve Chamberlain
!! sac@cygnus.com
@ -904,11 +1000,109 @@ ___sdivsi3:
div0: rts
mov #0,r0
#endif /* ! __SH4__ */
#endif
#ifdef L_udivsi3
#ifdef L_udivsi3_i4
.title "SH DIVIDE"
!! 4 byte integer Divide code for the Hitachi SH
#ifdef __SH4__
!! args in r4 and r5, result in fpul, clobber r0, r1, r4, r5, dr0, dr2, dr4
.global ___udivsi3_i4
___udivsi3_i4:
mov #1,r1
cmp/hi r1,r5
bf trivial
rotr r1
xor r1,r4
lds r4,fpul
mova L1,r0
#ifdef FMOVD_WORKS
fmov.d @r0+,dr4
#else
#ifdef __LITTLE_ENDIAN__
fmov.s @r0+,fr5
fmov.s @r0,fr4
#else
fmov.s @r0+,fr4
fmov.s @r0,fr5
#endif
#endif
float fpul,dr0
xor r1,r5
lds r5,fpul
float fpul,dr2
fadd dr4,dr0
fadd dr4,dr2
fdiv dr2,dr0
rts
ftrc dr0,fpul
trivial:
rts
lds r4,fpul
.align 2
L1:
.double 2147483648
#elif defined(__SH4_SINGLE__) || defined(__SH4_SINGLE_ONLY__)
!! args in r4 and r5, result in fpul, clobber r0, r1, r4, r5, dr0, dr2, dr4
.global ___udivsi3_i4
___udivsi3_i4:
mov #1,r1
cmp/hi r1,r5
bf trivial
sts.l fpscr,@-r15
mova L1,r0
lds.l @r0+,fpscr
rotr r1
xor r1,r4
lds r4,fpul
#ifdef FMOVD_WORKS
fmov.d @r0+,dr4
#else
#ifdef __LITTLE_ENDIAN__
fmov.s @r0+,fr5
fmov.s @r0,fr4
#else
fmov.s @r0+,fr4
fmov.s @r0,fr5
#endif
#endif
float fpul,dr0
xor r1,r5
lds r5,fpul
float fpul,dr2
fadd dr4,dr0
fadd dr4,dr2
fdiv dr2,dr0
ftrc dr0,fpul
rts
lds.l @r15+,fpscr
trivial:
rts
lds r4,fpul
.align 2
L1:
#ifdef __LITTLE_ENDIAN__
.long 0x80000
#else
.long 0x180000
#endif
.double 2147483648
#endif /* ! __SH4__ */
#endif
#ifdef L_udivsi3
/* __SH4_SINGLE_ONLY__ keeps this part for link compatibility with
sh3e code. */
#if ! defined(__SH4__) && ! defined (__SH4_SINGLE__)
!!
!! Steve Chamberlain
!! sac@cygnus.com
@ -966,22 +1160,40 @@ vshortway:
ret: rts
mov r4,r0
#endif /* __SH4__ */
#endif
#ifdef L_set_fpscr
#if defined (__SH3E__)
#if defined (__SH3E__) || defined(__SH4_SINGLE__) || defined(__SH4__) || defined(__SH4_SINGLE_ONLY__)
.global ___set_fpscr
___set_fpscr:
lds r4,fpscr
mov.l ___set_fpscr_L1,r1
swap.w r4,r0
or #24,r0
#ifndef FMOVD_WORKS
xor #16,r0
#endif
#if defined(__SH4__)
swap.w r0,r3
mov.l r3,@(4,r1)
#else /* defined(__SH3E__) || defined(__SH4_SINGLE*__) */
swap.w r0,r2
mov.l r2,@r1
#endif
#ifndef FMOVD_WORKS
xor #8,r0
#else
xor #24,r0
#endif
#if defined(__SH4__)
swap.w r0,r2
rts
mov.l r2,@r1
#else /* defined(__SH3E__) || defined(__SH4_SINGLE*__) */
swap.w r0,r3
rts
mov.l r3,@(4,r1)
#endif
.align 2
___set_fpscr_L1:
.long ___fpscr_values
@ -990,5 +1202,5 @@ ___set_fpscr_L1:
#else
.comm ___fpscr_values,8
#endif /* ELF */
#endif /* SH3E */
#endif /* SH3E / SH4 */
#endif /* L_set_fpscr */

View File

@ -1,5 +1,5 @@
/* Output routines for GCC for Hitachi Super-H.
Copyright (C) 1993, 1994, 1995, 1996, 1997 Free Software Foundation, Inc.
Copyright (C) 1993-1998 Free Software Foundation, Inc.
This file is part of GNU CC.
@ -103,6 +103,17 @@ int regno_reg_class[FIRST_PSEUDO_REGISTER] =
FP_REGS, FP_REGS, FP_REGS, FP_REGS,
FP_REGS, FP_REGS, FP_REGS, FP_REGS,
FP_REGS, FP_REGS, FP_REGS, FP_REGS,
DF_REGS, DF_REGS, DF_REGS, DF_REGS,
DF_REGS, DF_REGS, DF_REGS, DF_REGS,
FPSCR_REGS,
};
char fp_reg_names[][5] =
{
"fr0", "fr1", "fr2", "fr3", "fr4", "fr5", "fr6", "fr7",
"fr8", "fr9", "fr10", "fr11", "fr12", "fr13", "fr14", "fr15",
"fpul",
"xd0","xd2","xd4", "xd6", "xd8", "xd10", "xd12", "xd14",
};
/* Provide reg_class from a letter such as appears in the machine
@ -110,7 +121,7 @@ int regno_reg_class[FIRST_PSEUDO_REGISTER] =
enum reg_class reg_class_from_letter[] =
{
/* a */ NO_REGS, /* b */ NO_REGS, /* c */ NO_REGS, /* d */ NO_REGS,
/* a */ ALL_REGS, /* b */ NO_REGS, /* c */ FPSCR_REGS, /* d */ DF_REGS,
/* e */ NO_REGS, /* f */ FP_REGS, /* g */ NO_REGS, /* h */ NO_REGS,
/* i */ NO_REGS, /* j */ NO_REGS, /* k */ NO_REGS, /* l */ PR_REGS,
/* m */ NO_REGS, /* n */ NO_REGS, /* o */ NO_REGS, /* p */ NO_REGS,
@ -119,6 +130,12 @@ enum reg_class reg_class_from_letter[] =
/* y */ FPUL_REGS, /* z */ R0_REGS
};
int assembler_dialect;
rtx get_fpscr_rtx ();
void emit_sf_insn ();
void emit_df_insn ();
static void split_branches PROTO ((rtx));
/* Print the operand address in x to the stream. */
@ -131,7 +148,8 @@ print_operand_address (stream, x)
switch (GET_CODE (x))
{
case REG:
fprintf (stream, "@%s", reg_names[REGNO (x)]);
case SUBREG:
fprintf (stream, "@%s", reg_names[true_regnum (x)]);
break;
case PLUS:
@ -143,13 +161,19 @@ print_operand_address (stream, x)
{
case CONST_INT:
fprintf (stream, "@(%d,%s)", INTVAL (index),
reg_names[REGNO (base)]);
reg_names[true_regnum (base)]);
break;
case REG:
fprintf (stream, "@(r0,%s)",
reg_names[MAX (REGNO (base), REGNO (index))]);
break;
case SUBREG:
{
int base_num = true_regnum (base);
int index_num = true_regnum (index);
fprintf (stream, "@(r0,%s)",
reg_names[MAX (base_num, index_num)]);
break;
}
default:
debug_rtx (x);
@ -159,11 +183,11 @@ print_operand_address (stream, x)
break;
case PRE_DEC:
fprintf (stream, "@-%s", reg_names[REGNO (XEXP (x, 0))]);
fprintf (stream, "@-%s", reg_names[true_regnum (XEXP (x, 0))]);
break;
case POST_INC:
fprintf (stream, "@%s+", reg_names[REGNO (XEXP (x, 0))]);
fprintf (stream, "@%s+", reg_names[true_regnum (XEXP (x, 0))]);
break;
default:
@ -182,7 +206,8 @@ print_operand_address (stream, x)
'O' print a constant without the #
'R' print the LSW of a dp value - changes if in little endian
'S' print the MSW of a dp value - changes if in little endian
'T' print the next word of a dp value - same as 'R' in big endian mode. */
'T' print the next word of a dp value - same as 'R' in big endian mode.
'o' output an operator. */
void
print_operand (stream, x, code)
@ -230,16 +255,31 @@ print_operand (stream, x, code)
fputs (reg_names[REGNO (x) + 1], (stream));
break;
case MEM:
print_operand_address (stream,
XEXP (adj_offsettable_operand (x, 4), 0));
if (GET_CODE (XEXP (x, 0)) != PRE_DEC
&& GET_CODE (XEXP (x, 0)) != POST_INC)
x = adj_offsettable_operand (x, 4);
print_operand_address (stream, XEXP (x, 0));
break;
}
break;
case 'o':
switch (GET_CODE (x))
{
case PLUS: fputs ("add", stream); break;
case MINUS: fputs ("sub", stream); break;
case MULT: fputs ("mul", stream); break;
case DIV: fputs ("div", stream); break;
}
break;
default:
switch (GET_CODE (x))
{
case REG:
fputs (reg_names[REGNO (x)], (stream));
if (REGNO (x) >= FIRST_FP_REG && REGNO (x) <= LAST_FP_REG
&& GET_MODE_SIZE (GET_MODE (x)) > 4)
fprintf ((stream), "d%s", reg_names[REGNO (x)]+1);
else
fputs (reg_names[REGNO (x)], (stream));
break;
case MEM:
output_address (XEXP (x, 0));
@ -273,6 +313,55 @@ expand_block_move (operands)
if (! constp || align < 4 || (bytes % 4 != 0))
return 0;
if (TARGET_HARD_SH4)
{
if (bytes < 12)
return 0;
else if (bytes == 12)
{
tree entry_name;
rtx func_addr_rtx;
rtx r4 = gen_rtx (REG, SImode, 4);
rtx r5 = gen_rtx (REG, SImode, 5);
entry_name = get_identifier ("__movstrSI12_i4");
func_addr_rtx
= copy_to_mode_reg (Pmode,
gen_rtx_SYMBOL_REF (Pmode,
IDENTIFIER_POINTER (entry_name)));
emit_insn (gen_move_insn (r4, XEXP (operands[0], 0)));
emit_insn (gen_move_insn (r5, XEXP (operands[1], 0)));
emit_insn (gen_block_move_real_i4 (func_addr_rtx));
return 1;
}
else if (! TARGET_SMALLCODE)
{
tree entry_name;
rtx func_addr_rtx;
int dwords;
rtx r4 = gen_rtx (REG, SImode, 4);
rtx r5 = gen_rtx (REG, SImode, 5);
rtx r6 = gen_rtx (REG, SImode, 6);
entry_name = get_identifier (bytes & 4
? "__movstr_i4_odd"
: "__movstr_i4_even");
func_addr_rtx
= copy_to_mode_reg (Pmode,
gen_rtx_SYMBOL_REF (Pmode,
IDENTIFIER_POINTER (entry_name)));
emit_insn (gen_move_insn (r4, XEXP (operands[0], 0)));
emit_insn (gen_move_insn (r5, XEXP (operands[1], 0)));
dwords = bytes >> 3;
emit_insn (gen_move_insn (r6, GEN_INT (dwords - 1)));
emit_insn (gen_block_lump_real_i4 (func_addr_rtx));
return 1;
}
else
return 0;
}
if (bytes < 64)
{
char entry[30];
@ -405,9 +494,17 @@ prepare_scc_operands (code)
|| TARGET_SH3E && GET_MODE_CLASS (mode) == MODE_FLOAT)
sh_compare_op1 = force_reg (mode, sh_compare_op1);
emit_insn (gen_rtx (SET, VOIDmode, t_reg,
gen_rtx (code, SImode, sh_compare_op0,
sh_compare_op1)));
if (TARGET_SH4 && GET_MODE_CLASS (mode) == MODE_FLOAT)
(mode == SFmode ? emit_sf_insn : emit_df_insn)
(gen_rtx (PARALLEL, VOIDmode, gen_rtvec (2,
gen_rtx (SET, VOIDmode, t_reg,
gen_rtx (code, SImode,
sh_compare_op0, sh_compare_op1)),
gen_rtx (USE, VOIDmode, get_fpscr_rtx ()))));
else
emit_insn (gen_rtx (SET, VOIDmode, t_reg,
gen_rtx (code, SImode, sh_compare_op0,
sh_compare_op1)));
return t_reg;
}
@ -443,7 +540,15 @@ from_compare (operands, code)
insn = gen_rtx (SET, VOIDmode,
gen_rtx (REG, SImode, 18),
gen_rtx (code, SImode, sh_compare_op0, sh_compare_op1));
emit_insn (insn);
if (TARGET_SH4 && GET_MODE_CLASS (mode) == MODE_FLOAT)
{
insn = gen_rtx (PARALLEL, VOIDmode,
gen_rtvec (2, insn,
gen_rtx (USE, VOIDmode, get_fpscr_rtx ())));
(mode == SFmode ? emit_sf_insn : emit_df_insn) (insn);
}
else
emit_insn (insn);
}
/* Functions to output assembly code. */
@ -1722,7 +1827,8 @@ static int pool_size;
/* Add a constant to the pool and return its label. */
static rtx
add_constant (x, mode)
add_constant (x, mode, last_value)
rtx last_value;
rtx x;
enum machine_mode mode;
{
@ -1741,13 +1847,27 @@ add_constant (x, mode)
continue;
}
if (rtx_equal_p (x, pool_vector[i].value))
return pool_vector[i].label;
{
lab = 0;
if (! last_value
|| ! i
|| ! rtx_equal_p (last_value, pool_vector[i-1].value))
{
lab = pool_vector[i].label;
if (! lab)
pool_vector[i].label = lab = gen_label_rtx ();
}
return lab;
}
}
}
/* Need a new one. */
pool_vector[pool_size].value = x;
lab = gen_label_rtx ();
if (last_value && rtx_equal_p (last_value, pool_vector[pool_size - 1].value))
lab = 0;
else
lab = gen_label_rtx ();
pool_vector[pool_size].mode = mode;
pool_vector[pool_size].label = lab;
pool_size++;
@ -1965,7 +2085,8 @@ find_barrier (num_mova, mova, from)
/* We must explicitly check the mode, because sometimes the
front end will generate code to load unsigned constants into
HImode targets without properly sign extending them. */
if (mode == HImode || (mode == SImode && hi_const (src)))
if (mode == HImode
|| (mode == SImode && hi_const (src) && REGNO (dst) != FPUL_REG))
{
found_hi += 2;
/* We put the short constants before the long constants, so
@ -2130,7 +2251,7 @@ sfunc_uses_reg (insn)
for (i = XVECLEN (pattern, 0) - 1; i >= 0; i--)
{
part = XVECEXP (pattern, 0, i);
if (part == reg_part)
if (part == reg_part || GET_CODE (part) == CLOBBER)
continue;
if (reg_mentioned_p (reg, ((GET_CODE (part) == SET
&& GET_CODE (SET_DEST (part)) == REG)
@ -2470,6 +2591,13 @@ gen_far_branch (bp)
}
else
jump = emit_jump_insn_after (gen_return (), insn);
/* Emit a barrier so that reorg knows that any following instructions
are not reachable via a fall-through path.
But don't do this when not optimizing, since we wouldn't supress the
alignment for the barrier then, and could end up with out-of-range
pc-relative loads. */
if (optimize)
emit_barrier_after (jump);
emit_label_after (bp->near_label, insn);
JUMP_LABEL (jump) = bp->far_label;
if (! invert_jump (insn, label))
@ -2556,36 +2684,42 @@ barrier_align (barrier_or_label)
if (! TARGET_SH3 || ! optimize)
return CACHE_LOG;
/* Check if there is an immediately preceding branch to the insn beyond
the barrier. We must weight the cost of discarding useful information
from the current cache line when executing this branch and there is
an alignment, against that of fetching unneeded insn in front of the
branch target when there is no alignment. */
/* PREV is presumed to be the JUMP_INSN for the barrier under
investigation. Skip to the insn before it. */
prev = prev_real_insn (prev);
for (slot = 2, credit = 1 << (CACHE_LOG - 2) + 2;
credit >= 0 && prev && GET_CODE (prev) == INSN;
prev = prev_real_insn (prev))
/* When fixing up pcloads, a constant table might be inserted just before
the basic block that ends with the barrier. Thus, we can't trust the
instruction lengths before that. */
if (mdep_reorg_phase > SH_FIXUP_PCLOAD)
{
if (GET_CODE (PATTERN (prev)) == USE
|| GET_CODE (PATTERN (prev)) == CLOBBER)
continue;
if (GET_CODE (PATTERN (prev)) == SEQUENCE)
prev = XVECEXP (PATTERN (prev), 0, 1);
if (slot &&
get_attr_in_delay_slot (prev) == IN_DELAY_SLOT_YES)
slot = 0;
credit -= get_attr_length (prev);
/* Check if there is an immediately preceding branch to the insn beyond
the barrier. We must weight the cost of discarding useful information
from the current cache line when executing this branch and there is
an alignment, against that of fetching unneeded insn in front of the
branch target when there is no alignment. */
/* PREV is presumed to be the JUMP_INSN for the barrier under
investigation. Skip to the insn before it. */
prev = prev_real_insn (prev);
for (slot = 2, credit = 1 << (CACHE_LOG - 2) + 2;
credit >= 0 && prev && GET_CODE (prev) == INSN;
prev = prev_real_insn (prev))
{
if (GET_CODE (PATTERN (prev)) == USE
|| GET_CODE (PATTERN (prev)) == CLOBBER)
continue;
if (GET_CODE (PATTERN (prev)) == SEQUENCE)
prev = XVECEXP (PATTERN (prev), 0, 1);
if (slot &&
get_attr_in_delay_slot (prev) == IN_DELAY_SLOT_YES)
slot = 0;
credit -= get_attr_length (prev);
}
if (prev
&& GET_CODE (prev) == JUMP_INSN
&& JUMP_LABEL (prev)
&& next_real_insn (JUMP_LABEL (prev)) == next_real_insn (barrier_or_label)
&& (credit - slot >= (GET_CODE (SET_SRC (PATTERN (prev))) == PC ? 2 : 0)))
return 0;
}
if (prev
&& GET_CODE (prev) == JUMP_INSN
&& JUMP_LABEL (prev)
&& next_real_insn (JUMP_LABEL (prev)) == next_real_insn (barrier_or_label)
&& (credit - slot >= (GET_CODE (SET_SRC (PATTERN (prev))) == PC ? 2 : 0)))
return 0;
return CACHE_LOG;
}
@ -2914,7 +3048,8 @@ machine_dependent_reorg (first)
dst = SET_DEST (pat);
mode = GET_MODE (dst);
if (mode == SImode && hi_const (src))
if (mode == SImode && hi_const (src)
&& REGNO (dst) != FPUL_REG)
{
int offset = 0;
@ -2929,7 +3064,7 @@ machine_dependent_reorg (first)
if (GET_CODE (dst) == REG
&& ((REGNO (dst) >= FIRST_FP_REG
&& REGNO (dst) <= LAST_FP_REG)
&& REGNO (dst) <= LAST_XD_REG)
|| REGNO (dst) == FPUL_REG))
{
if (last_float
@ -2943,7 +3078,8 @@ machine_dependent_reorg (first)
last_float_move = scan;
last_float = src;
newsrc = gen_rtx (MEM, mode,
(REGNO (dst) == FPUL_REG
((TARGET_SH4 && ! TARGET_FMOVD
|| REGNO (dst) == FPUL_REG)
? r0_inc_rtx
: r0_rtx));
last_float_addr = &XEXP (newsrc, 0);
@ -2983,6 +3119,16 @@ machine_dependent_reorg (first)
emit_insn_before (gen_use_sfunc_addr (reg), insn);
}
}
#if 0
/* fpscr is not actually a user variable, but we pretend it is for the
sake of the previous optimization passes, since we want it handled like
one. However, we don't have eny debugging information for it, so turn
it into a non-user variable now. */
if (TARGET_SH4)
REG_USERVAR_P (get_fpscr_rtx ()) = 0;
#endif
if (optimize)
sh_flag_remove_dead_before_cse = 1;
mdep_reorg_phase = SH_AFTER_MDEP_REORG;
}
@ -3386,8 +3532,16 @@ push (rn)
int rn;
{
rtx x;
if ((rn >= FIRST_FP_REG && rn <= LAST_FP_REG)
|| rn == FPUL_REG)
if (rn == FPUL_REG)
x = gen_push_fpul ();
else if (TARGET_SH4 && TARGET_FMOVD && ! TARGET_FPU_SINGLE
&& rn >= FIRST_FP_REG && rn <= LAST_XD_REG)
{
if ((rn - FIRST_FP_REG) & 1 && rn <= LAST_FP_REG)
return;
x = gen_push_4 (gen_rtx (REG, DFmode, rn));
}
else if (TARGET_SH3E && rn >= FIRST_FP_REG && rn <= LAST_FP_REG)
x = gen_push_e (gen_rtx (REG, SFmode, rn));
else
x = gen_push (gen_rtx (REG, SImode, rn));
@ -3404,8 +3558,16 @@ pop (rn)
int rn;
{
rtx x;
if ((rn >= FIRST_FP_REG && rn <= LAST_FP_REG)
|| rn == FPUL_REG)
if (rn == FPUL_REG)
x = gen_pop_fpul ();
else if (TARGET_SH4 && TARGET_FMOVD && ! TARGET_FPU_SINGLE
&& rn >= FIRST_FP_REG && rn <= LAST_XD_REG)
{
if ((rn - FIRST_FP_REG) & 1 && rn <= LAST_FP_REG)
return;
x = gen_pop_4 (gen_rtx (REG, DFmode, rn));
}
else if (TARGET_SH3E && rn >= FIRST_FP_REG && rn <= LAST_FP_REG)
x = gen_pop_e (gen_rtx (REG, SFmode, rn));
else
x = gen_pop (gen_rtx (REG, SImode, rn));
@ -3453,6 +3615,16 @@ calc_live_regs (count_ptr, live_regs_mask2)
int count;
*live_regs_mask2 = 0;
/* If we can save a lot of saves by switching to double mode, do that. */
if (TARGET_SH4 && TARGET_FMOVD && TARGET_FPU_SINGLE)
for (count = 0, reg = FIRST_FP_REG; reg <= LAST_FP_REG; reg += 2)
if (regs_ever_live[reg] && regs_ever_live[reg+1]
&& (! call_used_regs[reg] || (pragma_interrupt && ! pragma_trapa))
&& ++count > 2)
{
target_flags &= ~FPU_SINGLE_BIT;
break;
}
for (count = 0, reg = FIRST_PSEUDO_REGISTER - 1; reg >= 0; reg--)
{
if ((pragma_interrupt && ! pragma_trapa)
@ -3463,7 +3635,7 @@ calc_live_regs (count_ptr, live_regs_mask2)
&& regs_ever_live[PR_REG]))
&& reg != STACK_POINTER_REGNUM && reg != ARG_POINTER_REGNUM
&& reg != RETURN_ADDRESS_POINTER_REGNUM
&& reg != T_REG && reg != GBR_REG)
&& reg != T_REG && reg != GBR_REG && reg != FPSCR_REG)
: (/* Only push those regs which are used and need to be saved. */
regs_ever_live[reg] && ! call_used_regs[reg]))
{
@ -3472,6 +3644,24 @@ calc_live_regs (count_ptr, live_regs_mask2)
else
live_regs_mask |= 1 << reg;
count++;
if (TARGET_SH4 && TARGET_FMOVD && reg >= FIRST_FP_REG)
if (reg <= LAST_FP_REG)
{
if (! TARGET_FPU_SINGLE && ! regs_ever_live[reg ^ 1])
{
if (reg >= 32)
*live_regs_mask2 |= 1 << ((reg ^ 1) - 32);
else
live_regs_mask |= 1 << (reg ^ 1);
count++;
}
}
else if (reg <= LAST_XD_REG)
{
/* Must switch to double mode to access these registers. */
target_flags &= ~FPU_SINGLE_BIT;
count++;
}
}
}
@ -3487,6 +3677,7 @@ sh_expand_prologue ()
int live_regs_mask;
int d, i;
int live_regs_mask2;
int save_flags = target_flags;
int double_align = 0;
/* We have pretend args if we had an object sent partially in registers
@ -3524,11 +3715,19 @@ sh_expand_prologue ()
emit_insn (gen_sp_switch_1 ());
live_regs_mask = calc_live_regs (&d, &live_regs_mask2);
/* ??? Maybe we could save some switching if we can move a mode switch
that already happens to be at the function start into the prologue. */
if (target_flags != save_flags)
emit_insn (gen_toggle_sz ());
push_regs (live_regs_mask, live_regs_mask2);
if (target_flags != save_flags)
emit_insn (gen_toggle_sz ());
if (TARGET_ALIGN_DOUBLE && d & 1)
double_align = 4;
target_flags = save_flags;
output_stack_adjust (-get_frame_size () - double_align,
stack_pointer_rtx, 3);
@ -3543,6 +3742,7 @@ sh_expand_epilogue ()
int d, i;
int live_regs_mask2;
int save_flags = target_flags;
int frame_size = get_frame_size ();
live_regs_mask = calc_live_regs (&d, &live_regs_mask2);
@ -3573,7 +3773,8 @@ sh_expand_epilogue ()
/* Pop all the registers. */
live_regs_mask = calc_live_regs (&d, &live_regs_mask2);
if (target_flags != save_flags)
emit_insn (gen_toggle_sz ());
if (live_regs_mask & (1 << PR_REG))
pop (PR_REG);
for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
@ -3584,6 +3785,9 @@ sh_expand_epilogue ()
else if (j >= 32 && (live_regs_mask2 & (1 << (j - 32))))
pop (j);
}
if (target_flags != save_flags)
emit_insn (gen_toggle_sz ());
target_flags = save_flags;
output_stack_adjust (extra_push + current_function_pretend_args_size,
stack_pointer_rtx, 7);
@ -3651,6 +3855,25 @@ sh_builtin_saveregs (arglist)
emit_move_insn (fpregs, XEXP (regbuf, 0));
emit_insn (gen_addsi3 (fpregs, fpregs,
GEN_INT (n_floatregs * UNITS_PER_WORD)));
if (TARGET_SH4)
{
for (regno = NPARM_REGS (DFmode) - 2; regno >= first_floatreg; regno -= 2)
{
emit_insn (gen_addsi3 (fpregs, fpregs,
GEN_INT (-2 * UNITS_PER_WORD)));
emit_move_insn (gen_rtx (MEM, DFmode, fpregs),
gen_rtx (REG, DFmode, BASE_ARG_REG (DFmode) + regno));
}
regno = first_floatreg;
if (regno & 1)
{
emit_insn (gen_addsi3 (fpregs, fpregs, GEN_INT (- UNITS_PER_WORD)));
emit_move_insn (gen_rtx (MEM, SFmode, fpregs),
gen_rtx (REG, SFmode, BASE_ARG_REG (SFmode) + regno
- (TARGET_LITTLE_ENDIAN != 0)));
}
}
else
for (regno = NPARM_REGS (SFmode) - 1; regno >= first_floatreg; regno--)
{
emit_insn (gen_addsi3 (fpregs, fpregs, GEN_INT (- UNITS_PER_WORD)));
@ -3677,6 +3900,8 @@ initial_elimination_offset (from, to)
int live_regs_mask, live_regs_mask2;
live_regs_mask = calc_live_regs (&regs_saved, &live_regs_mask2);
if (TARGET_ALIGN_DOUBLE && regs_saved & 1)
total_auto_space += 4;
target_flags = save_flags;
total_saved_regs_space = (regs_saved) * 4;
@ -3885,12 +4110,48 @@ arith_reg_operand (op, mode)
else
return 1;
return (regno != T_REG && regno != PR_REG && regno != FPUL_REG
return (regno != T_REG && regno != PR_REG
&& (regno != FPUL_REG || TARGET_SH4)
&& regno != MACH_REG && regno != MACL_REG);
}
return 0;
}
int
fp_arith_reg_operand (op, mode)
rtx op;
enum machine_mode mode;
{
if (register_operand (op, mode))
{
int regno;
if (GET_CODE (op) == REG)
regno = REGNO (op);
else if (GET_CODE (op) == SUBREG && GET_CODE (SUBREG_REG (op)) == REG)
regno = REGNO (SUBREG_REG (op));
else
return 1;
return (regno != T_REG && regno != PR_REG && regno > 15
&& regno != MACH_REG && regno != MACL_REG);
}
return 0;
}
int
fp_extended_operand (op, mode)
rtx op;
enum machine_mode mode;
{
if (GET_CODE (op) == FLOAT_EXTEND && GET_MODE (op) == mode)
{
op = XEXP (op, 0);
mode = GET_MODE (op);
}
return fp_arith_reg_operand (op, mode);
}
/* Returns 1 if OP is a valid source operand for an arithmetic insn. */
int
@ -3991,6 +4252,73 @@ braf_label_ref_operand(op, mode)
if (GET_CODE (prev) != PLUS || XEXP (prev, 1) != op)
return 0;
}
int
tertiary_reload_operand (op, mode)
rtx op;
enum machine_mode mode;
{
enum rtx_code code = GET_CODE (op);
return code == MEM || (TARGET_SH4 && code == CONST_DOUBLE);
}
int
fpscr_operand (op)
rtx op;
{
return (GET_CODE (op) == REG && REGNO (op) == FPSCR_REG
&& GET_MODE (op) == PSImode);
}
int
commutative_float_operator (op, mode)
rtx op;
enum machine_mode mode;
{
if (GET_MODE (op) != mode)
return 0;
switch (GET_CODE (op))
{
case PLUS:
case MULT:
return 1;
}
return 0;
}
int
noncommutative_float_operator (op, mode)
rtx op;
enum machine_mode mode;
{
if (GET_MODE (op) != mode)
return 0;
switch (GET_CODE (op))
{
case MINUS:
case DIV:
return 1;
}
return 0;
}
int
binary_float_operator (op, mode)
rtx op;
enum machine_mode mode;
{
if (GET_MODE (op) != mode)
return 0;
switch (GET_CODE (op))
{
case PLUS:
case MINUS:
case MULT:
case DIV:
return 1;
}
return 0;
}
/* Return the destination address of a branch. */
@ -4102,3 +4430,304 @@ reg_unused_after (reg, insn)
}
return 1;
}
extern struct obstack permanent_obstack;
rtx
get_fpscr_rtx ()
{
static rtx fpscr_rtx;
if (! fpscr_rtx)
{
push_obstacks (&permanent_obstack, &permanent_obstack);
fpscr_rtx = gen_rtx (REG, PSImode, 48);
REG_USERVAR_P (fpscr_rtx) = 1;
pop_obstacks ();
mark_user_reg (fpscr_rtx);
}
if (! reload_completed || mdep_reorg_phase != SH_AFTER_MDEP_REORG)
mark_user_reg (fpscr_rtx);
return fpscr_rtx;
}
void
emit_sf_insn (pat)
rtx pat;
{
rtx addr;
/* When generating reload insns, we must not create new registers. FPSCR
should already have the correct value, so do nothing to change it. */
if (! TARGET_FPU_SINGLE && ! reload_in_progress)
{
addr = gen_reg_rtx (SImode);
emit_insn (gen_fpu_switch0 (addr));
}
emit_insn (pat);
if (! TARGET_FPU_SINGLE && ! reload_in_progress)
{
addr = gen_reg_rtx (SImode);
emit_insn (gen_fpu_switch1 (addr));
}
}
void
emit_df_insn (pat)
rtx pat;
{
rtx addr;
if (TARGET_FPU_SINGLE && ! reload_in_progress)
{
addr = gen_reg_rtx (SImode);
emit_insn (gen_fpu_switch0 (addr));
}
emit_insn (pat);
if (TARGET_FPU_SINGLE && ! reload_in_progress)
{
addr = gen_reg_rtx (SImode);
emit_insn (gen_fpu_switch1 (addr));
}
}
void
expand_sf_unop (fun, operands)
rtx (*fun)();
rtx *operands;
{
emit_sf_insn ((*fun) (operands[0], operands[1], get_fpscr_rtx ()));
}
void
expand_sf_binop (fun, operands)
rtx (*fun)();
rtx *operands;
{
emit_sf_insn ((*fun) (operands[0], operands[1], operands[2],
get_fpscr_rtx ()));
}
void
expand_df_unop (fun, operands)
rtx (*fun)();
rtx *operands;
{
emit_df_insn ((*fun) (operands[0], operands[1], get_fpscr_rtx ()));
}
void
expand_df_binop (fun, operands)
rtx (*fun)();
rtx *operands;
{
emit_df_insn ((*fun) (operands[0], operands[1], operands[2],
get_fpscr_rtx ()));
}
void
expand_fp_branch (compare, branch)
rtx (*compare) (), (*branch) ();
{
(GET_MODE (sh_compare_op0) == SFmode ? emit_sf_insn : emit_df_insn)
((*compare) ());
emit_jump_insn ((*branch) ());
}
/* We don't want to make fpscr call-saved, because that would prevent
channging it, and it would also cost an exstra instruction to save it.
We don't want it to be known as a global register either, because
that disables all flow analysis. But it has to be live at the function
return. Thus, we need to insert a USE at the end of the function. */
/* This should best be called at about the time FINALIZE_PIC is called,
but not dependent on flag_pic. Alas, there is no suitable hook there,
so this gets called from HAVE_RETURN. */
int
emit_fpscr_use ()
{
static int fpscr_uses = 0;
if (rtx_equal_function_value_matters)
{
emit_insn (gen_rtx (USE, VOIDmode, get_fpscr_rtx ()));
fpscr_uses++;
}
else
{
if (fpscr_uses > 1)
{
/* Due to he crude way we emit the USEs, we might end up with
some extra ones. Delete all but the last one. */
rtx insn;
for (insn = get_last_insn(); insn; insn = PREV_INSN (insn))
if (GET_CODE (insn) == INSN
&& GET_CODE (PATTERN (insn)) == USE
&& GET_CODE (XEXP (PATTERN (insn), 0)) == REG
&& REGNO (XEXP (PATTERN (insn), 0)) == FPSCR_REG)
{
insn = PREV_INSN (insn);
break;
}
for (; insn; insn = PREV_INSN (insn))
if (GET_CODE (insn) == INSN
&& GET_CODE (PATTERN (insn)) == USE
&& GET_CODE (XEXP (PATTERN (insn), 0)) == REG
&& REGNO (XEXP (PATTERN (insn), 0)) == FPSCR_REG)
{
PUT_CODE (insn, NOTE);
NOTE_LINE_NUMBER (insn) = NOTE_INSN_DELETED;
NOTE_SOURCE_FILE (insn) = 0;
}
}
fpscr_uses = 0;
}
}
/* ??? gcc does flow analysis strictly after common subexpression
elimination. As a result, common subespression elimination fails
when there are some intervening statements setting the same register.
If we did nothing about this, this would hurt the precision switching
for SH4 badly. There is some cse after reload, but it is unable to
undo the extra register pressure from the unused instructions, and
it cannot remove auto-increment loads.
A C code example that shows this flow/cse weakness for (at least) SH
and sparc (as of gcc ss-970706) is this:
double
f(double a)
{
double d;
d = 0.1;
a += d;
d = 1.1;
d = 0.1;
a *= d;
return a;
}
So we add another pass before common subexpression elimination, to
remove assignments that are dead due to a following assignment in the
same basic block. */
int sh_flag_remove_dead_before_cse;
static void
mark_use (x, reg_set_block)
rtx x, *reg_set_block;
{
enum rtx_code code;
if (! x)
return;
code = GET_CODE (x);
switch (code)
{
case REG:
{
int regno = REGNO (x);
int nregs = (regno < FIRST_PSEUDO_REGISTER
? HARD_REGNO_NREGS (regno, GET_MODE (x))
: 1);
do
{
reg_set_block[regno + nregs - 1] = 0;
}
while (--nregs);
break;
}
case SET:
{
rtx dest = SET_DEST (x);
if (GET_CODE (dest) == SUBREG)
dest = SUBREG_REG (dest);
if (GET_CODE (dest) != REG)
mark_use (dest, reg_set_block);
mark_use (SET_SRC (x), reg_set_block);
break;
}
case CLOBBER:
break;
default:
{
char *fmt = GET_RTX_FORMAT (code);
int i, j;
for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
{
if (fmt[i] == 'e')
mark_use (XEXP (x, i), reg_set_block);
else if (fmt[i] == 'E')
for (j = XVECLEN (x, i) - 1; j >= 0; j--)
mark_use (XVECEXP (x, i, j), reg_set_block);
}
break;
}
}
}
int
remove_dead_before_cse ()
{
rtx *reg_set_block, last, last_call, insn, set;
int in_libcall = 0;
/* This pass should run just once, after rtl generation. */
if (! sh_flag_remove_dead_before_cse
|| rtx_equal_function_value_matters
|| reload_completed)
return;
sh_flag_remove_dead_before_cse = 0;
reg_set_block = (rtx *)alloca (max_reg_num () * sizeof (rtx));
bzero ((char *)reg_set_block, max_reg_num () * sizeof (rtx));
last_call = last = get_last_insn ();
for (insn = last; insn; insn = PREV_INSN (insn))
{
if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
continue;
if (GET_CODE (insn) == JUMP_INSN)
{
last_call = last = insn;
continue;
}
set = single_set (insn);
/* Don't delete parts of libcalls, since that would confuse cse, loop
and flow. */
if (find_reg_note (insn, REG_RETVAL, NULL_RTX))
in_libcall = 1;
else if (in_libcall)
{
if (find_reg_note (insn, REG_LIBCALL, NULL_RTX))
in_libcall = 0;
}
else if (set && GET_CODE (SET_DEST (set)) == REG)
{
int regno = REGNO (SET_DEST (set));
rtx ref_insn = (regno < FIRST_PSEUDO_REGISTER && call_used_regs[regno]
? last_call
: last);
if (reg_set_block[regno] == ref_insn
&& (regno >= FIRST_PSEUDO_REGISTER
|| HARD_REGNO_NREGS (regno, GET_MODE (SET_DEST (set))) == 1)
&& (GET_CODE (insn) != CALL_INSN || CONST_CALL_P (insn)))
{
PUT_CODE (insn, NOTE);
NOTE_LINE_NUMBER (insn) = NOTE_INSN_DELETED;
NOTE_SOURCE_FILE (insn) = 0;
continue;
}
else
reg_set_block[REGNO (SET_DEST (set))] = ref_insn;
}
if (GET_CODE (insn) == CALL_INSN)
{
last_call = insn;
mark_use (CALL_INSN_FUNCTION_USAGE (insn), reg_set_block);
}
mark_use (PATTERN (insn), reg_set_block);
}
return 0;
}

View File

@ -1,5 +1,5 @@
/* Definitions of target machine for GNU compiler for Hitachi Super-H.
Copyright (C) 1993, 1994, 1995, 1996, 1997 Free Software Foundation, Inc.
Copyright (C) 1993-1998 Free Software Foundation, Inc.
Contributed by Steve Chamberlain (sac@cygnus.com).
Improved by Jim Wilson (wilson@cygnus.com).
@ -43,7 +43,10 @@ extern int code_for_indirect_jump_scratch;
%{m2:-D__sh2__} \
%{m3:-D__sh3__} \
%{m3e:-D__SH3E__} \
%{!m1:%{!m2:%{!m3:%{!m3e:-D__sh1__}}}}"
%{m4-single-only:-D__SH4_SINGLE_ONLY__} \
%{m4-single:-D__SH4_SINGLE__} \
%{m4:-D__SH4__} \
%{!m1:%{!m2:%{!m3:%{!m3e:%{!m4:%{!m4-single:%{!m4-single-only:-D__sh1__}}}}}}}"
#define CPP_PREDEFINES "-D__sh__ -Acpu(sh) -Amachine(sh)"
@ -54,19 +57,28 @@ extern int code_for_indirect_jump_scratch;
/* We can not debug without a frame pointer. */
/* #define CAN_DEBUG_WITHOUT_FP */
#define CONDITIONAL_REGISTER_USAGE \
if (! TARGET_SH3E) \
{ \
int regno; \
for (regno = FIRST_FP_REG; regno <= LAST_FP_REG; regno++) \
fixed_regs[regno] = call_used_regs[regno] = 1; \
fixed_regs[FPUL_REG] = call_used_regs[FPUL_REG] = 1; \
} \
/* Hitachi saves and restores mac registers on call. */ \
if (TARGET_HITACHI) \
{ \
call_used_regs[MACH_REG] = 0; \
call_used_regs[MACL_REG] = 0; \
#define CONDITIONAL_REGISTER_USAGE \
if (! TARGET_SH4 || ! TARGET_FMOVD) \
{ \
int regno; \
for (regno = FIRST_XD_REG; regno <= LAST_XD_REG; regno++) \
fixed_regs[regno] = call_used_regs[regno] = 1; \
if (! TARGET_SH4) \
{ \
if (! TARGET_SH3E) \
{ \
int regno; \
for (regno = FIRST_FP_REG; regno <= LAST_FP_REG; regno++) \
fixed_regs[regno] = call_used_regs[regno] = 1; \
fixed_regs[FPUL_REG] = call_used_regs[FPUL_REG] = 1; \
} \
} \
} \
/* Hitachi saves and restores mac registers on call. */ \
if (TARGET_HITACHI) \
{ \
call_used_regs[MACH_REG] = 0; \
call_used_regs[MACL_REG] = 0; \
}
/* ??? Need to write documentation for all SH options and add it to the
@ -81,6 +93,10 @@ extern int target_flags;
#define SH2_BIT (1<<9)
#define SH3_BIT (1<<10)
#define SH3E_BIT (1<<11)
#define HARD_SH4_BIT (1<<5)
#define FPU_SINGLE_BIT (1<<7)
#define SH4_BIT (1<<12)
#define FMOVD_BIT (1<<4)
#define SPACE_BIT (1<<13)
#define BIGTABLE_BIT (1<<14)
#define RELAX_BIT (1<<15)
@ -107,6 +123,27 @@ extern int target_flags;
/* Nonzero if we should generate code using type 3E insns. */
#define TARGET_SH3E (target_flags & SH3E_BIT)
/* Nonzero if the cache line size is 32. */
#define TARGET_CACHE32 (target_flags & HARD_SH4_BIT)
/* Nonzero if we schedule for a superscalar implementation. */
#define TARGET_SUPERSCALAR (target_flags & HARD_SH4_BIT)
/* Nonzero if the target has separate instruction and data caches. */
#define TARGET_HARWARD (target_flags & HARD_SH4_BIT)
/* Nonzero if compiling for SH4 hardware (to be used for insn costs etc.) */
#define TARGET_HARD_SH4 (target_flags & HARD_SH4_BIT)
/* Nonzero if the default precision of th FPU is single */
#define TARGET_FPU_SINGLE (target_flags & FPU_SINGLE_BIT)
/* Nonzero if we should generate code using type 4 insns. */
#define TARGET_SH4 (target_flags & SH4_BIT)
/* Nonzero if we should generate fmovd. */
#define TARGET_FMOVD (target_flags & FMOVD_BIT)
/* Nonzero if we respect NANs. */
#define TARGET_IEEE (target_flags & IEEE_BIT)
@ -137,10 +174,14 @@ extern int target_flags;
{ {"1", SH1_BIT}, \
{"2", SH2_BIT}, \
{"3", SH3_BIT|SH2_BIT}, \
{"3e", SH3E_BIT|SH3_BIT|SH2_BIT}, \
{"3e", SH3E_BIT|SH3_BIT|SH2_BIT|FPU_SINGLE_BIT}, \
{"4-single-only", SH3E_BIT|SH3_BIT|SH2_BIT|SH3E_BIT|HARD_SH4_BIT|FPU_SINGLE_BIT}, \
{"4-single", SH4_BIT|SH3E_BIT|SH3_BIT|SH2_BIT|HARD_SH4_BIT|FPU_SINGLE_BIT},\
{"4", SH4_BIT|SH3E_BIT|SH3_BIT|SH2_BIT|HARD_SH4_BIT}, \
{"b", -LITTLE_ENDIAN_BIT}, \
{"bigtable", BIGTABLE_BIT}, \
{"dalign", DALIGN_BIT}, \
{"fmovd", FMOVD_BIT}, \
{"hitachi", HITACHI_BIT}, \
{"ieee", IEEE_BIT}, \
{"isize", ISIZE_BIT}, \
@ -160,26 +201,58 @@ extern int target_flags;
#define OPTIMIZATION_OPTIONS(LEVEL,SIZE) \
do { \
if (LEVEL) \
flag_omit_frame_pointer = -1; \
if (LEVEL) \
sh_flag_remove_dead_before_cse = 1; \
if (SIZE) \
target_flags |= SPACE_BIT; \
} while (0)
#define ASSEMBLER_DIALECT 0 /* will allow to distinguish b[tf].s and b[tf]/s . */
#define OVERRIDE_OPTIONS \
do { \
sh_cpu = CPU_SH1; \
if (TARGET_SH2) \
sh_cpu = CPU_SH2; \
if (TARGET_SH3) \
sh_cpu = CPU_SH3; \
if (TARGET_SH3E) \
sh_cpu = CPU_SH3E; \
\
/* Never run scheduling before reload, since that can \
break global alloc, and generates slower code anyway due \
to the pressure on R0. */ \
flag_schedule_insns = 0; \
sh_addr_diff_vec_mode = TARGET_BIGTABLE ? SImode : HImode; \
#define ASSEMBLER_DIALECT assembler_dialect
extern int assembler_dialect;
#define OVERRIDE_OPTIONS \
do { \
sh_cpu = CPU_SH1; \
assembler_dialect = 0; \
if (TARGET_SH2) \
sh_cpu = CPU_SH2; \
if (TARGET_SH3) \
sh_cpu = CPU_SH3; \
if (TARGET_SH3E) \
sh_cpu = CPU_SH3E; \
if (TARGET_SH4) \
{ \
assembler_dialect = 1; \
sh_cpu = CPU_SH4; \
} \
if (! TARGET_SH4 || ! TARGET_FMOVD) \
{ \
/* Prevent usage of explicit register names for variables \
for registers not present / not addressable in the \
target architecture. */ \
int regno; \
for (regno = (TARGET_SH3E) ? 17 : 0; \
regno <= 24; regno++) \
fp_reg_names[regno][0] = 0; \
} \
if (flag_omit_frame_pointer < 0) \
/* The debugging information is sufficient, \
but gdb doesn't implement this yet */ \
if (0) \
flag_omit_frame_pointer \
= (PREFERRED_DEBUGGING_TYPE == DWARF_DEBUG \
|| PREFERRED_DEBUGGING_TYPE == DWARF2_DEBUG); \
else \
flag_omit_frame_pointer = 0; \
\
/* Never run scheduling before reload, since that can \
break global alloc, and generates slower code anyway due \
to the pressure on R0. */ \
flag_schedule_insns = 0; \
sh_addr_diff_vec_mode = TARGET_BIGTABLE ? SImode : HImode; \
} while (0)
/* Target machine storage layout. */
@ -233,7 +306,7 @@ do { \
/* The log (base 2) of the cache line size, in bytes. Processors prior to
SH3 have no actual cache, but they fetch code in chunks of 4 bytes. */
#define CACHE_LOG (TARGET_SH3 ? 4 : 2)
#define CACHE_LOG (TARGET_CACHE32 ? 5 : TARGET_SH3 ? 4 : 2)
/* Allocation boundary (in *bits*) for the code of a function.
32 bit alignment is faster, because instructions are always fetched as a
@ -279,7 +352,7 @@ do { \
barrier_align (LABEL_AFTER_BARRIER)
#define LOOP_ALIGN(A_LABEL) \
((! optimize || TARGET_SMALLCODE) ? 0 : 2)
((! optimize || TARGET_HARWARD || TARGET_SMALLCODE) ? 0 : 2)
#define LABEL_ALIGN(A_LABEL) \
( \
@ -341,8 +414,11 @@ do { \
#define RAP_REG 23
#define FIRST_FP_REG 24
#define LAST_FP_REG 39
#define FIRST_XD_REG 40
#define LAST_XD_REG 47
#define FPSCR_REG 48
#define FIRST_PSEUDO_REGISTER 40
#define FIRST_PSEUDO_REGISTER 49
/* 1 for registers that have pervasive standard uses
and are not available for the register allocator.
@ -361,6 +437,9 @@ do { \
0, 0, 0, 0, \
0, 0, 0, 0, \
0, 0, 0, 0, \
0, 0, 0, 0, \
0, 0, 0, 0, \
1, \
}
/* 1 for registers not available across function calls.
@ -381,6 +460,9 @@ do { \
1, 1, 1, 1, \
1, 1, 1, 1, \
0, 0, 0, 0, \
1, 1, 1, 1, \
1, 1, 0, 0, \
1, \
}
/* Return number of consecutive hard regs needed starting at reg REGNO
@ -388,20 +470,39 @@ do { \
This is ordinarily the length in words of a value of mode MODE
but can be less for certain modes in special long registers.
On the SH regs are UNITS_PER_WORD bits wide. */
On the SH all but the XD regs are UNITS_PER_WORD bits wide. */
#define HARD_REGNO_NREGS(REGNO, MODE) \
(((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD))
((REGNO) >= FIRST_XD_REG && (REGNO) <= LAST_XD_REG \
? (GET_MODE_SIZE (MODE) / (2 * UNITS_PER_WORD)) \
: ((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)) \
/* Value is 1 if hard register REGNO can hold a value of machine-mode MODE.
We can allow any mode in any general register. The special registers
only allow SImode. Don't allow any mode in the PR. */
/* We cannot hold DCmode values in the XD registers because alter_reg
handles subregs of them incorrectly. We could work around this by
spacing the XD registers like the DR registers, but this would require
additional memory in every compilation to hold larger register vectors.
We could hold SFmode / SCmode values in XD registers, but that
would require a tertiary reload when reloading from / to memory,
and a secondary reload to reload from / to general regs; that
seems to be a loosing proposition. */
#define HARD_REGNO_MODE_OK(REGNO, MODE) \
(SPECIAL_REG (REGNO) ? (MODE) == SImode \
: (REGNO) == FPUL_REG ? (MODE) == SImode || (MODE) == SFmode \
: (REGNO) >= FIRST_FP_REG && (REGNO) <= LAST_FP_REG ? (MODE) == SFmode \
: (REGNO) >= FIRST_FP_REG && (REGNO) <= LAST_FP_REG && (MODE) == SFmode \
? 1 \
: (REGNO) >= FIRST_FP_REG && (REGNO) <= LAST_FP_REG \
? ((MODE) == SFmode \
|| (TARGET_SH3E && (MODE) == SCmode) \
|| (((TARGET_SH4 && (MODE) == DFmode) || (MODE) == DCmode) \
&& (((REGNO) - FIRST_FP_REG) & 1) == 0)) \
: (REGNO) >= FIRST_XD_REG && (REGNO) <= LAST_XD_REG \
? (MODE) == DFmode \
: (REGNO) == PR_REG ? 0 \
: (REGNO) == FPSCR_REG ? (MODE) == PSImode \
: 1)
/* Value is 1 if it is a good idea to tie two pseudo registers
@ -541,6 +642,8 @@ enum reg_class
GENERAL_REGS,
FP0_REGS,
FP_REGS,
DF_REGS,
FPSCR_REGS,
GENERAL_FP_REGS,
ALL_REGS,
LIM_REG_CLASSES
@ -560,6 +663,8 @@ enum reg_class
"GENERAL_REGS", \
"FP0_REGS", \
"FP_REGS", \
"DF_REGS", \
"FPSCR_REGS", \
"GENERAL_FP_REGS", \
"ALL_REGS", \
}
@ -579,8 +684,10 @@ enum reg_class
{ 0x0081FFFF, 0x00000000 }, /* GENERAL_REGS */ \
{ 0x01000000, 0x00000000 }, /* FP0_REGS */ \
{ 0xFF000000, 0x000000FF }, /* FP_REGS */ \
{ 0xFF81FFFF, 0x000000FF }, /* GENERAL_FP_REGS */ \
{ 0xFFFFFFFF, 0x000000FF }, /* ALL_REGS */ \
{ 0xFF000000, 0x0000FFFF }, /* DF_REGS */ \
{ 0x00000000, 0x00010000 }, /* FPSCR_REGS */ \
{ 0xFF81FFFF, 0x0000FFFF }, /* GENERAL_FP_REGS */ \
{ 0xFFFFFFFF, 0x0001FFFF }, /* ALL_REGS */ \
}
/* The same information, inverted:
@ -603,6 +710,7 @@ extern int regno_reg_class[];
spilled or used otherwise, we better have the FP_REGS allocated first. */
#define REG_ALLOC_ORDER \
{ 25,26,27,28,29,30,31,24,32,33,34,35,36,37,38,39, \
40,41,42,43,44,45,46,47,48, \
1,2,3,7,6,5,4,0,8,9,10,11,12,13,14, \
22,15,16,17,18,19,20,21,23 }
@ -657,7 +765,8 @@ extern enum reg_class reg_class_from_letter[];
#define PREFERRED_RELOAD_CLASS(X, CLASS) (CLASS)
#define SECONDARY_OUTPUT_RELOAD_CLASS(CLASS,MODE,X) \
((((((CLASS) == FP_REGS || (CLASS) == FP0_REGS) \
((((((CLASS) == FP_REGS || (CLASS) == FP0_REGS \
|| (CLASS) == DF_REGS) \
&& (GET_CODE (X) == REG && REGNO (X) <= AP_REG)) \
|| (((CLASS) == GENERAL_REGS || (CLASS) == R0_REGS) \
&& GET_CODE (X) == REG \
@ -666,7 +775,7 @@ extern enum reg_class reg_class_from_letter[];
? FPUL_REGS \
: ((CLASS) == FPUL_REGS \
&& (GET_CODE (X) == MEM \
|| GET_CODE (X) == REG && REGNO (X) >= FIRST_PSEUDO_REGISTER))\
|| (GET_CODE (X) == REG && REGNO (X) >= FIRST_PSEUDO_REGISTER)))\
? GENERAL_REGS \
: (((CLASS) == MAC_REGS || (CLASS) == PR_REGS) \
&& GET_CODE (X) == REG && REGNO (X) > 15 \
@ -674,10 +783,19 @@ extern enum reg_class reg_class_from_letter[];
? GENERAL_REGS : NO_REGS)
#define SECONDARY_INPUT_RELOAD_CLASS(CLASS,MODE,X) \
((((CLASS) == FP_REGS || (CLASS) == FP0_REGS) \
((((CLASS) == FP_REGS || (CLASS) == FP0_REGS || (CLASS) == DF_REGS) \
&& immediate_operand ((X), (MODE)) \
&& ! (fp_zero_operand (X) || fp_one_operand (X))) \
? R0_REGS : SECONDARY_OUTPUT_RELOAD_CLASS((CLASS),(MODE),(X)))
&& ! ((fp_zero_operand (X) || fp_one_operand (X)) && (MODE) == SFmode))\
? R0_REGS \
: CLASS == FPUL_REGS && immediate_operand ((X), (MODE)) \
? (GET_CODE (X) == CONST_INT && CONST_OK_FOR_I (INTVAL (X)) \
? GENERAL_REGS \
: R0_REGS) \
: (CLASS == FPSCR_REGS \
&& ((GET_CODE (X) == REG && REGNO (X) >= FIRST_PSEUDO_REGISTER) \
|| GET_CODE (X) == MEM && GET_CODE (XEXP ((X), 0)) == PLUS)) \
? GENERAL_REGS \
: SECONDARY_OUTPUT_RELOAD_CLASS((CLASS),(MODE),(X)))
/* Return the maximum number of consecutive registers
needed to represent mode MODE in a register of class CLASS.
@ -685,6 +803,11 @@ extern enum reg_class reg_class_from_letter[];
On SH this is the size of MODE in words. */
#define CLASS_MAX_NREGS(CLASS, MODE) \
((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
/* If defined, gives a class of registers that cannot be used as the
operand of a SUBREG that changes the size of the object. */
#define CLASS_CANNOT_CHANGE_SIZE DF_REGS
/* Stack layout; function entry, exit and calling. */
@ -694,6 +817,9 @@ extern enum reg_class reg_class_from_letter[];
#define NPARM_REGS(MODE) \
(TARGET_SH3E && (MODE) == SFmode \
? 8 \
: TARGET_SH4 && (GET_MODE_CLASS (MODE) == MODE_FLOAT \
|| GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT) \
? 8 \
: 4)
#define FIRST_PARM_REG 4
@ -752,25 +878,48 @@ extern enum reg_class reg_class_from_letter[];
#define BASE_RETURN_VALUE_REG(MODE) \
((TARGET_SH3E && ((MODE) == SFmode)) \
? FIRST_FP_RET_REG \
: TARGET_SH3E && (MODE) == SCmode \
? FIRST_FP_RET_REG \
: (TARGET_SH4 \
&& ((MODE) == DFmode || (MODE) == SFmode \
|| (MODE) == DCmode || (MODE) == SCmode )) \
? FIRST_FP_RET_REG \
: FIRST_RET_REG)
#define BASE_ARG_REG(MODE) \
((TARGET_SH3E && ((MODE) == SFmode)) \
? FIRST_FP_PARM_REG \
: TARGET_SH4 && (GET_MODE_CLASS (MODE) == MODE_FLOAT \
|| GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT)\
? FIRST_FP_PARM_REG \
: FIRST_PARM_REG)
/* Define how to find the value returned by a function.
VALTYPE is the data type of the value (as a tree).
If the precise function being called is known, FUNC is its FUNCTION_DECL;
otherwise, FUNC is 0. */
otherwise, FUNC is 0.
For the SH, this is like LIBCALL_VALUE, except that we must change the
mode like PROMOTE_MODE does.
??? PROMOTE_MODE is ignored for non-scalar types. The set of types
tested here has to be kept in sync with the one in explow.c:promote_mode. */
#define FUNCTION_VALUE(VALTYPE, FUNC) \
LIBCALL_VALUE (TYPE_MODE (VALTYPE))
#define FUNCTION_VALUE(VALTYPE, FUNC) \
gen_rtx (REG, \
((GET_MODE_CLASS (TYPE_MODE (VALTYPE)) == MODE_INT \
&& GET_MODE_SIZE (TYPE_MODE (VALTYPE)) < UNITS_PER_WORD \
&& (TREE_CODE (VALTYPE) == INTEGER_TYPE \
|| TREE_CODE (VALTYPE) == ENUMERAL_TYPE \
|| TREE_CODE (VALTYPE) == BOOLEAN_TYPE \
|| TREE_CODE (VALTYPE) == CHAR_TYPE \
|| TREE_CODE (VALTYPE) == REAL_TYPE \
|| TREE_CODE (VALTYPE) == OFFSET_TYPE)) \
? SImode : TYPE_MODE (VALTYPE)), \
BASE_RETURN_VALUE_REG (TYPE_MODE (VALTYPE)))
/* Define how to find the value returned by a library function
assuming the value has mode MODE. */
#define LIBCALL_VALUE(MODE) \
gen_rtx (REG, (MODE), BASE_RETURN_VALUE_REG (MODE));
gen_rtx (REG, (MODE), BASE_RETURN_VALUE_REG (MODE))
/* 1 if N is a possible register number for a function value. */
#define FUNCTION_VALUE_REGNO_P(REGNO) \
@ -801,7 +950,11 @@ struct sh_args {
#define CUMULATIVE_ARGS struct sh_args
#define GET_SH_ARG_CLASS(MODE) \
((TARGET_SH3E && ((MODE) == SFmode)) ? SH_ARG_FLOAT : SH_ARG_INT)
((TARGET_SH3E && (MODE) == SFmode) \
? SH_ARG_FLOAT \
: TARGET_SH4 && (GET_MODE_CLASS (MODE) == MODE_FLOAT \
|| GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT) \
? SH_ARG_FLOAT : SH_ARG_INT)
#define ROUND_ADVANCE(SIZE) \
(((SIZE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
@ -813,7 +966,9 @@ struct sh_args {
round doubles to even regs when asked to explicitly. */
#define ROUND_REG(CUM, MODE) \
((TARGET_ALIGN_DOUBLE \
(((TARGET_ALIGN_DOUBLE \
|| (TARGET_SH4 && ((MODE) == DFmode || (MODE) == DCmode) \
&& (CUM).arg_count[(int) SH_ARG_FLOAT] < NPARM_REGS (MODE)))\
&& GET_MODE_UNIT_SIZE ((MODE)) > UNITS_PER_WORD) \
? ((CUM).arg_count[(int) GET_SH_ARG_CLASS (MODE)] \
+ ((CUM).arg_count[(int) GET_SH_ARG_CLASS (MODE)] & 1)) \
@ -838,11 +993,12 @@ struct sh_args {
available.) */
#define FUNCTION_ARG_ADVANCE(CUM, MODE, TYPE, NAMED) \
((CUM).arg_count[(int) GET_SH_ARG_CLASS (MODE)] = \
(ROUND_REG ((CUM), (MODE)) \
+ ((MODE) != BLKmode \
? ROUND_ADVANCE (GET_MODE_SIZE (MODE)) \
: ROUND_ADVANCE (int_size_in_bytes (TYPE)))))
if (! TARGET_SH4 || PASS_IN_REG_P ((CUM), (MODE), (TYPE))) \
((CUM).arg_count[(int) GET_SH_ARG_CLASS (MODE)] \
= (ROUND_REG ((CUM), (MODE)) \
+ ((MODE) == BLKmode \
? ROUND_ADVANCE (int_size_in_bytes (TYPE)) \
: ROUND_ADVANCE (GET_MODE_SIZE (MODE)))))
/* Return boolean indicating arg of mode MODE will be passed in a reg.
This macro is only used in this file. */
@ -883,7 +1039,9 @@ extern int current_function_varargs;
((PASS_IN_REG_P ((CUM), (MODE), (TYPE)) \
&& ((NAMED) || TARGET_SH3E || ! current_function_varargs)) \
? gen_rtx (REG, (MODE), \
(BASE_ARG_REG (MODE) + ROUND_REG ((CUM), (MODE)))) \
((BASE_ARG_REG (MODE) + ROUND_REG ((CUM), (MODE))) \
^ ((MODE) == SFmode && TARGET_SH4 \
&& TARGET_LITTLE_ENDIAN != 0))) \
: 0)
/* For an arg passed partly in registers and partly in memory,
@ -894,8 +1052,9 @@ extern int current_function_varargs;
#define FUNCTION_ARG_PARTIAL_NREGS(CUM, MODE, TYPE, NAMED) \
((PASS_IN_REG_P ((CUM), (MODE), (TYPE)) \
&& ! TARGET_SH4 \
&& (ROUND_REG ((CUM), (MODE)) \
+ (MODE != BLKmode \
+ ((MODE) != BLKmode \
? ROUND_ADVANCE (GET_MODE_SIZE (MODE)) \
: ROUND_ADVANCE (int_size_in_bytes (TYPE))) \
- NPARM_REGS (MODE) > 0)) \
@ -955,7 +1114,7 @@ extern int current_function_anonymous_args;
/* Alignment required for a trampoline in bits . */
#define TRAMPOLINE_ALIGNMENT \
((CACHE_LOG < 3 || TARGET_SMALLCODE) ? 32 : 64) \
((CACHE_LOG < 3 || TARGET_SMALLCODE && ! TARGET_HARWARD) ? 32 : 64)
/* Emit RTL insns to initialize the variable parts of a trampoline.
FNADDR is an RTX for the address of the function's pure code.
@ -971,6 +1130,8 @@ extern int current_function_anonymous_args;
(CXT)); \
emit_move_insn (gen_rtx (MEM, SImode, plus_constant ((TRAMP), 12)), \
(FNADDR)); \
if (TARGET_HARWARD) \
emit_insn (gen_ic_invalidate_line (TRAMP)); \
}
/* A C expression whose value is RTL representing the value of the return
@ -1086,7 +1247,10 @@ extern struct rtx_def *sh_builtin_saveregs ();
#define MODE_DISP_OK_4(X,MODE) \
(GET_MODE_SIZE (MODE) == 4 && (unsigned) INTVAL (X) < 64 \
&& ! (INTVAL (X) & 3) && ! (TARGET_SH3E && (MODE) == SFmode))
#define MODE_DISP_OK_8(X,MODE) ((GET_MODE_SIZE(MODE)==8) && ((unsigned)INTVAL(X)<60) && (!(INTVAL(X) &3)))
#define MODE_DISP_OK_8(X,MODE) \
((GET_MODE_SIZE(MODE)==8) && ((unsigned)INTVAL(X)<60) \
&& ! (INTVAL(X) & 3) && ! (TARGET_SH4 && (MODE) == DFmode))
#define BASE_REGISTER_RTX_P(X) \
((GET_CODE (X) == REG && REG_OK_FOR_BASE_P (X)) \
@ -1141,13 +1305,15 @@ extern struct rtx_def *sh_builtin_saveregs ();
else if ((GET_CODE (X) == POST_INC || GET_CODE (X) == PRE_DEC) \
&& BASE_REGISTER_RTX_P (XEXP ((X), 0))) \
goto LABEL; \
else if (GET_CODE (X) == PLUS && MODE != PSImode) \
else if (GET_CODE (X) == PLUS \
&& ((MODE) != PSImode || reload_completed)) \
{ \
rtx xop0 = XEXP ((X), 0); \
rtx xop1 = XEXP ((X), 1); \
if (GET_MODE_SIZE (MODE) <= 8 && BASE_REGISTER_RTX_P (xop0)) \
GO_IF_LEGITIMATE_INDEX ((MODE), xop1, LABEL); \
if (GET_MODE_SIZE (MODE) <= 4) \
if (GET_MODE_SIZE (MODE) <= 4 \
|| TARGET_SH4 && TARGET_FMOVD && MODE == DFmode) \
{ \
if (BASE_REGISTER_RTX_P (xop1) && INDEX_REGISTER_RTX_P (xop0))\
goto LABEL; \
@ -1181,6 +1347,7 @@ extern struct rtx_def *sh_builtin_saveregs ();
|| GET_MODE_SIZE (MODE) == 8) \
&& GET_CODE (XEXP ((X), 1)) == CONST_INT \
&& BASE_REGISTER_RTX_P (XEXP ((X), 0)) \
&& ! (TARGET_SH4 && (MODE) == DFmode) \
&& ! (TARGET_SH3E && (MODE) == SFmode)) \
{ \
rtx index_rtx = XEXP ((X), 1); \
@ -1228,12 +1395,21 @@ extern struct rtx_def *sh_builtin_saveregs ();
&& (GET_MODE_SIZE (MODE) == 4 || GET_MODE_SIZE (MODE) == 8) \
&& GET_CODE (XEXP (X, 1)) == CONST_INT \
&& BASE_REGISTER_RTX_P (XEXP (X, 0)) \
&& ! (TARGET_SH3E && MODE == SFmode)) \
&& ! (TARGET_SH4 && (MODE) == DFmode) \
&& ! ((MODE) == PSImode && (TYPE) == RELOAD_FOR_INPUT_ADDRESS)) \
{ \
rtx index_rtx = XEXP (X, 1); \
HOST_WIDE_INT offset = INTVAL (index_rtx), offset_base; \
rtx sum; \
\
if (TARGET_SH3E && MODE == SFmode) \
{ \
X = copy_rtx (X); \
push_reload (index_rtx, NULL_RTX, &XEXP (X, 1), NULL_PTR, \
INDEX_REG_CLASS, Pmode, VOIDmode, 0, 0, (OPNUM), \
(TYPE)); \
goto WIN; \
} \
/* Instead of offset_base 128..131 use 124..127, so that \
simple add suffices. */ \
if (offset > 127) \
@ -1315,7 +1491,7 @@ extern struct rtx_def *sh_builtin_saveregs ();
/* Since the SH3e has only `float' support, it is desirable to make all
floating point types equivalent to `float'. */
#define DOUBLE_TYPE_SIZE (TARGET_SH3E ? 32 : 64)
#define DOUBLE_TYPE_SIZE ((TARGET_SH3E && ! TARGET_SH4) ? 32 : 64)
/* 'char' is signed by default. */
#define DEFAULT_SIGNED_CHAR 1
@ -1407,6 +1583,11 @@ extern struct rtx_def *sh_builtin_saveregs ();
return 10;
#define RTX_COSTS(X, CODE, OUTER_CODE) \
case PLUS: \
return (COSTS_N_INSNS (1) \
+ rtx_cost (XEXP ((X), 0), PLUS) \
+ (rtx_equal_p (XEXP ((X), 0), XEXP ((X), 1))\
? 0 : rtx_cost (XEXP ((X), 1), PLUS)));\
case AND: \
return COSTS_N_INSNS (andcosts (X)); \
case MULT: \
@ -1414,7 +1595,13 @@ extern struct rtx_def *sh_builtin_saveregs ();
case ASHIFT: \
case ASHIFTRT: \
case LSHIFTRT: \
return COSTS_N_INSNS (shiftcosts (X)) ; \
/* Add one extra unit for the matching constraint. \
Otherwise loop strength reduction would think that\
a shift with different sourc and destination is \
as cheap as adding a constant to a register. */ \
return (COSTS_N_INSNS (shiftcosts (X)) \
+ rtx_cost (XEXP ((X), 0), (CODE)) \
+ 1); \
case DIV: \
case UDIV: \
case MOD: \
@ -1462,11 +1649,29 @@ extern struct rtx_def *sh_builtin_saveregs ();
/* Compute extra cost of moving data between one register class
and another. */
/* Regclass always uses 2 for moves in the same register class;
If SECONDARY*_RELOAD_CLASS says something about the src/dst pair,
it uses this information. Hence, the general register <-> floating point
register information here is not used for SFmode. */
#define REGISTER_MOVE_COST(SRCCLASS, DSTCLASS) \
((DSTCLASS) == PR_REG ? 10 \
: (((DSTCLASS) == FP_REGS && (SRCCLASS) == GENERAL_REGS) \
|| ((DSTCLASS) == GENERAL_REGS && (SRCCLASS) == FP_REGS)) ? 4 \
: 1)
((((DSTCLASS) == T_REGS) || ((DSTCLASS) == PR_REG)) ? 10 \
: ((((DSTCLASS) == FP0_REGS || (DSTCLASS) == FP_REGS || (DSTCLASS) == DF_REGS) \
&& ((SRCCLASS) == GENERAL_REGS || (SRCCLASS) == R0_REGS)) \
|| (((DSTCLASS) == GENERAL_REGS || (DSTCLASS) == R0_REGS) \
&& ((SRCCLASS) == FP0_REGS || (SRCCLASS) == FP_REGS \
|| (SRCCLASS) == DF_REGS))) \
? TARGET_FMOVD ? 8 : 12 \
: (((DSTCLASS) == FPUL_REGS \
&& ((SRCCLASS) == GENERAL_REGS || (SRCCLASS) == R0_REGS)) \
|| (SRCCLASS == FPUL_REGS \
&& ((DSTCLASS) == GENERAL_REGS || (DSTCLASS) == R0_REGS))) \
? 5 \
: (((DSTCLASS) == FPUL_REGS \
&& ((SRCCLASS) == PR_REGS || (SRCCLASS) == MAC_REGS)) \
|| ((SRCCLASS) == FPUL_REGS \
&& ((DSTCLASS) == PR_REGS || (DSTCLASS) == MAC_REGS))) \
? 7 \
: 2)
/* ??? Perhaps make MEMORY_MOVE_COST depend on compiler option? This
would be so that people would slow memory systems could generate
@ -1573,13 +1778,32 @@ dtors_section() \
the Real framepointer; it can also be used as a normal general register.
Note that the name `fp' is horribly misleading since `fp' is in fact only
the argument-and-return-context pointer. */
extern char fp_reg_names[][5];
#define REGISTER_NAMES \
{ \
"r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", \
"r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", \
"ap", "pr", "t", "gbr", "mach","macl", fp_reg_names[16], "rap", \
fp_reg_names[0], fp_reg_names[1] , fp_reg_names[2], fp_reg_names[3], \
fp_reg_names[4], fp_reg_names[5], fp_reg_names[6], fp_reg_names[7], \
fp_reg_names[8], fp_reg_names[9], fp_reg_names[10], fp_reg_names[11], \
fp_reg_names[12], fp_reg_names[13], fp_reg_names[14], fp_reg_names[15], \
fp_reg_names[17], fp_reg_names[18], fp_reg_names[19], fp_reg_names[20], \
fp_reg_names[21], fp_reg_names[22], fp_reg_names[23], fp_reg_names[24], \
"fpscr", \
}
#define DEBUG_REGISTER_NAMES \
{ \
"r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", \
"r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", \
"ap", "pr", "t", "gbr", "mach","macl", "fpul","rap", \
"fr0","fr1","fr2", "fr3", "fr4", "fr5", "fr6", "fr7", \
"fr8","fr9","fr10","fr11","fr12","fr13","fr14","fr15",\
"xd0","xd2","xd4", "xd6", "xd8", "xd10","xd12","xd14", \
"fpscr", \
}
/* DBX register number for a given compiler register number. */
@ -1773,7 +1997,8 @@ enum processor_type {
PROCESSOR_SH1,
PROCESSOR_SH2,
PROCESSOR_SH3,
PROCESSOR_SH3E
PROCESSOR_SH3E,
PROCESSOR_SH4
};
#define sh_cpu_attr ((enum attr_cpu)sh_cpu)
@ -1837,6 +2062,11 @@ extern int sh_valid_machine_decl_attribute ();
#define VALID_MACHINE_DECL_ATTRIBUTE(DECL, ATTRIBUTES, IDENTIFIER, ARGS) \
sh_valid_machine_decl_attribute (DECL, ATTRIBUTES, IDENTIFIER, ARGS)
extern int sh_flag_remove_dead_before_cse;
extern int rtx_equal_function_value_matters;
extern struct rtx_def *fpscr_rtx;
extern struct rtx_def *get_fpscr_rtx ();
#define MOVE_RATIO (TARGET_SMALLCODE ? 2 : 16)
@ -1860,10 +2090,16 @@ sh_valid_machine_decl_attribute (DECL, ATTRIBUTES, IDENTIFIER, ARGS)
{"arith_operand", {SUBREG, REG, CONST_INT}}, \
{"arith_reg_operand", {SUBREG, REG}}, \
{"arith_reg_or_0_operand", {SUBREG, REG, CONST_INT}}, \
{"binary_float_operator", {PLUS, MULT}}, \
{"braf_label_ref_operand", {LABEL_REF}}, \
{"commutative_float_operator", {PLUS, MULT}}, \
{"fp_arith_reg_operand", {SUBREG, REG}}, \
{"fp_extended_operand", {SUBREG, REG, FLOAT_EXTEND}}, \
{"fpscr_operand", {REG}}, \
{"general_movsrc_operand", {SUBREG, REG, CONST_INT, MEM}}, \
{"general_movdst_operand", {SUBREG, REG, CONST_INT, MEM}}, \
{"logical_operand", {SUBREG, REG, CONST_INT}}, \
{"noncommutative_float_operator", {MINUS, DIV}}, \
{"register_operand", {SUBREG, REG}},
/* Define this macro if it is advisable to hold scalars in registers
@ -1929,7 +2165,7 @@ do { \
using their arguments pretty quickly. \
Assume a four cycle delay before they are needed. */ \
if (! reg_set_p (reg, dep_insn)) \
cost -= 4; \
cost -= TARGET_SUPERSCALAR ? 40 : 4; \
} \
/* Adjust load_si / pcload_si type insns latency. Use the known \
nominal latency and form of the insn to speed up the check. */ \
@ -1939,9 +2175,14 @@ do { \
it's actually a move insn. */ \
&& general_movsrc_operand (SET_SRC (PATTERN (dep_insn)), SImode))\
cost = 2; \
else if (cost == 30 \
&& GET_CODE (PATTERN (dep_insn)) == SET \
&& GET_MODE (SET_SRC (PATTERN (dep_insn))) == SImode) \
cost = 20; \
} while (0) \
/* For the sake of libgcc2.c, indicate target supports atexit. */
#define HAVE_ATEXIT
#define SH_DYNAMIC_SHIFT_COST (TARGET_SH3 ? (TARGET_SMALLCODE ? 1 : 2) : 20)
#define SH_DYNAMIC_SHIFT_COST \
(TARGET_HARD_SH4 ? 1 : TARGET_SH3 ? (TARGET_SMALLCODE ? 1 : 2) : 20)

File diff suppressed because it is too large Load Diff

View File

@ -1,7 +1,7 @@
CROSS_LIBGCC1 = libgcc1-asm.a
LIB1ASMSRC = sh/lib1funcs.asm
LIB1ASMFUNCS = _ashiftrt _ashiftrt_n _ashiftlt _lshiftrt _movstr \
_mulsi3 _sdivsi3 _udivsi3 _set_fpscr
_movstr_i4 _mulsi3 _sdivsi3 _sdivsi3_i4 _udivsi3 _udivsi3_i4 _set_fpscr
# These are really part of libgcc1, but this will cause them to be
# built correctly, so...
@ -21,7 +21,7 @@ fp-bit.c: $(srcdir)/config/fp-bit.c
echo '#endif' >> fp-bit.c
cat $(srcdir)/config/fp-bit.c >> fp-bit.c
MULTILIB_OPTIONS= ml m2/m3e
MULTILIB_OPTIONS= ml m2/m3e/m4-single-only/m4-single/m4
MULTILIB_DIRNAMES=
MULTILIB_MATCHES = m2=m3

View File

@ -6,10 +6,10 @@
#ifndef __GNUC_VA_LIST
#define __GNUC_VA_LIST
#ifdef __SH3E__
#if defined (__SH3E__) || defined (__SH4_SINGLE__) || defined (__SH4__) || defined (__SH4_SINGLE_ONLY__)
typedef long __va_greg;
typedef double __va_freg;
typedef float __va_freg;
typedef struct {
__va_greg * __va_next_o; /* next available register */
@ -33,24 +33,24 @@ typedef void *__gnuc_va_list;
#ifdef _STDARG_H
#ifdef __SH3E__
#if defined (__SH3E__) || defined (__SH4_SINGLE__) || defined (__SH4__) || defined (__SH4_SINGLE_ONLY__)
#define va_start(AP, LASTARG) \
__extension__ \
({ \
AP.__va_next_fp = (__va_freg *) __builtin_saveregs (); \
AP.__va_next_fp_limit = (AP.__va_next_fp + \
(AP).__va_next_fp = (__va_freg *) __builtin_saveregs (); \
(AP).__va_next_fp_limit = ((AP).__va_next_fp + \
(__builtin_args_info (1) < 8 ? 8 - __builtin_args_info (1) : 0)); \
AP.__va_next_o = (__va_greg *) AP.__va_next_fp_limit; \
AP.__va_next_o_limit = (AP.__va_next_o + \
(AP).__va_next_o = (__va_greg *) (AP).__va_next_fp_limit; \
(AP).__va_next_o_limit = ((AP).__va_next_o + \
(__builtin_args_info (0) < 4 ? 4 - __builtin_args_info (0) : 0)); \
AP.__va_next_stack = (__va_greg *) __builtin_next_arg (LASTARG); \
(AP).__va_next_stack = (__va_greg *) __builtin_next_arg (LASTARG); \
})
#else /* ! SH3E */
#define va_start(AP, LASTARG) \
(AP = ((__gnuc_va_list) __builtin_next_arg (LASTARG)))
((AP) = ((__gnuc_va_list) __builtin_next_arg (LASTARG)))
#endif /* ! SH3E */
@ -59,24 +59,26 @@ __extension__ \
#define va_alist __builtin_va_alist
#define va_dcl int __builtin_va_alist;...
#ifdef __SH3E__
#if defined (__SH3E__) || defined (__SH4_SINGLE__) || defined (__SH4__) || defined (__SH4_SINGLE_ONLY__)
#define va_start(AP) \
__extension__ \
({ \
AP.__va_next_fp = (__va_freg *) __builtin_saveregs (); \
AP.__va_next_fp_limit = (AP.__va_next_fp + \
(AP).__va_next_fp = (__va_freg *) __builtin_saveregs (); \
(AP).__va_next_fp_limit = ((AP).__va_next_fp + \
(__builtin_args_info (1) < 8 ? 8 - __builtin_args_info (1) : 0)); \
AP.__va_next_o = (__va_greg *) AP.__va_next_fp_limit; \
AP.__va_next_o_limit = (AP.__va_next_o + \
(AP).__va_next_o = (__va_greg *) (AP).__va_next_fp_limit; \
(AP).__va_next_o_limit = ((AP).__va_next_o + \
(__builtin_args_info (0) < 4 ? 4 - __builtin_args_info (0) : 0)); \
AP.__va_next_stack = (__va_greg *) __builtin_next_arg (__builtin_va_alist) \
- (__builtin_args_info (0) >= 4 || __builtin_args_info (1) >= 8 ? 1 : 0); \
(AP).__va_next_stack \
= ((__va_greg *) __builtin_next_arg (__builtin_va_alist) \
- (__builtin_args_info (0) >= 4 || __builtin_args_info (1) >= 8 \
? 1 : 0)); \
})
#else /* ! SH3E */
#define va_start(AP) AP=(char *) &__builtin_va_alist
#define va_start(AP) ((AP) = (char *) &__builtin_va_alist)
#endif /* ! SH3E */
@ -136,53 +138,78 @@ enum __va_type_classes {
We want the MEM_IN_STRUCT_P bit set in the emitted RTL, therefore we
use unions even when it would otherwise be unnecessary. */
/* gcc has an extension that allows to use a casted lvalue as an lvalue,
But it doesn't work in C++ with -pedantic - even in the presence of
__extension__ . We work around this problem by using a reference type. */
#ifdef __cplusplus
#define __VA_REF &
#else
#define __VA_REF
#endif
#define __va_arg_sh1(AP, TYPE) __extension__ \
__extension__ \
({(sizeof (TYPE) == 1 \
? ({union {TYPE t; char c;} __t; \
asm("" \
: "=r" (__t.c) \
: "0" ((((union { int i, j; } *) (AP))++)->i)); \
__asm("" \
: "=r" (__t.c) \
: "0" ((((union { int i, j; } *__VA_REF) (AP))++)->i)); \
__t.t;}) \
: sizeof (TYPE) == 2 \
? ({union {TYPE t; short s;} __t; \
asm("" \
: "=r" (__t.s) \
: "0" ((((union { int i, j; } *) (AP))++)->i)); \
__asm("" \
: "=r" (__t.s) \
: "0" ((((union { int i, j; } *__VA_REF) (AP))++)->i)); \
__t.t;}) \
: sizeof (TYPE) >= 4 || __LITTLE_ENDIAN_P \
? (((union { TYPE t; int i;} *) (AP))++)->t \
: ((union {TYPE t;TYPE u;}*) ((char *)++(int *)(AP) - sizeof (TYPE)))->t);})
? (((union { TYPE t; int i;} *__VA_REF) (AP))++)->t \
: ((union {TYPE t;TYPE u;}*) ((char *)++(int *__VA_REF)(AP) - sizeof (TYPE)))->t);})
#ifdef __SH3E__
#if defined (__SH3E__) || defined (__SH4_SINGLE__) || defined (__SH4__) || defined (__SH4_SINGLE_ONLY__)
#define __PASS_AS_FLOAT(TYPE_CLASS,SIZE) \
(TYPE_CLASS == __real_type_class && SIZE == 4)
#define __TARGET_SH4_P 0
#if defined(__SH4__) || defined(__SH4_SINGLE__)
#undef __PASS_AS_FLOAT
#define __PASS_AS_FLOAT(TYPE_CLASS,SIZE) \
(TYPE_CLASS == __real_type_class && SIZE <= 8 \
|| TYPE_CLASS == __complex_type_class && SIZE <= 16)
#undef __TARGET_SH4_P
#define __TARGET_SH4_P 1
#endif
#define va_arg(pvar,TYPE) \
__extension__ \
({int __type = __builtin_classify_type (* (TYPE *) 0); \
void * __result_p; \
if (__PASS_AS_FLOAT (__type, sizeof(TYPE))) \
{ \
if (pvar.__va_next_fp < pvar.__va_next_fp_limit) \
if ((pvar).__va_next_fp < (pvar).__va_next_fp_limit) \
{ \
__result_p = &pvar.__va_next_fp; \
if (((__type == __real_type_class && sizeof (TYPE) > 4)\
|| sizeof (TYPE) > 8) \
&& (((int) (pvar).__va_next_fp ^ (int) (pvar).__va_next_fp_limit)\
& 4)) \
(pvar).__va_next_fp++; \
__result_p = &(pvar).__va_next_fp; \
} \
else \
__result_p = &pvar.__va_next_stack; \
__result_p = &(pvar).__va_next_stack; \
} \
else \
{ \
if (pvar.__va_next_o + ((sizeof (TYPE) + 3) / 4) \
<= pvar.__va_next_o_limit) \
__result_p = &pvar.__va_next_o; \
if ((pvar).__va_next_o + ((sizeof (TYPE) + 3) / 4) \
<= (pvar).__va_next_o_limit) \
__result_p = &(pvar).__va_next_o; \
else \
{ \
if (sizeof (TYPE) > 4) \
pvar.__va_next_o = pvar.__va_next_o_limit; \
if (! __TARGET_SH4_P) \
(pvar).__va_next_o = (pvar).__va_next_o_limit; \
\
__result_p = &pvar.__va_next_stack; \
__result_p = &(pvar).__va_next_stack; \
} \
} \
__va_arg_sh1(*(void **)__result_p, TYPE);})
@ -194,6 +221,6 @@ __extension__ \
#endif /* SH3E */
/* Copy __gnuc_va_list into another variable of this type. */
#define __va_copy(dest, src) (dest) = (src)
#define __va_copy(dest, src) ((dest) = (src))
#endif /* defined (_STDARG_H) || defined (_VARARGS_H) */