Support for the SPARC M8 cpu.

This patch serie adds support for the SPARC M8 processor to GCC.
The SPARC M8 processor implements the Oracle SPARC Architecture 2017.

- bmask* instructions are put in their own instruction type.  It makes
  little sense to have them in the same category than array
  instructions.

- Similarly, VIS compare instructions are put in their own instruction
  type.  This is to better accommodate subtypes, which are not quite
  the same than the subtypes of `visl' instructions.

- The introduction of a new `subtype' insn attribute in sparc.md
  avoids the need for adjusting the instruction scheduler DFAs for
  previous cpu models every time a new cpu is introduced.

- The full set of SPARC instructions used in sparc.md, and their
  position in the type/subtype hierarchy, is documented in a comment.
  This eases the modification of the DFA schedulers, and the addition
  of new cpus.

- The M7 DFA scheduler is reworked:

  + To use the new type/subtype hierarchy.
  + The v3pipe insn attribute is no longer needed.
  + More accurate latencies for instructions.
  + The S4 core pipeline is documented in a comment in niagara7.md.

- Support for -mcpu=m8 (we are thus suggesting to abandon the niagaraN
  denomination for M8 and later processors.)

- Support for a new VIS level, VIS4B, covering the new VIS
  instructions introduced in OSA2017 and implemented in the M8.  Also
  built-ins.

- A M8 DFA scheduler:

  + Also based on the new type/subtype hierarchy.
  + The functional units in the S5 core are explicitly documented in a
    comment in m8.md.


gcc/ChangeLog:

	* config/sparc/m8.md: New file.
	* config/sparc/sparc.md: Include m8.md.

	* config/sparc/sparc.opt: New option -mvis4b.
	* config/sparc/sparc.c (dump_target_flag_bits): Handle MASK_VIS4B.
	(sparc_option_override): Handle VIS4B.
	(enum sparc_builtins): Define
	SPARC_BUILTIN_DICTUNPACK{8,16,32},
	SPARC_BUILTIN_FPCMP{LE,GT,EQ,NE}{8,16,32}SHL,
	SPARC_BUILTIN_FPCMPU{LE,GT}{8,16,32}SHL,
	SPARC_BUILTIN_FPCMPDE{8,16,32}SHL and
	SPARC_BUILTIN_FPCMPUR{8,16,32}SHL.
	(check_constant_argument): New function.
	(sparc_vis_init_builtins): Define builtins
	__builtin_vis_dictunpack{8,16,32},
	__builtin_vis_fpcmp{le,gt,eq,ne}{8,16,32}shl,
	__builtin_vis_fpcmpu{le,gt}{8,16,32}shl,
	__builtin_vis_fpcmpde{8,16,32}shl and
	__builtin_vis_fpcmpur{8,16,32}shl.
	(sparc_expand_builtin): Check that the constant operands to
	__builtin_vis_fpcmp*shl and _builtin_vis_dictunpack* are indeed
	constant and in range.
	* config/sparc/sparc-c.c (sparc_target_macros): Handle
	TARGET_VIS4B.
	* config/sparc/sparc.h (SPARC_IMM2_P): Define.
	(SPARC_IMM5_P): Likewise.
	* config/sparc/sparc.md (cpu_feature): Add new feagure "vis4b".
	(enabled): Handle vis4b.
	(UNSPEC_DICTUNPACK): New unspec.
	(UNSPEC_FPCMPSHL): Likewise.
	(UNSPEC_FPUCMPSHL): Likewise.
	(UNSPEC_FPCMPDESHL): Likewise.
	(UNSPEC_FPCMPURSHL): Likewise.
	(cpu_feature): New CPU feature `vis4b'.
	(dictunpack{8,16,32}): New insns.
	(FPCSMODE): New mode iterator.
	(fpcscond): New code iterator.
	(fpcsucond): Likewise.
	(fpcmp{le,gt,eq,ne}{8,16,32}{si,di}shl): New insns.
	(fpcmpu{le,gt}{8,16,32}{si,di}shl): Likewise.
	(fpcmpde{8,16,32}{si,di}shl): Likewise.
	(fpcmpur{8,16,32}{si,di}shl): Likewise.
	* config/sparc/constraints.md: Define constraints `q' for unsigned
	2-bit integer constants and `t' for unsigned 5-bit integer
	constants.
	* config/sparc/predicates.md (imm5_operand_dictunpack8): New
	predicate.
	(imm5_operand_dictunpack16): Likewise.
	(imm5_operand_dictunpack32): Likewise.
	(imm2_operand): Likewise.
	* doc/invoke.texi (SPARC Options): Document -mvis4b.
	* doc/extend.texi (SPARC VIS Built-in Functions): Document the
	ditunpack* and fpcmp*shl builtins.

	* config.gcc: Handle m8 in --with-{cpu,tune} options.
	* config.in: Add HAVE_AS_SPARC6 define.
	* config/sparc/driver-sparc.c (cpu_names): Add entry for the SPARC
	M8.
	* config/sparc/sol2.h (CPP_CPU64_DEFAULT_SPEC): Define for
	TARGET_CPU_m8.
	(ASM_CPU32_DEFAUILT_SPEC): Likewise.
	(CPP_CPU_SPEC): Handle m8.
	(ASM_CPU_SPEC): Likewise.
	* config/sparc/sparc-opts.h (enum processor_type): Add
	PROCESSOR_M8.
	* config/sparc/sparc.c (m8_costs): New struct.
	(sparc_option_override): Handle TARGET_CPU_m8.
	(sparc32_initialize_trampoline): Likewise.
	(sparc64_initialize_trampoline): Likewise.
	(sparc_issue_rate): Likewise.
	(sparc_register_move_cost): Likewise.
	* config/sparc/sparc.h (TARGET_CPU_m8): Define.
	(CPP_CPU64_DEFAULT_SPEC): Define for M8.
	(ASM_CPU64_DEFAULT_SPEC): Likewise.
	(CPP_CPU_SPEC): Handle M8.
	(ASM_CPU_SPEC): Likewise.
	(AS_M8_FLAG): Define.
	* config/sparc/sparc.md: Add m8 to the cpu attribute.
	* config/sparc/sparc.opt: New option -mcpu=m8 for sparc targets.
	* configure.ac (HAVE_AS_SPARC6): Check for assembler support for
	M8 instructions.
	* configure: Regenerate.
	* doc/invoke.texi (SPARC Options): Document -mcpu=m8 and
	-mtune=m8.

	* config/sparc/niagara7.md: Rework the DFA scheduler to use insn
	subtypes.
	* config/sparc/sparc.md: Remove the `v3pipe' insn attribute.
	("*movdi_insn_sp32"): Do not set v3pipe.
	("*movsi_insn"): Likewise.
	("*movdi_insn_sp64"): Likewise.
	("*movsf_insn"): Likewise.
	("*movdf_insn_sp32"): Likewise.
	("*movdf_insn_sp64"): Likewise.
	("*zero_extendsidi2_insn_sp64"): Likewise.
	("*sign_extendsidi2_insn"): Likewise.
	("*mov<VM32:mode>_insn"): Likewise.
	("*mov<VM64:mode>_insn_sp64"): Likewise.
	("*mov<VM64:mode>_insn_sp32"): Likewise.
	("<plusminus_insn><VADDSUB:mode>3"): Likewise.
	("<vlop:code><VL:mode>3"): Likewise.
	("*not_<vlop:code><VL:mode>3"): Likewise.
	("*nand<VL:mode>_vis"): Likewise.
	("*<vlnotop:code>_not1<VL:mode>_vis"): Likewise.
	("*<vlnotop:code>_not2<VL:mode>_vis"): Likewise.
	("one_cmpl<VL:mode>2"): Likewise.
	("faligndata<VM64:mode>_vis"): Likewise.
	("alignaddrsi_vis"): Likewise.
	("alignaddrdi_vis"): Likweise.
	("alignaddrlsi_vis"): Likewise.
	("alignaddrldi_vis"): Likewise.
	("fcmp<gcond:code><GCM:gcm_name><P:mode>_vis"): Likewise.
	("bmaskdi_vis"): Likewise.
	("bmasksi_vis"): Likewise.
	("bshuffle<VM64:mode>_vis"): Likewise.
	("cmask8<P:mode>_vis"): Likewise.
	("cmask16<P:mode>_vis"): Likewise.
	("cmask32<P:mode>_vis"): Likewise.
	("pdistn<P:mode>_vis"): Likewise.
	("<vis3_addsub_ss_patname><VASS:mode>3"): Likewise.

	* config/sparc/sparc.md ("subtype"): New insn attribute.
	("*wrgsr_sp64"): Set insn subtype.
	("*rdgsr_sp64"): Likewise.
	("alignaddrsi_vis"): Likewise.
	("alignaddrdi_vis"): Likewise.
	("alignaddrlsi_vis"): Likewise.
	("alignaddrldi_vis"): Likewise.
	("<plusminus_insn><VADDSUB:mode>3"): Likewise.
	("fexpand_vis"): Likewise.
	("fpmerge_vis"): Likewise.
	("faligndata<VM64:mode>_vis"): Likewise.
	("bshuffle<VM64:mode>_vis"): Likewise.
	("cmask8<P:mode>_vis"): Likewise.
	("cmask16<P:mode>_vis"): Likewise.
	("cmask32<P:mode>_vis"): Likewise.
	("fchksm16_vis"): Likewise.
	("v<vis3_shift_patname><GCM:mode>3"): Likewise.
	("fmean16_vis"): Likewise.
	("fp<plusminus_insn>64_vis"): Likewise.
	("<plusminus_insn>v8qi3"): Likewise.
	("<vis3_addsub_ss_patname><VASS:mode>3"): Likewise.
	("<vis4_minmax_patname><VMMAX:mode>3"): Likewise.
	("<vis4_uminmax_patname><VMMAX:mode>3"): Likewise.
	("<vis3_addsub_ss_patname>v8qi3"): Likewise.
	("<vis4_addsub_us_patname><VAUS:mode>3"): Likewise.
	("*movqi_insn"): Likewise.
	("*movhi_insn"): Likewise.
	("*movsi_insn"): Likewise.
	("movsi_pic_gotdata_op"): Likewise.
	("*movdi_insn_sp32"): Likewise.
	("*movdi_insn_sp64"): Likewise.
	("movdi_pic_gotdata_op"): Likewise.
	("*movsf_insn"): Likewise.
	("*movdf_insn_sp32"): Likewise.
	("*movdf_insn_sp64"): Likewise.
	("*zero_extendhisi2_insn"): Likewise.
	("*zero_extendqihi2_insn"): Likewise.
	("*zero_extendqisi2_insn"): Likewise.
	("*zero_extendqidi2_insn"): Likewise.
	("*zero_extendhidi2_insn"): Likewise.
	("*zero_extendsidi2_insn_sp64"): Likewise.
	("ldfsr"): Likewise.
	("prefetch_64"): Likewise.
	("prefetch_32"): Likewise.
	("tie_ld32"): Likewise.
	("tie_ld64"): Likewise.
	("*tldo_ldub_sp32"): Likewise.
	("*tldo_ldub1_sp32"): Likewise.
	("*tldo_ldub2_sp32"): Likewise.
	("*tldo_ldub_sp64"): Likewise.
	("*tldo_ldub1_sp64"): Likewise.
	("*tldo_ldub2_sp64"): Likewise.
	("*tldo_ldub3_sp64"): Likewise.
	("*tldo_lduh_sp32"): Likewise.
	("*tldo_lduh1_sp32"): Likewise.
	("*tldo_lduh_sp64"): Likewise.
	("*tldo_lduh1_sp64"): Likewise.
	("*tldo_lduh2_sp64"): Likewise.
	("*tldo_lduw_sp32"): Likewise.
	("*tldo_lduw_sp64"): Likewise.
	("*tldo_lduw1_sp64"): Likewise.
	("*tldo_ldx_sp64"): Likewise.
	("*mov<VM32:mode>_insn"): Likewise.
	("*mov<VM64:mode>_insn_sp64"): Likewise.
	("*mov<VM64:mode>_insn_sp32"): Likewise.

	* config/sparc/sparc.md ("type"): New insn type viscmp.
	("fcmp<gcond:code><GCM:gcm_name><P:mode>_vis"): Set insn type to
	viscmp.
	("fpcmp<gcond:code>8<P:mode>_vis"): Likewise.
	("fucmp<gcond:code>8<P:mode>_vis"): Likewise.
	("fpcmpu<gcond:code><GCM:gcm_name><P:mode>_vis"): Likewise.
	* config/sparc/niagara7.md ("n7_vis_logical_v3pipe"): Handle
	viscmp.
	("n7_vis_logical_11cycle"): Likewise.
	* config/sparc/niagara4.md ("n4_vis_logical"): Likewise.
	* config/sparc/niagara2.md ("niag3_vis": Likewise.
	* config/sparc/niagara.md ("niag_vis"): Likewise.
	* config/sparc/ultra3.md ("us3_fga"): Likewise.
	* config/sparc/ultra1_2.md ("us1_fga_double"): Likewise.

	* config/sparc/sparc.md: New instruction type `bmask'.
	(bmaskdi_vis): Use the `bmask' type.
	(bmasksi_vis): Likewise.
	* config/sparc/ultra3.md (us3_array): Likewise.
	* config/sparc/niagara7.md (n7_array): Likewise.
	* config/sparc/niagara4.md (n4_array): Likewise.
	* config/sparc/niagara2.md (niag2_vis): Likewise.
	(niag3_vis): Likewise.
	* config/sparc/niagara.md (niag_vis): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/sparc/dictunpack.c: New file.
	* gcc.target/sparc/fpcmpdeshl.c: Likewise.
	* gcc.target/sparc/fpcmpshl.c: Likewise.
	* gcc.target/sparc/fpcmpurshl.c: Likewise.
	* gcc.target/sparc/fpcmpushl.c: Likewise.

From-SVN: r250050
This commit is contained in:
Jose E. Marchesi 2017-07-07 17:42:43 +02:00 committed by Jose E. Marchesi
parent e604883c8f
commit 0316d24f7a
30 changed files with 1583 additions and 187 deletions

View File

@ -1,3 +1,229 @@
2017-07-07 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/sparc/m8.md: New file.
* config/sparc/sparc.md: Include m8.md.
2017-07-07 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/sparc/sparc.opt: New option -mvis4b.
* config/sparc/sparc.c (dump_target_flag_bits): Handle MASK_VIS4B.
(sparc_option_override): Handle VIS4B.
(enum sparc_builtins): Define
SPARC_BUILTIN_DICTUNPACK{8,16,32},
SPARC_BUILTIN_FPCMP{LE,GT,EQ,NE}{8,16,32}SHL,
SPARC_BUILTIN_FPCMPU{LE,GT}{8,16,32}SHL,
SPARC_BUILTIN_FPCMPDE{8,16,32}SHL and
SPARC_BUILTIN_FPCMPUR{8,16,32}SHL.
(check_constant_argument): New function.
(sparc_vis_init_builtins): Define builtins
__builtin_vis_dictunpack{8,16,32},
__builtin_vis_fpcmp{le,gt,eq,ne}{8,16,32}shl,
__builtin_vis_fpcmpu{le,gt}{8,16,32}shl,
__builtin_vis_fpcmpde{8,16,32}shl and
__builtin_vis_fpcmpur{8,16,32}shl.
(sparc_expand_builtin): Check that the constant operands to
__builtin_vis_fpcmp*shl and _builtin_vis_dictunpack* are indeed
constant and in range.
* config/sparc/sparc-c.c (sparc_target_macros): Handle
TARGET_VIS4B.
* config/sparc/sparc.h (SPARC_IMM2_P): Define.
(SPARC_IMM5_P): Likewise.
* config/sparc/sparc.md (cpu_feature): Add new feagure "vis4b".
(enabled): Handle vis4b.
(UNSPEC_DICTUNPACK): New unspec.
(UNSPEC_FPCMPSHL): Likewise.
(UNSPEC_FPUCMPSHL): Likewise.
(UNSPEC_FPCMPDESHL): Likewise.
(UNSPEC_FPCMPURSHL): Likewise.
(cpu_feature): New CPU feature `vis4b'.
(dictunpack{8,16,32}): New insns.
(FPCSMODE): New mode iterator.
(fpcscond): New code iterator.
(fpcsucond): Likewise.
(fpcmp{le,gt,eq,ne}{8,16,32}{si,di}shl): New insns.
(fpcmpu{le,gt}{8,16,32}{si,di}shl): Likewise.
(fpcmpde{8,16,32}{si,di}shl): Likewise.
(fpcmpur{8,16,32}{si,di}shl): Likewise.
* config/sparc/constraints.md: Define constraints `q' for unsigned
2-bit integer constants and `t' for unsigned 5-bit integer
constants.
* config/sparc/predicates.md (imm5_operand_dictunpack8): New
predicate.
(imm5_operand_dictunpack16): Likewise.
(imm5_operand_dictunpack32): Likewise.
(imm2_operand): Likewise.
* doc/invoke.texi (SPARC Options): Document -mvis4b.
* doc/extend.texi (SPARC VIS Built-in Functions): Document the
ditunpack* and fpcmp*shl builtins.
2017-07-07 Jose E. Marchesi <jose.marchesi@oracle.com>
* config.gcc: Handle m8 in --with-{cpu,tune} options.
* config.in: Add HAVE_AS_SPARC6 define.
* config/sparc/driver-sparc.c (cpu_names): Add entry for the SPARC
M8.
* config/sparc/sol2.h (CPP_CPU64_DEFAULT_SPEC): Define for
TARGET_CPU_m8.
(ASM_CPU32_DEFAUILT_SPEC): Likewise.
(CPP_CPU_SPEC): Handle m8.
(ASM_CPU_SPEC): Likewise.
* config/sparc/sparc-opts.h (enum processor_type): Add
PROCESSOR_M8.
* config/sparc/sparc.c (m8_costs): New struct.
(sparc_option_override): Handle TARGET_CPU_m8.
(sparc32_initialize_trampoline): Likewise.
(sparc64_initialize_trampoline): Likewise.
(sparc_issue_rate): Likewise.
(sparc_register_move_cost): Likewise.
* config/sparc/sparc.h (TARGET_CPU_m8): Define.
(CPP_CPU64_DEFAULT_SPEC): Define for M8.
(ASM_CPU64_DEFAULT_SPEC): Likewise.
(CPP_CPU_SPEC): Handle M8.
(ASM_CPU_SPEC): Likewise.
(AS_M8_FLAG): Define.
* config/sparc/sparc.md: Add m8 to the cpu attribute.
* config/sparc/sparc.opt: New option -mcpu=m8 for sparc targets.
* configure.ac (HAVE_AS_SPARC6): Check for assembler support for
M8 instructions.
* configure: Regenerate.
* doc/invoke.texi (SPARC Options): Document -mcpu=m8 and
-mtune=m8.
2017-07-07 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/sparc/niagara7.md: Rework the DFA scheduler to use insn
subtypes.
* config/sparc/sparc.md: Remove the `v3pipe' insn attribute.
("*movdi_insn_sp32"): Do not set v3pipe.
("*movsi_insn"): Likewise.
("*movdi_insn_sp64"): Likewise.
("*movsf_insn"): Likewise.
("*movdf_insn_sp32"): Likewise.
("*movdf_insn_sp64"): Likewise.
("*zero_extendsidi2_insn_sp64"): Likewise.
("*sign_extendsidi2_insn"): Likewise.
("*mov<VM32:mode>_insn"): Likewise.
("*mov<VM64:mode>_insn_sp64"): Likewise.
("*mov<VM64:mode>_insn_sp32"): Likewise.
("<plusminus_insn><VADDSUB:mode>3"): Likewise.
("<vlop:code><VL:mode>3"): Likewise.
("*not_<vlop:code><VL:mode>3"): Likewise.
("*nand<VL:mode>_vis"): Likewise.
("*<vlnotop:code>_not1<VL:mode>_vis"): Likewise.
("*<vlnotop:code>_not2<VL:mode>_vis"): Likewise.
("one_cmpl<VL:mode>2"): Likewise.
("faligndata<VM64:mode>_vis"): Likewise.
("alignaddrsi_vis"): Likewise.
("alignaddrdi_vis"): Likweise.
("alignaddrlsi_vis"): Likewise.
("alignaddrldi_vis"): Likewise.
("fcmp<gcond:code><GCM:gcm_name><P:mode>_vis"): Likewise.
("bmaskdi_vis"): Likewise.
("bmasksi_vis"): Likewise.
("bshuffle<VM64:mode>_vis"): Likewise.
("cmask8<P:mode>_vis"): Likewise.
("cmask16<P:mode>_vis"): Likewise.
("cmask32<P:mode>_vis"): Likewise.
("pdistn<P:mode>_vis"): Likewise.
("<vis3_addsub_ss_patname><VASS:mode>3"): Likewise.
2017-07-07 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/sparc/sparc.md ("subtype"): New insn attribute.
("*wrgsr_sp64"): Set insn subtype.
("*rdgsr_sp64"): Likewise.
("alignaddrsi_vis"): Likewise.
("alignaddrdi_vis"): Likewise.
("alignaddrlsi_vis"): Likewise.
("alignaddrldi_vis"): Likewise.
("<plusminus_insn><VADDSUB:mode>3"): Likewise.
("fexpand_vis"): Likewise.
("fpmerge_vis"): Likewise.
("faligndata<VM64:mode>_vis"): Likewise.
("bshuffle<VM64:mode>_vis"): Likewise.
("cmask8<P:mode>_vis"): Likewise.
("cmask16<P:mode>_vis"): Likewise.
("cmask32<P:mode>_vis"): Likewise.
("fchksm16_vis"): Likewise.
("v<vis3_shift_patname><GCM:mode>3"): Likewise.
("fmean16_vis"): Likewise.
("fp<plusminus_insn>64_vis"): Likewise.
("<plusminus_insn>v8qi3"): Likewise.
("<vis3_addsub_ss_patname><VASS:mode>3"): Likewise.
("<vis4_minmax_patname><VMMAX:mode>3"): Likewise.
("<vis4_uminmax_patname><VMMAX:mode>3"): Likewise.
("<vis3_addsub_ss_patname>v8qi3"): Likewise.
("<vis4_addsub_us_patname><VAUS:mode>3"): Likewise.
("*movqi_insn"): Likewise.
("*movhi_insn"): Likewise.
("*movsi_insn"): Likewise.
("movsi_pic_gotdata_op"): Likewise.
("*movdi_insn_sp32"): Likewise.
("*movdi_insn_sp64"): Likewise.
("movdi_pic_gotdata_op"): Likewise.
("*movsf_insn"): Likewise.
("*movdf_insn_sp32"): Likewise.
("*movdf_insn_sp64"): Likewise.
("*zero_extendhisi2_insn"): Likewise.
("*zero_extendqihi2_insn"): Likewise.
("*zero_extendqisi2_insn"): Likewise.
("*zero_extendqidi2_insn"): Likewise.
("*zero_extendhidi2_insn"): Likewise.
("*zero_extendsidi2_insn_sp64"): Likewise.
("ldfsr"): Likewise.
("prefetch_64"): Likewise.
("prefetch_32"): Likewise.
("tie_ld32"): Likewise.
("tie_ld64"): Likewise.
("*tldo_ldub_sp32"): Likewise.
("*tldo_ldub1_sp32"): Likewise.
("*tldo_ldub2_sp32"): Likewise.
("*tldo_ldub_sp64"): Likewise.
("*tldo_ldub1_sp64"): Likewise.
("*tldo_ldub2_sp64"): Likewise.
("*tldo_ldub3_sp64"): Likewise.
("*tldo_lduh_sp32"): Likewise.
("*tldo_lduh1_sp32"): Likewise.
("*tldo_lduh_sp64"): Likewise.
("*tldo_lduh1_sp64"): Likewise.
("*tldo_lduh2_sp64"): Likewise.
("*tldo_lduw_sp32"): Likewise.
("*tldo_lduw_sp64"): Likewise.
("*tldo_lduw1_sp64"): Likewise.
("*tldo_ldx_sp64"): Likewise.
("*mov<VM32:mode>_insn"): Likewise.
("*mov<VM64:mode>_insn_sp64"): Likewise.
("*mov<VM64:mode>_insn_sp32"): Likewise.
2017-07-07 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/sparc/sparc.md ("type"): New insn type viscmp.
("fcmp<gcond:code><GCM:gcm_name><P:mode>_vis"): Set insn type to
viscmp.
("fpcmp<gcond:code>8<P:mode>_vis"): Likewise.
("fucmp<gcond:code>8<P:mode>_vis"): Likewise.
("fpcmpu<gcond:code><GCM:gcm_name><P:mode>_vis"): Likewise.
* config/sparc/niagara7.md ("n7_vis_logical_v3pipe"): Handle
viscmp.
("n7_vis_logical_11cycle"): Likewise.
* config/sparc/niagara4.md ("n4_vis_logical"): Likewise.
* config/sparc/niagara2.md ("niag3_vis": Likewise.
* config/sparc/niagara.md ("niag_vis"): Likewise.
* config/sparc/ultra3.md ("us3_fga"): Likewise.
* config/sparc/ultra1_2.md ("us1_fga_double"): Likewise.
2017-07-07 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/sparc/sparc.md: New instruction type `bmask'.
(bmaskdi_vis): Use the `bmask' type.
(bmasksi_vis): Likewise.
* config/sparc/ultra3.md (us3_array): Likewise.
* config/sparc/niagara7.md (n7_array): Likewise.
* config/sparc/niagara4.md (n4_array): Likewise.
* config/sparc/niagara2.md (niag2_vis): Likewise.
(niag3_vis): Likewise.
* config/sparc/niagara.md (niag_vis): Likewise.
2017-07-05 Georg-Johann Lay <avr@gjlay.de>
Backport from 2017-07-05 trunk r249995.

View File

@ -4383,7 +4383,7 @@ case "${target}" in
| sparclite | f930 | f934 | sparclite86x \
| sparclet | tsc701 \
| v9 | ultrasparc | ultrasparc3 | niagara | niagara2 \
| niagara3 | niagara4 | niagara7)
| niagara3 | niagara4 | niagara7 | m8)
# OK
;;
*)

View File

@ -660,6 +660,10 @@
#undef HAVE_AS_SPARC5_VIS4
#endif
/* Define if your assembler supports SPARC6 instructions. */
#ifndef USED_FOR_TARGET
#undef HAVE_AS_SPARC6
#endif
/* Define if your assembler and linker support GOTDATA_OP relocs. */
#ifndef USED_FOR_TARGET

View File

@ -19,7 +19,7 @@
;;; Unused letters:
;;; B
;;; a jkl q tuv xyz
;;; a jkl uv xyz
;; Register constraints
@ -58,6 +58,16 @@
;; Integer constant constraints
(define_constraint "q"
"Unsigned 2-bit integer constant"
(and (match_code "const_int")
(match_test "SPARC_IMM2_P (ival)")))
(define_constraint "t"
"Unsigned 5-bit integer constant"
(and (match_code "const_int")
(match_test "SPARC_IMM5_P (ival)")))
(define_constraint "A"
"Signed 5-bit integer constant"
(and (match_code "const_int")

View File

@ -79,6 +79,7 @@ static const struct cpu_names {
#endif
{ "SPARC-M7", "niagara7" },
{ "SPARC-S7", "niagara7" },
{ "SPARC-M8", "m8" },
{ NULL, NULL }
};

242
gcc/config/sparc/m8.md Normal file
View File

@ -0,0 +1,242 @@
;; Scheduling description for the SPARC M8.
;; Copyright (C) 2017 Free Software Foundation, Inc.
;;
;; This file is part of GCC.
;;
;; GCC is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 3, or (at your option)
;; any later version.
;;
;; GCC is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU General Public License for more details.
;;
;; You should have received a copy of the GNU General Public License
;; along with GCC; see the file COPYING3. If not see
;; <http://www.gnu.org/licenses/>.
;; Thigs to improve:
;;
;; - Store instructions are implemented by micro-ops, one of which
;; generates the store address and is executed in the store address
;; generation unit in the slot0. We need to model that.
;;
;; - There are two V3 pipes connected to different slots. The current
;; implementation assumes that all the instructions executing in a
;; V3 pipe are issued to the unit in slot3.
;;
;; - Single-issue ALU operations incur an additional cycle of latency to
;; slot 0 and slot 1 instructions. This is not currently reflected
;; in the DFA.
(define_automaton "m8_0")
;; The S5 core has two dual-issue queues, PQLS and PQEX. Each queue
;; is divided into two slots: PQLS corresponds to slots 0 and 1, and
;; PQEX corresponds to slots 2 and 3. The core can issue 4
;; instructions per-cycle, and up to 4 instructions are committed each
;; cycle.
;;
;;
;; m8_slot0 - Load Unit.
;; - Store address gen. Unit.
;;
;;
;; === PQLS ==> m8_slot1 - Store data unit.
;; - Branch unit.
;;
;;
;; === PQEX ==> m8_slot2 - Integer Unit (EXU2).
;; - 3-cycles Crypto Unit (SPU2).
;;
;; m8_slot3 - Integer Unit (EXU3).
;; - 3-cycles Crypto Unit (SPU3).
;; - Floating-point and graphics unit (FPG).
;; - Long-latency Crypto Unit.
;; - Oracle Numbers Unit (ONU).
(define_cpu_unit "m8_slot0,m8_slot1,m8_slot2,m8_slot3" "m8_0")
;; Some instructions stall the pipeline and avoid any other
;; instruction to be issued in the same cycle. We assume the same for
;; multi-instruction insns.
(define_reservation "m8_single_issue" "m8_slot0 + m8_slot1 + m8_slot2 + m8_slot3")
(define_insn_reservation "m8_single" 1
(and (eq_attr "cpu" "m8")
(eq_attr "type" "multi,savew,flushw,trap,bmask"))
"m8_single_issue")
;; Most of the instructions executing in the integer units have a
;; latency of 1.
(define_insn_reservation "m8_integer" 1
(and (eq_attr "cpu" "m8")
(eq_attr "type" "ialu,ialuX,shift,cmove,compare,bmask"))
"(m8_slot2 | m8_slot3)")
;; Flushing the instruction memory takes 27 cycles.
(define_insn_reservation "m8_iflush" 27
(and (eq_attr "cpu" "m8")
(eq_attr "type" "iflush"))
"(m8_slot2 | m8_slot3), nothing*26")
;; The integer multiplication instructions have a latency of 10 cycles
;; and execute in integer units.
;;
;; Likewise for array*, edge* and pdistn instructions.
;;
;; However, the latency is only 9 cycles if the consumer of the
;; operation is also capable of 9 cycles latency. We model this with
;; a bypass.
(define_insn_reservation "m8_imul" 10
(and (eq_attr "cpu" "m8")
(eq_attr "type" "imul,array,edge,edgen,pdistn"))
"(m8_slot2 | m8_slot3), nothing*12")
(define_bypass 9 "m8_imul" "m8_imul")
;; The integer division instructions `sdiv' and `udivx' have a latency
;; of 30 cycles and execute in integer units.
(define_insn_reservation "m8_idiv" 30
(and (eq_attr "cpu" "m8")
(eq_attr "type" "idiv"))
"(m8_slot2 | m8_slot3), nothing*29")
;; Both integer and floating-point load instructions have a latency of
;; only 3 cycles,and execute in the slot0.
;;
;; Misaligned load instructions feature a latency of 11 cycles.
;;
;; The prefetch instruction also executes in the load unit, but it's
;; latency is only 1 cycle.
(define_insn_reservation "m8_load" 3
(and (eq_attr "cpu" "m8")
(ior (eq_attr "type" "fpload,sload")
(and (eq_attr "type" "load")
(eq_attr "subtype" "regular"))))
"m8_slot0, nothing*2")
;; (define_insn_reservation "m8_load_misalign" 11
;; (and (eq_attr "cpu" "m8")
;; (eq_attr "type" "load_mis,fpload_mis"))
;; "m8_slot0, nothing*10")
(define_insn_reservation "m8_prefetch" 1
(and (eq_attr "cpu" "m8")
(eq_attr "type" "load")
(eq_attr "subtype" "prefetch"))
"m8_slot0")
;; Both integer and floating-point store instructions have a latency
;; of 1 cycle, and execute in the store data unit in slot1.
;;
;; However, misaligned store instructions feature a latency of 3
;; cycles.
(define_insn_reservation "m8_store" 1
(and (eq_attr "cpu" "m8")
(eq_attr "type" "store,fpstore"))
"m8_slot1")
;; (define_insn_reservation "m8_store_misalign" 3
;; (and (eq_attr "cpu" "m8")
;; (eq_attr "type" "store_mis,fpstore_mis"))
;; "m8_slot1, nothing*2")
;; Control-transfer instructions execute in the Branch Unit in the
;; slot1.
(define_insn_reservation "m8_cti" 1
(and (eq_attr "cpu" "m8")
(eq_attr "type" "cbcond,uncond_cbcond,branch,call,sibcall,call_no_delay_slot,uncond_branch,return"))
"m8_slot1")
;; Many instructions executing in the Floating-point and Graphics Unit
;; (FGU) serving slot3 feature a default latency of 9 cycles.
(define_insn_reservation "m8_fp" 9
(and (eq_attr "cpu" "m8")
(ior (eq_attr "type" "fpmove,fpcmove,fpcrmove,fp,fpcmp,fpmul,fgm_pack,fgm_mul,pdist")
(and (eq_attr "type" "fga")
(eq_attr "subtype" "fpu"))))
"m8_slot3, nothing*8")
;; Floating-point division and floating-point square-root instructions
;; have high latencies. They execute in the FGU.
(define_insn_reservation "m8_fpdivs" 26
(and (eq_attr "cpu" "m8")
(eq_attr "type" "fpdivs"))
"m8_slot3, nothing*25")
(define_insn_reservation "m8_fpsqrts" 33
(and (eq_attr "cpu" "m8")
(eq_attr "type" "fpsqrts"))
"m8_slot3, nothing*32")
(define_insn_reservation "m8_fpdivd" 30
(and (eq_attr "cpu" "m8")
(eq_attr "type" "fpdivd"))
"m8_slot3, nothing*29")
(define_insn_reservation "m8_fpsqrtd" 41
(and (eq_attr "cpu" "m8")
(eq_attr "type" "fpsqrtd"))
"m8_slot3, nothing*40")
;; SIMD VIS instructions executing in the Floating-point and graphics
;; unit (FPG) in slot3 usually have a latency of 5 cycles.
;;
;; However, the latency for many instructions is only 3 cycles if the
;; consumer can also be executed in 3 cycles. We model this with a
;; bypass. In these cases the instructions are executed in one of the
;; two 3-cycle crypto units (SPU, also known as "v3-pipes") in slots 2
;; and 3.
(define_insn_reservation "m8_vis" 5
(and (eq_attr "cpu" "m8")
(ior (eq_attr "type" "viscmp,lzd")
(and (eq_attr "type" "fga")
(eq_attr "subtype" "maxmin,cmask,other"))
(and (eq_attr "type" "vismv")
(eq_attr "subtype" "single,movstouw"))
(and (eq_attr "type" "visl")
(eq_attr "subtype" "single"))))
"m8_slot3, nothing*4")
(define_bypass 3 "m8_vis" "m8_vis")
(define_insn_reservation "m8_gsr" 5
(and (eq_attr "cpu" "m8")
(eq_attr "type" "gsr")
(eq_attr "subtype" "alignaddr"))
"m8_slot3, nothing*4")
;; A few VIS instructions have a latency of 1.
(define_insn_reservation "m8_vis_1cycle" 1
(and (eq_attr "cpu" "m8")
(ior (and (eq_attr "type" "vismv")
(eq_attr "subtype" "double,movxtod,movdtox"))
(and (eq_attr "type" "visl")
(eq_attr "subtype" "double"))
(and (eq_attr "type" "fga")
(eq_attr "subtype" "addsub64"))))
"m8_slot3")
;; Reading and writing to the gsr register takes more than 70 cycles.
(define_insn_reservation "m8_gsr_reg" 70
(and (eq_attr "cpu" "m8")
(eq_attr "type" "gsr")
(eq_attr "subtype" "reg"))
"m8_slot3, nothing*69")

View File

@ -114,5 +114,5 @@
*/
(define_insn_reservation "niag_vis" 8
(and (eq_attr "cpu" "niagara")
(eq_attr "type" "fga,visl,vismv,fgm_pack,fgm_mul,pdist,edge,edgen,gsr,array"))
(eq_attr "type" "fga,visl,viscmp,vismv,fgm_pack,fgm_mul,pdist,edge,edgen,gsr,array,bmask"))
"niag_pipe*8")

View File

@ -111,10 +111,10 @@
(define_insn_reservation "niag2_vis" 6
(and (eq_attr "cpu" "niagara2")
(eq_attr "type" "fga,vismv,visl,fgm_pack,fgm_mul,pdist,edge,edgen,array,gsr"))
(eq_attr "type" "fga,vismv,visl,viscmp,fgm_pack,fgm_mul,pdist,edge,edgen,array,bmask,gsr"))
"niag2_pipe*6")
(define_insn_reservation "niag3_vis" 9
(and (eq_attr "cpu" "niagara3")
(eq_attr "type" "fga,vismv,visl,fgm_pack,fgm_mul,pdist,pdistn,edge,edgen,array,gsr"))
(eq_attr "type" "fga,vismv,visl,viscmp,fgm_pack,fgm_mul,pdist,pdistn,edge,edgen,array,bmask,gsr"))
"niag2_pipe*9")

View File

@ -66,7 +66,7 @@
(define_insn_reservation "n4_array" 12
(and (eq_attr "cpu" "niagara4")
(eq_attr "type" "array,edge,edgen"))
(eq_attr "type" "array,bmask,edge,edgen"))
"n4_slot1, nothing*11")
(define_insn_reservation "n4_vis_move_1cycle" 1
@ -90,8 +90,9 @@
(define_insn_reservation "n4_vis_logical" 3
(and (eq_attr "cpu" "niagara4")
(and (eq_attr "type" "visl,pdistn")
(eq_attr "fptype" "double")))
(ior (and (eq_attr "type" "visl,pdistn")
(eq_attr "fptype" "double"))
(eq_attr "type" "viscmp")))
"n4_slot1, nothing*2")
(define_insn_reservation "n4_vis_logical_11cycle" 11

View File

@ -19,64 +19,120 @@
(define_automaton "niagara7_0")
(define_cpu_unit "n7_slot0,n7_slot1,n7_slot2" "niagara7_0")
(define_reservation "n7_single_issue" "n7_slot0 + n7_slot1 + n7_slot2")
;; The S4 core has a dual-issue queue. This queue is divided into two
;; slots. One instruction can be issued each cycle to each slot, and
;; up to 2 instructions are committed each cycle. Each slot serves
;; several execution units, as depicted below:
;;
;;
;; m7_slot0 - Integer unit.
;; - Load/Store unit.
;; === QUEUE ==>
;;
;; m7_slot1 - Integer unit.
;; - Branch unit.
;; - Floating-point and graphics unit.
;; - 3-cycles crypto unit.
(define_cpu_unit "n7_load_store" "niagara7_0")
(define_cpu_unit "n7_slot0,n7_slot1" "niagara7_0")
;; Some instructions stall the pipeline and avoid any other
;; instruction to be issued in the same cycle. We assume the same for
;; multi-instruction insns.
(define_reservation "n7_single_issue" "n7_slot0 + n7_slot1")
(define_insn_reservation "n7_single" 1
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "multi,savew,flushw,trap"))
"n7_single_issue")
(define_insn_reservation "n7_iflush" 27
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "iflush"))
"(n7_slot0 | n7_slot1), nothing*26")
;; Most of the instructions executing in the integer unit have a
;; latency of 1.
(define_insn_reservation "n7_integer" 1
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "ialu,ialuX,shift,cmove,compare"))
"(n7_slot0 | n7_slot1)")
;; Flushing the instruction memory takes 27 cycles.
(define_insn_reservation "n7_iflush" 27
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "iflush"))
"(n7_slot0 | n7_slot1), nothing*26")
;; The integer multiplication instructions have a latency of 12 cycles
;; and execute in the integer unit.
;;
;; Likewise for array*, edge* and pdistn instructions.
(define_insn_reservation "n7_imul" 12
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "imul"))
"n7_slot1, nothing*11")
(eq_attr "type" "imul,array,edge,edgen,pdistn"))
"(n7_slot0 | n7_slot1), nothing*11")
;; The integer division instructions have a latency of 35 cycles and
;; execute in the integer unit.
(define_insn_reservation "n7_idiv" 35
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "idiv"))
"n7_slot1, nothing*34")
"(n7_slot0 | n7_slot1), nothing*34")
;; Both integer and floating-point load instructions have a latency of
;; 5 cycles, and execute in the slot0.
;;
;; The prefetch instruction also executes in the load/store unit, but
;; its latency is only 1 cycle.
(define_insn_reservation "n7_load" 5
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "load,fpload,sload"))
"(n7_slot0 + n7_load_store), nothing*4")
(ior (eq_attr "type" "fpload,sload")
(and (eq_attr "type" "load")
(eq_attr "subtype" "regular"))))
"n7_slot0, nothing*4")
(define_insn_reservation "n7_prefetch" 1
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "load")
(eq_attr "subtype" "prefetch"))
"n7_slot0")
;; Both integer and floating-point store instructions have a latency
;; of 1 cycle, and execute in the load/store unit in slot0.
(define_insn_reservation "n7_store" 1
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "store,fpstore"))
"(n7_slot0 | n7_slot2) + n7_load_store")
"n7_slot0")
;; Control-transfer instructions execute in the Branch Unit in the
;; slot1.
(define_insn_reservation "n7_cti" 1
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "cbcond,uncond_cbcond,branch,call,sibcall,call_no_delay_slot,uncond_branch,return"))
"n7_slot1")
;; Many instructions executing in the Floating-point and Graphics unit
;; in the slot1 feature a latency of 11 cycles.
(define_insn_reservation "n7_fp" 11
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "fpmove,fpcmove,fpcrmove,fp,fpcmp,fpmul"))
(ior (eq_attr "type" "fpmove,fpcmove,fpcrmove,fp,fpcmp,fpmul,fgm_pack,fgm_mul,pdist")
(and (eq_attr "type" "fga")
(eq_attr "subtype" "fpu,maxmin"))))
"n7_slot1, nothing*10")
(define_insn_reservation "n7_array" 12
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "array,edge,edgen"))
"n7_slot1, nothing*11")
;; Floating-point division and floating-point square-root instructions
;; have high latencies. They execute in the floating-point and
;; graphics unit in the slot1.
(define_insn_reservation "n7_fpdivs" 24
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "fpdivs,fpsqrts"))
(eq_attr "type" "fpdivs,fpsqrts"))
"n7_slot1, nothing*23")
(define_insn_reservation "n7_fpdivd" 37
@ -84,53 +140,66 @@
(eq_attr "type" "fpdivd,fpsqrtd"))
"n7_slot1, nothing*36")
;; SIMD VIS instructions executing in the Floating-point and graphics
;; unit (FPG) in slot1 usually have a latency of either 11 or 12
;; cycles.
;;
;; However, the latency for many instructions is only 3 cycles if the
;; consumer can also be executed in 3 cycles. We model this with a
;; bypass. In these cases the instructions are executed in the
;; 3-cycle crypto unit which also serves slot1.
(define_insn_reservation "n7_vis_11cycles" 11
(and (eq_attr "cpu" "niagara7")
(ior (and (eq_attr "type" "fga")
(eq_attr "subtype" "addsub64,other"))
(and (eq_attr "type" "vismv")
(eq_attr "subtype" "double,single"))
(and (eq_attr "type" "visl")
(eq_attr "subtype" "double,single"))))
"n7_slot1, nothing*10")
(define_insn_reservation "n7_vis_12cycles" 12
(and (eq_attr "cpu" "niagara7")
(ior (eq_attr "type" "bmask,viscmp")
(and (eq_attr "type" "fga")
(eq_attr "subtype" "cmask"))
(and (eq_attr "type" "vismv")
(eq_attr "subtype" "movstouw"))))
"n7_slot1, nothing*11")
(define_bypass 3 "n7_vis_*" "n7_vis_*")
;; Some other VIS instructions have a latency of 12 cycles, and won't
;; be executed in the 3-cycle crypto pipe.
(define_insn_reservation "n7_lzd" 12
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "lzd"))
"(n7_slot0 | n7_slot1), nothing*11")
(ior (eq_attr "type" "lzd,")
(and (eq_attr "type" "gsr")
(eq_attr "subtype" "alignaddr"))))
"n7_slot1, nothing*11")
;; There is an internal unit called the "V3 pipe", that was originally
;; intended to process some of the short cryptographic instructions.
;; However, as soon as in the T4 several of the VIS instructions
;; (notably non-FP instructions) have been moved to the V3 pipe.
;; Consequently, these instructions feature a latency of 3 instead of
;; 11 or 12 cycles, provided their consumers also execute in the V3
;; pipe.
;;
;; This is modelled here with a bypass.
;; A couple of VIS instructions feature very low latencies in the M7.
(define_insn_reservation "n7_vis_fga" 11
(define_insn_reservation "n7_single_vis" 1
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "fga,gsr"))
"n7_slot1, nothing*10")
(define_insn_reservation "n7_vis_fgm" 11
(and (eq_attr "cpu" "niagara7")
(eq_attr "type" "fgm_pack,fgm_mul,pdist"))
"n7_slot1, nothing*10")
(define_insn_reservation "n7_vis_move_v3pipe" 11
(and (eq_attr "cpu" "niagara7")
(and (eq_attr "type" "vismv")
(eq_attr "v3pipe" "true")))
(eq_attr "type" "vismv")
(eq_attr "subtype" "movxtod"))
"n7_slot1")
(define_insn_reservation "n7_vis_move_11cycle" 11
(define_insn_reservation "n7_double_vis" 2
(and (eq_attr "cpu" "niagara7")
(and (eq_attr "type" "vismv")
(eq_attr "v3pipe" "false")))
"n7_slot1, nothing*10")
(eq_attr "type" "vismv")
(eq_attr "subtype" "movdtox"))
"n7_slot1, nothing")
(define_insn_reservation "n7_vis_logical_v3pipe" 11
;; Reading and writing to the gsr register takes a high number of
;; cycles that is not documented in the PRM. Let's use the same value
;; than the M8.
(define_insn_reservation "n7_gsr_reg" 70
(and (eq_attr "cpu" "niagara7")
(and (eq_attr "type" "visl,pdistn")
(eq_attr "v3pipe" "true")))
"n7_slot1, nothing*2")
(define_insn_reservation "n7_vis_logical_11cycle" 11
(and (eq_attr "cpu" "niagara7")
(and (eq_attr "type" "visl")
(eq_attr "v3pipe" "false")))
"n7_slot1, nothing*10")
(define_bypass 3 "*_v3pipe" "*_v3pipe")
(eq_attr "type" "gsr")
(eq_attr "subtype" "reg"))
"n7_slot1, nothing*70")

View File

@ -328,6 +328,33 @@
(and (match_code "const_int")
(match_test "SPARC_SIMM5_P (INTVAL (op))"))))
;; Return true if OP is a constant in the range 0..7. This is an
;; acceptable second operand for dictunpack instructions setting a
;; V8QI mode in the destination register.
(define_predicate "imm5_operand_dictunpack8"
(and (match_code "const_int")
(match_test "(INTVAL (op) >= 0 && INTVAL (op) < 8)")))
;; Return true if OP is a constant in the range 7..15. This is an
;; acceptable second operand for dictunpack instructions setting a
;; V4HI mode in the destination register.
(define_predicate "imm5_operand_dictunpack16"
(and (match_code "const_int")
(match_test "(INTVAL (op) >= 8 && INTVAL (op) < 16)")))
;; Return true if OP is a constant in the range 15..31. This is an
;; acceptable second operand for dictunpack instructions setting a
;; V2SI mode in the destination register.
(define_predicate "imm5_operand_dictunpack32"
(and (match_code "const_int")
(match_test "(INTVAL (op) >= 16 && INTVAL (op) < 32)")))
;; Return true if OP is a constant that is representable by a 2-bit
;; unsigned field. This is an acceptable third operand for
;; fpcmp*shl instructions.
(define_predicate "imm2_operand"
(and (match_code "const_int")
(match_test "SPARC_IMM2_P (INTVAL (op))")))
;; Predicates for miscellaneous instructions.

View File

@ -174,13 +174,22 @@ along with GCC; see the file COPYING3. If not see
#define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG AS_NIAGARA7_FLAG
#endif
#if TARGET_CPU_DEFAULT == TARGET_CPU_m8
#undef CPP_CPU64_DEFAULT_SPEC
#define CPP_CPU64_DEFAULT_SPEC ""
#undef ASM_CPU32_DEFAULT_SPEC
#define ASM_CPU32_DEFAULT_SPEC AS_SPARC32_FLAG AS_M8_FLAG
#undef ASM_CPU64_DEFAULT_SPEC
#define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG AS_M8_FLAG
#endif
#undef CPP_CPU_SPEC
#define CPP_CPU_SPEC "\
%{mcpu=sparclet|mcpu=tsc701:-D__sparclet__} \
%{mcpu=sparclite|mcpu-f930|mcpu=f934:-D__sparclite__} \
%{mcpu=v8:" DEF_ARCH32_SPEC("-D__sparcv8") "} \
%{mcpu=supersparc:-D__supersparc__ " DEF_ARCH32_SPEC("-D__sparcv8") "} \
%{mcpu=v9|mcpu=ultrasparc|mcpu=ultrasparc3|mcpu=niagara|mcpu=niagara2|mcpu=niagara3|mcpu=niagara4|mcpu=niagara7:" DEF_ARCH32_SPEC("-D__sparcv8") "} \
%{mcpu=v9|mcpu=ultrasparc|mcpu=ultrasparc3|mcpu=niagara|mcpu=niagara2|mcpu=niagara3|mcpu=niagara4|mcpu=niagara7|mcpu=m8:" DEF_ARCH32_SPEC("-D__sparcv8") "} \
%{!mcpu*:%(cpp_cpu_default)} \
"
@ -290,7 +299,8 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
%{mcpu=niagara3:" DEF_ARCH32_SPEC("-xarch=v8plus" AS_NIAGARA3_FLAG) DEF_ARCH64_SPEC("-xarch=v9" AS_NIAGARA3_FLAG) "} \
%{mcpu=niagara4:" DEF_ARCH32_SPEC(AS_SPARC32_FLAG AS_NIAGARA4_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA4_FLAG) "} \
%{mcpu=niagara7:" DEF_ARCH32_SPEC(AS_SPARC32_FLAG AS_NIAGARA7_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA7_FLAG) "} \
%{!mcpu=niagara7:%{!mcpu=niagara4:%{!mcpu=niagara3:%{!mcpu=niagara2:%{!mcpu=niagara:%{!mcpu=ultrasparc3:%{!mcpu=ultrasparc:%{!mcpu=v9:%{mcpu*:" DEF_ARCH32_SPEC("-xarch=v8") DEF_ARCH64_SPEC("-xarch=v9") "}}}}}}}}} \
%{mcpu=m8:" DEF_ARCH32_SPEC(AS_SPARC32_FLAG AS_M8_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_M8_FLAG) "} \
%{!mcpu=m8:%{!mcpu=niagara7:%{!mcpu=niagara4:%{!mcpu=niagara3:%{!mcpu=niagara2:%{!mcpu=niagara:%{!mcpu=ultrasparc3:%{!mcpu=ultrasparc:%{!mcpu=v9:%{mcpu*:" DEF_ARCH32_SPEC("-xarch=v8") DEF_ARCH64_SPEC("-xarch=v9") "}}}}}}}}}} \
%{!mcpu*:%(asm_cpu_default)} \
"

View File

@ -40,7 +40,12 @@ sparc_target_macros (void)
cpp_assert (parse_in, "machine=sparc");
}
if (TARGET_VIS4)
if (TARGET_VIS4B)
{
cpp_define (parse_in, "__VIS__=0x410");
cpp_define (parse_in, "__VIS=0x410");
}
else if (TARGET_VIS4)
{
cpp_define (parse_in, "__VIS__=0x400");
cpp_define (parse_in, "__VIS=0x400");

View File

@ -46,6 +46,7 @@ enum processor_type {
PROCESSOR_NIAGARA3,
PROCESSOR_NIAGARA4,
PROCESSOR_NIAGARA7,
PROCESSOR_M8,
PROCESSOR_NATIVE
};

View File

@ -448,6 +448,30 @@ struct processor_costs niagara7_costs = {
0, /* shift penalty */
};
static const
struct processor_costs m8_costs = {
COSTS_N_INSNS (3), /* int load */
COSTS_N_INSNS (3), /* int signed load */
COSTS_N_INSNS (3), /* int zeroed load */
COSTS_N_INSNS (3), /* float load */
COSTS_N_INSNS (9), /* fmov, fneg, fabs */
COSTS_N_INSNS (9), /* fadd, fsub */
COSTS_N_INSNS (9), /* fcmp */
COSTS_N_INSNS (9), /* fmov, fmovr */
COSTS_N_INSNS (9), /* fmul */
COSTS_N_INSNS (26), /* fdivs */
COSTS_N_INSNS (30), /* fdivd */
COSTS_N_INSNS (33), /* fsqrts */
COSTS_N_INSNS (41), /* fsqrtd */
COSTS_N_INSNS (12), /* imul */
COSTS_N_INSNS (10), /* imulX */
0, /* imul bit factor */
COSTS_N_INSNS (57), /* udiv/sdiv */
COSTS_N_INSNS (30), /* udivx/sdivx */
COSTS_N_INSNS (1), /* movcc/movr */
0, /* shift penalty */
};
static const struct processor_costs *sparc_costs = &cypress_costs;
#ifdef HAVE_AS_RELAX_OPTION
@ -1222,6 +1246,8 @@ dump_target_flag_bits (const int flags)
fprintf (stderr, "VIS3 ");
if (flags & MASK_VIS4)
fprintf (stderr, "VIS4 ");
if (flags & MASK_VIS4B)
fprintf (stderr, "VIS4B ");
if (flags & MASK_CBCOND)
fprintf (stderr, "CBCOND ");
if (flags & MASK_DEPRECATED_V8_INSNS)
@ -1286,6 +1312,7 @@ sparc_option_override (void)
{ TARGET_CPU_niagara3, PROCESSOR_NIAGARA3 },
{ TARGET_CPU_niagara4, PROCESSOR_NIAGARA4 },
{ TARGET_CPU_niagara7, PROCESSOR_NIAGARA7 },
{ TARGET_CPU_m8, PROCESSOR_M8 },
{ -1, PROCESSOR_V7 }
};
const struct cpu_default *def;
@ -1337,7 +1364,11 @@ sparc_option_override (void)
MASK_V9|MASK_POPC|MASK_VIS3|MASK_FMAF|MASK_CBCOND },
/* UltraSPARC M7 */
{ "niagara7", MASK_ISA,
MASK_V9|MASK_POPC|MASK_VIS4|MASK_FMAF|MASK_CBCOND|MASK_SUBXC }
MASK_V9|MASK_POPC|MASK_VIS4|MASK_FMAF|MASK_CBCOND|MASK_SUBXC },
/* UltraSPARC M8 */
{ "m8", MASK_ISA,
MASK_V9|MASK_POPC|MASK_VIS4|MASK_FMAF|MASK_CBCOND|MASK_SUBXC
|MASK_VIS4B }
};
const struct cpu_table *cpu;
unsigned int i;
@ -1467,6 +1498,9 @@ sparc_option_override (void)
#ifndef HAVE_AS_SPARC5_VIS4
& ~(MASK_VIS4 | MASK_SUBXC)
#endif
#ifndef HAVE_AS_SPARC6
& ~(MASK_VIS4B)
#endif
#ifndef HAVE_AS_LEON
& ~(MASK_LEON | MASK_LEON3)
#endif
@ -1485,11 +1519,15 @@ sparc_option_override (void)
if (TARGET_VIS4)
target_flags |= MASK_VIS3 | MASK_VIS2 | MASK_VIS;
/* Don't allow -mvis, -mvis2, -mvis3, -mvis4 or -mfmaf if FPU is
disabled. */
/* -mvis4b implies -mvis4, -mvis3, -mvis2 and -mvis */
if (TARGET_VIS4B)
target_flags |= MASK_VIS4 | MASK_VIS3 | MASK_VIS2 | MASK_VIS;
/* Don't allow -mvis, -mvis2, -mvis3, -mvis4, -mvis4b and -mfmaf if
FPU is disabled. */
if (! TARGET_FPU)
target_flags &= ~(MASK_VIS | MASK_VIS2 | MASK_VIS3 | MASK_VIS4
| MASK_FMAF);
| MASK_VIS4B | MASK_FMAF);
/* -mvis assumes UltraSPARC+, so we are sure v9 instructions
are available; -m64 also implies v9. */
@ -1529,7 +1567,8 @@ sparc_option_override (void)
|| sparc_cpu == PROCESSOR_NIAGARA3
|| sparc_cpu == PROCESSOR_NIAGARA4)
align_functions = 32;
else if (sparc_cpu == PROCESSOR_NIAGARA7)
else if (sparc_cpu == PROCESSOR_NIAGARA7
|| sparc_cpu == PROCESSOR_M8)
align_functions = 64;
}
@ -1597,6 +1636,9 @@ sparc_option_override (void)
case PROCESSOR_NIAGARA7:
sparc_costs = &niagara7_costs;
break;
case PROCESSOR_M8:
sparc_costs = &m8_costs;
break;
case PROCESSOR_NATIVE:
gcc_unreachable ();
};
@ -1659,13 +1701,14 @@ sparc_option_override (void)
|| sparc_cpu == PROCESSOR_NIAGARA4)
? 2
: (sparc_cpu == PROCESSOR_ULTRASPARC3
? 8 : (sparc_cpu == PROCESSOR_NIAGARA7
? 8 : ((sparc_cpu == PROCESSOR_NIAGARA7
|| sparc_cpu == PROCESSOR_M8)
? 32 : 3))),
global_options.x_param_values,
global_options_set.x_param_values);
/* For PARAM_L1_CACHE_LINE_SIZE we use the default 32 bytes (see
params.def), so no maybe_set_param_value is needed.
/* PARAM_L1_CACHE_LINE_SIZE is the size of the L1 cache line, in
bytes.
The Oracle SPARC Architecture (previously the UltraSPARC
Architecture) specification states that when a PREFETCH[A]
@ -1681,6 +1724,11 @@ sparc_option_override (void)
L2 and L3, but only 32B are brought into the L1D$. (Assuming it
is a read_n prefetch, which is the only type which allocates to
the L1.) */
maybe_set_param_value (PARAM_L1_CACHE_LINE_SIZE,
(sparc_cpu == PROCESSOR_M8
? 64 : 32),
global_options.x_param_values,
global_options_set.x_param_values);
/* PARAM_L1_CACHE_SIZE is the size of the L1D$ (most SPARC chips use
Hardvard level-1 caches) in kilobytes. Both UltraSPARC and
@ -1692,7 +1740,8 @@ sparc_option_override (void)
|| sparc_cpu == PROCESSOR_NIAGARA2
|| sparc_cpu == PROCESSOR_NIAGARA3
|| sparc_cpu == PROCESSOR_NIAGARA4
|| sparc_cpu == PROCESSOR_NIAGARA7)
|| sparc_cpu == PROCESSOR_NIAGARA7
|| sparc_cpu == PROCESSOR_M8)
? 16 : 64),
global_options.x_param_values,
global_options_set.x_param_values);
@ -1701,7 +1750,8 @@ sparc_option_override (void)
/* PARAM_L2_CACHE_SIZE is the size fo the L2 in kilobytes. Note
that 512 is the default in params.def. */
maybe_set_param_value (PARAM_L2_CACHE_SIZE,
(sparc_cpu == PROCESSOR_NIAGARA4
((sparc_cpu == PROCESSOR_NIAGARA4
|| sparc_cpu == PROCESSOR_M8)
? 128 : (sparc_cpu == PROCESSOR_NIAGARA7
? 256 : 512)),
global_options.x_param_values,
@ -9478,7 +9528,8 @@ sparc32_initialize_trampoline (rtx m_tramp, rtx fnaddr, rtx cxt)
&& sparc_cpu != PROCESSOR_NIAGARA2
&& sparc_cpu != PROCESSOR_NIAGARA3
&& sparc_cpu != PROCESSOR_NIAGARA4
&& sparc_cpu != PROCESSOR_NIAGARA7)
&& sparc_cpu != PROCESSOR_NIAGARA7
&& sparc_cpu != PROCESSOR_M8)
emit_insn (gen_flushsi (validize_mem (adjust_address (m_tramp, SImode, 8))));
/* Call __enable_execute_stack after writing onto the stack to make sure
@ -9524,7 +9575,8 @@ sparc64_initialize_trampoline (rtx m_tramp, rtx fnaddr, rtx cxt)
&& sparc_cpu != PROCESSOR_NIAGARA2
&& sparc_cpu != PROCESSOR_NIAGARA3
&& sparc_cpu != PROCESSOR_NIAGARA4
&& sparc_cpu != PROCESSOR_NIAGARA7)
&& sparc_cpu != PROCESSOR_NIAGARA7
&& sparc_cpu != PROCESSOR_M8)
emit_insn (gen_flushdi (validize_mem (adjust_address (m_tramp, DImode, 8))));
/* Call __enable_execute_stack after writing onto the stack to make sure
@ -9724,7 +9776,8 @@ sparc_use_sched_lookahead (void)
|| sparc_cpu == PROCESSOR_NIAGARA3)
return 0;
if (sparc_cpu == PROCESSOR_NIAGARA4
|| sparc_cpu == PROCESSOR_NIAGARA7)
|| sparc_cpu == PROCESSOR_NIAGARA7
|| sparc_cpu == PROCESSOR_M8)
return 2;
if (sparc_cpu == PROCESSOR_ULTRASPARC
|| sparc_cpu == PROCESSOR_ULTRASPARC3)
@ -9758,6 +9811,7 @@ sparc_issue_rate (void)
return 2;
case PROCESSOR_ULTRASPARC:
case PROCESSOR_ULTRASPARC3:
case PROCESSOR_M8:
return 4;
}
}
@ -10340,6 +10394,45 @@ enum sparc_builtins
SPARC_BUILTIN_FPSUBS8,
SPARC_BUILTIN_FPSUBUS8,
SPARC_BUILTIN_FPSUBUS16,
/* VIS 4.0B builtins. */
/* Note that all the DICTUNPACK* entries should be kept
contiguous. */
SPARC_BUILTIN_FIRST_DICTUNPACK,
SPARC_BUILTIN_DICTUNPACK8 = SPARC_BUILTIN_FIRST_DICTUNPACK,
SPARC_BUILTIN_DICTUNPACK16,
SPARC_BUILTIN_DICTUNPACK32,
SPARC_BUILTIN_LAST_DICTUNPACK = SPARC_BUILTIN_DICTUNPACK32,
/* Note that all the FPCMP*SHL entries should be kept
contiguous. */
SPARC_BUILTIN_FIRST_FPCMPSHL,
SPARC_BUILTIN_FPCMPLE8SHL = SPARC_BUILTIN_FIRST_FPCMPSHL,
SPARC_BUILTIN_FPCMPGT8SHL,
SPARC_BUILTIN_FPCMPEQ8SHL,
SPARC_BUILTIN_FPCMPNE8SHL,
SPARC_BUILTIN_FPCMPLE16SHL,
SPARC_BUILTIN_FPCMPGT16SHL,
SPARC_BUILTIN_FPCMPEQ16SHL,
SPARC_BUILTIN_FPCMPNE16SHL,
SPARC_BUILTIN_FPCMPLE32SHL,
SPARC_BUILTIN_FPCMPGT32SHL,
SPARC_BUILTIN_FPCMPEQ32SHL,
SPARC_BUILTIN_FPCMPNE32SHL,
SPARC_BUILTIN_FPCMPULE8SHL,
SPARC_BUILTIN_FPCMPUGT8SHL,
SPARC_BUILTIN_FPCMPULE16SHL,
SPARC_BUILTIN_FPCMPUGT16SHL,
SPARC_BUILTIN_FPCMPULE32SHL,
SPARC_BUILTIN_FPCMPUGT32SHL,
SPARC_BUILTIN_FPCMPDE8SHL,
SPARC_BUILTIN_FPCMPDE16SHL,
SPARC_BUILTIN_FPCMPDE32SHL,
SPARC_BUILTIN_FPCMPUR8SHL,
SPARC_BUILTIN_FPCMPUR16SHL,
SPARC_BUILTIN_FPCMPUR32SHL,
SPARC_BUILTIN_LAST_FPCMPSHL = SPARC_BUILTIN_FPCMPUR32SHL,
SPARC_BUILTIN_MAX
};
@ -10347,6 +10440,27 @@ enum sparc_builtins
static GTY (()) tree sparc_builtins[(int) SPARC_BUILTIN_MAX];
static enum insn_code sparc_builtins_icode[(int) SPARC_BUILTIN_MAX];
/* Return true if OPVAL can be used for operand OPNUM of instruction ICODE.
The instruction should require a constant operand of some sort. The
function prints an error if OPVAL is not valid. */
static int
check_constant_argument (enum insn_code icode, int opnum, rtx opval)
{
if (GET_CODE (opval) != CONST_INT)
{
error ("%qs expects a constant argument", insn_data[icode].name);
return false;
}
if (!(*insn_data[icode].operand[opnum].predicate) (opval, VOIDmode))
{
error ("constant argument out of range for %qs", insn_data[icode].name);
return false;
}
return true;
}
/* Add a SPARC builtin function with NAME, ICODE, CODE and TYPE. Return the
function decl or NULL_TREE if the builtin was not added. */
@ -10440,6 +10554,12 @@ sparc_vis_init_builtins (void)
v8qi, v8qi, 0);
tree si_ftype_v8qi_v8qi = build_function_type_list (intSI_type_node,
v8qi, v8qi, 0);
tree v8qi_ftype_df_si = build_function_type_list (v8qi, double_type_node,
intSI_type_node, 0);
tree v4hi_ftype_df_si = build_function_type_list (v4hi, double_type_node,
intSI_type_node, 0);
tree v2si_ftype_df_si = build_function_type_list (v2si, double_type_node,
intDI_type_node, 0);
tree di_ftype_di_di = build_function_type_list (intDI_type_node,
intDI_type_node,
intDI_type_node, 0);
@ -10894,6 +11014,156 @@ sparc_vis_init_builtins (void)
def_builtin_const ("__builtin_vis_fpsubus16", CODE_FOR_ussubv4hi3,
SPARC_BUILTIN_FPSUBUS16, v4hi_ftype_v4hi_v4hi);
}
if (TARGET_VIS4B)
{
def_builtin_const ("__builtin_vis_dictunpack8", CODE_FOR_dictunpack8,
SPARC_BUILTIN_DICTUNPACK8, v8qi_ftype_df_si);
def_builtin_const ("__builtin_vis_dictunpack16", CODE_FOR_dictunpack16,
SPARC_BUILTIN_DICTUNPACK16, v4hi_ftype_df_si);
def_builtin_const ("__builtin_vis_dictunpack32", CODE_FOR_dictunpack32,
SPARC_BUILTIN_DICTUNPACK32, v2si_ftype_df_si);
if (TARGET_ARCH64)
{
tree di_ftype_v8qi_v8qi_si = build_function_type_list (intDI_type_node,
v8qi, v8qi,
intSI_type_node, 0);
tree di_ftype_v4hi_v4hi_si = build_function_type_list (intDI_type_node,
v4hi, v4hi,
intSI_type_node, 0);
tree di_ftype_v2si_v2si_si = build_function_type_list (intDI_type_node,
v2si, v2si,
intSI_type_node, 0);
def_builtin_const ("__builtin_vis_fpcmple8shl", CODE_FOR_fpcmple8dishl,
SPARC_BUILTIN_FPCMPLE8SHL, di_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpgt8shl", CODE_FOR_fpcmpgt8dishl,
SPARC_BUILTIN_FPCMPGT8SHL, di_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpeq8shl", CODE_FOR_fpcmpeq8dishl,
SPARC_BUILTIN_FPCMPEQ8SHL, di_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpne8shl", CODE_FOR_fpcmpne8dishl,
SPARC_BUILTIN_FPCMPNE8SHL, di_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmple16shl", CODE_FOR_fpcmple16dishl,
SPARC_BUILTIN_FPCMPLE16SHL, di_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpgt16shl", CODE_FOR_fpcmpgt16dishl,
SPARC_BUILTIN_FPCMPGT16SHL, di_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpeq16shl", CODE_FOR_fpcmpeq16dishl,
SPARC_BUILTIN_FPCMPEQ16SHL, di_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpne16shl", CODE_FOR_fpcmpne16dishl,
SPARC_BUILTIN_FPCMPNE16SHL, di_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmple32shl", CODE_FOR_fpcmple32dishl,
SPARC_BUILTIN_FPCMPLE32SHL, di_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpgt32shl", CODE_FOR_fpcmpgt32dishl,
SPARC_BUILTIN_FPCMPGT32SHL, di_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpeq32shl", CODE_FOR_fpcmpeq32dishl,
SPARC_BUILTIN_FPCMPEQ32SHL, di_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpne32shl", CODE_FOR_fpcmpne32dishl,
SPARC_BUILTIN_FPCMPNE32SHL, di_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpule8shl", CODE_FOR_fpcmpule8dishl,
SPARC_BUILTIN_FPCMPULE8SHL, di_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpugt8shl", CODE_FOR_fpcmpugt8dishl,
SPARC_BUILTIN_FPCMPUGT8SHL, di_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpule16shl", CODE_FOR_fpcmpule16dishl,
SPARC_BUILTIN_FPCMPULE16SHL, di_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpugt16shl", CODE_FOR_fpcmpugt16dishl,
SPARC_BUILTIN_FPCMPUGT16SHL, di_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpule32shl", CODE_FOR_fpcmpule32dishl,
SPARC_BUILTIN_FPCMPULE32SHL, di_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpugt32shl", CODE_FOR_fpcmpugt32dishl,
SPARC_BUILTIN_FPCMPUGT32SHL, di_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpde8shl", CODE_FOR_fpcmpde8dishl,
SPARC_BUILTIN_FPCMPDE8SHL, di_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpde16shl", CODE_FOR_fpcmpde16dishl,
SPARC_BUILTIN_FPCMPDE16SHL, di_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpde32shl", CODE_FOR_fpcmpde32dishl,
SPARC_BUILTIN_FPCMPDE32SHL, di_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpur8shl", CODE_FOR_fpcmpur8dishl,
SPARC_BUILTIN_FPCMPUR8SHL, di_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpur16shl", CODE_FOR_fpcmpur16dishl,
SPARC_BUILTIN_FPCMPUR16SHL, di_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpur32shl", CODE_FOR_fpcmpur32dishl,
SPARC_BUILTIN_FPCMPUR32SHL, di_ftype_v2si_v2si_si);
}
else
{
tree si_ftype_v8qi_v8qi_si = build_function_type_list (intSI_type_node,
v8qi, v8qi,
intSI_type_node, 0);
tree si_ftype_v4hi_v4hi_si = build_function_type_list (intSI_type_node,
v4hi, v4hi,
intSI_type_node, 0);
tree si_ftype_v2si_v2si_si = build_function_type_list (intSI_type_node,
v2si, v2si,
intSI_type_node, 0);
def_builtin_const ("__builtin_vis_fpcmple8shl", CODE_FOR_fpcmple8sishl,
SPARC_BUILTIN_FPCMPLE8SHL, si_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpgt8shl", CODE_FOR_fpcmpgt8sishl,
SPARC_BUILTIN_FPCMPGT8SHL, si_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpeq8shl", CODE_FOR_fpcmpeq8sishl,
SPARC_BUILTIN_FPCMPEQ8SHL, si_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpne8shl", CODE_FOR_fpcmpne8sishl,
SPARC_BUILTIN_FPCMPNE8SHL, si_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmple16shl", CODE_FOR_fpcmple16sishl,
SPARC_BUILTIN_FPCMPLE16SHL, si_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpgt16shl", CODE_FOR_fpcmpgt16sishl,
SPARC_BUILTIN_FPCMPGT16SHL, si_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpeq16shl", CODE_FOR_fpcmpeq16sishl,
SPARC_BUILTIN_FPCMPEQ16SHL, si_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpne16shl", CODE_FOR_fpcmpne16sishl,
SPARC_BUILTIN_FPCMPNE16SHL, si_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmple32shl", CODE_FOR_fpcmple32sishl,
SPARC_BUILTIN_FPCMPLE32SHL, si_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpgt32shl", CODE_FOR_fpcmpgt32sishl,
SPARC_BUILTIN_FPCMPGT32SHL, si_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpeq32shl", CODE_FOR_fpcmpeq32sishl,
SPARC_BUILTIN_FPCMPEQ32SHL, si_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpne32shl", CODE_FOR_fpcmpne32sishl,
SPARC_BUILTIN_FPCMPNE32SHL, si_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpule8shl", CODE_FOR_fpcmpule8sishl,
SPARC_BUILTIN_FPCMPULE8SHL, si_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpugt8shl", CODE_FOR_fpcmpugt8sishl,
SPARC_BUILTIN_FPCMPUGT8SHL, si_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpule16shl", CODE_FOR_fpcmpule16sishl,
SPARC_BUILTIN_FPCMPULE16SHL, si_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpugt16shl", CODE_FOR_fpcmpugt16sishl,
SPARC_BUILTIN_FPCMPUGT16SHL, si_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpule32shl", CODE_FOR_fpcmpule32sishl,
SPARC_BUILTIN_FPCMPULE32SHL, si_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpugt32shl", CODE_FOR_fpcmpugt32sishl,
SPARC_BUILTIN_FPCMPUGT32SHL, si_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpde8shl", CODE_FOR_fpcmpde8sishl,
SPARC_BUILTIN_FPCMPDE8SHL, si_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpde16shl", CODE_FOR_fpcmpde16sishl,
SPARC_BUILTIN_FPCMPDE16SHL, si_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpde32shl", CODE_FOR_fpcmpde32sishl,
SPARC_BUILTIN_FPCMPDE32SHL, si_ftype_v2si_v2si_si);
def_builtin_const ("__builtin_vis_fpcmpur8shl", CODE_FOR_fpcmpur8sishl,
SPARC_BUILTIN_FPCMPUR8SHL, si_ftype_v8qi_v8qi_si);
def_builtin_const ("__builtin_vis_fpcmpur16shl", CODE_FOR_fpcmpur16sishl,
SPARC_BUILTIN_FPCMPUR16SHL, si_ftype_v4hi_v4hi_si);
def_builtin_const ("__builtin_vis_fpcmpur32shl", CODE_FOR_fpcmpur32sishl,
SPARC_BUILTIN_FPCMPUR32SHL, si_ftype_v2si_v2si_si);
}
}
}
/* Implement TARGET_BUILTIN_DECL hook. */
@ -10948,6 +11218,19 @@ sparc_expand_builtin (tree exp, rtx target,
insn_op = &insn_data[icode].operand[idx];
op[arg_count] = expand_normal (arg);
/* Some of the builtins require constant arguments. We check
for this here. */
if ((code >= SPARC_BUILTIN_FIRST_FPCMPSHL
&& code <= SPARC_BUILTIN_LAST_FPCMPSHL
&& arg_count == 3)
|| (code >= SPARC_BUILTIN_FIRST_DICTUNPACK
&& code <= SPARC_BUILTIN_LAST_DICTUNPACK
&& arg_count == 2))
{
if (!check_constant_argument (icode, idx, op[arg_count]))
return const0_rtx;
}
if (code == SPARC_BUILTIN_LDFSR || code == SPARC_BUILTIN_STFSR)
{
if (!address_operand (op[arg_count], SImode))
@ -11458,7 +11741,8 @@ sparc_register_move_cost (machine_mode mode ATTRIBUTE_UNUSED,
|| sparc_cpu == PROCESSOR_NIAGARA2
|| sparc_cpu == PROCESSOR_NIAGARA3
|| sparc_cpu == PROCESSOR_NIAGARA4
|| sparc_cpu == PROCESSOR_NIAGARA7)
|| sparc_cpu == PROCESSOR_NIAGARA7
|| sparc_cpu == PROCESSOR_M8)
return 12;
return 6;

View File

@ -143,6 +143,7 @@ extern enum cmodel sparc_cmodel;
#define TARGET_CPU_niagara3 15
#define TARGET_CPU_niagara4 16
#define TARGET_CPU_niagara7 19
#define TARGET_CPU_m8 20
#if TARGET_CPU_DEFAULT == TARGET_CPU_v9 \
|| TARGET_CPU_DEFAULT == TARGET_CPU_ultrasparc \
@ -151,7 +152,8 @@ extern enum cmodel sparc_cmodel;
|| TARGET_CPU_DEFAULT == TARGET_CPU_niagara2 \
|| TARGET_CPU_DEFAULT == TARGET_CPU_niagara3 \
|| TARGET_CPU_DEFAULT == TARGET_CPU_niagara4 \
|| TARGET_CPU_DEFAULT == TARGET_CPU_niagara7
|| TARGET_CPU_DEFAULT == TARGET_CPU_niagara7 \
|| TARGET_CPU_DEFAULT == TARGET_CPU_m8
#define CPP_CPU32_DEFAULT_SPEC ""
#define ASM_CPU32_DEFAULT_SPEC ""
@ -192,6 +194,10 @@ extern enum cmodel sparc_cmodel;
#define CPP_CPU64_DEFAULT_SPEC "-D__sparc_v9__"
#define ASM_CPU64_DEFAULT_SPEC AS_NIAGARA7_FLAG
#endif
#if TARGET_CPU_DEFAULT == TARGET_CPU_m8
#define CPP_CPU64_DEFAULT_SPEC "-D__sparc_v9__"
#define ASM_CPU64_DEFAULT_SPEC AS_M8_FLAG
#endif
#else
@ -295,6 +301,7 @@ extern enum cmodel sparc_cmodel;
%{mcpu=niagara3:-D__sparc_v9__} \
%{mcpu=niagara4:-D__sparc_v9__} \
%{mcpu=niagara7:-D__sparc_v9__} \
%{mcpu=m8:-D__sparc_v9__} \
%{!mcpu*:%(cpp_cpu_default)} \
"
#define CPP_ARCH32_SPEC ""
@ -347,6 +354,7 @@ extern enum cmodel sparc_cmodel;
%{mcpu=niagara3:%{!mv8plus:-Av9" AS_NIAGARA3_FLAG "}} \
%{mcpu=niagara4:%{!mv8plus:" AS_NIAGARA4_FLAG "}} \
%{mcpu=niagara7:%{!mv8plus:" AS_NIAGARA7_FLAG "}} \
%{mcpu=m8:%{!mv8plus:" AS_M8_FLAG "}} \
%{!mcpu*:%(asm_cpu_default)} \
"
@ -1039,6 +1047,10 @@ extern char leaf_reg_remap[];
/* Local macro to handle the two v9 classes of FP regs. */
#define FP_REG_CLASS_P(CLASS) ((CLASS) == FP_REGS || (CLASS) == EXTRA_FP_REGS)
/* Predicate for 2-bit and 5-bit unsigned constants. */
#define SPARC_IMM2_P(X) (((unsigned HOST_WIDE_INT) (X) & ~0x3) == 0)
#define SPARC_IMM5_P(X) (((unsigned HOST_WIDE_INT) (X) & ~0x1F) == 0)
/* Predicates for 5-bit, 10-bit, 11-bit and 13-bit signed constants. */
#define SPARC_SIMM5_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x10 < 0x20)
#define SPARC_SIMM10_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x200 < 0x400)
@ -1799,6 +1811,12 @@ extern int sparc_indent_opcode;
#define AS_NIAGARA7_FLAG AS_NIAGARA4_FLAG
#endif
#ifdef HAVE_AS_SPARC6
#define AS_M8_FLAG "-xarch=sparc6"
#else
#define AS_M8_FLAG AS_NIAGARA7_FLAG
#endif
#ifdef HAVE_AS_LEON
#define AS_LEON_FLAG "-Aleon"
#define AS_LEONV7_FLAG "-Aleon"

View File

@ -94,6 +94,12 @@
UNSPEC_ADDV
UNSPEC_SUBV
UNSPEC_NEGV
UNSPEC_DICTUNPACK
UNSPEC_FPCMPSHL
UNSPEC_FPUCMPSHL
UNSPEC_FPCMPDESHL
UNSPEC_FPCMPURSHL
])
(define_c_enum "unspecv" [
@ -238,7 +244,8 @@
niagara2,
niagara3,
niagara4,
niagara7"
niagara7,
m8"
(const (symbol_ref "sparc_cpu_attr")))
;; Attribute for the instruction set.
@ -251,7 +258,7 @@
(symbol_ref "TARGET_SPARCLET") (const_string "sparclet")]
(const_string "v7"))))
(define_attr "cpu_feature" "none,fpu,fpunotv9,v9,vis,vis3,vis4"
(define_attr "cpu_feature" "none,fpu,fpunotv9,v9,vis,vis3,vis4,vis4b"
(const_string "none"))
(define_attr "lra" "disabled,enabled"
@ -265,10 +272,92 @@
(eq_attr "cpu_feature" "v9") (symbol_ref "TARGET_V9")
(eq_attr "cpu_feature" "vis") (symbol_ref "TARGET_VIS")
(eq_attr "cpu_feature" "vis3") (symbol_ref "TARGET_VIS3")
(eq_attr "cpu_feature" "vis4") (symbol_ref "TARGET_VIS4")]
(eq_attr "cpu_feature" "vis4") (symbol_ref "TARGET_VIS4")
(eq_attr "cpu_feature" "vis4b") (symbol_ref "TARGET_VIS4B")]
(const_int 0)))
;; Insn type.
;; The SPARC instructions used by the backend are organized into a
;; hierarchy using the insn attributes "type" and "subtype".
;;
;; The mnemonics used in the list below are the architectural names
;; used in the Oracle SPARC Architecture specs. A / character
;; separates the type from the subtype where appropriate. For
;; brevity, text enclosed in {} denotes alternatives, while text
;; enclosed in [] is optional.
;;
;; Please keep this list updated. It is of great help for keeping the
;; correctness and coherence of the DFA schedulers.
;;
;; ialu: <empty>
;; ialuX: ADD[X]C SUB[X]C
;; shift: SLL[X] SRL[X] SRA[X]
;; cmove: MOV{A,N,NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS}
;; MOVF{A,N,U,G,UG,L,UL,LG,NE,E,UE,GE,UGE,LE,ULE,O}
;; MOVR{Z,LEZ,LZ,NZ,GZ,GEZ}
;; compare: ADDcc ADDCcc ANDcc ORcc SUBcc SUBCcc XORcc XNORcc
;; imul: MULX SMUL[cc] UMUL UMULXHI XMULX XMULXHI
;; idiv: UDIVX SDIVX
;; flush: FLUSH
;; load/regular: LD{UB,UH,UW} LDFSR
;; load/prefetch: PREFETCH
;; fpload: LDF LDDF LDQF
;; sload: LD{SB,SH,SW}
;; store: ST{B,H,W,X} STFSR
;; fpstore: STF STDF STQF
;; cbcond: CWB{NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS}
;; CXB{NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS}
;; uncond_branch: BA BPA JMPL
;; branch: B{NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS}
;; BP{NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS}
;; FB{U,G,UG,L,UL,LG,NE,BE,UE,GE,UGE,LE,ULE,O}
;; call: CALL
;; return: RESTORE RETURN
;; fpmove: FABS{s,d,q} FMOV{s,d,q} FNEG{s,d,q}
;; fpcmove: FMOV{S,D,Q}{icc,xcc,fcc}
;; fpcrmove: FMOVR{s,d,q}{Z,LEZ,LZ,NZ,GZ,GEZ}
;; fp: FADD{s,d,q} FSUB{s,d,q} FHSUB{s,d} FNHADD{s,d} FNADD{s,d}
;; FiTO{s,d,q} FsTO{i,x,d,q} FdTO{i,x,s,q} FxTO{d,s,q} FqTO{i,x,s,d}
;; fpcmp: FCMP{s,d,q} FCMPE{s,d,q}
;; fpmul: FMADD{s,d} FMSUB{s,d} FMUL{s,d,q} FNMADD{s,d}
;; FNMSUB{s,d} FNMUL{s,d} FNsMULd FsMULd
;; FdMULq
;; array: ARRAY{8,16,32}
;; bmask: BMASK
;; edge: EDGE{8,16,32}[L]cc
;; edgen: EDGE{8,16,32}[L]n
;; fpdivs: FDIV{s,q}
;; fpsqrts: FSQRT{s,q}
;; fpdivd: FDIVd
;; fpsqrtd: FSQRTd
;; lzd: LZCNT
;; fga/addsub64: FP{ADD,SUB}64
;; fga/fpu: FCHKSM16 FEXPANd FMEAN16 FPMERGE
;; FS{LL,RA,RL}{16,32}
;; fga/maxmin: FP{MAX,MIN}[U]{8,16,32}
;; fga/cmask: CMASK{8,16,32}
;; fga/other: BSHUFFLE FALIGNDATAg FP{ADD,SUB}[S]{8,16,32}
;; FP{ADD,SUB}US{8,16} DICTUNPACK
;; gsr/reg: RDGSR WRGSR
;; gsr/alignaddr: ALIGNADDRESS[_LITTLE]
;; vismv/double: FSRC2d
;; vismv/single: MOVwTOs FSRC2s
;; vismv/movstouw: MOVsTOuw
;; vismv/movxtod: MOVxTOd
;; vismv/movdtox: MOVdTOx
;; visl/single: F{AND,NAND,NOR,OR,NOT1}s
;; F{AND,OR}NOT{1,2}s
;; FONEs F{ZERO,XNOR,XOR}s FNOT2s
;; visl/double: FONEd FZEROd FNOT1d F{OR,AND,XOR}d F{NOR,NAND,XNOR}d
;; F{OR,AND}NOT1d F{OR,AND}NOT2d
;; viscmp: FPCMP{LE,GT,NE,EQ}{8,16,32} FPCMPU{LE,GT,NE,EQ}{8,16,32}
;; FPCMP{LE,GT,EQ,NE}{8,16,32}SHL FPCMPU{LE,GT,EQ,NE}{8,16,32}SHL
;; FPCMPDE{8,16,32}SHL FPCMPUR{8,16,32}SHL
;; fgm_pack: FPACKFIX FPACK{8,16,32}
;; fgm_mul: FMUL8SUx16 FMUL8ULx16 FMUL8x16 FMUL8x16AL
;; FMUL8x16AU FMULD8SUx16 FMULD8ULx16
;; pdist: PDIST
;; pdistn: PDISTN
(define_attr "type"
"ialu,compare,shift,
load,sload,store,
@ -281,12 +370,20 @@
fpcmp,
fpmul,fpdivs,fpdivd,
fpsqrts,fpsqrtd,
fga,visl,vismv,fgm_pack,fgm_mul,pdist,pdistn,edge,edgen,gsr,array,
fga,visl,vismv,viscmp,
fgm_pack,fgm_mul,pdist,pdistn,edge,edgen,gsr,array,bmask,
cmove,
ialuX,
multi,savew,flushw,iflush,trap,lzd"
(const_string "ialu"))
(define_attr "subtype"
"single,double,movstouw,movxtod,movdtox,
addsub64,cmask,fpu,maxmin,other,
reg,alignaddr,
prefetch,regular"
(const_string "single"))
;; True if branch/call has empty delay slot and will emit a nop in it
(define_attr "empty_delay_slot" "false,true"
(symbol_ref "(empty_delay_slot (insn)
@ -487,9 +584,6 @@
(const_string "true")
] (const_string "false")))
;; True if the instruction executes in the V3 pipeline, in M7 and later processors.
(define_attr "v3pipe" "false,true" (const_string "false"))
(define_delay (eq_attr "type" "call")
[(eq_attr "in_call_delay" "true") (nil) (nil)])
@ -519,6 +613,7 @@
(include "niagara2.md")
(include "niagara4.md")
(include "niagara7.md")
(include "m8.md")
;; Operand and operator predicates and constraints
@ -1507,6 +1602,7 @@
ldub\t%1, %0
stb\t%r1, %0"
[(set_attr "type" "*,load,store")
(set_attr "subtype" "*,regular,*")
(set_attr "us3load_type" "*,3cycle,*")])
(define_expand "movhi"
@ -1529,6 +1625,7 @@
lduh\t%1, %0
sth\t%r1, %0"
[(set_attr "type" "*,*,load,store")
(set_attr "subtype" "*,*,regular,*")
(set_attr "us3load_type" "*,*,3cycle,*")])
;; We always work with constants here.
@ -1566,8 +1663,8 @@
fzeros\t%0
fones\t%0"
[(set_attr "type" "*,*,load,store,vismv,vismv,fpmove,fpload,fpstore,visl,visl")
(set_attr "cpu_feature" "*,*,*,*,vis3,vis3,*,*,*,vis,vis")
(set_attr "v3pipe" "*,*,*,*,true,true,*,*,*,true,true")])
(set_attr "subtype" "*,*,regular,*,movstouw,single,*,*,*,single,single")
(set_attr "cpu_feature" "*,*,*,*,vis3,vis3,*,*,*,vis,vis")])
(define_insn "*movsi_lo_sum"
[(set (match_operand:SI 0 "register_operand" "=r")
@ -1624,7 +1721,8 @@
return "ld\t[%1 + %2], %0";
#endif
}
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "regular")])
(define_expand "movsi_pic_label_ref"
[(set (match_dup 3) (high:SI
@ -1733,11 +1831,12 @@
std\t%1, %0
fzero\t%0
fone\t%0"
[(set_attr "type" "store,*,load,store,load,store,*,*,fpload,fpstore,*,*,fpmove,*,*,*,fpload,fpstore,visl,visl")
[(set_attr "type" "store,*,load,store,load,store,*,*,fpload,fpstore,*,*,fpmove,*,*,*,fpload,fpstore,visl,
visl")
(set_attr "subtype" "*,*,regular,*,regular,*,*,*,*,*,*,*,*,*,*,*,*,*,double,double")
(set_attr "length" "*,2,*,*,*,*,2,2,*,*,2,2,*,2,2,2,*,*,*,*")
(set_attr "fptype" "*,*,*,*,*,*,*,*,*,*,*,*,double,*,*,*,*,*,double,double")
(set_attr "cpu_feature" "v9,*,*,*,*,*,*,*,fpu,fpu,fpu,fpu,v9,fpunotv9,vis3,vis3,fpu,fpu,vis,vis")
(set_attr "v3pipe" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,true,true")
(set_attr "lra" "*,*,disabled,disabled,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*")])
(define_insn "*movdi_insn_sp64"
@ -1759,9 +1858,9 @@
fzero\t%0
fone\t%0"
[(set_attr "type" "*,*,load,store,vismv,vismv,fpmove,fpload,fpstore,visl,visl")
(set_attr "subtype" "*,*,regular,*,movdtox,movxtod,*,*,*,double,double")
(set_attr "fptype" "*,*,*,*,*,*,double,*,*,double,double")
(set_attr "cpu_feature" "*,*,*,*,vis3,vis3,*,*,*,vis,vis")
(set_attr "v3pipe" "*,*,*,*,*,*,*,*,*,true,true")])
(set_attr "cpu_feature" "*,*,*,*,vis3,vis3,*,*,*,vis,vis")])
(define_expand "movdi_pic_label_ref"
[(set (match_dup 3) (high:DI
@ -1847,7 +1946,8 @@
return "ldx\t[%1 + %2], %0";
#endif
}
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "regular")])
(define_insn "*sethi_di_medlow_embmedany_pic"
[(set (match_operand:DI 0 "register_operand" "=r")
@ -2289,8 +2389,8 @@
}
}
[(set_attr "type" "visl,visl,fpmove,*,*,*,vismv,vismv,fpload,load,fpstore,store")
(set_attr "cpu_feature" "vis,vis,fpu,*,*,*,vis3,vis3,fpu,*,fpu,*")
(set_attr "v3pipe" "true,true,*,*,*,*,true,true,*,*,*,*")])
(set_attr "subtype" "single,single,*,*,*,*,movstouw,single,*,regular,*,*")
(set_attr "cpu_feature" "vis,vis,fpu,*,*,*,vis3,vis3,fpu,*,fpu,*")])
;; The following 3 patterns build SFmode constants in integer registers.
@ -2362,10 +2462,10 @@
ldd\t%1, %0
std\t%1, %0"
[(set_attr "type" "store,*,visl,visl,fpmove,*,*,*,fpload,fpstore,load,store,*,*,*,load,store")
(set_attr "subtype" "*,*,double,double,*,*,*,*,*,*,regular,*,*,*,*,regular,*")
(set_attr "length" "*,2,*,*,*,2,2,2,*,*,*,*,2,2,2,*,*")
(set_attr "fptype" "*,*,double,double,double,*,*,*,*,*,*,*,*,*,*,*,*")
(set_attr "cpu_feature" "v9,*,vis,vis,v9,fpunotv9,vis3,vis3,fpu,fpu,*,*,fpu,fpu,*,*,*")
(set_attr "v3pipe" "*,*,true,true,*,*,*,*,*,*,*,*,*,*,*,*,*")
(set_attr "lra" "*,*,*,*,*,*,*,*,*,*,disabled,disabled,*,*,*,*,*")])
(define_insn "*movdf_insn_sp64"
@ -2387,10 +2487,10 @@
stx\t%r1, %0
#"
[(set_attr "type" "visl,visl,fpmove,vismv,vismv,load,store,*,load,store,*")
(set_attr "subtype" "double,double,*,movdtox,movxtod,regular,*,*,regular,*,*")
(set_attr "length" "*,*,*,*,*,*,*,*,*,*,2")
(set_attr "fptype" "double,double,double,double,double,*,*,*,*,*,*")
(set_attr "cpu_feature" "vis,vis,fpu,vis3,vis3,fpu,fpu,*,*,*,*")
(set_attr "v3pipe" "true,true,*,*,*,*,*,*,*,*,*")])
(set_attr "cpu_feature" "vis,vis,fpu,vis3,vis3,fpu,fpu,*,*,*,*")])
;; This pattern builds DFmode constants in integer registers.
(define_split
@ -2916,6 +3016,7 @@
""
"lduh\t%1, %0"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_expand "zero_extendqihi2"
@ -2932,6 +3033,7 @@
and\t%1, 0xff, %0
ldub\t%1, %0"
[(set_attr "type" "*,load")
(set_attr "subtype" "*,regular")
(set_attr "us3load_type" "*,3cycle")])
(define_expand "zero_extendqisi2"
@ -2948,6 +3050,7 @@
and\t%1, 0xff, %0
ldub\t%1, %0"
[(set_attr "type" "*,load")
(set_attr "subtype" "*,regular")
(set_attr "us3load_type" "*,3cycle")])
(define_expand "zero_extendqidi2"
@ -2964,6 +3067,7 @@
and\t%1, 0xff, %0
ldub\t%1, %0"
[(set_attr "type" "*,load")
(set_attr "subtype" "*,regular")
(set_attr "us3load_type" "*,3cycle")])
(define_expand "zero_extendhidi2"
@ -2995,6 +3099,7 @@
"TARGET_ARCH64"
"lduh\t%1, %0"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
;; ??? Write truncdisi pattern using sra?
@ -3015,8 +3120,9 @@
lduw\t%1, %0
movstouw\t%1, %0"
[(set_attr "type" "shift,load,*")
(set_attr "cpu_feature" "*,*,vis3")
(set_attr "v3pipe" "*,*,true")])
(set_attr "subtype" "*,regular,movstouw")
(set_attr "cpu_feature" "*,*,vis3")])
(define_insn_and_split "*zero_extendsidi2_insn_sp32"
[(set (match_operand:DI 0 "register_operand" "=r")
@ -3331,8 +3437,7 @@
movstosw\t%1, %0"
[(set_attr "type" "shift,sload,*")
(set_attr "us3load_type" "*,3cycle,*")
(set_attr "cpu_feature" "*,*,vis3")
(set_attr "v3pipe" "*,*,true")])
(set_attr "cpu_feature" "*,*,vis3")])
;; Special pattern for optimizing bit-field compares. This is needed
@ -7356,7 +7461,8 @@
[(unspec_volatile [(match_operand:SI 0 "memory_operand" "m")] UNSPECV_LDFSR)]
"TARGET_FPU"
"ld\t%0, %%fsr"
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "regular")])
(define_insn "stfsr"
[(set (match_operand:SI 0 "memory_operand" "=m")
@ -7720,7 +7826,8 @@
gcc_assert (locality >= 0 && locality < 4);
return prefetch_instr [read_or_write][locality == 0 ? 0 : 1];
}
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "prefetch")])
(define_insn "prefetch_32"
[(prefetch (match_operand:SI 0 "address_operand" "p")
@ -7745,7 +7852,8 @@
gcc_assert (locality >= 0 && locality < 4);
return prefetch_instr [read_or_write][locality == 0 ? 0 : 1];
}
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "prefetch")])
;; Trap instructions.
@ -7966,7 +8074,8 @@
UNSPEC_TLSIE))]
"TARGET_TLS && TARGET_ARCH32"
"ld\\t[%1 + %2], %0, %%tie_ld(%a3)"
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "regular")])
(define_insn "tie_ld64"
[(set (match_operand:DI 0 "register_operand" "=r")
@ -7976,7 +8085,8 @@
UNSPEC_TLSIE))]
"TARGET_TLS && TARGET_ARCH64"
"ldx\\t[%1 + %2], %0, %%tie_ldx(%a3)"
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "regular")])
(define_insn "tie_add32"
[(set (match_operand:SI 0 "register_operand" "=r")
@ -8036,6 +8146,7 @@
"TARGET_TLS && TARGET_ARCH32"
"ldub\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_ldub1_sp32"
@ -8048,6 +8159,7 @@
"TARGET_TLS && TARGET_ARCH32"
"ldub\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_ldub2_sp32"
@ -8060,6 +8172,7 @@
"TARGET_TLS && TARGET_ARCH32"
"ldub\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_ldsb1_sp32"
@ -8095,6 +8208,7 @@
"TARGET_TLS && TARGET_ARCH64"
"ldub\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_ldub1_sp64"
@ -8107,6 +8221,7 @@
"TARGET_TLS && TARGET_ARCH64"
"ldub\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_ldub2_sp64"
@ -8119,6 +8234,7 @@
"TARGET_TLS && TARGET_ARCH64"
"ldub\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_ldub3_sp64"
@ -8131,6 +8247,7 @@
"TARGET_TLS && TARGET_ARCH64"
"ldub\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_ldsb1_sp64"
@ -8178,6 +8295,7 @@
"TARGET_TLS && TARGET_ARCH32"
"lduh\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_lduh1_sp32"
@ -8190,6 +8308,7 @@
"TARGET_TLS && TARGET_ARCH32"
"lduh\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_ldsh1_sp32"
@ -8213,6 +8332,7 @@
"TARGET_TLS && TARGET_ARCH64"
"lduh\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_lduh1_sp64"
@ -8225,6 +8345,7 @@
"TARGET_TLS && TARGET_ARCH64"
"lduh\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_lduh2_sp64"
@ -8237,6 +8358,7 @@
"TARGET_TLS && TARGET_ARCH64"
"lduh\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")
(set_attr "subtype" "regular")
(set_attr "us3load_type" "3cycle")])
(define_insn "*tldo_ldsh1_sp64"
@ -8271,7 +8393,8 @@
(match_operand:SI 1 "register_operand" "r"))))]
"TARGET_TLS && TARGET_ARCH32"
"ld\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "regular")])
(define_insn "*tldo_lduw_sp64"
[(set (match_operand:SI 0 "register_operand" "=r")
@ -8281,7 +8404,8 @@
(match_operand:DI 1 "register_operand" "r"))))]
"TARGET_TLS && TARGET_ARCH64"
"lduw\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "regular")])
(define_insn "*tldo_lduw1_sp64"
[(set (match_operand:DI 0 "register_operand" "=r")
@ -8292,7 +8416,8 @@
(match_operand:DI 1 "register_operand" "r")))))]
"TARGET_TLS && TARGET_ARCH64"
"lduw\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "regular")])
(define_insn "*tldo_ldsw1_sp64"
[(set (match_operand:DI 0 "register_operand" "=r")
@ -8314,7 +8439,8 @@
(match_operand:DI 1 "register_operand" "r"))))]
"TARGET_TLS && TARGET_ARCH64"
"ldx\t[%1 + %2], %0, %%tldo_add(%3)"
[(set_attr "type" "load")])
[(set_attr "type" "load")
(set_attr "subtype" "regular")])
(define_insn "*tldo_stb_sp32"
[(set (mem:QI (plus:SI (unspec:SI [(match_operand:SI 2 "register_operand" "r")
@ -8519,8 +8645,8 @@
movstouw\t%1, %0
movwtos\t%1, %0"
[(set_attr "type" "visl,visl,vismv,fpload,fpstore,store,load,store,*,vismv,vismv")
(set_attr "cpu_feature" "vis,vis,vis,*,*,*,*,*,*,vis3,vis3")
(set_attr "v3pipe" "true,true,true,*,*,*,*,*,*,true,true")])
(set_attr "subtype" "single,single,single,*,*,*,regular,*,*,movstouw,single")
(set_attr "cpu_feature" "vis,vis,vis,*,*,*,*,*,*,vis3,vis3")])
(define_insn "*mov<VM64:mode>_insn_sp64"
[(set (match_operand:VM64 0 "nonimmediate_operand" "=e,e,e,e,W,m,*r, m,*r, e,*r")
@ -8542,8 +8668,8 @@
movxtod\t%1, %0
mov\t%1, %0"
[(set_attr "type" "visl,visl,vismv,fpload,fpstore,store,load,store,vismv,vismv,*")
(set_attr "cpu_feature" "vis,vis,vis,*,*,*,*,*,vis3,vis3,*")
(set_attr "v3pipe" "true,true,true,*,*,*,*,*,*,*,*")])
(set_attr "subtype" "double,double,double,*,*,*,regular,*,movdtox,movxtod,*")
(set_attr "cpu_feature" "vis,vis,vis,*,*,*,*,*,vis3,vis3,*")])
(define_insn "*mov<VM64:mode>_insn_sp32"
[(set (match_operand:VM64 0 "nonimmediate_operand"
@ -8572,9 +8698,9 @@
ldd\t%1, %0
std\t%1, %0"
[(set_attr "type" "store,*,visl,visl,vismv,*,*,fpload,fpstore,load,store,*,*,*,load,store")
(set_attr "subtype" "*,*,double,double,double,*,*,*,*,regular,*,*,*,*,regular,*")
(set_attr "length" "*,2,*,*,*,2,2,*,*,*,*,2,2,2,*,*")
(set_attr "cpu_feature" "*,*,vis,vis,vis,vis3,vis3,*,*,*,*,*,*,*,*,*")
(set_attr "v3pipe" "*,*,true,true,true,*,*,*,*,*,*,*,*,*,*,*")
(set_attr "lra" "*,*,*,*,*,*,*,*,*,disabled,disabled,*,*,*,*,*")])
(define_split
@ -8652,8 +8778,8 @@
"TARGET_VIS"
"fp<plusminus_insn><vbits>\t%1, %2, %0"
[(set_attr "type" "fga")
(set_attr "fptype" "<vfptype>")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "other")
(set_attr "fptype" "<vfptype>")])
(define_mode_iterator VL [V1SI V2HI V4QI V1DI V2SI V4HI V8QI])
(define_mode_attr vlsuf [(V1SI "s") (V2HI "s") (V4QI "s")
@ -8669,8 +8795,7 @@
"TARGET_VIS"
"f<vlinsn><vlsuf>\t%1, %2, %0"
[(set_attr "type" "visl")
(set_attr "fptype" "<vfptype>")
(set_attr "v3pipe" "true")])
(set_attr "fptype" "<vfptype>")])
(define_insn "*not_<vlop:code><VL:mode>3"
[(set (match_operand:VL 0 "register_operand" "=<vconstr>")
@ -8679,8 +8804,7 @@
"TARGET_VIS"
"f<vlninsn><vlsuf>\t%1, %2, %0"
[(set_attr "type" "visl")
(set_attr "fptype" "<vfptype>")
(set_attr "v3pipe" "true")])
(set_attr "fptype" "<vfptype>")])
;; (ior (not (op1)) (not (op2))) is the canonical form of NAND.
(define_insn "*nand<VL:mode>_vis"
@ -8690,8 +8814,7 @@
"TARGET_VIS"
"fnand<vlsuf>\t%1, %2, %0"
[(set_attr "type" "visl")
(set_attr "fptype" "<vfptype>")
(set_attr "v3pipe" "true")])
(set_attr "fptype" "<vfptype>")])
(define_code_iterator vlnotop [ior and])
@ -8702,8 +8825,7 @@
"TARGET_VIS"
"f<vlinsn>not1<vlsuf>\t%1, %2, %0"
[(set_attr "type" "visl")
(set_attr "fptype" "<vfptype>")
(set_attr "v3pipe" "true")])
(set_attr "fptype" "<vfptype>")])
(define_insn "*<vlnotop:code>_not2<VL:mode>_vis"
[(set (match_operand:VL 0 "register_operand" "=<vconstr>")
@ -8712,8 +8834,7 @@
"TARGET_VIS"
"f<vlinsn>not2<vlsuf>\t%1, %2, %0"
[(set_attr "type" "visl")
(set_attr "fptype" "<vfptype>")
(set_attr "v3pipe" "true")])
(set_attr "fptype" "<vfptype>")])
(define_insn "one_cmpl<VL:mode>2"
[(set (match_operand:VL 0 "register_operand" "=<vconstr>")
@ -8721,8 +8842,7 @@
"TARGET_VIS"
"fnot1<vlsuf>\t%1, %0"
[(set_attr "type" "visl")
(set_attr "fptype" "<vfptype>")
(set_attr "v3pipe" "true")])
(set_attr "fptype" "<vfptype>")])
;; Hard to generate VIS instructions. We have builtins for these.
@ -8764,6 +8884,7 @@
"TARGET_VIS"
"fexpand\t%1, %0"
[(set_attr "type" "fga")
(set_attr "subtype" "fpu")
(set_attr "fptype" "double")])
(define_insn "fpmerge_vis"
@ -8778,6 +8899,7 @@
"TARGET_VIS"
"fpmerge\t%1, %2, %0"
[(set_attr "type" "fga")
(set_attr "subtype" "fpu")
(set_attr "fptype" "double")])
;; Partitioned multiply instructions
@ -8866,7 +8988,8 @@
[(set (reg:DI GSR_REG) (match_operand:DI 0 "arith_operand" "rI"))]
"TARGET_VIS && TARGET_ARCH64"
"wr\t%%g0, %0, %%gsr"
[(set_attr "type" "gsr")])
[(set_attr "type" "gsr")
(set_attr "subtype" "reg")])
(define_insn "wrgsr_v8plus"
[(set (reg:DI GSR_REG) (match_operand:DI 0 "arith_operand" "I,r"))
@ -8897,7 +9020,8 @@
[(set (match_operand:DI 0 "register_operand" "=r") (reg:DI GSR_REG))]
"TARGET_VIS && TARGET_ARCH64"
"rd\t%%gsr, %0"
[(set_attr "type" "gsr")])
[(set_attr "type" "gsr")
(set_attr "subtype" "reg")])
(define_insn "rdgsr_v8plus"
[(set (match_operand:DI 0 "register_operand" "=r") (reg:DI GSR_REG))
@ -8920,8 +9044,8 @@
"TARGET_VIS"
"faligndata\t%1, %2, %0"
[(set_attr "type" "fga")
(set_attr "fptype" "double")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "other")
(set_attr "fptype" "double")])
(define_insn "alignaddrsi_vis"
[(set (match_operand:SI 0 "register_operand" "=r")
@ -8932,7 +9056,7 @@
"TARGET_VIS"
"alignaddr\t%r1, %r2, %0"
[(set_attr "type" "gsr")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "alignaddr")])
(define_insn "alignaddrdi_vis"
[(set (match_operand:DI 0 "register_operand" "=r")
@ -8943,7 +9067,7 @@
"TARGET_VIS"
"alignaddr\t%r1, %r2, %0"
[(set_attr "type" "gsr")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "alignaddr")])
(define_insn "alignaddrlsi_vis"
[(set (match_operand:SI 0 "register_operand" "=r")
@ -8955,7 +9079,7 @@
"TARGET_VIS"
"alignaddrl\t%r1, %r2, %0"
[(set_attr "type" "gsr")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "alignaddr")])
(define_insn "alignaddrldi_vis"
[(set (match_operand:DI 0 "register_operand" "=r")
@ -8967,7 +9091,7 @@
"TARGET_VIS"
"alignaddrl\t%r1, %r2, %0"
[(set_attr "type" "gsr")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "alignaddr")])
(define_insn "pdist_vis"
[(set (match_operand:DI 0 "register_operand" "=e")
@ -9059,9 +9183,7 @@
UNSPEC_FCMP))]
"TARGET_VIS"
"fcmp<gcond:code><GCM:gcm_name>\t%1, %2, %0"
[(set_attr "type" "visl")
(set_attr "fptype" "double")
(set_attr "v3pipe" "true")])
[(set_attr "type" "viscmp")])
(define_insn "fpcmp<gcond:code>8<P:mode>_vis"
[(set (match_operand:P 0 "register_operand" "=r")
@ -9070,8 +9192,7 @@
UNSPEC_FCMP))]
"TARGET_VIS4"
"fpcmp<gcond:code>8\t%1, %2, %0"
[(set_attr "type" "visl")
(set_attr "fptype" "double")])
[(set_attr "type" "viscmp")])
(define_expand "vcond<GCM:mode><GCM:mode>"
[(match_operand:GCM 0 "register_operand" "")
@ -9134,8 +9255,7 @@
(plus:DI (match_dup 1) (match_dup 2)))]
"TARGET_VIS2 && TARGET_ARCH64"
"bmask\t%r1, %r2, %0"
[(set_attr "type" "array")
(set_attr "v3pipe" "true")])
[(set_attr "type" "bmask")])
(define_insn "bmasksi_vis"
[(set (match_operand:SI 0 "register_operand" "=r")
@ -9145,8 +9265,7 @@
(zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))]
"TARGET_VIS2"
"bmask\t%r1, %r2, %0"
[(set_attr "type" "array")
(set_attr "v3pipe" "true")])
[(set_attr "type" "bmask")])
(define_insn "bshuffle<VM64:mode>_vis"
[(set (match_operand:VM64 0 "register_operand" "=e")
@ -9157,8 +9276,8 @@
"TARGET_VIS2"
"bshuffle\t%1, %2, %0"
[(set_attr "type" "fga")
(set_attr "fptype" "double")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "other")
(set_attr "fptype" "double")])
;; The rtl expanders will happily convert constant permutations on other
;; modes down to V8QI. Rely on this to avoid the complexity of the byte
@ -9261,7 +9380,7 @@
"TARGET_VIS3"
"cmask8\t%r0"
[(set_attr "type" "fga")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "cmask")])
(define_insn "cmask16<P:mode>_vis"
[(set (reg:DI GSR_REG)
@ -9271,7 +9390,7 @@
"TARGET_VIS3"
"cmask16\t%r0"
[(set_attr "type" "fga")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "cmask")])
(define_insn "cmask32<P:mode>_vis"
[(set (reg:DI GSR_REG)
@ -9281,7 +9400,7 @@
"TARGET_VIS3"
"cmask32\t%r0"
[(set_attr "type" "fga")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "cmask")])
(define_insn "fchksm16_vis"
[(set (match_operand:V4HI 0 "register_operand" "=e")
@ -9290,7 +9409,8 @@
UNSPEC_FCHKSM16))]
"TARGET_VIS3"
"fchksm16\t%1, %2, %0"
[(set_attr "type" "fga")])
[(set_attr "type" "fga")
(set_attr "subtype" "fpu")])
(define_code_iterator vis3_shift [ashift ss_ashift lshiftrt ashiftrt])
(define_code_attr vis3_shift_insn
@ -9304,7 +9424,8 @@
(match_operand:GCM 2 "register_operand" "<vconstr>")))]
"TARGET_VIS3"
"<vis3_shift_insn><vbits>\t%1, %2, %0"
[(set_attr "type" "fga")])
[(set_attr "type" "fga")
(set_attr "subtype" "fpu")])
(define_insn "pdistn<P:mode>_vis"
[(set (match_operand:P 0 "register_operand" "=r")
@ -9314,8 +9435,7 @@
"TARGET_VIS3"
"pdistn\t%1, %2, %0"
[(set_attr "type" "pdistn")
(set_attr "fptype" "double")
(set_attr "v3pipe" "true")])
(set_attr "fptype" "double")])
(define_insn "fmean16_vis"
[(set (match_operand:V4HI 0 "register_operand" "=e")
@ -9332,7 +9452,8 @@
(const_int 1))))]
"TARGET_VIS3"
"fmean16\t%1, %2, %0"
[(set_attr "type" "fga")])
[(set_attr "type" "fga")
(set_attr "subtype" "fpu")])
(define_insn "fp<plusminus_insn>64_vis"
[(set (match_operand:V1DI 0 "register_operand" "=e")
@ -9340,7 +9461,8 @@
(match_operand:V1DI 2 "register_operand" "e")))]
"TARGET_VIS3"
"fp<plusminus_insn>64\t%1, %2, %0"
[(set_attr "type" "fga")])
[(set_attr "type" "fga")
(set_attr "subtype" "addsub64")])
(define_insn "<plusminus_insn>v8qi3"
[(set (match_operand:V8QI 0 "register_operand" "=e")
@ -9348,7 +9470,8 @@
(match_operand:V8QI 2 "register_operand" "e")))]
"TARGET_VIS4"
"fp<plusminus_insn>8\t%1, %2, %0"
[(set_attr "type" "fga")])
[(set_attr "type" "fga")
(set_attr "subtype" "other")])
(define_mode_iterator VASS [V4HI V2SI V2HI V1SI])
(define_code_iterator vis3_addsub_ss [ss_plus ss_minus])
@ -9364,7 +9487,7 @@
"TARGET_VIS3"
"<vis3_addsub_ss_insn><vbits>\t%1, %2, %0"
[(set_attr "type" "fga")
(set_attr "v3pipe" "true")])
(set_attr "subtype" "other")])
(define_mode_iterator VMMAX [V8QI V4HI V2SI])
(define_code_iterator vis4_minmax [smin smax])
@ -9379,7 +9502,8 @@
(match_operand:VMMAX 2 "register_operand" "<vconstr>")))]
"TARGET_VIS4"
"<vis4_minmax_insn><vbits>\t%1, %2, %0"
[(set_attr "type" "fga")])
[(set_attr "type" "fga")
(set_attr "subtype" "maxmin")])
(define_code_iterator vis4_uminmax [umin umax])
(define_code_attr vis4_uminmax_insn
@ -9393,7 +9517,8 @@
(match_operand:VMMAX 2 "register_operand" "<vconstr>")))]
"TARGET_VIS4"
"<vis4_uminmax_insn><vbits>\t%1, %2, %0"
[(set_attr "type" "fga")])
[(set_attr "type" "fga")
(set_attr "subtype" "maxmin")])
;; The use of vis3_addsub_ss_patname in the VIS4 instruction below is
;; intended.
@ -9403,7 +9528,8 @@
(match_operand:V8QI 2 "register_operand" "e")))]
"TARGET_VIS4"
"<vis3_addsub_ss_insn>8\t%1, %2, %0"
[(set_attr "type" "fga")])
[(set_attr "type" "fga")
(set_attr "subtype" "other")])
(define_mode_iterator VAUS [V4HI V8QI])
(define_code_iterator vis4_addsub_us [us_plus us_minus])
@ -9418,7 +9544,8 @@
(match_operand:VAUS 2 "register_operand" "<vconstr>")))]
"TARGET_VIS4"
"<vis4_addsub_us_insn><vbits>\t%1, %2, %0"
[(set_attr "type" "fga")])
[(set_attr "type" "fga")
(set_attr "subtype" "other")])
(define_insn "fucmp<gcond:code>8<P:mode>_vis"
[(set (match_operand:P 0 "register_operand" "=r")
@ -9427,8 +9554,7 @@
UNSPEC_FUCMP))]
"TARGET_VIS3"
"fucmp<gcond:code>8\t%1, %2, %0"
[(set_attr "type" "visl")
(set_attr "v3pipe" "true")])
[(set_attr "type" "viscmp")])
(define_insn "fpcmpu<gcond:code><GCM:gcm_name><P:mode>_vis"
[(set (match_operand:P 0 "register_operand" "=r")
@ -9437,8 +9563,7 @@
UNSPEC_FUCMP))]
"TARGET_VIS4"
"fpcmpu<gcond:code><GCM:gcm_name>\t%1, %2, %0"
[(set_attr "type" "visl")
(set_attr "fptype" "double")])
[(set_attr "type" "viscmp")])
(define_insn "*naddsf3"
[(set (match_operand:SF 0 "register_operand" "=f")
@ -9542,4 +9667,62 @@
[(set_attr "type" "fp")
(set_attr "fptype" "double")])
;; VIS4B instructions.
(define_mode_iterator DUMODE [V2SI V4HI V8QI])
(define_insn "dictunpack<DUMODE:vbits>"
[(set (match_operand:DUMODE 0 "register_operand" "=e")
(unspec:DUMODE [(match_operand:DF 1 "register_operand" "e")
(match_operand:SI 2 "imm5_operand_dictunpack<DUMODE:vbits>" "t")]
UNSPEC_DICTUNPACK))]
"TARGET_VIS4B"
"dictunpack\t%1, %2, %0"
[(set_attr "type" "fga")
(set_attr "subtype" "other")])
(define_mode_iterator FPCSMODE [V2SI V4HI V8QI])
(define_code_iterator fpcscond [le gt eq ne])
(define_code_iterator fpcsucond [le gt])
(define_insn "fpcmp<fpcscond:code><FPCSMODE:vbits><P:mode>shl"
[(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(fpcscond:FPCSMODE (match_operand:FPCSMODE 1 "register_operand" "e")
(match_operand:FPCSMODE 2 "register_operand" "e"))
(match_operand:SI 3 "imm2_operand" "q")]
UNSPEC_FPCMPSHL))]
"TARGET_VIS4B"
"fpcmp<fpcscond:code><FPCSMODE:vbits>shl\t%1, %2, %3, %0"
[(set_attr "type" "viscmp")])
(define_insn "fpcmpu<fpcsucond:code><FPCSMODE:vbits><P:mode>shl"
[(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(fpcsucond:FPCSMODE (match_operand:FPCSMODE 1 "register_operand" "e")
(match_operand:FPCSMODE 2 "register_operand" "e"))
(match_operand:SI 3 "imm2_operand" "q")]
UNSPEC_FPUCMPSHL))]
"TARGET_VIS4B"
"fpcmpu<fpcsucond:code><FPCSMODE:vbits>shl\t%1, %2, %3, %0"
[(set_attr "type" "viscmp")])
(define_insn "fpcmpde<FPCSMODE:vbits><P:mode>shl"
[(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(match_operand:FPCSMODE 1 "register_operand" "e")
(match_operand:FPCSMODE 2 "register_operand" "e")
(match_operand:SI 3 "imm2_operand" "q")]
UNSPEC_FPCMPDESHL))]
"TARGET_VIS4B"
"fpcmpde<FPCSMODE:vbits>shl\t%1, %2, %3, %0"
[(set_attr "type" "viscmp")])
(define_insn "fpcmpur<FPCSMODE:vbits><P:mode>shl"
[(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(match_operand:FPCSMODE 1 "register_operand" "e")
(match_operand:FPCSMODE 2 "register_operand" "e")
(match_operand:SI 3 "imm2_operand" "q")]
UNSPEC_FPCMPURSHL))]
"TARGET_VIS4B"
"fpcmpur<FPCSMODE:vbits>shl\t%1, %2, %3, %0"
[(set_attr "type" "viscmp")])
(include "sync.md")

View File

@ -81,6 +81,10 @@ mvis4
Target Report Mask(VIS4)
Use UltraSPARC Visual Instruction Set version 4.0 extensions.
mvis4b
Target Report Mask(VIS4B)
Use additional VIS instructions introduced in OSA2017.
mcbcond
Target Report Mask(CBCOND)
Use UltraSPARC Compare-and-Branch extensions.
@ -209,6 +213,9 @@ Enum(sparc_processor_type) String(niagara4) Value(PROCESSOR_NIAGARA4)
EnumValue
Enum(sparc_processor_type) String(niagara7) Value(PROCESSOR_NIAGARA7)
EnumValue
Enum(sparc_processor_type) String(m8) Value(PROCESSOR_M8)
mcmodel=
Target RejectNegative Joined Var(sparc_cmodel_string)
Use given SPARC-V9 code model.

View File

@ -263,10 +263,10 @@
(define_insn_reservation "us1_fga_double"
2
(and (and
(eq_attr "cpu" "ultrasparc")
(eq_attr "type" "fga,visl,vismv"))
(eq_attr "fptype" "double"))
(and (eq_attr "cpu" "ultrasparc")
(ior (and (eq_attr "type" "fga,visl,vismv")
(eq_attr "fptype" "double"))
(eq_attr "type" "viscmp")))
"us1_fpa + us1_fp_double + us1_slotany, nothing")
(define_bypass 1 "us1_fga_double" "us1_fga_double")

View File

@ -56,7 +56,7 @@
(define_insn_reservation "us3_array" 2
(and (eq_attr "cpu" "ultrasparc3")
(eq_attr "type" "array,edgen"))
(eq_attr "type" "array,edgen,bmask"))
"us3_ms + us3_slotany, nothing")
;; ??? Not entirely accurate.
@ -176,7 +176,7 @@
(define_insn_reservation "us3_fga"
3
(and (eq_attr "cpu" "ultrasparc3")
(eq_attr "type" "fga,visl,vismv"))
(eq_attr "type" "fga,visl,viscmp,vismv"))
"us3_fpa + us3_slotany, nothing*2")
(define_insn_reservation "us3_fgm"

35
gcc/configure vendored
View File

@ -25217,6 +25217,41 @@ $as_echo "#define HAVE_AS_SPARC5_VIS4 1" >>confdefs.h
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for SPARC6 instructions" >&5
$as_echo_n "checking assembler for SPARC6 instructions... " >&6; }
if test "${gcc_cv_as_sparc_sparc6+set}" = set; then :
$as_echo_n "(cached) " >&6
else
gcc_cv_as_sparc_sparc6=no
if test x$gcc_cv_as != x; then
$as_echo '.text
.register %g2, #scratch
.register %g3, #scratch
.align 4
rd %entropy, %g1
fpsll64x %f0, %f2, %f4' > conftest.s
if { ac_try='$gcc_cv_as $gcc_cv_as_flags -xarch=sparc6 -o conftest.o conftest.s >&5'
{ { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
(eval $ac_try) 2>&5
ac_status=$?
$as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
test $ac_status = 0; }; }
then
gcc_cv_as_sparc_sparc6=yes
else
echo "configure: failed program was" >&5
cat conftest.s >&5
fi
rm -f conftest.o conftest.s
fi
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_sparc_sparc6" >&5
$as_echo "$gcc_cv_as_sparc_sparc6" >&6; }
if test $gcc_cv_as_sparc_sparc6 = yes; then
$as_echo "#define HAVE_AS_SPARC6 1" >>confdefs.h
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for LEON instructions" >&5
$as_echo_n "checking assembler for LEON instructions... " >&6; }

View File

@ -3969,6 +3969,18 @@ foo:
[AC_DEFINE(HAVE_AS_SPARC5_VIS4, 1,
[Define if your assembler supports SPARC5 and VIS 4.0 instructions.])])
gcc_GAS_CHECK_FEATURE([SPARC6 instructions],
gcc_cv_as_sparc_sparc6,,
[-xarch=sparc6],
[.text
.register %g2, #scratch
.register %g3, #scratch
.align 4
rd %entropy, %g1
fpsll64x %f0, %f2, %f4],,
[AC_DEFINE(HAVE_AS_SPARC6, 1,
[Define if your assembler supports SPARC6 instructions.])])
gcc_GAS_CHECK_FEATURE([LEON instructions],
gcc_cv_as_sparc_leon,,
[-Aleon],

View File

@ -19074,6 +19074,45 @@ v4hi __builtin_vis_fpminu16 (v4hi, v4hi);
v2si __builtin_vis_fpminu32 (v2si, v2si);
@end smallexample
When you use the @option{-mvis4b} switch, the VIS version 4.0B
built-in functions also become available:
@smallexample
v8qi __builtin_vis_dictunpack8 (double, int);
v4hi __builtin_vis_dictunpack16 (double, int);
v2si __builtin_vis_dictunpack32 (double, int);
long __builtin_vis_fpcmple8shl (v8qi, v8qi, int);
long __builtin_vis_fpcmpgt8shl (v8qi, v8qi, int);
long __builtin_vis_fpcmpeq8shl (v8qi, v8qi, int);
long __builtin_vis_fpcmpne8shl (v8qi, v8qi, int);
long __builtin_vis_fpcmple16shl (v4hi, v4hi, int);
long __builtin_vis_fpcmpgt16shl (v4hi, v4hi, int);
long __builtin_vis_fpcmpeq16shl (v4hi, v4hi, int);
long __builtin_vis_fpcmpne16shl (v4hi, v4hi, int);
long __builtin_vis_fpcmple32shl (v2si, v2si, int);
long __builtin_vis_fpcmpgt32shl (v2si, v2si, int);
long __builtin_vis_fpcmpeq32shl (v2si, v2si, int);
long __builtin_vis_fpcmpne32shl (v2si, v2si, int);
long __builtin_vis_fpcmpule8shl (v8qi, v8qi, int);
long __builtin_vis_fpcmpugt8shl (v8qi, v8qi, int);
long __builtin_vis_fpcmpule16shl (v4hi, v4hi, int);
long __builtin_vis_fpcmpugt16shl (v4hi, v4hi, int);
long __builtin_vis_fpcmpule32shl (v2si, v2si, int);
long __builtin_vis_fpcmpugt32shl (v2si, v2si, int);
long __builtin_vis_fpcmpde8shl (v8qi, v8qi, int);
long __builtin_vis_fpcmpde16shl (v4hi, v4hi, int);
long __builtin_vis_fpcmpde32shl (v2si, v2si, int);
long __builtin_vis_fpcmpur8shl (v8qi, v8qi, int);
long __builtin_vis_fpcmpur16shl (v4hi, v4hi, int);
long __builtin_vis_fpcmpur32shl (v2si, v2si, int);
@end smallexample
@node SPU Built-in Functions
@subsection SPU Built-in Functions

View File

@ -1117,6 +1117,7 @@ See RS/6000 and PowerPC Options.
-muser-mode -mno-user-mode @gol
-mv8plus -mno-v8plus -mvis -mno-vis @gol
-mvis2 -mno-vis2 -mvis3 -mno-vis3 @gol
-mvis4 -mno-vis4 -mvis4b -mno-vis4b @gol
-mcbcond -mno-cbcond -mfmaf -mno-fmaf @gol
-mpopc -mno-popc -msubxc -mno-subxc@gol
-mfix-at697f -mfix-ut699 @gol
@ -23395,7 +23396,7 @@ for machine type @var{cpu_type}. Supported values for @var{cpu_type} are
@samp{leon}, @samp{leon3}, @samp{leon3v7}, @samp{sparclite}, @samp{f930},
@samp{f934}, @samp{sparclite86x}, @samp{sparclet}, @samp{tsc701}, @samp{v9},
@samp{ultrasparc}, @samp{ultrasparc3}, @samp{niagara}, @samp{niagara2},
@samp{niagara3}, @samp{niagara4} and @samp{niagara7}.
@samp{niagara3}, @samp{niagara4}, @samp{niagara7} and @samp{m8}.
Native Solaris and GNU/Linux toolchains also support the value @samp{native},
which selects the best architecture option for the host processor.
@ -23423,7 +23424,8 @@ f930, f934, sparclite86x
tsc701
@item v9
ultrasparc, ultrasparc3, niagara, niagara2, niagara3, niagara4, niagara7
ultrasparc, ultrasparc3, niagara, niagara2, niagara3, niagara4,
niagara7, m8
@end table
By default (unless configured otherwise), GCC generates code for the V7
@ -23467,7 +23469,8 @@ additionally optimizes it for Sun UltraSPARC T2 chips. With
UltraSPARC T3 chips. With @option{-mcpu=niagara4}, the compiler
additionally optimizes it for Sun UltraSPARC T4 chips. With
@option{-mcpu=niagara7}, the compiler additionally optimizes it for
Oracle SPARC M7 chips.
Oracle SPARC M7 chips. With @option{-mcpu=m8}, the compiler
additionally optimizes it for Oracle M8 chips.
@item -mtune=@var{cpu_type}
@opindex mtune
@ -23482,8 +23485,8 @@ that select a particular CPU implementation. Those are
@samp{leon3}, @samp{leon3v7}, @samp{f930}, @samp{f934},
@samp{sparclite86x}, @samp{tsc701}, @samp{ultrasparc},
@samp{ultrasparc3}, @samp{niagara}, @samp{niagara2}, @samp{niagara3},
@samp{niagara4} and @samp{niagara7}. With native Solaris and
GNU/Linux toolchains, @samp{native} can also be used.
@samp{niagara4}, @samp{niagara7} and @samp{m8}. With native Solaris
and GNU/Linux toolchains, @samp{native} can also be used.
@item -mv8plus
@itemx -mno-v8plus
@ -23531,6 +23534,18 @@ default is @option{-mvis4} when targeting a cpu that supports such
instructions, such as niagara-7 and later. Setting @option{-mvis4}
also sets @option{-mvis3}, @option{-mvis2} and @option{-mvis}.
@item -mvis4b
@itemx -mno-vis4b
@opindex mvis4b
@opindex mno-vis4b
With @option{-mvis4b}, GCC generates code that takes advantage of
version 4.0 of the UltraSPARC Visual Instruction Set extensions, plus
the additional VIS instructions introduced in the Oracle SPARC
Architecture 2017. The default is @option{-mvis4b} when targeting a
cpu that supports such instructions, such as m8 and later. Setting
@option{-mvis4b} also sets @option{-mvis4}, @option{-mvis3},
@option{-mvis2} and @option{-mvis}.
@item -mcbcond
@itemx -mno-cbcond
@opindex mcbcond

View File

@ -1,3 +1,11 @@
2017-07-07 Jose E. Marchesi <jose.marchesi@oracle.com>
* gcc.target/sparc/dictunpack.c: New file.
* gcc.target/sparc/fpcmpdeshl.c: Likewise.
* gcc.target/sparc/fpcmpshl.c: Likewise.
* gcc.target/sparc/fpcmpurshl.c: Likewise.
* gcc.target/sparc/fpcmpushl.c: Likewise.
2017-07-05 Georg-Johann Lay <avr@gjlay.de>
Backport from 2017-07-05 trunk r249995, r249996.

View File

@ -0,0 +1,25 @@
/* { dg-do compile } */
/* { dg-options "-mvis4b" } */
typedef unsigned char vec8 __attribute__((vector_size(8)));
typedef short vec16 __attribute__((vector_size(8)));
typedef int vec32 __attribute__((vector_size(8)));
vec8 test_dictunpack8 (double a)
{
return __builtin_vis_dictunpack8 (a, 6);
}
vec16 test_dictunpack16 (double a)
{
return __builtin_vis_dictunpack16 (a, 14);
}
vec32 test_dictunpack32 (double a)
{
return __builtin_vis_dictunpack32 (a, 30);
}
/* { dg-final { scan-assembler "dictunpack\t%" } } */
/* { dg-final { scan-assembler "dictunpack\t%" } } */
/* { dg-final { scan-assembler "dictunpack\t%" } } */

View File

@ -0,0 +1,25 @@
/* { dg-do compile } */
/* { dg-options "-mvis4b" } */
typedef unsigned char vec8 __attribute__((vector_size(8)));
typedef short vec16 __attribute__((vector_size(8)));
typedef int vec32 __attribute__((vector_size(8)));
long test_fpcmpde8shl (vec8 a, vec8 b)
{
return __builtin_vis_fpcmpde8shl (a, b, 2);
}
long test_fpcmpde16shl (vec16 a, vec16 b)
{
return __builtin_vis_fpcmpde16shl (a, b, 2);
}
long test_fpcmpde32shl (vec32 a, vec32 b)
{
return __builtin_vis_fpcmpde32shl (a, b, 2);
}
/* { dg-final { scan-assembler "fpcmpde8shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpde16shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpde32shl\t%" } } */

View File

@ -0,0 +1,81 @@
/* { dg-do compile } */
/* { dg-options "-mvis4b" } */
typedef unsigned char vec8 __attribute__((vector_size(8)));
typedef short vec16 __attribute__((vector_size(8)));
typedef int vec32 __attribute__((vector_size(8)));
long test_fpcmple8shl (vec8 a, vec8 b)
{
return __builtin_vis_fpcmple8shl (a, b, 2);
}
long test_fpcmpgt8shl (vec8 a, vec8 b)
{
return __builtin_vis_fpcmpgt8shl (a, b, 2);
}
long test_fpcmpeq8shl (vec8 a, vec8 b)
{
return __builtin_vis_fpcmpeq8shl (a, b, 2);
}
long test_fpcmpne8shl (vec8 a, vec8 b)
{
return __builtin_vis_fpcmpne8shl (a, b, 2);
}
long test_fpcmple16shl (vec16 a, vec16 b)
{
return __builtin_vis_fpcmple16shl (a, b, 2);
}
long test_fpcmpgt16shl (vec16 a, vec16 b)
{
return __builtin_vis_fpcmpgt16shl (a, b, 2);
}
long test_fpcmpeq16shl (vec16 a, vec16 b)
{
return __builtin_vis_fpcmpeq16shl (a, b, 2);
}
long test_fpcmpne16shl (vec16 a, vec16 b)
{
return __builtin_vis_fpcmpne16shl (a, b, 2);
}
long test_fpcmple32shl (vec32 a, vec32 b)
{
return __builtin_vis_fpcmple32shl (a, b, 2);
}
long test_fpcmpgt32shl (vec32 a, vec32 b)
{
return __builtin_vis_fpcmpgt32shl (a, b, 2);
}
long test_fpcmpeq32shl (vec32 a, vec32 b)
{
return __builtin_vis_fpcmpeq32shl (a, b, 2);
}
long test_fpcmpne32shl (vec32 a, vec32 b)
{
return __builtin_vis_fpcmpne32shl (a, b, 2);
}
/* { dg-final { scan-assembler "fpcmple8shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpgt8shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpeq8shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpne8shl\t%" } } */
/* { dg-final { scan-assembler "fpcmple16shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpgt16shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpeq16shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpne16shl\t%" } } */
/* { dg-final { scan-assembler "fpcmple32shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpgt32shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpeq32shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpne32shl\t%" } } */

View File

@ -0,0 +1,25 @@
/* { dg-do compile } */
/* { dg-options "-mvis4b" } */
typedef unsigned char vec8 __attribute__((vector_size(8)));
typedef short vec16 __attribute__((vector_size(8)));
typedef int vec32 __attribute__((vector_size(8)));
long test_fpcmpur8shl (vec8 a, vec8 b)
{
return __builtin_vis_fpcmpur8shl (a, b, 2);
}
long test_fpcmpur16shl (vec16 a, vec16 b)
{
return __builtin_vis_fpcmpur16shl (a, b, 2);
}
long test_fpcmpur32shl (vec32 a, vec32 b)
{
return __builtin_vis_fpcmpur32shl (a, b, 2);
}
/* { dg-final { scan-assembler "fpcmpur8shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpur16shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpur32shl\t%" } } */

View File

@ -0,0 +1,43 @@
/* { dg-do compile } */
/* { dg-options "-mvis4b" } */
typedef unsigned char vec8 __attribute__((vector_size(8)));
typedef short vec16 __attribute__((vector_size(8)));
typedef int vec32 __attribute__((vector_size(8)));
long test_fpcmpule8shl (vec8 a, vec8 b)
{
return __builtin_vis_fpcmpule8shl (a, b, 2);
}
long test_fpcmpugt8shl (vec8 a, vec8 b)
{
return __builtin_vis_fpcmpugt8shl (a, b, 2);
}
long test_fpcmpule16shl (vec16 a, vec16 b)
{
return __builtin_vis_fpcmpule16shl (a, b, 2);
}
long test_fpcmpugt16shl (vec16 a, vec16 b)
{
return __builtin_vis_fpcmpugt16shl (a, b, 2);
}
long test_fpcmpule32shl (vec32 a, vec32 b)
{
return __builtin_vis_fpcmpule32shl (a, b, 2);
}
long test_fpcmpugt32shl (vec32 a, vec32 b)
{
return __builtin_vis_fpcmpugt32shl (a, b, 2);
}
/* { dg-final { scan-assembler "fpcmpule8shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpugt8shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpule16shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpugt16shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpule32shl\t%" } } */
/* { dg-final { scan-assembler "fpcmpugt32shl\t%" } } */