i386.c (ix86_avx256_split_vector_move_misalign): Use new names.

* config/i386/i386.c (ix86_avx256_split_vector_move_misalign):
	Use new names.
	(ix86_expand_vector_move_misalign): Support new unaligned load and
	stores and use new names.
	(CODE_FOR_sse2_storedqu): Rename to ...
	(CODE_FOR_sse2_storedquv16qi): ... this.
	(CODE_FOR_sse2_loaddqu): Rename to ...
	(CODE_FOR_sse2_loaddquv16qi): ... this.
	(CODE_FOR_avx_loaddqu256): Rename to ...
	(CODE_FOR_avx_loaddquv32qi): ... this.
	(CODE_FOR_avx_storedqu256): Rename to ...
	(CODE_FOR_avx_storedquv32qi): ... this.
	* config/i386/i386.md (fpint_logic): New.
	* config/i386/sse.md (VMOVE): Extend for AVX512.
	(VF): Ditto.
	(VF_128_256): New.
	(VF_512): Ditto.
	(VI_UNALIGNED_LOADSTORE): Ditto.
	(sse2_avx_avx512f): Ditto.
	(sse2_avx2): Extend for AVX512.
	(sse4_1_avx2): Ditto.
	(avx2_avx512f): New.
	(sse): Extend for AVX512.
	(sse2): Ditto.
	(sse4_1): Ditto.
	(avxsizesuffix): Ditto.
	(sseintvecmode): Ditto.
	(ssePSmode): Ditto.
	(<sse>_loadu<ssemodesuffix><avxsizesuffix>): Ditto.
	(<sse>_storeu<ssemodesuffix><avxsizesuffix>): Ditto.
	(<sse2>_loaddqu<avxsizesuffix>): Extend for AVX512 and rename to ...
	(<sse2_avx_avx512f>_loaddqu<mode>): ... this.
	(<sse2>_storedqu<avxsizesuffix>): Extend for AVX512 and rename to ...
	(<sse2_avx_avx512f>_storedqu<mode): ... this.
	(<sse>_movnt<mode>): Replace constraint "x" with "v".
	(STORENT_MODE): Extend for AVX512.
	(*absneg<mode>2): Replace constraint "x" with "v".
	(*mul<mode>3): Ditto.
	(*ieee_smin<mode>3): Ditto.
	(*ieee_smax<mode>3): Ditto.
	(avx_cmp<mode>3): Replace VF with VF_128_256.
	(*<sse>_maskcmp<mode>3_comm): Ditto.
	(<sse>_maskcmp<mode>3): Ditto.
	(<sse>_andnot<mode>3): Extend for AVX512.
	(<code><mode>3, anylogic): Replace VF with VF_128_256.
	(<code><mode>3, fpint_logic): New.
	(*<code><mode>3): Extend for AVX512.
	(avx512flogicsuff): New.
	(avx512f_<logic><mode>): Ditto.
	(<sse>_movmsk<ssemodesuffix><avxsizesuffix>): Replace VF with
	VF_128_256.
	(<sse4_1>_blend<ssemodesuffix><avxsizesuffix>): Ditto.
	(<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>): Ditto.
	(<sse4_1>_dp<ssemodesuffix><avxsizesuffix>): Ditto.
	(avx_vtest<ssemodesuffix><avxsizesuffix>): Ditto.
	(<sse4_1>_round<ssemodesuffix><avxsizesuffix>): Ditto.
	(xop_vpermil2<mode>3): Ditto.
	(*avx_vpermilp<mode>): Extend for AVX512 and rename to ...
	(*<sse2_avx_avx512f>_vpermilp<mode>): ... this.
	(avx_vpermilvar<mode>3): Extend for AVX512 and rename to ...
	(<sse2_avx_avx512f>_vpermilvar<mode>3): ... this.


Co-Authored-By: Andrey Turetskiy <andrey.turetskiy@intel.com>
Co-Authored-By: Anna Tikhonova <anna.tikhonova@intel.com>
Co-Authored-By: Ilya Tocar <ilya.tocar@intel.com>
Co-Authored-By: Ilya Verbin <ilya.verbin@intel.com>
Co-Authored-By: Kirill Yukhin <kirill.yukhin@intel.com>
Co-Authored-By: Maxim Kuznetsov <maxim.kuznetsov@intel.com>
Co-Authored-By: Michael Zolotukhin <michael.v.zolotukhin@intel.com>
Co-Authored-By: Sergey Lega <sergey.s.lega@intel.com>

From-SVN: r202913
This commit is contained in:
Alexander Ivchenko 2013-09-25 18:01:43 +00:00 committed by Kirill Yukhin
parent 4d44d03c72
commit b86f6e9e46
4 changed files with 313 additions and 105 deletions

View File

@ -1,3 +1,75 @@
2013-09-25 Alexander Ivchenko <alexander.ivchenko@intel.com>
Maxim Kuznetsov <maxim.kuznetsov@intel.com>
Sergey Lega <sergey.s.lega@intel.com>
Anna Tikhonova <anna.tikhonova@intel.com>
Ilya Tocar <ilya.tocar@intel.com>
Andrey Turetskiy <andrey.turetskiy@intel.com>
Ilya Verbin <ilya.verbin@intel.com>
Kirill Yukhin <kirill.yukhin@intel.com>
Michael Zolotukhin <michael.v.zolotukhin@intel.com>
* config/i386/i386.c (ix86_avx256_split_vector_move_misalign):
Use new names.
(ix86_expand_vector_move_misalign): Support new unaligned load and
stores and use new names.
(CODE_FOR_sse2_storedqu): Rename to ...
(CODE_FOR_sse2_storedquv16qi): ... this.
(CODE_FOR_sse2_loaddqu): Rename to ...
(CODE_FOR_sse2_loaddquv16qi): ... this.
(CODE_FOR_avx_loaddqu256): Rename to ...
(CODE_FOR_avx_loaddquv32qi): ... this.
(CODE_FOR_avx_storedqu256): Rename to ...
(CODE_FOR_avx_storedquv32qi): ... this.
* config/i386/i386.md (fpint_logic): New.
* config/i386/sse.md (VMOVE): Extend for AVX512.
(VF): Ditto.
(VF_128_256): New.
(VF_512): Ditto.
(VI_UNALIGNED_LOADSTORE): Ditto.
(sse2_avx_avx512f): Ditto.
(sse2_avx2): Extend for AVX512.
(sse4_1_avx2): Ditto.
(avx2_avx512f): New.
(sse): Extend for AVX512.
(sse2): Ditto.
(sse4_1): Ditto.
(avxsizesuffix): Ditto.
(sseintvecmode): Ditto.
(ssePSmode): Ditto.
(<sse>_loadu<ssemodesuffix><avxsizesuffix>): Ditto.
(<sse>_storeu<ssemodesuffix><avxsizesuffix>): Ditto.
(<sse2>_loaddqu<avxsizesuffix>): Extend for AVX512 and rename to ...
(<sse2_avx_avx512f>_loaddqu<mode>): ... this.
(<sse2>_storedqu<avxsizesuffix>): Extend for AVX512 and rename to ...
(<sse2_avx_avx512f>_storedqu<mode): ... this.
(<sse>_movnt<mode>): Replace constraint "x" with "v".
(STORENT_MODE): Extend for AVX512.
(*absneg<mode>2): Replace constraint "x" with "v".
(*mul<mode>3): Ditto.
(*ieee_smin<mode>3): Ditto.
(*ieee_smax<mode>3): Ditto.
(avx_cmp<mode>3): Replace VF with VF_128_256.
(*<sse>_maskcmp<mode>3_comm): Ditto.
(<sse>_maskcmp<mode>3): Ditto.
(<sse>_andnot<mode>3): Extend for AVX512.
(<code><mode>3, anylogic): Replace VF with VF_128_256.
(<code><mode>3, fpint_logic): New.
(*<code><mode>3): Extend for AVX512.
(avx512flogicsuff): New.
(avx512f_<logic><mode>): Ditto.
(<sse>_movmsk<ssemodesuffix><avxsizesuffix>): Replace VF with
VF_128_256.
(<sse4_1>_blend<ssemodesuffix><avxsizesuffix>): Ditto.
(<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>): Ditto.
(<sse4_1>_dp<ssemodesuffix><avxsizesuffix>): Ditto.
(avx_vtest<ssemodesuffix><avxsizesuffix>): Ditto.
(<sse4_1>_round<ssemodesuffix><avxsizesuffix>): Ditto.
(xop_vpermil2<mode>3): Ditto.
(*avx_vpermilp<mode>): Extend for AVX512 and rename to ...
(*<sse2_avx_avx512f>_vpermilp<mode>): ... this.
(avx_vpermilvar<mode>3): Extend for AVX512 and rename to ...
(<sse2_avx_avx512f>_vpermilvar<mode>3): ... this.
2013-09-25 Tom Tromey <tromey@redhat.com> 2013-09-25 Tom Tromey <tromey@redhat.com>
* Makefile.in (PARTITION_H, LTO_SYMTAB_H, COMMON_TARGET_DEF_H) * Makefile.in (PARTITION_H, LTO_SYMTAB_H, COMMON_TARGET_DEF_H)

View File

@ -16457,8 +16457,8 @@ ix86_avx256_split_vector_move_misalign (rtx op0, rtx op1)
gcc_unreachable (); gcc_unreachable ();
case V32QImode: case V32QImode:
extract = gen_avx_vextractf128v32qi; extract = gen_avx_vextractf128v32qi;
load_unaligned = gen_avx_loaddqu256; load_unaligned = gen_avx_loaddquv32qi;
store_unaligned = gen_avx_storedqu256; store_unaligned = gen_avx_storedquv32qi;
mode = V16QImode; mode = V16QImode;
break; break;
case V8SFmode: case V8SFmode:
@ -16561,10 +16561,56 @@ void
ix86_expand_vector_move_misalign (enum machine_mode mode, rtx operands[]) ix86_expand_vector_move_misalign (enum machine_mode mode, rtx operands[])
{ {
rtx op0, op1, m; rtx op0, op1, m;
rtx (*load_unaligned) (rtx, rtx);
rtx (*store_unaligned) (rtx, rtx);
op0 = operands[0]; op0 = operands[0];
op1 = operands[1]; op1 = operands[1];
if (GET_MODE_SIZE (mode) == 64)
{
switch (GET_MODE_CLASS (mode))
{
case MODE_VECTOR_INT:
case MODE_INT:
op0 = gen_lowpart (V16SImode, op0);
op1 = gen_lowpart (V16SImode, op1);
/* FALLTHRU */
case MODE_VECTOR_FLOAT:
switch (GET_MODE (op0))
{
default:
gcc_unreachable ();
case V16SImode:
load_unaligned = gen_avx512f_loaddquv16si;
store_unaligned = gen_avx512f_storedquv16si;
break;
case V16SFmode:
load_unaligned = gen_avx512f_loadups512;
store_unaligned = gen_avx512f_storeups512;
break;
case V8DFmode:
load_unaligned = gen_avx512f_loadupd512;
store_unaligned = gen_avx512f_storeupd512;
break;
}
if (MEM_P (op1))
emit_insn (load_unaligned (op0, op1));
else if (MEM_P (op0))
emit_insn (store_unaligned (op0, op1));
else
gcc_unreachable ();
break;
default:
gcc_unreachable ();
}
return;
}
if (TARGET_AVX if (TARGET_AVX
&& GET_MODE_SIZE (mode) == 32) && GET_MODE_SIZE (mode) == 32)
{ {
@ -16597,7 +16643,7 @@ ix86_expand_vector_move_misalign (enum machine_mode mode, rtx operands[])
op0 = gen_lowpart (V16QImode, op0); op0 = gen_lowpart (V16QImode, op0);
op1 = gen_lowpart (V16QImode, op1); op1 = gen_lowpart (V16QImode, op1);
/* We will eventually emit movups based on insn attributes. */ /* We will eventually emit movups based on insn attributes. */
emit_insn (gen_sse2_loaddqu (op0, op1)); emit_insn (gen_sse2_loaddquv16qi (op0, op1));
} }
else if (TARGET_SSE2 && mode == V2DFmode) else if (TARGET_SSE2 && mode == V2DFmode)
{ {
@ -16672,7 +16718,7 @@ ix86_expand_vector_move_misalign (enum machine_mode mode, rtx operands[])
op0 = gen_lowpart (V16QImode, op0); op0 = gen_lowpart (V16QImode, op0);
op1 = gen_lowpart (V16QImode, op1); op1 = gen_lowpart (V16QImode, op1);
/* We will eventually emit movups based on insn attributes. */ /* We will eventually emit movups based on insn attributes. */
emit_insn (gen_sse2_storedqu (op0, op1)); emit_insn (gen_sse2_storedquv16qi (op0, op1));
} }
else if (TARGET_SSE2 && mode == V2DFmode) else if (TARGET_SSE2 && mode == V2DFmode)
{ {
@ -27400,13 +27446,13 @@ static const struct builtin_description bdesc_special_args[] =
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_lfence, "__builtin_ia32_lfence", IX86_BUILTIN_LFENCE, UNKNOWN, (int) VOID_FTYPE_VOID }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_lfence, "__builtin_ia32_lfence", IX86_BUILTIN_LFENCE, UNKNOWN, (int) VOID_FTYPE_VOID },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_mfence, 0, IX86_BUILTIN_MFENCE, UNKNOWN, (int) VOID_FTYPE_VOID }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_mfence, 0, IX86_BUILTIN_MFENCE, UNKNOWN, (int) VOID_FTYPE_VOID },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_storeupd, "__builtin_ia32_storeupd", IX86_BUILTIN_STOREUPD, UNKNOWN, (int) VOID_FTYPE_PDOUBLE_V2DF }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_storeupd, "__builtin_ia32_storeupd", IX86_BUILTIN_STOREUPD, UNKNOWN, (int) VOID_FTYPE_PDOUBLE_V2DF },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_storedqu, "__builtin_ia32_storedqu", IX86_BUILTIN_STOREDQU, UNKNOWN, (int) VOID_FTYPE_PCHAR_V16QI }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_storedquv16qi, "__builtin_ia32_storedqu", IX86_BUILTIN_STOREDQU, UNKNOWN, (int) VOID_FTYPE_PCHAR_V16QI },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_movntv2df, "__builtin_ia32_movntpd", IX86_BUILTIN_MOVNTPD, UNKNOWN, (int) VOID_FTYPE_PDOUBLE_V2DF }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_movntv2df, "__builtin_ia32_movntpd", IX86_BUILTIN_MOVNTPD, UNKNOWN, (int) VOID_FTYPE_PDOUBLE_V2DF },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_movntv2di, "__builtin_ia32_movntdq", IX86_BUILTIN_MOVNTDQ, UNKNOWN, (int) VOID_FTYPE_PV2DI_V2DI }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_movntv2di, "__builtin_ia32_movntdq", IX86_BUILTIN_MOVNTDQ, UNKNOWN, (int) VOID_FTYPE_PV2DI_V2DI },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_movntisi, "__builtin_ia32_movnti", IX86_BUILTIN_MOVNTI, UNKNOWN, (int) VOID_FTYPE_PINT_INT }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_movntisi, "__builtin_ia32_movnti", IX86_BUILTIN_MOVNTI, UNKNOWN, (int) VOID_FTYPE_PINT_INT },
{ OPTION_MASK_ISA_SSE2 | OPTION_MASK_ISA_64BIT, CODE_FOR_sse2_movntidi, "__builtin_ia32_movnti64", IX86_BUILTIN_MOVNTI64, UNKNOWN, (int) VOID_FTYPE_PLONGLONG_LONGLONG }, { OPTION_MASK_ISA_SSE2 | OPTION_MASK_ISA_64BIT, CODE_FOR_sse2_movntidi, "__builtin_ia32_movnti64", IX86_BUILTIN_MOVNTI64, UNKNOWN, (int) VOID_FTYPE_PLONGLONG_LONGLONG },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_loadupd, "__builtin_ia32_loadupd", IX86_BUILTIN_LOADUPD, UNKNOWN, (int) V2DF_FTYPE_PCDOUBLE }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_loadupd, "__builtin_ia32_loadupd", IX86_BUILTIN_LOADUPD, UNKNOWN, (int) V2DF_FTYPE_PCDOUBLE },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_loaddqu, "__builtin_ia32_loaddqu", IX86_BUILTIN_LOADDQU, UNKNOWN, (int) V16QI_FTYPE_PCCHAR }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_loaddquv16qi, "__builtin_ia32_loaddqu", IX86_BUILTIN_LOADDQU, UNKNOWN, (int) V16QI_FTYPE_PCCHAR },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_loadhpd_exp, "__builtin_ia32_loadhpd", IX86_BUILTIN_LOADHPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_PCDOUBLE }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_loadhpd_exp, "__builtin_ia32_loadhpd", IX86_BUILTIN_LOADHPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_PCDOUBLE },
{ OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_loadlpd_exp, "__builtin_ia32_loadlpd", IX86_BUILTIN_LOADLPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_PCDOUBLE }, { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_loadlpd_exp, "__builtin_ia32_loadlpd", IX86_BUILTIN_LOADLPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_PCDOUBLE },
@ -27435,8 +27481,8 @@ static const struct builtin_description bdesc_special_args[] =
{ OPTION_MASK_ISA_AVX, CODE_FOR_avx_loadups256, "__builtin_ia32_loadups256", IX86_BUILTIN_LOADUPS256, UNKNOWN, (int) V8SF_FTYPE_PCFLOAT }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_loadups256, "__builtin_ia32_loadups256", IX86_BUILTIN_LOADUPS256, UNKNOWN, (int) V8SF_FTYPE_PCFLOAT },
{ OPTION_MASK_ISA_AVX, CODE_FOR_avx_storeupd256, "__builtin_ia32_storeupd256", IX86_BUILTIN_STOREUPD256, UNKNOWN, (int) VOID_FTYPE_PDOUBLE_V4DF }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_storeupd256, "__builtin_ia32_storeupd256", IX86_BUILTIN_STOREUPD256, UNKNOWN, (int) VOID_FTYPE_PDOUBLE_V4DF },
{ OPTION_MASK_ISA_AVX, CODE_FOR_avx_storeups256, "__builtin_ia32_storeups256", IX86_BUILTIN_STOREUPS256, UNKNOWN, (int) VOID_FTYPE_PFLOAT_V8SF }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_storeups256, "__builtin_ia32_storeups256", IX86_BUILTIN_STOREUPS256, UNKNOWN, (int) VOID_FTYPE_PFLOAT_V8SF },
{ OPTION_MASK_ISA_AVX, CODE_FOR_avx_loaddqu256, "__builtin_ia32_loaddqu256", IX86_BUILTIN_LOADDQU256, UNKNOWN, (int) V32QI_FTYPE_PCCHAR }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_loaddquv32qi, "__builtin_ia32_loaddqu256", IX86_BUILTIN_LOADDQU256, UNKNOWN, (int) V32QI_FTYPE_PCCHAR },
{ OPTION_MASK_ISA_AVX, CODE_FOR_avx_storedqu256, "__builtin_ia32_storedqu256", IX86_BUILTIN_STOREDQU256, UNKNOWN, (int) VOID_FTYPE_PCHAR_V32QI }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_storedquv32qi, "__builtin_ia32_storedqu256", IX86_BUILTIN_STOREDQU256, UNKNOWN, (int) VOID_FTYPE_PCHAR_V32QI },
{ OPTION_MASK_ISA_AVX, CODE_FOR_avx_lddqu256, "__builtin_ia32_lddqu256", IX86_BUILTIN_LDDQU256, UNKNOWN, (int) V32QI_FTYPE_PCCHAR }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_lddqu256, "__builtin_ia32_lddqu256", IX86_BUILTIN_LDDQU256, UNKNOWN, (int) V32QI_FTYPE_PCCHAR },
{ OPTION_MASK_ISA_AVX, CODE_FOR_avx_movntv4di, "__builtin_ia32_movntdq256", IX86_BUILTIN_MOVNTDQ256, UNKNOWN, (int) VOID_FTYPE_PV4DI_V4DI }, { OPTION_MASK_ISA_AVX, CODE_FOR_avx_movntv4di, "__builtin_ia32_movntdq256", IX86_BUILTIN_MOVNTDQ256, UNKNOWN, (int) VOID_FTYPE_PV4DI_V4DI },

View File

@ -779,6 +779,7 @@
;; Mapping of logic operators ;; Mapping of logic operators
(define_code_iterator any_logic [and ior xor]) (define_code_iterator any_logic [and ior xor])
(define_code_iterator any_or [ior xor]) (define_code_iterator any_or [ior xor])
(define_code_iterator fpint_logic [and xor])
;; Base name for insn mnemonic. ;; Base name for insn mnemonic.
(define_code_attr logic [(and "and") (ior "or") (xor "xor")]) (define_code_attr logic [(and "and") (ior "or") (xor "xor")])

View File

@ -97,13 +97,13 @@
;; All vector modes including V?TImode, used in move patterns. ;; All vector modes including V?TImode, used in move patterns.
(define_mode_iterator VMOVE (define_mode_iterator VMOVE
[(V32QI "TARGET_AVX") V16QI [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
(V16HI "TARGET_AVX") V8HI (V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI
(V8SI "TARGET_AVX") V4SI (V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX") V4SI
(V4DI "TARGET_AVX") V2DI (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX") V2DI
(V2TI "TARGET_AVX") V1TI (V2TI "TARGET_AVX") V1TI
(V8SF "TARGET_AVX") V4SF (V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF
(V4DF "TARGET_AVX") V2DF]) (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") V2DF])
;; All vector modes ;; All vector modes
(define_mode_iterator V (define_mode_iterator V
@ -124,6 +124,11 @@
;; All vector float modes ;; All vector float modes
(define_mode_iterator VF (define_mode_iterator VF
[(V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF
(V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") (V2DF "TARGET_SSE2")])
;; 128- and 256-bit float vector modes
(define_mode_iterator VF_128_256
[(V8SF "TARGET_AVX") V4SF [(V8SF "TARGET_AVX") V4SF
(V4DF "TARGET_AVX") (V2DF "TARGET_SSE2")]) (V4DF "TARGET_AVX") (V2DF "TARGET_SSE2")])
@ -143,6 +148,10 @@
(define_mode_iterator VF_256 (define_mode_iterator VF_256
[V8SF V4DF]) [V8SF V4DF])
;; All 512bit vector float modes
(define_mode_iterator VF_512
[V16SF V8DF])
;; All vector integer modes ;; All vector integer modes
(define_mode_iterator VI (define_mode_iterator VI
[(V32QI "TARGET_AVX") V16QI [(V32QI "TARGET_AVX") V16QI
@ -160,6 +169,10 @@
(define_mode_iterator VI1 (define_mode_iterator VI1
[(V32QI "TARGET_AVX") V16QI]) [(V32QI "TARGET_AVX") V16QI])
(define_mode_iterator VI_UNALIGNED_LOADSTORE
[(V32QI "TARGET_AVX") V16QI
(V16SI "TARGET_AVX512F") (V8DI "TARGET_AVX512F")])
;; All DImode vector integer modes ;; All DImode vector integer modes
(define_mode_iterator VI8 (define_mode_iterator VI8
[(V4DI "TARGET_AVX") V2DI]) [(V4DI "TARGET_AVX") V2DI])
@ -212,11 +225,18 @@
(V4SI "TARGET_AVX2") (V2DI "TARGET_AVX2") (V4SI "TARGET_AVX2") (V2DI "TARGET_AVX2")
(V8SI "TARGET_AVX2") (V4DI "TARGET_AVX2")]) (V8SI "TARGET_AVX2") (V4DI "TARGET_AVX2")])
(define_mode_attr sse2_avx_avx512f
[(V16QI "sse2") (V32QI "avx") (V64QI "avx512f")
(V4SI "sse2") (V8SI "avx") (V16SI "avx512f")
(V8DI "avx512f")
(V16SF "avx512f") (V8SF "avx") (V4SF "avx")
(V8DF "avx512f") (V4DF "avx") (V2DF "avx")])
(define_mode_attr sse2_avx2 (define_mode_attr sse2_avx2
[(V16QI "sse2") (V32QI "avx2") [(V16QI "sse2") (V32QI "avx2")
(V8HI "sse2") (V16HI "avx2") (V8HI "sse2") (V16HI "avx2")
(V4SI "sse2") (V8SI "avx2") (V4SI "sse2") (V8SI "avx2") (V16SI "avx512f")
(V2DI "sse2") (V4DI "avx2") (V2DI "sse2") (V4DI "avx2") (V8DI "avx512f")
(V1TI "sse2") (V2TI "avx2")]) (V1TI "sse2") (V2TI "avx2")])
(define_mode_attr ssse3_avx2 (define_mode_attr ssse3_avx2
@ -229,7 +249,7 @@
(define_mode_attr sse4_1_avx2 (define_mode_attr sse4_1_avx2
[(V16QI "sse4_1") (V32QI "avx2") [(V16QI "sse4_1") (V32QI "avx2")
(V8HI "sse4_1") (V16HI "avx2") (V8HI "sse4_1") (V16HI "avx2")
(V4SI "sse4_1") (V8SI "avx2") (V4SI "sse4_1") (V8SI "avx2") (V16SI "avx512f")
(V2DI "sse4_1") (V4DI "avx2")]) (V2DI "sse4_1") (V4DI "avx2")])
(define_mode_attr avx_avx2 (define_mode_attr avx_avx2
@ -244,6 +264,12 @@
(V4SI "vec") (V8SI "avx2") (V4SI "vec") (V8SI "avx2")
(V2DI "vec") (V4DI "avx2")]) (V2DI "vec") (V4DI "avx2")])
(define_mode_attr avx2_avx512f
[(V4SI "avx2") (V8SI "avx2") (V16SI "avx512f")
(V2DI "avx2") (V4DI "avx2") (V8DI "avx512f")
(V8SF "avx2") (V16SF "avx512f")
(V4DF "avx2") (V8DF "avx512f")])
(define_mode_attr shuffletype (define_mode_attr shuffletype
[(V16SF "f") (V16SI "i") (V8DF "f") (V8DI "i") [(V16SF "f") (V16SI "i") (V8DF "f") (V8DI "i")
(V8SF "f") (V8SI "i") (V4DF "f") (V4DI "i") (V8SF "f") (V8SI "i") (V4DF "f") (V4DI "i")
@ -287,22 +313,26 @@
(define_mode_attr sse (define_mode_attr sse
[(SF "sse") (DF "sse2") [(SF "sse") (DF "sse2")
(V4SF "sse") (V2DF "sse2") (V4SF "sse") (V2DF "sse2")
(V8SF "avx") (V4DF "avx")]) (V16SF "avx512f") (V8SF "avx")
(V8DF "avx512f") (V4DF "avx")])
(define_mode_attr sse2 (define_mode_attr sse2
[(V16QI "sse2") (V32QI "avx") [(V16QI "sse2") (V32QI "avx") (V64QI "avx512f")
(V2DI "sse2") (V4DI "avx")]) (V2DI "sse2") (V4DI "avx") (V8DI "avx512f")])
(define_mode_attr sse3 (define_mode_attr sse3
[(V16QI "sse3") (V32QI "avx")]) [(V16QI "sse3") (V32QI "avx")])
(define_mode_attr sse4_1 (define_mode_attr sse4_1
[(V4SF "sse4_1") (V2DF "sse4_1") [(V4SF "sse4_1") (V2DF "sse4_1")
(V8SF "avx") (V4DF "avx")]) (V8SF "avx") (V4DF "avx")
(V8DF "avx512f")])
(define_mode_attr avxsizesuffix (define_mode_attr avxsizesuffix
[(V32QI "256") (V16HI "256") (V8SI "256") (V4DI "256") [(V64QI "512") (V32HI "512") (V16SI "512") (V8DI "512")
(V32QI "256") (V16HI "256") (V8SI "256") (V4DI "256")
(V16QI "") (V8HI "") (V4SI "") (V2DI "") (V16QI "") (V8HI "") (V4SI "") (V2DI "")
(V16SF "512") (V8DF "512")
(V8SF "256") (V4DF "256") (V8SF "256") (V4DF "256")
(V4SF "") (V2DF "")]) (V4SF "") (V2DF "")])
@ -318,8 +348,10 @@
;; Mapping of vector float modes to an integer mode of the same size ;; Mapping of vector float modes to an integer mode of the same size
(define_mode_attr sseintvecmode (define_mode_attr sseintvecmode
[(V8SF "V8SI") (V4DF "V4DI") [(V16SF "V16SI") (V8DF "V8DI")
(V8SF "V8SI") (V4DF "V4DI")
(V4SF "V4SI") (V2DF "V2DI") (V4SF "V4SI") (V2DF "V2DI")
(V16SI "V16SI") (V8DI "V8DI")
(V8SI "V8SI") (V4DI "V4DI") (V8SI "V8SI") (V4DI "V4DI")
(V4SI "V4SI") (V2DI "V2DI") (V4SI "V4SI") (V2DI "V2DI")
(V16HI "V16HI") (V8HI "V8HI") (V16HI "V16HI") (V8HI "V8HI")
@ -349,8 +381,10 @@
;; Mapping of vector modes ti packed single mode of the same size ;; Mapping of vector modes ti packed single mode of the same size
(define_mode_attr ssePSmode (define_mode_attr ssePSmode
[(V32QI "V8SF") (V16QI "V4SF") [(V16SI "V16SF") (V8DF "V16SF")
(V16HI "V8SF") (V8HI "V4SF") (V16SF "V16SF") (V8DI "V16SF")
(V64QI "V16SF") (V32QI "V8SF") (V16QI "V4SF")
(V32HI "V16SF") (V16HI "V8SF") (V8HI "V4SF")
(V8SI "V8SF") (V4SI "V4SF") (V8SI "V8SF") (V4SI "V4SF")
(V4DI "V8SF") (V2DI "V4SF") (V4DI "V8SF") (V2DI "V4SF")
(V2TI "V8SF") (V1TI "V4SF") (V2TI "V8SF") (V1TI "V4SF")
@ -665,12 +699,13 @@
(define_insn "<sse>_loadu<ssemodesuffix><avxsizesuffix>" (define_insn "<sse>_loadu<ssemodesuffix><avxsizesuffix>"
[(set (match_operand:VF 0 "register_operand" "=v") [(set (match_operand:VF 0 "register_operand" "=v")
(unspec:VF (unspec:VF
[(match_operand:VF 1 "memory_operand" "m")] [(match_operand:VF 1 "nonimmediate_operand" "vm")]
UNSPEC_LOADU))] UNSPEC_LOADU))]
"TARGET_SSE" "TARGET_SSE"
{ {
switch (get_attr_mode (insn)) switch (get_attr_mode (insn))
{ {
case MODE_V16SF:
case MODE_V8SF: case MODE_V8SF:
case MODE_V4SF: case MODE_V4SF:
return "%vmovups\t{%1, %0|%0, %1}"; return "%vmovups\t{%1, %0|%0, %1}";
@ -694,12 +729,13 @@
(define_insn "<sse>_storeu<ssemodesuffix><avxsizesuffix>" (define_insn "<sse>_storeu<ssemodesuffix><avxsizesuffix>"
[(set (match_operand:VF 0 "memory_operand" "=m") [(set (match_operand:VF 0 "memory_operand" "=m")
(unspec:VF (unspec:VF
[(match_operand:VF 1 "register_operand" "x")] [(match_operand:VF 1 "register_operand" "v")]
UNSPEC_STOREU))] UNSPEC_STOREU))]
"TARGET_SSE" "TARGET_SSE"
{ {
switch (get_attr_mode (insn)) switch (get_attr_mode (insn))
{ {
case MODE_V16SF:
case MODE_V8SF: case MODE_V8SF:
case MODE_V4SF: case MODE_V4SF:
return "%vmovups\t{%1, %0|%0, %1}"; return "%vmovups\t{%1, %0|%0, %1}";
@ -721,9 +757,10 @@
] ]
(const_string "<MODE>")))]) (const_string "<MODE>")))])
(define_insn "<sse2>_loaddqu<avxsizesuffix>" (define_insn "<sse2_avx_avx512f>_loaddqu<mode>"
[(set (match_operand:VI1 0 "register_operand" "=v") [(set (match_operand:VI_UNALIGNED_LOADSTORE 0 "register_operand" "=v")
(unspec:VI1 [(match_operand:VI1 1 "memory_operand" "m")] (unspec:VI_UNALIGNED_LOADSTORE
[(match_operand:VI_UNALIGNED_LOADSTORE 1 "nonimmediate_operand" "vm")]
UNSPEC_LOADU))] UNSPEC_LOADU))]
"TARGET_SSE2" "TARGET_SSE2"
{ {
@ -732,6 +769,11 @@
case MODE_V8SF: case MODE_V8SF:
case MODE_V4SF: case MODE_V4SF:
return "%vmovups\t{%1, %0|%0, %1}"; return "%vmovups\t{%1, %0|%0, %1}";
case MODE_XI:
if (<MODE>mode == V8DImode)
return "vmovdqu64\t{%1, %0|%0, %1}";
else
return "vmovdqu32\t{%1, %0|%0, %1}";
default: default:
return "%vmovdqu\t{%1, %0|%0, %1}"; return "%vmovdqu\t{%1, %0|%0, %1}";
} }
@ -754,9 +796,10 @@
] ]
(const_string "<sseinsnmode>")))]) (const_string "<sseinsnmode>")))])
(define_insn "<sse2>_storedqu<avxsizesuffix>" (define_insn "<sse2_avx_avx512f>_storedqu<mode>"
[(set (match_operand:VI1 0 "memory_operand" "=m") [(set (match_operand:VI_UNALIGNED_LOADSTORE 0 "memory_operand" "=m")
(unspec:VI1 [(match_operand:VI1 1 "register_operand" "v")] (unspec:VI_UNALIGNED_LOADSTORE
[(match_operand:VI_UNALIGNED_LOADSTORE 1 "register_operand" "v")]
UNSPEC_STOREU))] UNSPEC_STOREU))]
"TARGET_SSE2" "TARGET_SSE2"
{ {
@ -765,6 +808,11 @@
case MODE_V8SF: case MODE_V8SF:
case MODE_V4SF: case MODE_V4SF:
return "%vmovups\t{%1, %0|%0, %1}"; return "%vmovups\t{%1, %0|%0, %1}";
case MODE_XI:
if (<MODE>mode == V8DImode)
return "vmovdqu64\t{%1, %0|%0, %1}";
else
return "vmovdqu32\t{%1, %0|%0, %1}";
default: default:
return "%vmovdqu\t{%1, %0|%0, %1}"; return "%vmovdqu\t{%1, %0|%0, %1}";
} }
@ -821,7 +869,8 @@
(define_insn "<sse>_movnt<mode>" (define_insn "<sse>_movnt<mode>"
[(set (match_operand:VF 0 "memory_operand" "=m") [(set (match_operand:VF 0 "memory_operand" "=m")
(unspec:VF [(match_operand:VF 1 "register_operand" "x")] (unspec:VF
[(match_operand:VF 1 "register_operand" "v")]
UNSPEC_MOVNT))] UNSPEC_MOVNT))]
"TARGET_SSE" "TARGET_SSE"
"%vmovnt<ssemodesuffix>\t{%1, %0|%0, %1}" "%vmovnt<ssemodesuffix>\t{%1, %0|%0, %1}"
@ -852,9 +901,9 @@
(define_mode_iterator STORENT_MODE (define_mode_iterator STORENT_MODE
[(DI "TARGET_SSE2 && TARGET_64BIT") (SI "TARGET_SSE2") [(DI "TARGET_SSE2 && TARGET_64BIT") (SI "TARGET_SSE2")
(SF "TARGET_SSE4A") (DF "TARGET_SSE4A") (SF "TARGET_SSE4A") (DF "TARGET_SSE4A")
(V4DI "TARGET_AVX") (V2DI "TARGET_SSE2") (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX") (V2DI "TARGET_SSE2")
(V8SF "TARGET_AVX") V4SF (V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF
(V4DF "TARGET_AVX") (V2DF "TARGET_SSE2")]) (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") (V2DF "TARGET_SSE2")])
(define_expand "storent<mode>" (define_expand "storent<mode>"
[(set (match_operand:STORENT_MODE 0 "memory_operand") [(set (match_operand:STORENT_MODE 0 "memory_operand")
@ -877,10 +926,10 @@
"ix86_expand_fp_absneg_operator (<CODE>, <MODE>mode, operands); DONE;") "ix86_expand_fp_absneg_operator (<CODE>, <MODE>mode, operands); DONE;")
(define_insn_and_split "*absneg<mode>2" (define_insn_and_split "*absneg<mode>2"
[(set (match_operand:VF 0 "register_operand" "=x,x,x,x") [(set (match_operand:VF 0 "register_operand" "=x,x,v,v")
(match_operator:VF 3 "absneg_operator" (match_operator:VF 3 "absneg_operator"
[(match_operand:VF 1 "nonimmediate_operand" "0, xm,x, m")])) [(match_operand:VF 1 "nonimmediate_operand" "0, xm, v, m")]))
(use (match_operand:VF 2 "nonimmediate_operand" "xm,0, xm,x"))] (use (match_operand:VF 2 "nonimmediate_operand" "xm, 0, vm,v"))]
"TARGET_SSE" "TARGET_SSE"
"#" "#"
"&& reload_completed" "&& reload_completed"
@ -962,10 +1011,10 @@
"ix86_fixup_binary_operands_no_copy (MULT, <MODE>mode, operands);") "ix86_fixup_binary_operands_no_copy (MULT, <MODE>mode, operands);")
(define_insn "*mul<mode>3" (define_insn "*mul<mode>3"
[(set (match_operand:VF 0 "register_operand" "=x,x") [(set (match_operand:VF 0 "register_operand" "=x,v")
(mult:VF (mult:VF
(match_operand:VF 1 "nonimmediate_operand" "%0,x") (match_operand:VF 1 "nonimmediate_operand" "%0,v")
(match_operand:VF 2 "nonimmediate_operand" "xm,xm")))] (match_operand:VF 2 "nonimmediate_operand" "xm,vm")))]
"TARGET_SSE && ix86_binary_operator_ok (MULT, <MODE>mode, operands)" "TARGET_SSE && ix86_binary_operator_ok (MULT, <MODE>mode, operands)"
"@ "@
mul<ssemodesuffix>\t{%2, %0|%0, %2} mul<ssemodesuffix>\t{%2, %0|%0, %2}
@ -1239,10 +1288,10 @@
;; presence of -0.0 and NaN. ;; presence of -0.0 and NaN.
(define_insn "*ieee_smin<mode>3" (define_insn "*ieee_smin<mode>3"
[(set (match_operand:VF 0 "register_operand" "=x,x") [(set (match_operand:VF 0 "register_operand" "=v,v")
(unspec:VF (unspec:VF
[(match_operand:VF 1 "register_operand" "0,x") [(match_operand:VF 1 "register_operand" "0,v")
(match_operand:VF 2 "nonimmediate_operand" "xm,xm")] (match_operand:VF 2 "nonimmediate_operand" "vm,vm")]
UNSPEC_IEEE_MIN))] UNSPEC_IEEE_MIN))]
"TARGET_SSE" "TARGET_SSE"
"@ "@
@ -1254,10 +1303,10 @@
(set_attr "mode" "<MODE>")]) (set_attr "mode" "<MODE>")])
(define_insn "*ieee_smax<mode>3" (define_insn "*ieee_smax<mode>3"
[(set (match_operand:VF 0 "register_operand" "=x,x") [(set (match_operand:VF 0 "register_operand" "=v,v")
(unspec:VF (unspec:VF
[(match_operand:VF 1 "register_operand" "0,x") [(match_operand:VF 1 "register_operand" "0,v")
(match_operand:VF 2 "nonimmediate_operand" "xm,xm")] (match_operand:VF 2 "nonimmediate_operand" "vm,vm")]
UNSPEC_IEEE_MAX))] UNSPEC_IEEE_MAX))]
"TARGET_SSE" "TARGET_SSE"
"@ "@
@ -1632,10 +1681,10 @@
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(define_insn "avx_cmp<mode>3" (define_insn "avx_cmp<mode>3"
[(set (match_operand:VF 0 "register_operand" "=x") [(set (match_operand:VF_128_256 0 "register_operand" "=x")
(unspec:VF (unspec:VF_128_256
[(match_operand:VF 1 "register_operand" "x") [(match_operand:VF_128_256 1 "register_operand" "x")
(match_operand:VF 2 "nonimmediate_operand" "xm") (match_operand:VF_128_256 2 "nonimmediate_operand" "xm")
(match_operand:SI 3 "const_0_to_31_operand" "n")] (match_operand:SI 3 "const_0_to_31_operand" "n")]
UNSPEC_PCMP))] UNSPEC_PCMP))]
"TARGET_AVX" "TARGET_AVX"
@ -1663,10 +1712,10 @@
(set_attr "mode" "<ssescalarmode>")]) (set_attr "mode" "<ssescalarmode>")])
(define_insn "*<sse>_maskcmp<mode>3_comm" (define_insn "*<sse>_maskcmp<mode>3_comm"
[(set (match_operand:VF 0 "register_operand" "=x,x") [(set (match_operand:VF_128_256 0 "register_operand" "=x,x")
(match_operator:VF 3 "sse_comparison_operator" (match_operator:VF_128_256 3 "sse_comparison_operator"
[(match_operand:VF 1 "register_operand" "%0,x") [(match_operand:VF_128_256 1 "register_operand" "%0,x")
(match_operand:VF 2 "nonimmediate_operand" "xm,xm")]))] (match_operand:VF_128_256 2 "nonimmediate_operand" "xm,xm")]))]
"TARGET_SSE "TARGET_SSE
&& GET_RTX_CLASS (GET_CODE (operands[3])) == RTX_COMM_COMPARE" && GET_RTX_CLASS (GET_CODE (operands[3])) == RTX_COMM_COMPARE"
"@ "@
@ -1679,10 +1728,10 @@
(set_attr "mode" "<MODE>")]) (set_attr "mode" "<MODE>")])
(define_insn "<sse>_maskcmp<mode>3" (define_insn "<sse>_maskcmp<mode>3"
[(set (match_operand:VF 0 "register_operand" "=x,x") [(set (match_operand:VF_128_256 0 "register_operand" "=x,x")
(match_operator:VF 3 "sse_comparison_operator" (match_operator:VF_128_256 3 "sse_comparison_operator"
[(match_operand:VF 1 "register_operand" "0,x") [(match_operand:VF_128_256 1 "register_operand" "0,x")
(match_operand:VF 2 "nonimmediate_operand" "xm,xm")]))] (match_operand:VF_128_256 2 "nonimmediate_operand" "xm,xm")]))]
"TARGET_SSE" "TARGET_SSE"
"@ "@
cmp%D3<ssemodesuffix>\t{%2, %0|%0, %2} cmp%D3<ssemodesuffix>\t{%2, %0|%0, %2}
@ -1792,11 +1841,11 @@
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(define_insn "<sse>_andnot<mode>3" (define_insn "<sse>_andnot<mode>3"
[(set (match_operand:VF 0 "register_operand" "=x,x") [(set (match_operand:VF 0 "register_operand" "=x,v")
(and:VF (and:VF
(not:VF (not:VF
(match_operand:VF 1 "register_operand" "0,x")) (match_operand:VF 1 "register_operand" "0,v"))
(match_operand:VF 2 "nonimmediate_operand" "xm,xm")))] (match_operand:VF 2 "nonimmediate_operand" "xm,vm")))]
"TARGET_SSE" "TARGET_SSE"
{ {
static char buf[32]; static char buf[32];
@ -1825,12 +1874,19 @@
gcc_unreachable (); gcc_unreachable ();
} }
/* There is no vandnp[sd]. Use vpandnq. */
if (GET_MODE_SIZE (<MODE>mode) == 64)
{
suffix = "q";
ops = "vpandn%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
}
snprintf (buf, sizeof (buf), ops, suffix); snprintf (buf, sizeof (buf), ops, suffix);
return buf; return buf;
} }
[(set_attr "isa" "noavx,avx") [(set_attr "isa" "noavx,avx")
(set_attr "type" "sselog") (set_attr "type" "sselog")
(set_attr "prefix" "orig,vex") (set_attr "prefix" "orig,maybe_evex")
(set (attr "mode") (set (attr "mode")
(cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL") (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
(const_string "<ssePSmode>") (const_string "<ssePSmode>")
@ -1842,13 +1898,21 @@
(const_string "<MODE>")))]) (const_string "<MODE>")))])
(define_expand "<code><mode>3" (define_expand "<code><mode>3"
[(set (match_operand:VF 0 "register_operand") [(set (match_operand:VF_128_256 0 "register_operand")
(any_logic:VF (any_logic:VF_128_256
(match_operand:VF 1 "nonimmediate_operand") (match_operand:VF_128_256 1 "nonimmediate_operand")
(match_operand:VF 2 "nonimmediate_operand")))] (match_operand:VF_128_256 2 "nonimmediate_operand")))]
"TARGET_SSE" "TARGET_SSE"
"ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);") "ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")
(define_expand "<code><mode>3"
[(set (match_operand:VF_512 0 "register_operand")
(fpint_logic:VF_512
(match_operand:VF_512 1 "nonimmediate_operand")
(match_operand:VF_512 2 "nonimmediate_operand")))]
"TARGET_AVX512F"
"ix86_fixup_binary_operands_no_copy (<CODE>, <MODE>mode, operands);")
(define_insn "*<code><mode>3" (define_insn "*<code><mode>3"
[(set (match_operand:VF 0 "register_operand" "=x,v") [(set (match_operand:VF 0 "register_operand" "=x,v")
(any_logic:VF (any_logic:VF
@ -1882,12 +1946,19 @@
gcc_unreachable (); gcc_unreachable ();
} }
/* There is no v<logic>p[sd]. Use vp<logic>q. */
if (GET_MODE_SIZE (<MODE>mode) == 64)
{
suffix = "q";
ops = "vp<logic>%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
}
snprintf (buf, sizeof (buf), ops, suffix); snprintf (buf, sizeof (buf), ops, suffix);
return buf; return buf;
} }
[(set_attr "isa" "noavx,avx") [(set_attr "isa" "noavx,avx")
(set_attr "type" "sselog") (set_attr "type" "sselog")
(set_attr "prefix" "orig,vex") (set_attr "prefix" "orig,maybe_evex")
(set (attr "mode") (set (attr "mode")
(cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL") (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
(const_string "<ssePSmode>") (const_string "<ssePSmode>")
@ -2105,6 +2176,23 @@
] ]
(const_string "TI")))]) (const_string "TI")))])
;; There are no floating point xor for V16SF and V8DF in avx512f
;; but we need them for negation. Instead we use int versions of
;; xor. Maybe there could be a better way to do that.
(define_mode_attr avx512flogicsuff
[(V16SF "d") (V8DF "q")])
(define_insn "avx512f_<logic><mode>"
[(set (match_operand:VF_512 0 "register_operand" "=v")
(fpint_logic:VF_512
(match_operand:VF_512 1 "register_operand" "v")
(match_operand:VF_512 2 "nonimmediate_operand" "vm")))]
"TARGET_AVX512F"
"vp<logic><avx512flogicsuff>\t{%2, %1, %0|%0, %1, %2}"
[(set_attr "type" "sselog")
(set_attr "prefix" "evex")])
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; ;;
;; FMA floating point multiply/accumulate instructions. These include ;; FMA floating point multiply/accumulate instructions. These include
@ -7747,7 +7835,7 @@
(define_insn "<sse>_movmsk<ssemodesuffix><avxsizesuffix>" (define_insn "<sse>_movmsk<ssemodesuffix><avxsizesuffix>"
[(set (match_operand:SI 0 "register_operand" "=r") [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI (unspec:SI
[(match_operand:VF 1 "register_operand" "x")] [(match_operand:VF_128_256 1 "register_operand" "x")]
UNSPEC_MOVMSK))] UNSPEC_MOVMSK))]
"TARGET_SSE" "TARGET_SSE"
"%vmovmsk<ssemodesuffix>\t{%1, %0|%0, %1}" "%vmovmsk<ssemodesuffix>\t{%1, %0|%0, %1}"
@ -8537,10 +8625,10 @@
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(define_insn "<sse4_1>_blend<ssemodesuffix><avxsizesuffix>" (define_insn "<sse4_1>_blend<ssemodesuffix><avxsizesuffix>"
[(set (match_operand:VF 0 "register_operand" "=x,x") [(set (match_operand:VF_128_256 0 "register_operand" "=x,x")
(vec_merge:VF (vec_merge:VF_128_256
(match_operand:VF 2 "nonimmediate_operand" "xm,xm") (match_operand:VF_128_256 2 "nonimmediate_operand" "xm,xm")
(match_operand:VF 1 "register_operand" "0,x") (match_operand:VF_128_256 1 "register_operand" "0,x")
(match_operand:SI 3 "const_0_to_<blendbits>_operand")))] (match_operand:SI 3 "const_0_to_<blendbits>_operand")))]
"TARGET_SSE4_1" "TARGET_SSE4_1"
"@ "@
@ -8555,11 +8643,11 @@
(set_attr "mode" "<MODE>")]) (set_attr "mode" "<MODE>")])
(define_insn "<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>" (define_insn "<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>"
[(set (match_operand:VF 0 "register_operand" "=x,x") [(set (match_operand:VF_128_256 0 "register_operand" "=x,x")
(unspec:VF (unspec:VF_128_256
[(match_operand:VF 1 "register_operand" "0,x") [(match_operand:VF_128_256 1 "register_operand" "0,x")
(match_operand:VF 2 "nonimmediate_operand" "xm,xm") (match_operand:VF_128_256 2 "nonimmediate_operand" "xm,xm")
(match_operand:VF 3 "register_operand" "Yz,x")] (match_operand:VF_128_256 3 "register_operand" "Yz,x")]
UNSPEC_BLENDV))] UNSPEC_BLENDV))]
"TARGET_SSE4_1" "TARGET_SSE4_1"
"@ "@
@ -8575,10 +8663,10 @@
(set_attr "mode" "<MODE>")]) (set_attr "mode" "<MODE>")])
(define_insn "<sse4_1>_dp<ssemodesuffix><avxsizesuffix>" (define_insn "<sse4_1>_dp<ssemodesuffix><avxsizesuffix>"
[(set (match_operand:VF 0 "register_operand" "=x,x") [(set (match_operand:VF_128_256 0 "register_operand" "=x,x")
(unspec:VF (unspec:VF_128_256
[(match_operand:VF 1 "nonimmediate_operand" "%0,x") [(match_operand:VF_128_256 1 "nonimmediate_operand" "%0,x")
(match_operand:VF 2 "nonimmediate_operand" "xm,xm") (match_operand:VF_128_256 2 "nonimmediate_operand" "xm,xm")
(match_operand:SI 3 "const_0_to_255_operand" "n,n")] (match_operand:SI 3 "const_0_to_255_operand" "n,n")]
UNSPEC_DP))] UNSPEC_DP))]
"TARGET_SSE4_1" "TARGET_SSE4_1"
@ -8909,8 +8997,8 @@
;; setting FLAGS_REG. But it is not a really compare instruction. ;; setting FLAGS_REG. But it is not a really compare instruction.
(define_insn "avx_vtest<ssemodesuffix><avxsizesuffix>" (define_insn "avx_vtest<ssemodesuffix><avxsizesuffix>"
[(set (reg:CC FLAGS_REG) [(set (reg:CC FLAGS_REG)
(unspec:CC [(match_operand:VF 0 "register_operand" "x") (unspec:CC [(match_operand:VF_128_256 0 "register_operand" "x")
(match_operand:VF 1 "nonimmediate_operand" "xm")] (match_operand:VF_128_256 1 "nonimmediate_operand" "xm")]
UNSPEC_VTESTP))] UNSPEC_VTESTP))]
"TARGET_AVX" "TARGET_AVX"
"vtest<ssemodesuffix>\t{%1, %0|%0, %1}" "vtest<ssemodesuffix>\t{%1, %0|%0, %1}"
@ -8947,9 +9035,9 @@
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
(define_insn "<sse4_1>_round<ssemodesuffix><avxsizesuffix>" (define_insn "<sse4_1>_round<ssemodesuffix><avxsizesuffix>"
[(set (match_operand:VF 0 "register_operand" "=x") [(set (match_operand:VF_128_256 0 "register_operand" "=x")
(unspec:VF (unspec:VF_128_256
[(match_operand:VF 1 "nonimmediate_operand" "xm") [(match_operand:VF_128_256 1 "nonimmediate_operand" "xm")
(match_operand:SI 2 "const_0_to_15_operand" "n")] (match_operand:SI 2 "const_0_to_15_operand" "n")]
UNSPEC_ROUND))] UNSPEC_ROUND))]
"TARGET_ROUND" "TARGET_ROUND"
@ -10341,10 +10429,10 @@
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
(define_insn "xop_vpermil2<mode>3" (define_insn "xop_vpermil2<mode>3"
[(set (match_operand:VF 0 "register_operand" "=x") [(set (match_operand:VF_128_256 0 "register_operand" "=x")
(unspec:VF (unspec:VF_128_256
[(match_operand:VF 1 "register_operand" "x") [(match_operand:VF_128_256 1 "register_operand" "x")
(match_operand:VF 2 "nonimmediate_operand" "%x") (match_operand:VF_128_256 2 "nonimmediate_operand" "%x")
(match_operand:<sseintvecmode> 3 "nonimmediate_operand" "xm") (match_operand:<sseintvecmode> 3 "nonimmediate_operand" "xm")
(match_operand:SI 4 "const_0_to_3_operand" "n")] (match_operand:SI 4 "const_0_to_3_operand" "n")]
UNSPEC_VPERMIL2))] UNSPEC_VPERMIL2))]
@ -10794,7 +10882,7 @@
= gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (<ssescalarnum>, perm)); = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (<ssescalarnum>, perm));
}) })
(define_insn "*avx_vpermilp<mode>" (define_insn "*<sse2_avx_avx512f>_vpermilp<mode>"
[(set (match_operand:VF 0 "register_operand" "=v") [(set (match_operand:VF 0 "register_operand" "=v")
(vec_select:VF (vec_select:VF
(match_operand:VF 1 "nonimmediate_operand" "vm") (match_operand:VF 1 "nonimmediate_operand" "vm")
@ -10811,9 +10899,9 @@
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "length_immediate" "1") (set_attr "length_immediate" "1")
(set_attr "prefix" "vex") (set_attr "prefix" "vex")
(set_attr "mode" "<MODE>")]) (set_attr "mode" "<sseinsnmode>")])
(define_insn "avx_vpermilvar<mode>3" (define_insn "<sse2_avx_avx512f>_vpermilvar<mode>3"
[(set (match_operand:VF 0 "register_operand" "=v") [(set (match_operand:VF 0 "register_operand" "=v")
(unspec:VF (unspec:VF
[(match_operand:VF 1 "register_operand" "v") [(match_operand:VF 1 "register_operand" "v")
@ -10823,9 +10911,10 @@
"vpermil<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}" "vpermil<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
[(set_attr "type" "sselog") [(set_attr "type" "sselog")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "prefix" "vex")
(set_attr "btver2_decode" "vector") (set_attr "btver2_decode" "vector")
(set_attr "mode" "<MODE>")]) (set_attr "prefix" "vex")
(set_attr "mode" "<sseinsnmode>")])
(define_expand "avx_vperm2f128<mode>3" (define_expand "avx_vperm2f128<mode>3"
[(set (match_operand:AVX256MODE2P 0 "register_operand") [(set (match_operand:AVX256MODE2P 0 "register_operand")