Commit Graph

188224 Commits

Author SHA1 Message Date
H.J. Lu
48b3caffca x86: Add TARGET_SSE_PARTIAL_REG_[FP_]CONVERTS_DEPENDENCY
1. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY in SSE FP to FP splitters.
2. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY in SSE INT to FP splitters.
3.  Also check TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY and
TARGET_SSE_PARTIAL_REG_DEPENDENCY when handling avx_partial_xmm_update
attribute.  Don't convert AVX partial XMM register update if there is no
partial SSE register dependency for SSE conversion.

gcc/

	* config/i386/i386-features.c (remove_partial_avx_dependency):
	Also check TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY and
	and TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY before generating
	vxorps.
	* config/i386/i386.h (TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY):
	New.
	(TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
	* config/i386/i386.md (SSE FP to FP splitters): Replace
	TARGET_SSE_PARTIAL_REG_DEPENDENCY with
	TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY.
	(SSE INT to FP splitter): Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY
	with TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY.
	* config/i386/x86-tune.def
	(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): New.
	(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.

gcc/testsuite/

	* gcc.target/i386/avx-covert-1.c: New file.
	* gcc.target/i386/avx-fp-covert-1.c: Likewise.
	* gcc.target/i386/avx-int-covert-1.c: Likewise.
	* gcc.target/i386/sse-covert-1.c: Likewise.
	* gcc.target/i386/sse-fp-covert-1.c: Likewise.
	* gcc.target/i386/sse-int-covert-1.c: Likewise.
2021-09-17 16:18:15 +08:00
H.J. Lu
16cca1806d x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS
Check TARGET_USE_VECTOR_FP_CONVERTS or TARGET_USE_VECTOR_CONVERTS when
handling avx_partial_xmm_update attribute.  Don't convert AVX partial
XMM register update if vector packed SSE conversion should be used.

gcc/

	PR target/101900
	* config/i386/i386-features.c (remove_partial_avx_dependency):
	Check TARGET_USE_VECTOR_FP_CONVERTS and TARGET_USE_VECTOR_CONVERTS
	before generating vxorps.

gcc/testsuite

	PR target/101900
	* gcc.target/i386/pr101900-1.c: New test.
	* gcc.target/i386/pr101900-2.c: Likewise.
	* gcc.target/i386/pr101900-3.c: Likewise.
2021-09-17 16:17:57 +08:00
H.J. Lu
c3a2437fec x86: Update memcpy/memset inline strategies for -mtune=tremont
Simply memcpy and memset inline strategies to avoid branches for
-mtune=tremont:

1. Create Tremont cost model from generic cost model.
2. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
   load and store for up to 16 * 16 (256) bytes when the data size is
   fixed and known.
3. Inline only if data size is known to be <= 256.
   a. Use "rep movsb/stosb" with simple code sequence if the data size
      is a constant.
   b. Use loop if data size is not a constant.
4. Use memcpy/memset libray function if data size is unknown or > 256.

	* config/i386/i386-options.c (processor_cost_table): Use
	tremont_cost for Tremont.
	* config/i386/x86-tune-costs.h (tremont_memcpy): New.
	(tremont_memset): Likewise.
	(tremont_cost): Likewise.
	* config/i386/x86-tune.def (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
	Enable for Tremont.
2021-09-17 16:17:00 +08:00
H.J. Lu
61b03ade93 x86: Update -mtune=tremont
Initial -mtune=tremont update

1. Use Haswell scheduling model.
2. Assume that stack engine allows to execute push&pop instructions in
parall.
3. Prepare for scheduling pass as -mtune=generic.
4. Use the same issue rate as -mtune=generic.
5. Enable partial_reg_dependency.
6. Disable accumulate_outgoing_args
7. Enable use_leave
8. Enable push_memory
9. Disable four_jump_limit
10. Disable opt_agu
11. Disable avoid_lea_for_addr
12. Disable avoid_mem_opnd_for_cmove
13. Enable misaligned_move_string_pro_epilogues
14. Enable use_cltd
16. Enable avoid_false_dep_for_bmi
17. Enable avoid_mfence
18. Disable expand_abs
19. Enable sse_typeless_stores
20. Enable sse_load0_by_pxor
21. Disable split_mem_opnd_for_fp_converts
22. Disable slow_pshufb
23. Enable partial_reg_dependency

This is the first patch to tune for Tremont.  With all patches applied,
performance impacts on SPEC CPU 2017 are:

500.perlbench_r         1.81%
502.gcc_r               0.57%
505.mcf_r               1.16%
520.omnetpp_r           0.00%
523.xalancbmk_r         0.00%
525.x264_r              4.55%
531.deepsjeng_r         0.00%
541.leela_r             0.39%
548.exchange2_r         1.13%
557.xz_r                0.00%
geomean for intrate     0.95%
503.bwaves_r            0.00%
507.cactuBSSN_r         6.94%
508.namd_r              12.37%
510.parest_r            1.01%
511.povray_r            3.70%
519.lbm_r               36.61%
521.wrf_r               8.79%
526.blender_r           2.91%
527.cam4_r              6.23%
538.imagick_r           0.28%
544.nab_r               21.99%
549.fotonik3d_r         3.63%
554.roms_r              -1.20%
geomean for fprate      7.50%

gcc/ChangeLog

	* common/config/i386/i386-common.c: Use Haswell scheduling model
	for Tremont.
	* config/i386/i386.c (ix86_sched_init_global): Prepare for Tremont
	scheduling pass.
	* config/i386/x86-tune-sched.c (ix86_issue_rate): Change Tremont
	issue rate to 4.
	(ix86_adjust_cost): Handle Tremont.
	* config/i386/x86-tune.def (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY):
	Enable for Tremont.
	(X86_TUNE_USE_LEAVE): Likewise.
	(X86_TUNE_PUSH_MEMORY): Likewise.
	(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Likewise.
	(X86_TUNE_USE_CLTD): Likewise.
	(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Likewise.
	(X86_TUNE_AVOID_MFENCE): Likewise.
	(X86_TUNE_SSE_TYPELESS_STORES): Likewise.
	(X86_TUNE_SSE_LOAD0_BY_PXOR): Likewise.
	(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Disable for Tremont.
	(X86_TUNE_FOUR_JUMP_LIMIT): Likewise.
	(X86_TUNE_OPT_AGU): Likewise.
	(X86_TUNE_AVOID_LEA_FOR_ADDR): Likewise.
	(X86_TUNE_AVOID_MEM_OPND_FOR_CMOVE): Likewise.
	(X86_TUNE_EXPAND_ABS): Likewise.
	(X86_TUNE_SPLIT_MEM_OPND_FOR_FP_CONVERTS): Likewise.
	(X86_TUNE_SLOW_PSHUFB): Likewise.
2021-09-17 16:17:00 +08:00
Eric Botcazou
687e30d9d7 Fix PR rtl-optimization/102306
This is a duplication of volatile loads introduced during GCC 9 development
by the 2->2 mechanism of the RTL combiner.  There is already a substantial
checking for volatile references in can_combine_p but it implicitly assumes
that the combination reduces the number of instructions, which is of course
not the case here.  So the fix teaches try_combine to abort the combination
when it is about to make a copy of volatile references to preserve them.

gcc/
	PR rtl-optimization/102306
	* combine.c (try_combine): Abort the combination if we are about to
	duplicate volatile references.

gcc/testsuite/
	* gcc.target/sparc/20210917-1.c: New test.
2021-09-17 10:15:38 +02:00
liuhongt
a5873aadb6 AVX512FP16: Add intrinsics for casting between vector float16 and vector float32/float64/integer.
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm_undefined_ph):
	New intrinsic.
	(_mm256_undefined_ph): Likewise.
	(_mm512_undefined_ph): Likewise.
	(_mm_cvtsh_h): Likewise.
	(_mm256_cvtsh_h): Likewise.
	(_mm512_cvtsh_h): Likewise.
	(_mm512_castph_ps): Likewise.
	(_mm512_castph_pd): Likewise.
	(_mm512_castph_si512): Likewise.
	(_mm512_castph512_ph128): Likewise.
	(_mm512_castph512_ph256): Likewise.
	(_mm512_castph128_ph512): Likewise.
	(_mm512_castph256_ph512): Likewise.
	(_mm512_zextph128_ph512): Likewise.
	(_mm512_zextph256_ph512): Likewise.
	(_mm512_castps_ph): Likewise.
	(_mm512_castpd_ph): Likewise.
	(_mm512_castsi512_ph): Likewise.
	* config/i386/avx512fp16vlintrin.h (_mm_castph_ps):
	New intrinsic.
	(_mm256_castph_ps): Likewise.
	(_mm_castph_pd): Likewise.
	(_mm256_castph_pd): Likewise.
	(_mm_castph_si128): Likewise.
	(_mm256_castph_si256): Likewise.
	(_mm_castps_ph): Likewise.
	(_mm256_castps_ph): Likewise.
	(_mm_castpd_ph): Likewise.
	(_mm256_castpd_ph): Likewise.
	(_mm_castsi128_ph): Likewise.
	(_mm256_castsi256_ph): Likewise.
	(_mm256_castph256_ph128): Likewise.
	(_mm256_castph128_ph256): Likewise.
	(_mm256_zextph128_ph256): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-typecast-1.c: New test.
	* gcc.target/i386/avx512fp16-typecast-2.c: Ditto.
	* gcc.target/i386/avx512fp16vl-typecast-1.c: Ditto.
	* gcc.target/i386/avx512fp16vl-typecast-2.c: Ditto.
2021-09-17 16:04:29 +08:00
liuhongt
1ef291e68f AVX512FP16: Add testcase for vcvtsh2sd/vcvtsh2ss/vcvtsd2sh/vcvtss2sh.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-vcvtsd2sh-1a.c: New test.
	* gcc.target/i386/avx512fp16-vcvtsd2sh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2sd-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2sd-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2ss-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2ss-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtss2sh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtss2sh-1b.c: Ditto.
2021-09-17 16:04:29 +08:00
liuhongt
90429b962e AVX512FP16: Add vcvtsh2ss/vcvtsh2sd/vcvtss2sh/vcvtsd2sh.
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm_cvtsh_ss):
	New intrinsic.
	(_mm_mask_cvtsh_ss): Likewise.
	(_mm_maskz_cvtsh_ss): Likewise.
	(_mm_cvtsh_sd): Likewise.
	(_mm_mask_cvtsh_sd): Likewise.
	(_mm_maskz_cvtsh_sd): Likewise.
	(_mm_cvt_roundsh_ss): Likewise.
	(_mm_mask_cvt_roundsh_ss): Likewise.
	(_mm_maskz_cvt_roundsh_ss): Likewise.
	(_mm_cvt_roundsh_sd): Likewise.
	(_mm_mask_cvt_roundsh_sd): Likewise.
	(_mm_maskz_cvt_roundsh_sd): Likewise.
	(_mm_cvtss_sh): Likewise.
	(_mm_mask_cvtss_sh): Likewise.
	(_mm_maskz_cvtss_sh): Likewise.
	(_mm_cvtsd_sh): Likewise.
	(_mm_mask_cvtsd_sh): Likewise.
	(_mm_maskz_cvtsd_sh): Likewise.
	(_mm_cvt_roundss_sh): Likewise.
	(_mm_mask_cvt_roundss_sh): Likewise.
	(_mm_maskz_cvt_roundss_sh): Likewise.
	(_mm_cvt_roundsd_sh): Likewise.
	(_mm_mask_cvt_roundsd_sh): Likewise.
	(_mm_maskz_cvt_roundsd_sh): Likewise.
	* config/i386/i386-builtin-types.def
	(V8HF_FTYPE_V2DF_V8HF_V8HF_UQI_INT,
	V8HF_FTYPE_V4SF_V8HF_V8HF_UQI_INT,
	V2DF_FTYPE_V8HF_V2DF_V2DF_UQI_INT,
	V4SF_FTYPE_V8HF_V4SF_V4SF_UQI_INT): Add new builtin types.
	* config/i386/i386-builtin.def: Add corrresponding new builtins.
	* config/i386/i386-expand.c: Handle new builtin types.
	* config/i386/sse.md (VF48_128): New mode iterator.
	(avx512fp16_vcvtsh2<ssescalarmodesuffix><mask_scalar_name><round_saeonly_scalar_name>):
	New.
	(avx512fp16_vcvt<ssescalarmodesuffix>2sh<mask_scalar_name><round_scalar_name>):
	Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-17 16:04:29 +08:00
liuhongt
23fe603b4b AVX512FP16: Add testcase for vcvtph2pd/vcvtph2psx/vcvtpd2ph/vcvtps2phx.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-helper.h (V512): Add DF contents.
	(src3f): New.
	* gcc.target/i386/avx512fp16-vcvtpd2ph-1a.c: New test.
	* gcc.target/i386/avx512fp16-vcvtpd2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2pd-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2pd-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2psx-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2psx-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtps2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtps2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtpd2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtpd2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2pd-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2pd-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2psx-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2psx-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtps2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtps2ph-1b.c: Ditto.
2021-09-17 16:04:29 +08:00
liuhongt
5a744e5056 AVX512FP16: Add vcvtph2pd/vcvtph2psx/vcvtpd2ph/vcvtps2phx.
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm512_cvtph_pd):
	New intrinsic.
	(_mm512_mask_cvtph_pd): Likewise.
	(_mm512_maskz_cvtph_pd): Likewise.
	(_mm512_cvt_roundph_pd): Likewise.
	(_mm512_mask_cvt_roundph_pd): Likewise.
	(_mm512_maskz_cvt_roundph_pd): Likewise.
	(_mm512_cvtxph_ps): Likewise.
	(_mm512_mask_cvtxph_ps): Likewise.
	(_mm512_maskz_cvtxph_ps): Likewise.
	(_mm512_cvtx_roundph_ps): Likewise.
	(_mm512_mask_cvtx_roundph_ps): Likewise.
	(_mm512_maskz_cvtx_roundph_ps): Likewise.
	(_mm512_cvtxps_ph): Likewise.
	(_mm512_mask_cvtxps_ph): Likewise.
	(_mm512_maskz_cvtxps_ph): Likewise.
	(_mm512_cvtx_roundps_ph): Likewise.
	(_mm512_mask_cvtx_roundps_ph): Likewise.
	(_mm512_maskz_cvtx_roundps_ph): Likewise.
	(_mm512_cvtpd_ph): Likewise.
	(_mm512_mask_cvtpd_ph): Likewise.
	(_mm512_maskz_cvtpd_ph): Likewise.
	(_mm512_cvt_roundpd_ph): Likewise.
	(_mm512_mask_cvt_roundpd_ph): Likewise.
	(_mm512_maskz_cvt_roundpd_ph): Likewise.
	* config/i386/avx512fp16vlintrin.h (_mm_cvtph_pd):
	New intrinsic.
	(_mm_mask_cvtph_pd): Likewise.
	(_mm_maskz_cvtph_pd): Likewise.
	(_mm256_cvtph_pd): Likewise.
	(_mm256_mask_cvtph_pd): Likewise.
	(_mm256_maskz_cvtph_pd): Likewise.
	(_mm_cvtxph_ps): Likewise.
	(_mm_mask_cvtxph_ps): Likewise.
	(_mm_maskz_cvtxph_ps): Likewise.
	(_mm256_cvtxph_ps): Likewise.
	(_mm256_mask_cvtxph_ps): Likewise.
	(_mm256_maskz_cvtxph_ps): Likewise.
	(_mm_cvtxps_ph): Likewise.
	(_mm_mask_cvtxps_ph): Likewise.
	(_mm_maskz_cvtxps_ph): Likewise.
	(_mm256_cvtxps_ph): Likewise.
	(_mm256_mask_cvtxps_ph): Likewise.
	(_mm256_maskz_cvtxps_ph): Likewise.
	(_mm_cvtpd_ph): Likewise.
	(_mm_mask_cvtpd_ph): Likewise.
	(_mm_maskz_cvtpd_ph): Likewise.
	(_mm256_cvtpd_ph): Likewise.
	(_mm256_mask_cvtpd_ph): Likewise.
	(_mm256_maskz_cvtpd_ph): Likewise.
	* config/i386/i386-builtin.def: Add corresponding new builtins.
	* config/i386/i386-builtin-types.def: Add corresponding builtin types.
	* config/i386/i386-expand.c: Handle new builtin types.
	* config/i386/sse.md
	(VF4_128_8_256): New.
	(VF48H_AVX512VL): Ditto.
	(ssePHmode): Add HF vector modes.
	(castmode): Add new convertable modes.
	(qq2phsuff): Ditto.
	(ph2pssuffix): New.
	(avx512fp16_vcvt<castmode>2ph_<mode><mask_name><round_name>): Ditto.
	(avx512fp16_vcvt<castmode>2ph_<mode>): Ditto.
	(*avx512fp16_vcvt<castmode>2ph_<mode>): Ditto.
	(avx512fp16_vcvt<castmode>2ph_<mode>_mask): Ditto.
	(*avx512fp16_vcvt<castmode>2ph_<mode>_mask): Ditto.
	(*avx512fp16_vcvt<castmode>2ph_<mode>_mask_1): Ditto.
	(avx512fp16_float_extend_ph<mode>2<mask_name><round_saeonly_name>):
	Ditto.
	(avx512fp16_float_extend_ph<mode>2<mask_name>): Ditto.
	(*avx512fp16_float_extend_ph<mode>2_load<mask_name>): Ditto.
	(avx512fp16_float_extend_phv2df2<mask_name>): Ditto.
	(*avx512fp16_float_extend_phv2df2_load<mask_name>): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-17 16:04:29 +08:00
liuhongt
6babedbbae AVX512FP16: Add vcvttsh2si/vcvttsh2usi.
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm_cvttsh_i32):
	New intrinsic.
	(_mm_cvttsh_u32): Likewise.
	(_mm_cvtt_roundsh_i32): Likewise.
	(_mm_cvtt_roundsh_u32): Likewise.
	(_mm_cvttsh_i64): Likewise.
	(_mm_cvttsh_u64): Likewise.
	(_mm_cvtt_roundsh_i64): Likewise.
	(_mm_cvtt_roundsh_u64): Likewise.
	* config/i386/i386-builtin.def: Add corresponding new builtins.
	* config/i386/sse.md
	(avx512fp16_fix<fixunssuffix>_trunc<mode>2<round_saeonly_name>):
	New.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-vcvttsh2si-1a.c: New test.
	* gcc.target/i386/avx512fp16-vcvttsh2si-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttsh2si64-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttsh2si64-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttsh2usi-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttsh2usi-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttsh2usi64-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttsh2usi64-1b.c: Ditto.
	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-17 16:04:29 +08:00
liuhongt
8691efe400 AVX512FP16: Add testcase for vcvttph2w/vcvttph2uw/vcvttph2dq/vcvttph2udq/vcvttph2qq/vcvttph2uqq.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-vcvttph2dq-1a.c: New test.
	* gcc.target/i386/avx512fp16-vcvttph2dq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2qq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2qq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2udq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2udq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2uqq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2uqq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2uw-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2uw-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2w-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvttph2w-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2dq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2dq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2qq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2qq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2udq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2udq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2uqq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2uqq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2uw-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2uw-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2w-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvttph2w-1b.c: Ditto.
2021-09-17 16:04:28 +08:00
liuhongt
c027accb42 AVX512FP16: Add vcvttph2w/vcvttph2uw/vcvttph2dq/vcvttph2qq/vcvttph2udq/vcvttph2uqq
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm512_cvttph_epi32):
	New intrinsic.
	(_mm512_mask_cvttph_epi32): Likewise.
	(_mm512_maskz_cvttph_epi32): Likewise.
	(_mm512_cvtt_roundph_epi32): Likewise.
	(_mm512_mask_cvtt_roundph_epi32): Likewise.
	(_mm512_maskz_cvtt_roundph_epi32): Likewise.
	(_mm512_cvttph_epu32): Likewise.
	(_mm512_mask_cvttph_epu32): Likewise.
	(_mm512_maskz_cvttph_epu32): Likewise.
	(_mm512_cvtt_roundph_epu32): Likewise.
	(_mm512_mask_cvtt_roundph_epu32): Likewise.
	(_mm512_maskz_cvtt_roundph_epu32): Likewise.
	(_mm512_cvttph_epi64): Likewise.
	(_mm512_mask_cvttph_epi64): Likewise.
	(_mm512_maskz_cvttph_epi64): Likewise.
	(_mm512_cvtt_roundph_epi64): Likewise.
	(_mm512_mask_cvtt_roundph_epi64): Likewise.
	(_mm512_maskz_cvtt_roundph_epi64): Likewise.
	(_mm512_cvttph_epu64): Likewise.
	(_mm512_mask_cvttph_epu64): Likewise.
	(_mm512_maskz_cvttph_epu64): Likewise.
	(_mm512_cvtt_roundph_epu64): Likewise.
	(_mm512_mask_cvtt_roundph_epu64): Likewise.
	(_mm512_maskz_cvtt_roundph_epu64): Likewise.
	(_mm512_cvttph_epi16): Likewise.
	(_mm512_mask_cvttph_epi16): Likewise.
	(_mm512_maskz_cvttph_epi16): Likewise.
	(_mm512_cvtt_roundph_epi16): Likewise.
	(_mm512_mask_cvtt_roundph_epi16): Likewise.
	(_mm512_maskz_cvtt_roundph_epi16): Likewise.
	(_mm512_cvttph_epu16): Likewise.
	(_mm512_mask_cvttph_epu16): Likewise.
	(_mm512_maskz_cvttph_epu16): Likewise.
	(_mm512_cvtt_roundph_epu16): Likewise.
	(_mm512_mask_cvtt_roundph_epu16): Likewise.
	(_mm512_maskz_cvtt_roundph_epu16): Likewise.
	* config/i386/avx512fp16vlintrin.h (_mm_cvttph_epi32):
	New intirnsic.
	(_mm_mask_cvttph_epi32): Likewise.
	(_mm_maskz_cvttph_epi32): Likewise.
	(_mm256_cvttph_epi32): Likewise.
	(_mm256_mask_cvttph_epi32): Likewise.
	(_mm256_maskz_cvttph_epi32): Likewise.
	(_mm_cvttph_epu32): Likewise.
	(_mm_mask_cvttph_epu32): Likewise.
	(_mm_maskz_cvttph_epu32): Likewise.
	(_mm256_cvttph_epu32): Likewise.
	(_mm256_mask_cvttph_epu32): Likewise.
	(_mm256_maskz_cvttph_epu32): Likewise.
	(_mm_cvttph_epi64): Likewise.
	(_mm_mask_cvttph_epi64): Likewise.
	(_mm_maskz_cvttph_epi64): Likewise.
	(_mm256_cvttph_epi64): Likewise.
	(_mm256_mask_cvttph_epi64): Likewise.
	(_mm256_maskz_cvttph_epi64): Likewise.
	(_mm_cvttph_epu64): Likewise.
	(_mm_mask_cvttph_epu64): Likewise.
	(_mm_maskz_cvttph_epu64): Likewise.
	(_mm256_cvttph_epu64): Likewise.
	(_mm256_mask_cvttph_epu64): Likewise.
	(_mm256_maskz_cvttph_epu64): Likewise.
	(_mm_cvttph_epi16): Likewise.
	(_mm_mask_cvttph_epi16): Likewise.
	(_mm_maskz_cvttph_epi16): Likewise.
	(_mm256_cvttph_epi16): Likewise.
	(_mm256_mask_cvttph_epi16): Likewise.
	(_mm256_maskz_cvttph_epi16): Likewise.
	(_mm_cvttph_epu16): Likewise.
	(_mm_mask_cvttph_epu16): Likewise.
	(_mm_maskz_cvttph_epu16): Likewise.
	(_mm256_cvttph_epu16): Likewise.
	(_mm256_mask_cvttph_epu16): Likewise.
	(_mm256_maskz_cvttph_epu16): Likewise.
	* config/i386/i386-builtin.def: Add new builtins.
	* config/i386/sse.md
	(avx512fp16_fix<fixunssuffix>_trunc<mode>2<mask_name><round_saeonly_name>):
	New.
	(avx512fp16_fix<fixunssuffix>_trunc<mode>2<mask_name>): Ditto.
	(*avx512fp16_fix<fixunssuffix>_trunc<mode>2_load<mask_name>): Ditto.
	(avx512fp16_fix<fixunssuffix>_truncv2di2<mask_name>): Ditto.
	(avx512fp16_fix<fixunssuffix>_truncv2di2_load<mask_name>): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-17 16:04:28 +08:00
liuhongt
babaa0e521 AVX512FP16: Add testcase for vcvtsh2si/vcvtsh2usi/vcvtsi2sh/vcvtusi2sh.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-helper.h (V512): Add int32
	component.
	* gcc.target/i386/avx512fp16-vcvtsh2si-1a.c: New test.
	* gcc.target/i386/avx512fp16-vcvtsh2si-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2si64-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2si64-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2usi-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2usi-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2usi64-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsh2usi64-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsi2sh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsi2sh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsi2sh64-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtsi2sh64-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtusi2sh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtusi2sh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtusi2sh64-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtusi2sh64-1b.c: Ditto.
2021-09-17 16:04:28 +08:00
liuhongt
3069a2e599 AVX512FP16: Add vcvtsh2si/vcvtsh2usi/vcvtsi2sh/vcvtusi2sh.
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm_cvtsh_i32): New intrinsic.
	(_mm_cvtsh_u32): Likewise.
	(_mm_cvt_roundsh_i32): Likewise.
	(_mm_cvt_roundsh_u32): Likewise.
	(_mm_cvtsh_i64): Likewise.
	(_mm_cvtsh_u64): Likewise.
	(_mm_cvt_roundsh_i64): Likewise.
	(_mm_cvt_roundsh_u64): Likewise.
	(_mm_cvti32_sh): Likewise.
	(_mm_cvtu32_sh): Likewise.
	(_mm_cvt_roundi32_sh): Likewise.
	(_mm_cvt_roundu32_sh): Likewise.
	(_mm_cvti64_sh): Likewise.
	(_mm_cvtu64_sh): Likewise.
	(_mm_cvt_roundi64_sh): Likewise.
	(_mm_cvt_roundu64_sh): Likewise.
	* config/i386/i386-builtin-types.def: Add corresponding builtin types.
	* config/i386/i386-builtin.def: Add corresponding new builtins.
	* config/i386/i386-expand.c (ix86_expand_round_builtin):
	Handle new builtin types.
	* config/i386/sse.md
	(avx512fp16_vcvtsh2<sseintconvertsignprefix>si<rex64namesuffix><round_name>):
	New define_insn.
	(avx512fp16_vcvtsh2<sseintconvertsignprefix>si<rex64namesuffix>_2): Likewise.
	(avx512fp16_vcvt<floatsuffix>si2sh<rex64namesuffix><round_name>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-17 16:04:28 +08:00
GCC Administrator
e19570d38f Daily bump. 2021-09-17 00:16:25 +00:00
Ian Lance Taylor
54866f7a81 libgo: update to go1.17.1 release
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/350414
2021-09-16 16:48:19 -07:00
Maxim Blinov
745781d24c analyzer: Fix bootstrap with clang
gcc/analyzer/ChangeLog:
	PR bootstrap/102242
	* engine.cc (INCLUDE_UNIQUE_PTR): Define.
2021-09-17 00:36:25 +02:00
Jonathan Wakely
fce4e12f8e libstdc++: Regenerate the src/debug Makefiles as needed
When the build configuration changes and Makefiles are recreated, the
src/debug/Makefile and src/debug/*/Makefile files are not recreated,
because they're not managed in the usual way by automake. This can lead
to build failures or surprising inconsistencies between the main and
debug versions of the library when doing incremental builds.

This causes them to be regenerated if any of the corresponding non-debug
makefiles is newer.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* src/Makefile.am (stamp-debug): Add all Makefiles as
	prerequisites.
	* src/Makefile.in: Regenerate.
2021-09-16 23:06:38 +01:00
Jonathan Wakely
4337893306 libstdc++: Increase timeout factor for slow pb_ds tests
Compiling these tests still times out too often when running the
testsuite with more parallel jobs than there are available cores.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* testsuite/ext/pb_ds/regression/tree_map_rand.cc: Increase
	timeout factor to 3.
	* testsuite/ext/pb_ds/regression/tree_set_rand.cc: Likewise.
2021-09-16 23:06:38 +01:00
Jonathan Wakely
bd0df30a7b libstdc++: Update documentation that only refers to c++98 and c++11
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* doc/xml/manual/using.xml: Generalize to apply to more than
	just -std=c++11.
	* doc/html/manual/using_macros.html: Regenerate.
2021-09-16 23:06:38 +01:00
Jonathan Wakely
cbe705a2f7 libstdc++: Add noexcept to std::nullopt_t constructor
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/std/optional (nullptr_t): Make constructor noexcept.
2021-09-16 23:06:38 +01:00
Jonathan Wakely
21c760510d libstdc++: Remove non-deducible parameter for std::advance overload
This was just a copy and paste error.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/bits/fs_path.h (advance): Remove non-deducible
	template parameter.
2021-09-16 23:06:37 +01:00
Jonathan Wakely
734b2c2eed libstdc++: Add missing 'constexpr' to std::tuple [PR102270]
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	PR libstdc++/102270
	* include/std/tuple (_Head_base, _Tuple_impl): Add
	_GLIBCXX20_CONSTEXPR to allocator-extended constructors.
	(tuple<>::swap(tuple&)): Add _GLIBCXX20_CONSTEXPR.
	* testsuite/20_util/tuple/cons/102270.C: New test.
2021-09-16 23:06:31 +01:00
Jonathan Wakely
e67917f5df libstdc++: Add missing constraint to std::span deduction guide [PR102280]
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	PR libstdc++/102280
	* include/std/span (span(Range&&)): Add constraint to deduction
	guide.
2021-09-16 22:59:47 +01:00
Jonathan Wakely
2c351dafcb libstdc++: Fix recipes for C++11-compiled files in src/c++98
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* src/c++98/Makefile.am: Use CXXCOMPILE not LTCXXCOMPILE.
	* src/c++98/Makefile.in: Regenerate.
2021-09-16 22:59:47 +01:00
Jonathan Wakely
9d813ddd97 libstdc++: Add noexcept to std::to_string overloads that don't allocate
When the values is guaranteed to fit in the SSO buffer we know the
string won't allocate, so the function can be noexcept. For 32-bit
integers, we know they need no more than 9 bytes (or 10 with a minus
sign) and the SSO buffer is 15 bytes.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI]
	(to_string): Add noexcept if the type width is 32 bits or less.
2021-09-16 22:59:47 +01:00
Jonathan Wakely
869107c9c9 libstdc++: Add noexcept to unique_ptr accessors
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/bits/unique_ptr.h (__uniq_ptr_impl::_M_ptr)
	(__uniq_ptr_impl::_M_deleter): Add noexcept.
2021-09-16 22:59:46 +01:00
Thomas Rodgers
f9f1a6efaa libstdc++: Fix UB in atomic_ref/wait_notify.cc [PR101761]
Remove UB in atomic_ref/wait_notify test.

Signed-off-by: Thomas Rodgers <trodgers@redhat.com>

libstdc++-v3/ChangeLog:

	PR libstdc++/101761
	* testsuite/29_atomics/atomic_ref/wait_notify.cc (test): Use
	va and vb as arguments to wait/notify, remove unused bb local.
2021-09-16 14:48:17 -07:00
Bill Schmidt
93b5a66710 rs6000: Handle overloads during program parsing
Although this patch looks quite large, the changes are fairly minimal.
Most of it is duplicating the large function that does the overload
resolution using the automatically generated data structures instead of
the old hand-generated ones.  This doesn't make the patch terribly easy to
review, unfortunately.  Just be aware that generally we aren't changing
the logic and functionality of overload handling.

2021-09-16  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-c.c (rs6000-builtins.h): New include.
	(altivec_resolve_new_overloaded_builtin): New forward decl.
	(rs6000_new_builtin_type_compatible): New function.
	(altivec_resolve_overloaded_builtin): Call
	altivec_resolve_new_overloaded_builtin.
	(altivec_build_new_resolved_builtin): New function.
	(altivec_resolve_new_overloaded_builtin): Likewise.
	* config/rs6000/rs6000-call.c (rs6000_new_builtin_is_supported):
	Likewise.
	* config/rs6000/rs6000-gen-builtins.c (write_decls): Remove _p from
	name of rs6000_new_builtin_is_supported.
2021-09-16 15:36:00 -05:00
Patrick Palka
2e2e65a46d c++: constrained variable template issues [PR98486]
This fixes some issues with constrained variable templates:

  - Constraints aren't checked when explicitly specializing a variable
    template.
  - Constraints aren't attached to a static data member template at
    parse time.
  - Constraints don't get propagated when (partially) instantiating a
    static data member template, so we need to make sure to look up
    constraints using the most general template during satisfaction.

	PR c++/98486

gcc/cp/ChangeLog:

	* constraint.cc (get_normalized_constraints_from_decl): Always
	look up constraints using the most general template.
	* decl.c (grokdeclarator): Set constraints on a static data
	member template.
	* pt.c (determine_specialization): Check constraints on a
	variable template.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-var-templ1.C: New test.
	* g++.dg/cpp2a/concepts-var-templ1a.C: New test.
	* g++.dg/cpp2a/concepts-var-templ1b.C: New test.
2021-09-16 15:03:55 -04:00
Harald Anlauf
cfea7b86f2 Fortran - fix handling of optional allocatable DT arguments with INTENT(OUT)
gcc/fortran/ChangeLog:

	PR fortran/102287
	* trans-expr.c (gfc_conv_procedure_call): Wrap deallocation of
	allocatable components of optional allocatable derived type
	procedure arguments with INTENT(OUT) into a presence check.

gcc/testsuite/ChangeLog:

	PR fortran/102287
	* gfortran.dg/intent_out_14.f90: New test.
2021-09-16 20:12:21 +02:00
Andrew Pinski
db1a65d936 Fix PR 67102: Add libstdc++ dependancy to libffi
The error message is obvious -funconfigured-libstdc++-v3 is used
on the g++ command line.  So we just add the dependancy.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

ChangeLog:

	PR bootstrap/67102
	* Makefile.def: Have configure-target-libffi depend on
	all-target-libstdc++-v3.
	* Makefile.in: Regenerate.
2021-09-16 17:53:38 +00:00
Uros Bizjak
d7071e4982 [i386] Change ix86_decompose_address return type to bool.
After a recent change only a boolean value is returned.

2021-09-16  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
	* config/i386/i386-protos.h (ix86_decompose_address):
	Change return type to bool.
	* config/i386/i386.c (ix86_decompose_address): Ditto.
2021-09-16 19:06:12 +02:00
Tobias Burnus
acd7e7b33f PowerPC: Fix rs6000-gen-builtins with build != host [PR102353]
This mimics what the main Makefile.in does: compile the generator
files under build (with Makefile.in's 'build/%.o' rule for compilation).
It also adds $(RUN_GEN) to optionally run it with valgrind and
the $(build_exeext) suffix.

Before, the .o files were compiled with $(COMPILE), causing link
error with $(LINKER_FOR_BUILD) for build != host.

gcc/
	PR target/102353
	* config/rs6000/t-rs6000 (build/rs6000-gen-builtins.o, build/rbtree.o):
	Added 'build/' to target, use build/%.o rule.
	(build/rs6000-gen-builtins$(build_exeext)): Add 'build/' and
	'$(build_exeext)' to target and 'build/' for the *.o files.
	(rs6000-builtins.c): Update for those changes; run rs6000-gen-builtins
	with $(RUN_GEN).
2021-09-16 18:35:34 +02:00
Martin Jambor
371848a7ed
cgraph: Do not warn about caller count mismatches of removed functions
To verify other changes in the patch series, I have been searching for
"Invalid sum of caller counts" string in symtab dump but found that
there are false warnings about functions which have their body removed
because they are now unreachable.  Those are of course invalid and so
this patches avoids checking such cgraph_nodes.

gcc/ChangeLog:

2021-08-20  Martin Jambor  <mjambor@suse.cz>

	* cgraph.c (cgraph_node::dump): Do not check caller count sums if
	the body has been removed.  Remove trailing whitespace.
2021-09-16 17:06:47 +02:00
Iain Sandoe
ab08859e37 coroutines: Small cleanups to await_statement_walker [NFC].
There is no need to make a MODIFY_EXPR for any of the condition
vars that we synthesize.

Expansion of co_return can be carried out independently of any
co_awaits that might be contained which simplifies this.

Where we are rewriting statements to handle await expression
logic, there is no need to carry out any analysis - we just need
to detect the presence of any co_await.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/cp/ChangeLog:

	* coroutines.cc (await_statement_walker): Code cleanups.
2021-09-16 12:54:56 +01:00
Richard Biener
8d6b12b223 middle-end/102360 - adjust .DEFERRED_INIT expansion
This avoids using native_interpret_type when we cannot do it with
the original type of the variable, instead use an integer type
for the initialization and side-step the size limitation of
native_interpret_int.

2021-09-16  Richard Biener  <rguenther@suse.de>

	PR middle-end/102360
	* internal-fn.c (expand_DEFERRED_INIT): Make pattern-init
	of non-memory more robust.

	* g++.dg/pr102360.C: New testcase.
2021-09-16 13:22:50 +02:00
Daniel Cederman
275a076f76 sparc: Add scheduling information for LEON5
The LEON5 can often dual issue instructions from the same 64-bit aligned
double word if there are no data dependencies. Add scheduling information
to avoid scheduling unpairable instructions back-to-back.

gcc/ChangeLog:

	* config/sparc/sparc-opts.h (enum sparc_processor_type): Add LEON5
	* config/sparc/sparc.c (struct processor_costs): Add LEON5 costs
	(leon5_adjust_cost): Increase cost of store with data dependency
	on ALU instruction and FPU anti-dependencies.
	(sparc_option_override): Add LEON5 costs
	(sparc_adjust_cost): Add LEON5 cost adjustments
	* config/sparc/sparc.h: Add LEON5
	* config/sparc/sparc.md: Include LEON5 scheduling information
	* config/sparc/sparc.opt: Add LEON5
	* doc/invoke.texi: Add LEON5
	* config/sparc/leon5.md: New file.
2021-09-16 13:05:53 +02:00
Daniel Cederman
a053dab90e sparc: Add NOP in stack_protect_set32 if sparc_fix_b2bst enabled
This is needed to prevent the Store -> (Non-store or load) -> Store
sequence.

gcc/ChangeLog:

	* config/sparc/sparc.md (stack_protect_set32): Add NOP to prevent
	sensitive sequence for B2BST errata workaround.
2021-09-16 13:05:51 +02:00
Daniel Cederman
d4aa16699d sparc: Prevent atomic instructions in beginning of functions for UT700
A call to the function might have a load instruction in the delay slot
and a load followed by an atomic function could cause a deadlock.

gcc/ChangeLog:

	* config/sparc/sparc.c (sparc_do_work_around_errata): Do not begin
	functions with atomic instruction in the UT700 errata workaround.
2021-09-16 13:05:50 +02:00
Daniel Cederman
6d0c97b19a sparc: Skip all empty assembly statements
This version detects multiple empty assembly statements in a row and also
detects non-memory barrier empty assembly statements (__asm__("")). It
can be used instead of next_active_insn().

gcc/ChangeLog:

	* config/sparc/sparc.c (next_active_non_empty_insn): New function
	that returns next active non empty assembly instruction.
	(sparc_do_work_around_errata): Use new function.
2021-09-16 13:05:48 +02:00
Daniel Cederman
b4bbb373df sparc: Treat more instructions as load or store in errata workarounds
Check the attribute of instruction to determine if it performs a store
or load operation. This more generic approach sees the last instruction
in the GOTdata_op model as a potential load and treats the memory barrier
as a potential store instruction.

gcc/ChangeLog:

	* config/sparc/sparc.c (store_insn_p): Add predicate for store
	attributes.
	(load_insn_p): Add predicate for load attributes.
	(sparc_do_work_around_errata): Use new predicates.
2021-09-16 13:05:47 +02:00
Andreas Larsson
b7e0dd61e4 sparc: Print out bit names for LEON and LEON3 with -mdebug
gcc/ChangeLog:

	* config/sparc/sparc.c (dump_target_flag_bits): Print bit names for
	LEON and LEON3.
2021-09-16 13:05:45 +02:00
Christophe Lyon
8e2c293f02 testsuite: Support single-precision in g++.dg/eh/arm-vfp-unwind.C
g++.dg/eh/arm-vfp-unwind.C uses an asm statement relying on
double-precision FPU support. This patch extends it support
single-precision, useful for targets without double-precision.

2021-09-16  Richard Earnshaw  <rearnsha@arm.com>

	gcc/testsuite/
	* g++.dg/eh/arm-vfp-unwind.C: Support single-precision.
2021-09-16 09:33:52 +00:00
Martin Liska
8137be3958 mips: Fix macro typo
gcc/ChangeLog:

	* config/mips/netbsd.h: Fix typo in name of a macro.
2021-09-16 11:19:12 +02:00
liuhongt
a26ff83ed0 Check mask type when doing cond_op related gimple simplification.
gcc/ChangeLog:

	PR middle-end/102080
	* match.pd: Check mask type when doing cond_op related gimple
	simplification.
	* tree.c (is_truth_type_for): New function.
	* tree.h (is_truth_type_for): New declaration.

gcc/testsuite/ChangeLog:

	PR middle-end/102080
	* gcc.target/i386/pr102080.c: New test.
2021-09-16 16:35:29 +08:00
liuhongt
a73d59089a AVX512FP16: Add testcase for vcvtw2ph/vcvtuw2ph/vcvtdq2ph/vcvtudq2ph/vcvtqq2ph/vcvtuqq2ph.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-vcvtdq2ph-1a.c: New test.
	* gcc.target/i386/avx512fp16-vcvtdq2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtqq2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtqq2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtudq2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtudq2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtuqq2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtuqq2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtuw2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtuw2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtw2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtw2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtdq2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtdq2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtqq2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtqq2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtudq2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtudq2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtuqq2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtuqq2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtuw2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtuw2ph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtw2ph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtw2ph-1b.c: Ditto.
2021-09-16 13:09:30 +08:00
liuhongt
be0e4c32bf AVX512FP16: Add vcvtuw2ph/vcvtw2ph/vcvtdq2ph/vcvtudq2ph/vcvtqq2ph/vcvtuqq2ph
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm512_cvtepi32_ph): New
	intrinsic.
	(_mm512_mask_cvtepi32_ph): Likewise.
	(_mm512_maskz_cvtepi32_ph): Likewise.
	(_mm512_cvt_roundepi32_ph): Likewise.
	(_mm512_mask_cvt_roundepi32_ph): Likewise.
	(_mm512_maskz_cvt_roundepi32_ph): Likewise.
	(_mm512_cvtepu32_ph): Likewise.
	(_mm512_mask_cvtepu32_ph): Likewise.
	(_mm512_maskz_cvtepu32_ph): Likewise.
	(_mm512_cvt_roundepu32_ph): Likewise.
	(_mm512_mask_cvt_roundepu32_ph): Likewise.
	(_mm512_maskz_cvt_roundepu32_ph): Likewise.
	(_mm512_cvtepi64_ph): Likewise.
	(_mm512_mask_cvtepi64_ph): Likewise.
	(_mm512_maskz_cvtepi64_ph): Likewise.
	(_mm512_cvt_roundepi64_ph): Likewise.
	(_mm512_mask_cvt_roundepi64_ph): Likewise.
	(_mm512_maskz_cvt_roundepi64_ph): Likewise.
	(_mm512_cvtepu64_ph): Likewise.
	(_mm512_mask_cvtepu64_ph): Likewise.
	(_mm512_maskz_cvtepu64_ph): Likewise.
	(_mm512_cvt_roundepu64_ph): Likewise.
	(_mm512_mask_cvt_roundepu64_ph): Likewise.
	(_mm512_maskz_cvt_roundepu64_ph): Likewise.
	(_mm512_cvtepi16_ph): Likewise.
	(_mm512_mask_cvtepi16_ph): Likewise.
	(_mm512_maskz_cvtepi16_ph): Likewise.
	(_mm512_cvt_roundepi16_ph): Likewise.
	(_mm512_mask_cvt_roundepi16_ph): Likewise.
	(_mm512_maskz_cvt_roundepi16_ph): Likewise.
	(_mm512_cvtepu16_ph): Likewise.
	(_mm512_mask_cvtepu16_ph): Likewise.
	(_mm512_maskz_cvtepu16_ph): Likewise.
	(_mm512_cvt_roundepu16_ph): Likewise.
	(_mm512_mask_cvt_roundepu16_ph): Likewise.
	(_mm512_maskz_cvt_roundepu16_ph): Likewise.
	* config/i386/avx512fp16vlintrin.h (_mm_cvtepi32_ph): New
	intrinsic.
	(_mm_mask_cvtepi32_ph): Likewise.
	(_mm_maskz_cvtepi32_ph): Likewise.
	(_mm256_cvtepi32_ph): Likewise.
	(_mm256_mask_cvtepi32_ph): Likewise.
	(_mm256_maskz_cvtepi32_ph): Likewise.
	(_mm_cvtepu32_ph): Likewise.
	(_mm_mask_cvtepu32_ph): Likewise.
	(_mm_maskz_cvtepu32_ph): Likewise.
	(_mm256_cvtepu32_ph): Likewise.
	(_mm256_mask_cvtepu32_ph): Likewise.
	(_mm256_maskz_cvtepu32_ph): Likewise.
	(_mm_cvtepi64_ph): Likewise.
	(_mm_mask_cvtepi64_ph): Likewise.
	(_mm_maskz_cvtepi64_ph): Likewise.
	(_mm256_cvtepi64_ph): Likewise.
	(_mm256_mask_cvtepi64_ph): Likewise.
	(_mm256_maskz_cvtepi64_ph): Likewise.
	(_mm_cvtepu64_ph): Likewise.
	(_mm_mask_cvtepu64_ph): Likewise.
	(_mm_maskz_cvtepu64_ph): Likewise.
	(_mm256_cvtepu64_ph): Likewise.
	(_mm256_mask_cvtepu64_ph): Likewise.
	(_mm256_maskz_cvtepu64_ph): Likewise.
	(_mm_cvtepi16_ph): Likewise.
	(_mm_mask_cvtepi16_ph): Likewise.
	(_mm_maskz_cvtepi16_ph): Likewise.
	(_mm256_cvtepi16_ph): Likewise.
	(_mm256_mask_cvtepi16_ph): Likewise.
	(_mm256_maskz_cvtepi16_ph): Likewise.
	(_mm_cvtepu16_ph): Likewise.
	(_mm_mask_cvtepu16_ph): Likewise.
	(_mm_maskz_cvtepu16_ph): Likewise.
	(_mm256_cvtepu16_ph): Likewise.
	(_mm256_mask_cvtepu16_ph): Likewise.
	(_mm256_maskz_cvtepu16_ph): Likewise.
	* config/i386/i386-builtin-types.def: Add corresponding builtin types.
	* config/i386/i386-builtin.def: Add corresponding new builtins.
	* config/i386/i386-expand.c
	(ix86_expand_args_builtin): Handle new builtin types.
	(ix86_expand_round_builtin): Ditto.
	* config/i386/i386-modes.def: Declare V2HF and V6HF.
	* config/i386/sse.md (VI2H_AVX512VL): New.
	(qq2phsuff): Ditto.
	(sseintvecmode): Add HF vector modes.
	(avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode><mask_name><round_name>):
	New.
	(avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>): Ditto.
	(*avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>): Ditto.
	(avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>_mask): Ditto.
	(*avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>_mask): Ditto.
	(*avx512fp16_vcvt<floatsuffix><sseintconvert>2ph_<mode>_mask_1): Ditto.
	(avx512fp16_vcvt<floatsuffix>qq2ph_v2di): Ditto.
	(*avx512fp16_vcvt<floatsuffix>qq2ph_v2di): Ditto.
	(avx512fp16_vcvt<floatsuffix>qq2ph_v2di_mask): Ditto.
	(*avx512fp16_vcvt<floatsuffix>qq2ph_v2di_mask): Ditto.
	(*avx512fp16_vcvt<floatsuffix>qq2ph_v2di_mask_1): Ditto.
	* config/i386/subst.md (round_qq2phsuff): New subst_attr.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-16 13:09:30 +08:00
liuhongt
038afce92d AVX512FP16: Add testcase for vcvtph2w/vcvtph2uw/vcvtph2dq/vcvtph2udq/vcvtph2qq/vcvtph2uqq.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-helper.h (V512): Add QI
	components.
	* gcc.target/i386/avx512fp16-vcvtph2dq-1a.c: New test.
	* gcc.target/i386/avx512fp16-vcvtph2dq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2qq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2qq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2udq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2udq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2uqq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2uqq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2uw-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2uw-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2w-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vcvtph2w-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2dq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2dq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2qq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2qq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2udq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2udq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2uqq-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2uqq-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2uw-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2uw-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2w-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vcvtph2w-1b.c: Ditto.
2021-09-16 13:09:30 +08:00