OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
liuhongt	b96cb2caa9	AVX512FP16: Add vmaxph/vminph/vmaxsh/vminsh. gcc/ChangeLog: * config/i386/avx512fp16intrin.h: (_mm512_max_ph): New intrinsic. (_mm512_mask_max_ph): Likewise. (_mm512_maskz_max_ph): Likewise. (_mm512_min_ph): Likewise. (_mm512_mask_min_ph): Likewise. (_mm512_maskz_min_ph): Likewise. (_mm512_max_round_ph): Likewise. (_mm512_mask_max_round_ph): Likewise. (_mm512_maskz_max_round_ph): Likewise. (_mm512_min_round_ph): Likewise. (_mm512_mask_min_round_ph): Likewise. (_mm512_maskz_min_round_ph): Likewise. (_mm_max_sh): Likewise. (_mm_mask_max_sh): Likewise. (_mm_maskz_max_sh): Likewise. (_mm_min_sh): Likewise. (_mm_mask_min_sh): Likewise. (_mm_maskz_min_sh): Likewise. (_mm_max_round_sh): Likewise. (_mm_mask_max_round_sh): Likewise. (_mm_maskz_max_round_sh): Likewise. (_mm_min_round_sh): Likewise. (_mm_mask_min_round_sh): Likewise. (_mm_maskz_min_round_sh): Likewise. * config/i386/avx512fp16vlintrin.h (_mm_max_ph): New intrinsic. (_mm256_max_ph): Likewise. (_mm_mask_max_ph): Likewise. (_mm256_mask_max_ph): Likewise. (_mm_maskz_max_ph): Likewise. (_mm256_maskz_max_ph): Likewise. (_mm_min_ph): Likewise. (_mm256_min_ph): Likewise. (_mm_mask_min_ph): Likewise. (_mm256_mask_min_ph): Likewise. (_mm_maskz_min_ph): Likewise. (_mm256_maskz_min_ph): Likewise. * config/i386/i386-builtin-types.def: Add corresponding builtin types. * config/i386/i386-builtin.def: Add corresponding new builtins. * config/i386/i386-expand.c (ix86_expand_args_builtin): Handle new builtin types. * config/i386/sse.md (<code><mode>3<mask_name><round_saeonly_name>): Adjust to support HF vector modes. (<code><mode>3<mask_name><round_saeonly_name>): Likewise. (ieee_<ieee_maxmin><mode>3<mask_name><round_saeonly_name>): Likewise. (<sse>_vm<code><mode>3<mask_scalar_name><round_saeonly_scalar_name>): Likewise. config/i386/subst.md (round_saeonly_mode512bit_condition): Adjust for HF vector modes. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add test for new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Add test for new intrinsics. * gcc.target/i386/sse-22.c: Ditto.	2021-09-10 14:59:30 +08:00
liuhongt	63d7c9dd66	AVX512FP16: Add testcase for vaddsh/vsubsh/vmulsh/vdivsh. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vaddsh-1a.c: New test. * gcc.target/i386/avx512fp16-vaddsh-1b.c: Ditto. * gcc.target/i386/avx512fp16-vdivsh-1a.c: Ditto. * gcc.target/i386/avx512fp16-vdivsh-1b.c: Ditto. * gcc.target/i386/avx512fp16-vmulsh-1a.c: Ditto. * gcc.target/i386/avx512fp16-vmulsh-1b.c: Ditto. * gcc.target/i386/avx512fp16-vsubsh-1a.c: Ditto. * gcc.target/i386/avx512fp16-vsubsh-1b.c: Ditto. * gcc.target/i386/pr54855-11.c: Ditto.	2021-09-10 14:59:30 +08:00
Liu, Hongtao	71838266e7	AVX512FP16: Add vaddsh/vsubsh/vmulsh/vdivsh. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_add_sh): New intrinsic. (_mm_mask_add_sh): Likewise. (_mm_maskz_add_sh): Likewise. (_mm_sub_sh): Likewise. (_mm_mask_sub_sh): Likewise. (_mm_maskz_sub_sh): Likewise. (_mm_mul_sh): Likewise. (_mm_mask_mul_sh): Likewise. (_mm_maskz_mul_sh): Likewise. (_mm_div_sh): Likewise. (_mm_mask_div_sh): Likewise. (_mm_maskz_div_sh): Likewise. (_mm_add_round_sh): Likewise. (_mm_mask_add_round_sh): Likewise. (_mm_maskz_add_round_sh): Likewise. (_mm_sub_round_sh): Likewise. (_mm_mask_sub_round_sh): Likewise. (_mm_maskz_sub_round_sh): Likewise. (_mm_mul_round_sh): Likewise. (_mm_mask_mul_round_sh): Likewise. (_mm_maskz_mul_round_sh): Likewise. (_mm_div_round_sh): Likewise. (_mm_mask_div_round_sh): Likewise. (_mm_maskz_div_round_sh): Likewise. * config/i386/i386-builtin-types.def: Add corresponding builtin types. * config/i386/i386-builtin.def: Add corresponding new builtins. * config/i386/i386-expand.c (ix86_expand_round_builtin): Handle new builtins. * config/i386/sse.md (VF_128): Change description. (<sse>_vm<plusminus_insn><mode>3<mask_scalar_name><round_scalar_name>): Adjust to support HF vector modes. (<sse>_vm<multdiv_mnemonic><mode>3<mask_scalar_name><round_scalar_name>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add test for new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Add test for new intrinsics. * gcc.target/i386/sse-22.c: Ditto.	2021-09-10 14:59:30 +08:00
H.J. Lu	d959312b42	AVX512FP16: Enable _Float16 autovectorization gcc/ChangeLog: * config/i386/i386-expand.c (ix86_avx256_split_vector_move_misalign): Handle V16HF mode. * config/i386/i386.c (ix86_preferred_simd_mode): Handle HF mode. * config/i386/sse.md (V_256H): New mode iterator. (avx_vextractf128<mode>): Use it. (VEC_INIT_MODE): Align vector HFmode condition to vector HImodes since there're no real HF instruction used. (VEC_INIT_HALF_MODE): Ditto. (VIHF): Ditto. (VIHF_AVX512BW): Ditto. (vec_extracthf): Ditto. (VEC_EXTRACT_MODE): Ditto. gcc/testsuite/ChangeLog: gcc.target/i386/vect-float16-1.c: New test. * gcc.target/i386/vect-float16-10.c: Ditto. * gcc.target/i386/vect-float16-11.c: Ditto. * gcc.target/i386/vect-float16-12.c: Ditto. * gcc.target/i386/vect-float16-2.c: Ditto. * gcc.target/i386/vect-float16-3.c: Ditto. * gcc.target/i386/vect-float16-4.c: Ditto. * gcc.target/i386/vect-float16-5.c: Ditto. * gcc.target/i386/vect-float16-6.c: Ditto. * gcc.target/i386/vect-float16-7.c: Ditto. * gcc.target/i386/vect-float16-8.c: Ditto. * gcc.target/i386/vect-float16-9.c: Ditto.	2021-09-10 14:59:30 +08:00
Richard Biener	0458154caa	Remove dbx.h, do not set PREFERRED_DEBUGGING_TYPE from dbxcoff.h, lynx.h The following removes the unused config/dbx.h file and removes the setting of PREFERRED_DEBUGGING_TYPE from dbxcoff.h which is overridden by all users (djgpp/mingw/cygwin) via either including config/i386/djgpp.h or config/i386/cygming.h There are still circumstances where mingw and cygwin default to STABS, namely when HAVE_GAS_PE_SECREL32_RELOC is not defined and the target defaults to 32bit code generation. The new style handling DBX_DEBUGGING_INFO is in line with dbxelf.h which does not define PREFERRED_DEBUGGING_TYPE either. The patch also removes the PREFERRED_DEBUGGING_TYPE define from lynx.h which always follows elfos.h already defaulting to DWARF, so the comment about STABS being the default is misleading and outdated. 2021-09-09 Richard Biener <rguenther@suse.de> PR target/102255 * config/dbx.h: Remove. * config/dbxcoff.h: Do not define PREFERRED_DEBUGGING_TYPE. * config/lynx.h: Likewise.	2021-09-10 07:59:15 +02:00
liuhongt	60efb1fee9	Remove copysign post_reload splitter for scalar modes. It can generate better code just like avx512dq-abs-copysign-1.c shows. gcc/ChangeLog: * config/i386/i386-expand.c (ix86_expand_copysign): Expand right into ANDNOT + AND + IOR, using paradoxical subregs. (ix86_split_copysign_const): Remove. (ix86_split_copysign_var): Ditto. * config/i386/i386-protos.h (ix86_split_copysign_const): Dotto. (ix86_split_copysign_var): Ditto. * config/i386/i386.md (@copysign<mode>3_const): Ditto. (@copysign<mode>3_var): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512dq-abs-copysign-1.c: Adjust testcase. * gcc.target/i386/avx512vl-abs-copysign-1.c: Adjust testcase.	2021-09-10 12:29:28 +08:00
GCC Administrator	f84e2f0b7b	Daily bump.	2021-09-10 00:16:31 +00:00
qing zhao	a25e0b5e6a	Add -ftrivial-auto-var-init option and uninitialized variable attribute. Initialize automatic variables with either a pattern or with zeroes to increase the security and predictability of a program by preventing uninitialized memory disclosure and use. GCC still considers an automatic variable that doesn't have an explicit initializer as uninitialized, -Wuninitialized will still report warning messages on such automatic variables. With this option, GCC will also initialize any padding of automatic variables that have structure or union types to zeroes. You can control this behavior for a specific variable by using the variable attribute "uninitialized" to control runtime overhead. gcc/ChangeLog: 2021-09-09 qing zhao <qing.zhao@oracle.com> * builtins.c (expand_builtin_memset): Make external visible. * builtins.h (expand_builtin_memset): Declare extern. * common.opt (ftrivial-auto-var-init=): New option. * doc/extend.texi: Document the uninitialized attribute. * doc/invoke.texi: Document -ftrivial-auto-var-init. * flag-types.h (enum auto_init_type): New enumerated type auto_init_type. * gimple-fold.c (clear_padding_type): Add one new parameter. (clear_padding_union): Likewise. (clear_padding_emit_loop): Likewise. (clear_type_padding_in_mask): Likewise. (gimple_fold_builtin_clear_padding): Handle this new parameter. * gimplify.c (gimple_add_init_for_auto_var): New function. (gimple_add_padding_init_for_auto_var): New function. (is_var_need_auto_init): New function. (gimplify_decl_expr): Add initialization to automatic variables per users' requests. (gimplify_call_expr): Add one new parameter for call to __builtin_clear_padding. (gimplify_init_constructor): Add padding initialization in the end. * internal-fn.c (INIT_PATTERN_VALUE): New macro. (expand_DEFERRED_INIT): New function. * internal-fn.def (DEFERRED_INIT): New internal function. * tree-cfg.c (verify_gimple_call): Verify calls to .DEFERRED_INIT. * tree-sra.c (generate_subtree_deferred_init): New function. (scan_function): Avoid setting cannot_scalarize_away_bitmap for calls to .DEFERRED_INIT. (sra_modify_deferred_init): New function. (sra_modify_function_body): Handle calls to DEFERRED_INIT specially. * tree-ssa-structalias.c (find_func_aliases_for_call): Likewise. * tree-ssa-uninit.c (warn_uninit): Handle calls to DEFERRED_INIT specially. (check_defs): Likewise. (warn_uninitialized_vars): Likewise. * tree-ssa.c (ssa_undefined_value_p): Likewise. * tree.c (build_common_builtin_nodes): Build tree node for BUILT_IN_CLEAR_PADDING when needed. gcc/c-family/ChangeLog: 2021-09-09 qing zhao <qing.zhao@oracle.com> * c-attribs.c (handle_uninitialized_attribute): New function. (c_common_attribute_table): Add "uninitialized" attribute. gcc/testsuite/ChangeLog: 2021-09-09 qing zhao <qing.zhao@oracle.com> * c-c++-common/auto-init-1.c: New test. * c-c++-common/auto-init-10.c: New test. * c-c++-common/auto-init-11.c: New test. * c-c++-common/auto-init-12.c: New test. * c-c++-common/auto-init-13.c: New test. * c-c++-common/auto-init-14.c: New test. * c-c++-common/auto-init-15.c: New test. * c-c++-common/auto-init-16.c: New test. * c-c++-common/auto-init-2.c: New test. * c-c++-common/auto-init-3.c: New test. * c-c++-common/auto-init-4.c: New test. * c-c++-common/auto-init-5.c: New test. * c-c++-common/auto-init-6.c: New test. * c-c++-common/auto-init-7.c: New test. * c-c++-common/auto-init-8.c: New test. * c-c++-common/auto-init-9.c: New test. * c-c++-common/auto-init-esra.c: New test. * c-c++-common/auto-init-padding-1.c: New test. * c-c++-common/auto-init-padding-2.c: New test. * c-c++-common/auto-init-padding-3.c: New test. * g++.dg/auto-init-uninit-pred-1_a.C: New test. * g++.dg/auto-init-uninit-pred-2_a.C: New test. * g++.dg/auto-init-uninit-pred-3_a.C: New test. * g++.dg/auto-init-uninit-pred-4.C: New test. * gcc.dg/auto-init-sra-1.c: New test. * gcc.dg/auto-init-sra-2.c: New test. * gcc.dg/auto-init-uninit-1.c: New test. * gcc.dg/auto-init-uninit-12.c: New test. * gcc.dg/auto-init-uninit-13.c: New test. * gcc.dg/auto-init-uninit-14.c: New test. * gcc.dg/auto-init-uninit-15.c: New test. * gcc.dg/auto-init-uninit-16.c: New test. * gcc.dg/auto-init-uninit-17.c: New test. * gcc.dg/auto-init-uninit-18.c: New test. * gcc.dg/auto-init-uninit-19.c: New test. * gcc.dg/auto-init-uninit-2.c: New test. * gcc.dg/auto-init-uninit-20.c: New test. * gcc.dg/auto-init-uninit-21.c: New test. * gcc.dg/auto-init-uninit-22.c: New test. * gcc.dg/auto-init-uninit-23.c: New test. * gcc.dg/auto-init-uninit-24.c: New test. * gcc.dg/auto-init-uninit-25.c: New test. * gcc.dg/auto-init-uninit-26.c: New test. * gcc.dg/auto-init-uninit-3.c: New test. * gcc.dg/auto-init-uninit-34.c: New test. * gcc.dg/auto-init-uninit-36.c: New test. * gcc.dg/auto-init-uninit-37.c: New test. * gcc.dg/auto-init-uninit-4.c: New test. * gcc.dg/auto-init-uninit-5.c: New test. * gcc.dg/auto-init-uninit-6.c: New test. * gcc.dg/auto-init-uninit-8.c: New test. * gcc.dg/auto-init-uninit-9.c: New test. * gcc.dg/auto-init-uninit-A.c: New test. * gcc.dg/auto-init-uninit-B.c: New test. * gcc.dg/auto-init-uninit-C.c: New test. * gcc.dg/auto-init-uninit-H.c: New test. * gcc.dg/auto-init-uninit-I.c: New test. * gcc.target/aarch64/auto-init-1.c: New test. * gcc.target/aarch64/auto-init-2.c: New test. * gcc.target/aarch64/auto-init-3.c: New test. * gcc.target/aarch64/auto-init-4.c: New test. * gcc.target/aarch64/auto-init-5.c: New test. * gcc.target/aarch64/auto-init-6.c: New test. * gcc.target/aarch64/auto-init-7.c: New test. * gcc.target/aarch64/auto-init-8.c: New test. * gcc.target/aarch64/auto-init-padding-1.c: New test. * gcc.target/aarch64/auto-init-padding-10.c: New test. * gcc.target/aarch64/auto-init-padding-11.c: New test. * gcc.target/aarch64/auto-init-padding-12.c: New test. * gcc.target/aarch64/auto-init-padding-2.c: New test. * gcc.target/aarch64/auto-init-padding-3.c: New test. * gcc.target/aarch64/auto-init-padding-4.c: New test. * gcc.target/aarch64/auto-init-padding-5.c: New test. * gcc.target/aarch64/auto-init-padding-6.c: New test. * gcc.target/aarch64/auto-init-padding-7.c: New test. * gcc.target/aarch64/auto-init-padding-8.c: New test. * gcc.target/aarch64/auto-init-padding-9.c: New test. * gcc.target/i386/auto-init-1.c: New test. * gcc.target/i386/auto-init-2.c: New test. * gcc.target/i386/auto-init-21.c: New test. * gcc.target/i386/auto-init-22.c: New test. * gcc.target/i386/auto-init-23.c: New test. * gcc.target/i386/auto-init-24.c: New test. * gcc.target/i386/auto-init-3.c: New test. * gcc.target/i386/auto-init-4.c: New test. * gcc.target/i386/auto-init-5.c: New test. * gcc.target/i386/auto-init-6.c: New test. * gcc.target/i386/auto-init-7.c: New test. * gcc.target/i386/auto-init-8.c: New test. * gcc.target/i386/auto-init-padding-1.c: New test. * gcc.target/i386/auto-init-padding-10.c: New test. * gcc.target/i386/auto-init-padding-11.c: New test. * gcc.target/i386/auto-init-padding-12.c: New test. * gcc.target/i386/auto-init-padding-2.c: New test. * gcc.target/i386/auto-init-padding-3.c: New test. * gcc.target/i386/auto-init-padding-4.c: New test. * gcc.target/i386/auto-init-padding-5.c: New test. * gcc.target/i386/auto-init-padding-6.c: New test. * gcc.target/i386/auto-init-padding-7.c: New test. * gcc.target/i386/auto-init-padding-8.c: New test. * gcc.target/i386/auto-init-padding-9.c: New test.	2021-09-09 15:44:49 -07:00
Harald Anlauf	5fe0865ab7	Fortran - out of bounds in array constructor with implied do loop gcc/fortran/ChangeLog: PR fortran/98490 * trans-expr.c (gfc_conv_substring): Do not generate substring bounds check for implied do loop index variable before it actually becomes defined. gcc/testsuite/ChangeLog: PR fortran/98490 * gfortran.dg/bounds_check_23.f90: New test.	2021-09-09 21:34:01 +02:00
H.J. Lu	de515ce0b2	x86-64: Update AVX512FP16 ABI tests for x32 On x32, long is the same as int and pointer is 32 bits. Update AVX512FP16 ABI tests: 1. Replace long with long long for 64-bit integers. 2. Update type and alignment for long and pointer. 3. Skip tests for long on x32. * gcc.target/x86_64/abi/avx512fp16/args.h: Replace long with long long. (XMM_T): Rename _long to _longlong and _ulong to _ulonglong. (X87_T): Rename _ulong to _ulonglong. * gcc.target/x86_64/abi/avx512fp16/defines.h (TYPE_SIZE_LONG): Define to 4 if __ILP32__ is defined. (TYPE_SIZE_POINTER): Likewise. (TYPE_ALIGN_LONG): Likewise. (TYPE_ALIGN_POINTER): Likewise. * gcc.target/x86_64/abi/avx512fp16/test_3_element_struct_and_unions.c (main): Skip test for long if __ILP32__ is defined. * gcc.target/x86_64/abi/avx512fp16/test_m64m128_returning.c (do_test): Replace _long with _longlong. * gcc.target/x86_64/abi/avx512fp16/test_struct_returning.c: (check_300): Replace _ulong with _ulonglong. * gcc.target/x86_64/abi/avx512fp16/m256h/args.h: Replace long with long long. (YMM_T): Rename _long to _longlong and _ulong to _ulonglong. (X87_T): Rename _ulong to _ulonglong. * gcc.target/x86_64/abi/avx512fp16/m512h/args.h: Replace long with long long. (ZMM_T): Rename _long to _longlong and _ulong to _ulonglong. (X87_T): Rename _ulong to _ulonglong.	2021-09-09 08:42:35 -07:00
Richard Biener	013cfc6484	Improve LIM fill_always_executed_in computation Currently the DOM walk over a loop body does not walk into not always executed subloops to avoid scalability issues since doing so makes the walk quadratic in the loop depth. It turns out this is not an issue in practice and even with a loop depth of 1800 this function is way off the radar. So the following patch removes the limitation, replacing it with a comment. 2021-09-09 Richard Biener <rguenther@suse.de> * tree-ssa-loop-im.c (fill_always_executed_in_1): Walk into all subloops. * gcc.dg/tree-ssa/ssa-lim-17.c: New testcase.	2021-09-09 11:50:20 +02:00
Richard Biener	6e27bc2b88	Avoid full DOM walk in LIM fill_always_executed_in This avoids a full DOM walk via get_loop_body_in_dom_order in the loop body walk of fill_always_executed_in which is often terminating the walk of a loop body early by integrating the DOM walk of get_loop_body_in_dom_order with the actual processing done by fill_always_executed_in. This trades the fully populated loop body array with a worklist allocation of the same size and thus should be a strict improvement over the recursive approach of get_loop_body_in_dom_order. 2021-09-09 Richard Biener <rguenther@suse.de> * tree-ssa-loop-im.c (fill_always_executed_in_1): Integrate DOM walk from get_loop_body_in_dom_order using a worklist approach.	2021-09-09 11:16:58 +02:00
liuhongt	f77f3adebd	AVX512FP16: Add testcase for vaddph/vsubph/vmulph/vdivph. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-helper.h: New header file for FP16 runtime test. * gcc.target/i386/avx512fp16-vaddph-1a.c: New test. * gcc.target/i386/avx512fp16-vaddph-1b.c: Ditto. * gcc.target/i386/avx512fp16-vdivph-1a.c: Ditto. * gcc.target/i386/avx512fp16-vdivph-1b.c: Ditto. * gcc.target/i386/avx512fp16-vmulph-1a.c: Ditto. * gcc.target/i386/avx512fp16-vmulph-1b.c: Ditto. * gcc.target/i386/avx512fp16-vsubph-1a.c: Ditto. * gcc.target/i386/avx512fp16-vsubph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vaddph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vaddph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vdivph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vdivph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vmulph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vmulph-1b.c: Ditto. * gcc.target/i386/avx512fp16vl-vsubph-1a.c: Ditto. * gcc.target/i386/avx512fp16vl-vsubph-1b.c: Ditto.	2021-09-09 16:09:05 +08:00
liuhongt	bd7a34ef55	AVX512FP16: Add vaddph/vsubph/vdivph/vmulph. gcc/ChangeLog: * config.gcc: Add avx512fp16vlintrin.h. * config/i386/avx512fp16intrin.h: (_mm512_add_ph): New intrinsic. (_mm512_mask_add_ph): Likewise. (_mm512_maskz_add_ph): Likewise. (_mm512_sub_ph): Likewise. (_mm512_mask_sub_ph): Likewise. (_mm512_maskz_sub_ph): Likewise. (_mm512_mul_ph): Likewise. (_mm512_mask_mul_ph): Likewise. (_mm512_maskz_mul_ph): Likewise. (_mm512_div_ph): Likewise. (_mm512_mask_div_ph): Likewise. (_mm512_maskz_div_ph): Likewise. (_mm512_add_round_ph): Likewise. (_mm512_mask_add_round_ph): Likewise. (_mm512_maskz_add_round_ph): Likewise. (_mm512_sub_round_ph): Likewise. (_mm512_mask_sub_round_ph): Likewise. (_mm512_maskz_sub_round_ph): Likewise. (_mm512_mul_round_ph): Likewise. (_mm512_mask_mul_round_ph): Likewise. (_mm512_maskz_mul_round_ph): Likewise. (_mm512_div_round_ph): Likewise. (_mm512_mask_div_round_ph): Likewise. (_mm512_maskz_div_round_ph): Likewise. * config/i386/avx512fp16vlintrin.h: New header. * config/i386/i386-builtin-types.def (V16HF, V8HF, V32HF): Add new builtin types. * config/i386/i386-builtin.def: Add corresponding builtins. * config/i386/i386-expand.c (ix86_expand_args_builtin): Handle new builtin types. (ix86_expand_round_builtin): Likewise. * config/i386/immintrin.h: Include avx512fp16vlintrin.h * config/i386/sse.md (VFH): New mode_iterator. (VF2H): Likewise. (avx512fmaskmode): Add HF vector modes. (avx512fmaskhalfmode): Likewise. (<plusminus_insn><mode>3<mask_name><round_name>): Adjust to for HF vector modes. (<plusminus_insn><mode>3<mask_name><round_name>): Likewise. (mul<mode>3<mask_name><round_name>): Likewise. (mul<mode>3<mask_name><round_name>): Likewise. (div<mode>3): Likewise. (<sse>_div<mode>3<mask_name><round_name>): Likewise. * config/i386/subst.md (SUBST_V): Add HF vector modes. (SUBST_A): Likewise. (round_mode512bit_condition): Adjust for V32HFmode. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add -mavx512vl and test for new intrinsics. * gcc.target/i386/avx-2.c: Add -mavx512vl. * gcc.target/i386/avx512fp16-11a.c: New test. * gcc.target/i386/avx512fp16-11b.c: Ditto. * gcc.target/i386/avx512vlfp16-11a.c: Ditto. * gcc.target/i386/avx512vlfp16-11b.c: Ditto. * gcc.target/i386/sse-13.c: Add test for new builtins. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Add test for new intrinsics. * gcc.target/i386/sse-22.c: Ditto.	2021-09-09 16:08:56 +08:00
liuhongt	8f323c712e	Optimize v4sf reduction. gcc/ChangeLog: PR target/101059 * config/i386/sse.md (reduc_plus_scal_<mode>): Split to .. (reduc_plus_scal_v4sf): .. this, New define_expand. (reduc_plus_scal_v2df): .. and this, New define_expand. gcc/testsuite/ChangeLog: PR target/101059 * gcc.target/i386/sse2-pr101059.c: New test. * gcc.target/i386/sse3-pr101059.c: New test.	2021-09-09 09:34:15 +08:00
liuhongt	60eec23b5e	Optimize vec_extract for 256/512-bit vector when index exceeds the lower 128 bits. - vextracti32x8 $0x1, %zmm0, %ymm0 - vmovd %xmm0, %eax + valignd $8, %zmm0, %zmm0, %zmm1 + vmovd %xmm1, %eax - vextracti32x8 $0x1, %zmm0, %ymm0 - vextracti128 $0x1, %ymm0, %xmm0 - vpextrd $3, %xmm0, %eax + valignd $15, %zmm0, %zmm0, %zmm1 + vmovd %xmm1, %eax - vextractf64x2 $0x1, %ymm0, %xmm0 + valignq $2, %ymm0, %ymm0, %ymm0 - vextractf64x4 $0x1, %zmm0, %ymm0 - vextractf64x2 $0x1, %ymm0, %xmm0 - vunpckhpd %xmm0, %xmm0, %xmm0 + valignq $7, %zmm0, %zmm0, %zmm0 gcc/ChangeLog: PR target/91103 * config/i386/sse.md (vec_extract<mode><ssescalarmodelower>_valign): New define_insn. gcc/testsuite/ChangeLog: PR target/91103 gcc.target/i386/pr91103-1.c: New test. * gcc.target/i386/pr91103-2.c: New test.	2021-09-09 09:33:40 +08:00
GCC Administrator	b6db7cd41c	Daily bump.	2021-09-09 00:16:32 +00:00
Jonathan Wakely	3c64582372	c++: Fix docs on assignment of virtual bases [PR60318] The description of behaviour is incorrect, the virtual base gets assigned before entering the bodies of A::operator= and B::operator=, not after. The example is also ill-formed (passing a string literal to char) and undefined (missing return from Base::operator=). Signed-off-by: Jonathan Wakely <jwakely@redhat.com> gcc/ChangeLog: PR c++/60318 doc/trouble.texi (Copy Assignment): Fix description of behaviour and fix code in example.	2021-09-08 22:34:16 +01:00
David Malcolm	e66b9f6779	analyzer: fix ICE when discarding result of realloc [PR102225] gcc/analyzer/ChangeLog: PR analyzer/102225 * analyzer.h (compat_types_p): New decl. * constraint-manager.cc (constraint_manager::get_or_add_equiv_class): Guard against NULL type when checking for pointer types. * region-model-impl-calls.cc (region_model::impl_call_realloc): Guard against NULL lhs type/region. Guard against the size value not being of a compatible type for dynamic extents. * region-model.cc (compat_types_p): Make non-static. gcc/testsuite/ChangeLog: PR analyzer/102225 * gcc.dg/analyzer/realloc-1.c (test_10): New. * gcc.dg/analyzer/torture/pr102225.c: New test.	2021-09-08 14:37:19 -04:00
Richard Biener	716a583692	c++/102228 - make lookup_anon_field O(1) For the testcase in PR101555 lookup_anon_field takes the majority of parsing time followed by get_class_binding_direct/fields_linear_search which is PR83309. The situation with anon aggregates is particularly dire when we need to build accesses to their members and the anon aggregates are nested. There for each such access we recursively build sub-accesses to the anon aggregate FIELD_DECLs bottom-up, DFS searching for them. That's inefficient since as I believe there's a 1:1 relationship between anon aggregate types and the FIELD_DECL used to place them. The patch below does away with the search in lookup_anon_field and instead records the single FIELD_DECL in the anon aggregate types lang-specific data, re-using the RTTI typeinfo_var field. That speeds up the compile of the testcase with -fsyntax-only from about 4.5s to slightly less than 1s. I tried to poke holes into the 1:1 relationship idea with my C++ knowledge but failed (which might not say much). It also leaves a hole for the case when the C++ FE itself duplicates such type and places it at a semantically different position. I've tried to poke holes into it with the duplication mechanism I understand (templates) but failed. 2021-09-08 Richard Biener <rguenther@suse.de> PR c++/102228 gcc/cp/ * cp-tree.h (ANON_AGGR_TYPE_FIELD): New define. * decl.c (fixup_anonymous_aggr): Wipe RTTI info put in place on invalid code. * decl2.c (reset_type_linkage): Guard CLASSTYPE_TYPEINFO_VAR access. * module.cc (trees_in::read_class_def): Likewise. Reconstruct ANON_AGGR_TYPE_FIELD. * semantics.c (finish_member_declaration): Populate ANON_AGGR_TYPE_FIELD for anon aggregate typed members. * typeck.c (lookup_anon_field): Remove DFS search and return ANON_AGGR_TYPE_FIELD directly.	2021-09-08 17:43:40 +02:00
Joseph Myers	d27d694151	testsuite: Allow .sdata in more cases in gcc.dg/array-quals-1.c When testing for Nios II (gcc-testresults shows this for MIPS as well), failures of gcc.dg/array-quals-1.c appear where a symbol was found in .sdata rather than one of the expected sections. FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?a$ (found a) has section ^\\.(const\|rodata\|srodata)\|\\[RO\\] (found .sdata) FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?b$ (found b) has section ^\\.(const\|rodata\|srodata)\|\\[RO\\] (found .sdata) FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?c$ (found c) has section ^\\.(const\|rodata\|srodata)\|\\[RO\\] (found .sdata) FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?d$ (found d) has section ^\\.(const\|rodata\|srodata)\|\\[RO\\] (found .sdata) Jakub's commit `0b34dbc0a2` allowed .sdata for many variables in that test where use of .sdata caused a failure on powerpc-linux. I'm presuming the choice of which variables had .sdata allowed was based only on the code generated for powerpc-linux, not on any reason it would be wrong to allow it for the other variables; thus, this patch adjusts the test to allow .sdata for some more variables where that is needed on Nios II (and in one case where it's not needed on Nios II, but the test results on gcc-testresults suggest that it is needed on MIPS). Tested with no regressions with cross to nios2-elf. * gcc.dg/array-quals-1.c: Allow .sdata section in more cases.	2021-09-08 15:38:18 +00:00
Joseph Myers	d081516ae1	testsuite: Use explicit -ftree-cselim in tests using -fdump-tree-cselim-details When testing for Nios II (gcc-testresults shows this for various other targets as well), tests scanning cselim dumps produce an UNRESOLVED result because those dumps do not exist. cselim is enabled conditionally by code in toplev.c: if (flag_tree_cselim == AUTODETECT_VALUE) { if (HAVE_conditional_move) flag_tree_cselim = 1; else flag_tree_cselim = 0; } Add explicit -ftree-cselim to dg-options in the affected tests (as already used by some other tests of cselim dumps) so that this dump exists on all architectures. Tested with no regressions with cross to nios2-elf, where this causes the tests in question to PASS instead of being UNRESOLVED. * gcc.dg/tree-ssa/pr89430-1.c, gcc.dg/tree-ssa/pr89430-2.c, gcc.dg/tree-ssa/pr89430-3.c, gcc.dg/tree-ssa/pr89430-4.c, gcc.dg/tree-ssa/pr89430-5.c, gcc.dg/tree-ssa/pr89430-6.c, gcc.dg/tree-ssa/pr89430-7-comp-ref.c, gcc.dg/tree-ssa/pr89430-8-mem-ref-size.c, gcc.dg/tree-ssa/pr99473-1.c: Use -ftree-cselim.	2021-09-08 14:57:20 +00:00
Segher Boessenkool	86e6268cff	rs6000: Fix ELFv2 r12 use in epilogue We cannot use r12 here, it is already in use as the GEP (for sibling calls). 2021-09-08 Segher Boessenkool <segher@kernel.crashing.org> PR target/102107 * config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): For ELFv2 use r11 instead of r12 for restoring CR.	2021-09-08 13:27:56 +00:00
Jakub Jelinek	7485a52551	i386: Fix up xorsign for AVX [PR89984] Thinking about it more this morning, while this patch fixes the problems revealed in the testcase, the recent PR89984 change was buggy too, but perhaps that can be fixed incrementally. Because for AVX the new code destructively modifies op1. If that is different from dest, say on: float foo (float x, float y) { return x * __builtin_copysignf (1.0f, y) + y; } then we get after RA: (insn 8 7 9 2 (set (reg:SF 20 xmm0 [orig:82 _2 ] [82]) (unspec:SF [ (reg:SF 20 xmm0 [88]) (reg:SF 21 xmm1 [89]) (mem/u/c:V4SF (symbol_ref/u:DI (".LC0") [flags 0x2]) [0 S16 A128]) ] UNSPEC_XORSIGN)) "hohoho.c":4:12 649 {xorsignsf3_1} (nil)) (insn 9 8 15 2 (set (reg:SF 20 xmm0 [87]) (plus:SF (reg:SF 20 xmm0 [orig:82 _2 ] [82]) (reg:SF 21 xmm1 [89]))) "hohoho.c":4:44 1021 {fop_sf_comm} (nil)) but split the xorsign into: vandps .LC0(%rip), %xmm1, %xmm1 vxorps %xmm0, %xmm1, %xmm0 and then the addition: vaddss %xmm1, %xmm0, %xmm0 which means we miscompile it - instead of adding y in the end we add __builtin_copysignf (0.0f, y). So, wonder if we don't want instead in addition to the &Yv <- Yv, 0 alternative (enabled for both pre-AVX and AVX as in this patch) the &Yv <- Yv, Yv where destination must be different from inputs and another Yv <- Yv, Yv where it can be the same but then need a match_scratch (with X for the other alternatives and =Yv for the last one). That way we'd always have a safe register we can store the op1 & mask value into, either the destination (in the first alternative known to be equal to op1 which is needed for non-AVX but ok for AVX too), in the second alternative known to be different from both inputs and in the third which could be used for those float bar (float x, float y) { return x * __builtin_copysignf (1.0f, y); } cases where op1 is naturally xmm1 and dest == op0 naturally xmm0 we'd use some other register like xmm2. On Wed, Sep 08, 2021 at 05:23:40PM +0800, Hongtao Liu wrote: > I'm curious why we need the post_reload splitter @xorsign<mode>3_1 > for scalar mode, can't we just expand them into and/xor operations in > the expander, just like vector modes did. Following seems to work for all the testcases I've tried (and in some generates better code than the post-reload splitter). 2021-09-08 Jakub Jelinek <jakub@redhat.com> liuhongt <hongtao.liu@intel.com> PR target/89984 * config/i386/i386.md (@xorsign<mode>3_1): Remove. * config/i386/i386-expand.c (ix86_expand_xorsign): Expand right away into AND with mask and XOR, using paradoxical subregs. (ix86_split_xorsign): Remove. * config/i386/i386-protos.h (ix86_split_xorsign): Remove. * gcc.target/i386/avx-pr102224.c: Fix up PR number. * gcc.dg/pr89984.c: New test. * gcc.target/i386/avx-pr89984.c: New test.	2021-09-08 14:06:10 +02:00
liuhongt	6576ad5add	Compile __{mul,div}hc3 into libgcc_s.so.1. libgcc/ChangeLog: * config/i386/t-softfp: Compile __{mul,div}hc3 into libgcc_s.so.1.	2021-09-08 19:18:15 +08:00
Di Zhao	7285f39455	tree-optimization/102183 - sccvn: fix result compare in vn_nary_op_insert_into If the first predicate value is different and copied, the comparison will then be between val->result and the copied one. That can cause inserting extra vn_pvals. gcc/ChangeLog: * tree-ssa-sccvn.c (vn_nary_op_insert_into): fix result compare	2021-09-08 18:47:18 +08:00
Jakub Jelinek	87d55da7d7	libgcc, i386: Export hf and hc from libgcc_s.so.1 The following patch exports it for Linux from config/i386/.ver where it IMNSHO belongs, aarch64 already exports some of those at GCC_11 and other targets might add them at completely different gcc versions. 2021-09-08 Jakub Jelinek <jakub@redhat.com> Iain Sandoe <iain@sandoe.co.uk> * config/i386/libgcc-glibc.ver: Add %inherit GCC_12.0.0 GCC_7.0.0 and export hf and hc functions at GCC_12.0.0.	2021-09-08 11:34:45 +02:00
Jakub Jelinek	a7b626d98a	i386: Fix up @xorsign<mode>3_1 [PR102224] As the testcase shows, we miscompile @xorsign<mode>3_1 if both input operands are in the same register, because the splitter overwrites op1 before with op1 & mask before using op0. For dest = xorsign op0, op0 we can actually simplify it from dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs). The expander change is an optimization improvement, if we at expansion time know it is xorsign op0, op0, we can emit abs right away and get better code through that. The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known to have same operands during expansion, but during RTL optimizations they would appear. For non-AVX we need to use earlyclobber, we require dest and op1 to be the same but op0 must be different because we overwrite op1 first. For AVX the constraints ensure that at most 2 of the 3 operands may be the same register and if both inputs are the same, handles that case. This case can be easily tested with the xorsign<mode>3 expander change reverted. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Thinking about it more this morning, while this patch fixes the problems revealed in the testcase, the recent PR89984 change was buggy too, but perhaps that can be fixed incrementally. Because for AVX the new code destructively modifies op1. If that is different from dest, say on: float foo (float x, float y) { return x * __builtin_copysignf (1.0f, y) + y; } then we get after RA: (insn 8 7 9 2 (set (reg:SF 20 xmm0 [orig:82 _2 ] [82]) (unspec:SF [ (reg:SF 20 xmm0 [88]) (reg:SF 21 xmm1 [89]) (mem/u/c:V4SF (symbol_ref/u:DI (".LC0") [flags 0x2]) [0 S16 A128]) ] UNSPEC_XORSIGN)) "hohoho.c":4:12 649 {xorsignsf3_1} (nil)) (insn 9 8 15 2 (set (reg:SF 20 xmm0 [87]) (plus:SF (reg:SF 20 xmm0 [orig:82 _2 ] [82]) (reg:SF 21 xmm1 [89]))) "hohoho.c":4:44 1021 {fop_sf_comm} (nil)) but split the xorsign into: vandps .LC0(%rip), %xmm1, %xmm1 vxorps %xmm0, %xmm1, %xmm0 and then the addition: vaddss %xmm1, %xmm0, %xmm0 which means we miscompile it - instead of adding y in the end we add __builtin_copysignf (0.0f, y). So, wonder if we don't want instead in addition to the &Yv <- Yv, 0 alternative (enabled for both pre-AVX and AVX as in this patch) the &Yv <- Yv, Yv where destination must be different from inputs and another Yv <- Yv, Yv where it can be the same but then need a match_scratch (with X for the other alternatives and =Yv for the last one). That way we'd always have a safe register we can store the op1 & mask value into, either the destination (in the first alternative known to be equal to op1 which is needed for non-AVX but ok for AVX too), in the second alternative known to be different from both inputs and in the third which could be used for those float bar (float x, float y) { return x * __builtin_copysignf (1.0f, y); } cases where op1 is naturally xmm1 and dest == op0 naturally xmm0 we'd use some other register like xmm2. 2021-09-08 Jakub Jelinek <jakub@redhat.com> PR target/102224 * config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to operands[2], emit abs<mode>2 instead. (@xorsign<mode>3_1): Add early-clobbers for output operand, enable first alternative even for avx, add another alternative with =&Yv <- 0, Yv, Yvm constraints. * config/i386/i386-expand.c (ix86_split_xorsign): If op0 is equal to op1, emit vpandn instead. * gcc.dg/pr102224.c: New test. * gcc.target/i386/avx-pr102224.c: New test.	2021-09-08 11:25:31 +02:00
liuhongt	4a61bcaca0	AVX512FP16: Add abi test for zmm gcc/testsuite/ChangeLog: * gcc.target/x86_64/abi/avx512fp16/m512h/abi-avx512fp16-zmm.exp: New file. * gcc.target/x86_64/abi/avx512fp16/m512h/args.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/asm-support.S: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/avx512fp16-zmm-check.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_m512_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_m512.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_structs.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_unions.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m512h/test_varargs-m512.c: Likewise.	2021-09-08 12:44:50 +08:00
liuhongt	07308cdb0c	AVX512FP16: Add ABI test for ymm. gcc/testsuite/ChangeLog: * gcc.target/x86_64/abi/avx512fp16/m256h/abi-avx512fp16-ymm.exp: New exp file. * gcc.target/x86_64/abi/avx512fp16/m256h/args.h: New header. * gcc.target/x86_64/abi/avx512fp16/m256h/avx512fp16-ymm-check.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/m256h/asm-support.S: New. * gcc.target/x86_64/abi/avx512fp16/m256h/test_m256_returning.c: New test. * gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_m256.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_structs.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_unions.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/m256h/test_varargs-m256.c: Likewise.	2021-09-08 12:44:50 +08:00
H.J. Lu	22ce16ffa4	AVX512FP16: Add ABI tests for xmm. Copied from regular XMM ABI tests. Only run AVX512FP16 ABI tests for ELF targets. gcc/testsuite/ChangeLog: * gcc.target/x86_64/abi/avx512fp16/abi-avx512fp16-xmm.exp: New exp file for abi test. * gcc.target/x86_64/abi/avx512fp16/args.h: New header file for abi test. * gcc.target/x86_64/abi/avx512fp16/avx512fp16-check.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/avx512fp16-xmm-check.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/defines.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/macros.h: Likewise. * gcc.target/x86_64/abi/avx512fp16/asm-support.S: New asm for abi check. * gcc.target/x86_64/abi/avx512fp16/test_3_element_struct_and_unions.c: New test. * gcc.target/x86_64/abi/avx512fp16/test_basic_alignment.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_array_size_and_align.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_sizes.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_struct_size_and_align.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_basic_union_size_and_align.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_complex_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_m64m128_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_passing_floats.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_passing_m64m128.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_passing_structs.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_passing_unions.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_struct_returning.c: Likewise. * gcc.target/x86_64/abi/avx512fp16/test_varargs-m128.c: Likewise.	2021-09-08 12:44:50 +08:00
H.J. Lu	5bbd88bb1e	AVX512FP16: Add tests for vector passing in variable arguments. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vararg-1.c: New test. * gcc.target/i386/avx512fp16-vararg-2.c: Ditto. * gcc.target/i386/avx512fp16-vararg-3.c: Ditto. * gcc.target/i386/avx512fp16-vararg-4.c: Ditto.	2021-09-08 12:44:50 +08:00
liuhongt	2f3318dbcf	AVX512FP16: Add testcase for vector init and broadcast intrinsics. gcc/testsuite/ChangeLog: * gcc.target/i386/m512-check.h: Add union128h, union256h, union512h. * gcc.target/i386/avx512fp16-10a.c: New test. * gcc.target/i386/avx512fp16-10b.c: Ditto. * gcc.target/i386/avx512fp16-1a.c: Ditto. * gcc.target/i386/avx512fp16-1b.c: Ditto. * gcc.target/i386/avx512fp16-1c.c: Ditto. * gcc.target/i386/avx512fp16-1d.c: Ditto. * gcc.target/i386/avx512fp16-1e.c: Ditto. * gcc.target/i386/avx512fp16-2a.c: Ditto. * gcc.target/i386/avx512fp16-2b.c: Ditto. * gcc.target/i386/avx512fp16-2c.c: Ditto. * gcc.target/i386/avx512fp16-3a.c: Ditto. * gcc.target/i386/avx512fp16-3b.c: Ditto. * gcc.target/i386/avx512fp16-3c.c: Ditto. * gcc.target/i386/avx512fp16-4.c: Ditto. * gcc.target/i386/avx512fp16-5.c: Ditto. * gcc.target/i386/avx512fp16-6.c: Ditto. * gcc.target/i386/avx512fp16-7.c: Ditto. * gcc.target/i386/avx512fp16-8.c: Ditto. * gcc.target/i386/avx512fp16-9a.c: Ditto. * gcc.target/i386/avx512fp16-9b.c: Ditto. * gcc.target/i386/pr54855-13.c: Ditto. * gcc.target/i386/avx512fp16-vec_set_var.c: Ditto.	2021-09-08 12:44:50 +08:00
liuhongt	9e2a82e1f9	AVX512FP16: Support vector init/broadcast/set/extract for FP16. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_set_ph): New intrinsic. (_mm256_set_ph): Likewise. (_mm512_set_ph): Likewise. (_mm_setr_ph): Likewise. (_mm256_setr_ph): Likewise. (_mm512_setr_ph): Likewise. (_mm_set1_ph): Likewise. (_mm256_set1_ph): Likewise. (_mm512_set1_ph): Likewise. (_mm_setzero_ph): Likewise. (_mm256_setzero_ph): Likewise. (_mm512_setzero_ph): Likewise. (_mm_set_sh): Likewise. (_mm_load_sh): Likewise. (_mm_store_sh): Likewise. * config/i386/i386-builtin-types.def (V8HF): New type. (DEF_FUNCTION_TYPE (V8HF, V8HI)): New builtin function type * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate): Support vector HFmodes. (ix86_expand_vector_init_one_nonzero): Likewise. (ix86_expand_vector_init_one_var): Likewise. (ix86_expand_vector_init_interleave): Likewise. (ix86_expand_vector_init_general): Likewise. (ix86_expand_vector_set): Likewise. (ix86_expand_vector_extract): Likewise. (ix86_expand_vector_init_concat): Likewise. (ix86_expand_sse_movcc): Handle vector HFmodes. (ix86_expand_vector_set_var): Ditto. * config/i386/i386-modes.def: Add HF vector modes in comment. * config/i386/i386.c (classify_argument): Add HF vector modes. (ix86_hard_regno_mode_ok): Allow HF vector modes for AVX512FP16. (ix86_vector_mode_supported_p): Likewise. (ix86_set_reg_reg_cost): Handle vector HFmode. (ix86_get_ssemov): Handle vector HFmode. (function_arg_advance_64): Pass unamed V16HFmode and V32HFmode by stack. (function_arg_advance_32): Pass V8HF/V16HF/V32HF by sse reg for 32bit mode. (function_arg_advance_32): Ditto. * config/i386/i386.h (VALID_AVX512FP16_REG_MODE): New. (VALID_AVX256_REG_OR_OI_MODE): Rename to .. (VALID_AVX256_REG_OR_OI_VHF_MODE): .. this, and add V16HF. (VALID_SSE2_REG_VHF_MODE): New. (VALID_AVX512VL_128_REG_MODE): Add V8HF and TImode. (SSE_REG_MODE_P): Add vector HFmode. * config/i386/i386.md (mode): Add HF vector modes. (MODE_SIZE): Likewise. (ssemodesuffix): Add ph suffix for HF vector modes. * config/i386/sse.md (VFH_128): New mode iterator. (VMOVE): Adjust for HF vector modes. (V): Likewise. (V_256_512): Likewise. (avx512): Likewise. (avx512fmaskmode): Likewise. (shuffletype): Likewise. (sseinsnmode): Likewise. (ssedoublevecmode): Likewise. (ssehalfvecmode): Likewise. (ssehalfvecmodelower): Likewise. (ssePScmode): Likewise. (ssescalarmode): Likewise. (ssescalarmodelower): Likewise. (sseintprefix): Likewise. (i128): Likewise. (bcstscalarsuff): Likewise. (xtg_mode): Likewise. (VI12HF_AVX512VL): New mode_iterator. (VF_AVX512FP16): Likewise. (VIHF): Likewise. (VIHF_256): Likewise. (VIHF_AVX512BW): Likewise. (V16_256): Likewise. (V32_512): Likewise. (sseintmodesuffix): New mode_attr. (sse): Add scalar and vector HFmodes. (ssescalarmode): Add vector HFmode mapping. (ssescalarmodesuffix): Add sh suffix for HFmode. (<sse>_vm<insn><mode>3): Use VFH_128. (<sse>_vm<multdiv_mnemonic><mode>3): Likewise. (ieee_<ieee_maxmin><mode>3): Likewise. (<avx512>_blendm<mode>): New define_insn. (vec_setv8hf): New define_expand. (vec_set<mode>_0): New define_insn for HF vector set. (avx512fp16_movsh): Likewise. (avx512fp16_movsh): Likewise. (vec_extract_lo_v32hi): Rename to ... (vec_extract_lo_<mode>): ... this, and adjust to allow HF vector modes. (vec_extract_hi_v32hi): Likewise. (vec_extract_hi_<mode>): Likewise. (vec_extract_lo_v16hi): Likewise. (vec_extract_lo_<mode>): Likewise. (vec_extract_hi_v16hi): Likewise. (vec_extract_hi_<mode>): Likewise. (vec_set_hi_v16hi): Likewise. (vec_set_hi_<mode>): Likewise. (vec_set_lo_v16hi): Likewise. (vec_set_lo_<mode>): Likewise. (vec_extract<mode>_0): New define_insn_and_split for HF vector extract. (vec_extracthf): New define_insn. (VEC_EXTRACT_MODE): Add HF vector modes. (PINSR_MODE): Add V8HF. (sse2p4_1): Likewise. (pinsr_evex_isa): Likewise. (<sse2p4_1>_pinsr<ssemodesuffix>): Adjust to support insert for V8HFmode. (pbroadcast_evex_isa): Add HF vector modes. (AVX2_VEC_DUP_MODE): Likewise. (VEC_INIT_MODE): Likewise. (VEC_INIT_HALF_MODE): Likewise. (avx2_pbroadcast<mode>): Adjust to support HF vector mode broadcast. (avx2_pbroadcast<mode>_1): Likewise. (<avx512>_vec_dup<mode>_1): Likewise. (<avx512>_vec_dup<mode><mask_name>): Likewise. (<mask_codefor><avx512>_vec_dup_gpr<mode><mask_name>): Likewise.	2021-09-08 12:44:50 +08:00
Guo, Xuepeng	a68412117f	AVX512FP16: Initial support for AVX512FP16 feature and scalar _Float16 instructions. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect FEATURE_AVX512FP16. * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512FP16_SET, OPTION_MASK_ISA_AVX512FP16_UNSET, OPTION_MASK_ISA2_AVX512FP16_SET, OPTION_MASK_ISA2_AVX512FP16_UNSET): New. (OPTION_MASK_ISA2_AVX512BW_UNSET, OPTION_MASK_ISA2_AVX512BF16_UNSET): Add AVX512FP16. (ix86_handle_option): Handle -mavx512fp16. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_AVX512FP16. * common/config/i386/i386-isas.h: Add entry for AVX512FP16. * config.gcc: Add avx512fp16intrin.h. * config/i386/avx512fp16intrin.h: New intrinsic header. * config/i386/cpuid.h: Add bit_AVX512FP16. * config/i386/i386-builtin-types.def: (FLOAT16): New primitive type. * config/i386/i386-builtins.c: Support _Float16 type for i386 backend. (ix86_register_float16_builtin_type): New function. (ix86_float16_type_node): New. * config/i386/i386-c.c (ix86_target_macros_internal): Define __AVX512FP16__. * config/i386/i386-expand.c (ix86_expand_branch): Support HFmode. (ix86_prepare_fp_compare_args): Adjust TARGET_SSE_MATH && SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P. (ix86_expand_fp_movcc): Ditto. * config/i386/i386-isa.def: Add PTA define for AVX512FP16. * config/i386/i386-options.c (isa2_opts): Add -mavx512fp16. (ix86_valid_target_attribute_inner_p): Add avx512fp16 attribute. * config/i386/i386.c (ix86_get_ssemov): Use vmovdqu16/vmovw/vmovsh for HFmode/HImode scalar or vector. (ix86_get_excess_precision): Use FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when TARGET_AVX512FP16 existed. (sse_store_index): Use SFmode cost for HFmode cost. (inline_memory_move_cost): Add HFmode, and perfer SSE cost over GPR cost for HFmode. (ix86_hard_regno_mode_ok): Allow HImode in sse register. (ix86_mangle_type): Add manlging for _Float16 type. (inline_secondary_memory_needed): No memory is needed for 16bit movement between gpr and sse reg under TARGET_AVX512FP16. (ix86_multiplication_cost): Adjust TARGET_SSE_MATH && SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P. (ix86_division_cost): Ditto. (ix86_rtx_costs): Ditto. (ix86_add_stmt_cost): Ditto. (ix86_optab_supported_p): Ditto. * config/i386/i386.h (VALID_AVX512F_SCALAR_MODE): Add HFmode. (SSE_FLOAT_MODE_SSEMATH_OR_HF_P): Add HFmode. (PTA_SAPPHIRERAPIDS): Add PTA_AVX512FP16. * config/i386/i386.md (mode): Add HFmode. (MODE_SIZE): Add HFmode. (isa): Add avx512fp16. (enabled): Handle avx512fp16. (ssemodesuffix): Add sh suffix for HFmode. (comm): Add mult, div. (plusminusmultdiv): New code iterator. (insn): Add mult, div. (movhf_internal): Adjust for avx512fp16 instruction. (movhi_internal): Ditto. (cmpi<unord>hf): New define_insn for HFmode. (ieee_s<ieee_maxmin>hf3): Likewise. (extendhf<mode>2): Likewise. (trunc<mode>hf2): Likewise. (float<floatunssuffix><mode>hf2): Likewise. (<insn>hf): Likewise. (cbranchhf4): New expander. (movhfcc): Likewise. (<insn>hf3): Likewise. (mulhf3): Likewise. (divhf3): Likewise. config/i386/i386.opt: Add mavx512fp16. * config/i386/immintrin.h: Include avx512fp16intrin.h. * doc/invoke.texi: Add mavx512fp16. * doc/extend.texi: Add avx512fp16 Usage Notes. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add -mavx512fp16 in dg-options. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/avx512-check.h: Check cpuid for AVX512FP16. * gcc.target/i386/funcspec-56.inc: Add new target attribute check. * gcc.target/i386/sse-13.c: Add -mavx512fp16. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * lib/target-supports.exp: (check_effective_target_avx512fp16): New. * g++.target/i386/float16-1.C: New test. * g++.target/i386/float16-2.C: Ditto. * g++.target/i386/float16-3.C: Ditto. * gcc.target/i386/avx512fp16-12a.c: Ditto. * gcc.target/i386/avx512fp16-12b.c: Ditto. * gcc.target/i386/float16-3a.c: Ditto. * gcc.target/i386/float16-3b.c: Ditto. * gcc.target/i386/float16-4a.c: Ditto. * gcc.target/i386/float16-4b.c: Ditto. * gcc.target/i386/pr54855-12.c: Ditto. * g++.dg/other/i386-2.C: Ditto. * g++.dg/other/i386-3.C: Ditto. Co-Authored-By: H.J. Lu <hongjiu.lu@intel.com> Co-Authored-By: Liu Hongtao <hongtao.liu@intel.com> Co-Authored-By: Wang Hongyu <hongyu.wang@intel.com> Co-Authored-By: Xu Dianhong <dianhong.xu@intel.com>	2021-09-08 12:44:50 +08:00
liuhongt	f19a327077	Support -fexcess-precision=16 which will enable FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16. gcc/ada/ChangeLog: * gcc-interface/misc.c (gnat_post_options): Issue an error for -fexcess-precision=16. gcc/c-family/ChangeLog: * c-common.c (excess_precision_mode_join): Update below comments. (c_ts18661_flt_eval_method): Set excess_precision_type to EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16. * c-cppbuiltin.c (cpp_atomic_builtins): Update below comments. (c_cpp_flt_eval_method_iec_559): Set excess_precision_type to EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16. gcc/ChangeLog: * common.opt: Support -fexcess-precision=16. * config/aarch64/aarch64.c (aarch64_excess_precision): Return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when EXCESS_PRECISION_TYPE_FLOAT16. * config/arm/arm.c (arm_excess_precision): Ditto. * config/i386/i386.c (ix86_get_excess_precision): Ditto. * config/m68k/m68k.c (m68k_excess_precision): Issue an error when EXCESS_PRECISION_TYPE_FLOAT16. * config/s390/s390.c (s390_excess_precision): Ditto. * coretypes.h (enum excess_precision_type): Add EXCESS_PRECISION_TYPE_FLOAT16. * doc/tm.texi (TARGET_C_EXCESS_PRECISION): Update documents. * doc/tm.texi.in (TARGET_C_EXCESS_PRECISION): Ditto. * doc/extend.texi (Half-Precision): Document -fexcess-precision=16. * flag-types.h (enum excess_precision): Add EXCESS_PRECISION_FLOAT16. * target.def (excess_precision): Update document. * tree.c (excess_precision_type): Set excess_precision_type to EXCESS_PRECISION_FLOAT16 when -fexcess-precision=16. gcc/fortran/ChangeLog: * options.c (gfc_post_options): Issue an error for -fexcess-precision=16. gcc/testsuite/ChangeLog: * gcc.target/i386/float16-6.c: New test. * gcc.target/i386/float16-7.c: New test.	2021-09-08 12:44:49 +08:00
liuhongt	a549a9a39a	Adjust the wording for x86 _Float16 type. gcc/ChangeLog: * doc/extend.texi: (@node Floating Types): Adjust the wording. (@node Half-Precision): Ditto.	2021-09-08 09:10:45 +08:00
GCC Administrator	b2748138c0	Daily bump.	2021-09-08 00:16:23 +00:00
Max Filippov	b552c4e601	gcc: xtensa: fix PR target/102115 2021-09-07 Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp> gcc/ PR target/102115 * config/xtensa/xtensa.c (xtensa_emit_move_sequence): Add 'CONST_INT_P (src)' to the condition of the block that tries to eliminate literal when loading integer contant.	2021-09-07 15:40:26 -07:00
Ian Lance Taylor	21b046bade	runtime: use hash32, not hash64, for amd64p32, mips64p32, mips64p32le Fixes PR go/102102 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/348015	2021-09-07 15:05:11 -07:00
David Faust	d9996ccb94	doc: BPF CO-RE documentation Document the new command line options (-mco-re and -mno-co-re), the new BPF target builtin (__builtin_preserve_access_index), and the new BPF target attribute (preserve_access_index) introduced with BPF CO-RE. gcc/ChangeLog: * doc/extend.texi (BPF Type Attributes) New node. Document new preserve_access_index attribute. Document new preserve_access_index builtin. * doc/invoke.texi: Document -mco-re and -mno-co-re options.	2021-09-07 13:48:58 -07:00
David Faust	f4cdfd4856	bpf testsuite: Add BPF CO-RE tests This commit adds several tests for the new BPF CO-RE functionality to the BPF target testsuite. gcc/testsuite/ChangeLog: * gcc.target/bpf/core-attr-1.c: New test. * gcc.target/bpf/core-attr-2.c: Likewise. * gcc.target/bpf/core-attr-3.c: Likewise. * gcc.target/bpf/core-attr-4.c: Likewise * gcc.target/bpf/core-builtin-1.c: Likewise * gcc.target/bpf/core-builtin-2.c: Likewise. * gcc.target/bpf/core-builtin-3.c: Likewise. * gcc.target/bpf/core-section-1.c: Likewise.	2021-09-07 13:48:58 -07:00
David Faust	8bdabb3754	bpf: BPF CO-RE support This commit introduces support for BPF Compile Once - Run Everywhere (CO-RE) in GCC. gcc/ChangeLog: * config/bpf/bpf.c: Adjust includes. (bpf_handle_preserve_access_index_attribute): New function. (bpf_attribute_table): Use it here. (bpf_builtins): Add BPF_BUILTIN_PRESERVE_ACCESS_INDEX. (bpf_option_override): Handle "-mco-re" option. (bpf_asm_init_sections): New. (TARGET_ASM_INIT_SECTIONS): Redefine. (bpf_file_end): New. (TARGET_ASM_FILE_END): Redefine. (bpf_init_builtins): Add "__builtin_preserve_access_index". (bpf_core_compute, bpf_core_get_index): New. (is_attr_preserve_access): New. (bpf_expand_builtin): Handle new builtins. (bpf_core_newdecl, bpf_core_is_maybe_aggregate_access): New. (bpf_core_walk): New. (bpf_resolve_overloaded_builtin): New. (TARGET_RESOLVE_OVERLOADED_BUILTIN): Redefine. (handle_attr): New. (pass_bpf_core_attr): New RTL pass. * config/bpf/bpf-passes.def: New file. * config/bpf/bpf-protos.h (make_pass_bpf_core_attr): New. * config/bpf/coreout.c: New file. * config/bpf/coreout.h: Likewise. * config/bpf/t-bpf (TM_H): Add $(srcdir)/config/bpf/coreout.h. (coreout.o): New rule. (PASSES_EXTRA): Add $(srcdir)/config/bpf/bpf-passes.def. * config.gcc (bpf): Add coreout.h to extra_headers. Add coreout.o to extra_objs. Add $(srcdir)/config/bpf/coreout.c to target_gtfiles.	2021-09-07 13:48:58 -07:00
David Faust	0a2bd52f1a	btf: expose get_btf_id Expose the function get_btf_id, so that it may be used by the BPF backend. This enables the BPF CO-RE machinery in the BPF backend to lookup BTF type IDs, in order to create CO-RE relocation records. A prototype is added in ctfc.h gcc/ChangeLog: * btfout.c (get_btf_id): Function is no longer static. * ctfc.h: Expose it here.	2021-09-07 13:48:58 -07:00
David Faust	5b723401b3	ctfc: add function to lookup CTF ID of a TREE type Add a new function, ctf_lookup_tree_type, to return the CTF type ID associated with a type via its is TREE node. The function is exposed via a prototype in ctfc.h. gcc/ChangeLog: * ctfc.c (ctf_lookup_tree_type): New function. * ctfc.h: Likewise.	2021-09-07 13:48:58 -07:00
David Faust	44e4ed6a3c	ctfc: externalize ctf_dtd_lookup Expose the function ctf_dtd_lookup, so that it can be used by the BPF CO-RE machinery. The function is no longer static, and an extern prototype is added in ctfc.h. gcc/ChangeLog: * ctfc.c (ctf_dtd_lookup): Function is no longer static. * ctfc.h: Analogous change.	2021-09-07 13:48:58 -07:00
David Faust	81eced213c	dwarf: externalize lookup_type_die Expose the function lookup_type_die in dwarf2out, so that it can be used by CTF/BTF when adding BPF CO-RE information. The function is now non-static, and an extern prototype is added in dwarf2out.h. gcc/ChangeLog: * dwarf2out.c (lookup_type_die): Function is no longer static. * dwarf2out.h: Expose it here.	2021-09-07 13:48:57 -07:00
Hans-Peter Nilsson	578cd82af7	Fix fatal typo in gcc.dg/no_profile_instrument_function-attr-2.c Dejagnu is unfortunately brittle: a syntax error in a directive can abort the test-run for the current "tool" (gcc, g++, gfortran), and if you don't check for this condition or actually read the stdout log yourself, your tools may make you believe the test was successful without regressions. At the very least, always grep for ^ERROR: in the stdout log! With r12-3379, the testsuite got such a fatal syntax error, causing the gcc test-run to abort at (e.g.): ... FAIL: gcc.dg/memchr.c (test for excess errors) FAIL: gcc.dg/memcmp-3.c (test for excess errors) ERROR: (DejaGnu) proc "scan-tree-dump-not\" = foo {"} optimized" does not exist. The error code is TCL LOOKUP COMMAND scan-tree-dump-not\" The info on the error is: invalid command name "scan-tree-dump-not"" while executing "::tcl_unknown scan-tree-dump-not\" = foo {"} optimized" ("uplevel" body line 1) invoked from within "uplevel 1 ::tcl_unknown $args" === gcc Summary === # of expected passes 63740 # of unexpected failures 38 # of unexpected successes 2 # of expected failures 351 # of unresolved testcases 3 # of unsupported tests 662 x/cris-elf/gccobj/gcc/xgcc version 12.0.0 20210907 (experimental)\ [master r12-3391-g849d5f5929fc] (GCC) testsuite: * gcc.dg/no_profile_instrument_function-attr-2.c: Fix typo in last change.	2021-09-07 22:36:59 +02:00
Harald Anlauf	2a1537a19c	Fortran - improve error recovery determining array element from constructor gcc/fortran/ChangeLog: PR fortran/101327 * expr.c (find_array_element): When bounds cannot be determined as constant, return error instead of aborting. gcc/testsuite/ChangeLog: PR fortran/101327 * gfortran.dg/pr101327.f90: New test.	2021-09-07 20:51:49 +02:00
Indu Bhagat	849d5f5929	dwarf2out: Emit BTF in dwarf2out_finish for BPF CO-RE usecase DWARF generation is split between early and late phases when LTO is in effect. This poses challenges for CTF/BTF generation especially if late debug info generation is desirable, as turns out to be the case for BPF CO-RE. The approach taken here in this patch is: 1. LTO is disabled for BPF CO-RE The reason to disable LTO for BPF CO-RE is that if LTO is in effect, BPF CO-RE relocations need to be generated in the LTO link phase _after_ the optimizations are done. This means we need to devise way to combine early and late BTF. At this time, in absence of linker support for BTF sections, it makes sense to steer clear of LTO for BPF CO-RE and bypass the issue. 2. The BPF backend updates the write_symbols with BPF_WITH_CORE_DEBUG to convey the case that BTF with CO-RE support needs to be generated. This information is used by the debug info emission routines to defer the emission of BTF/CO-RE until dwarf2out_finish. So, in other words, dwarf2out_early_finish - Always emit CTF here. - if (BTF && !BTF_WITH_CORE), emit BTF now. dwarf2out_finish - if (BTF_WITH_CORE) emit BTF now. gcc/ChangeLog: * dwarf2ctf.c (ctf_debug_finalize): Make it static. (ctf_debug_early_finish): New definition. (ctf_debug_finish): Likewise. * dwarf2ctf.h (ctf_debug_finalize): Remove declaration. (ctf_debug_early_finish): New declaration. (ctf_debug_finish): Likewise. * dwarf2out.c (dwarf2out_finish): Invoke ctf_debug_finish. (dwarf2out_early_finish): Invoke ctf_debug_early_finish.	2021-09-07 11:18:54 -07:00

1 2 3 4 5 ...

187849 Commits