Commit Graph

187943 Commits

Author SHA1 Message Date
GCC Administrator f84e2f0b7b Daily bump. 2021-09-10 00:16:31 +00:00
qing zhao a25e0b5e6a Add -ftrivial-auto-var-init option and uninitialized variable attribute.
Initialize automatic variables with either a pattern or with zeroes to increase
the security and predictability of a program by preventing uninitialized memory
disclosure and use.
GCC still considers an automatic variable that doesn't have an explicit
initializer as uninitialized, -Wuninitialized will still report warning messages
on such automatic variables.
With this option, GCC will also initialize any padding of automatic variables
that have structure or union types to zeroes.
You can control this behavior for a specific variable by using the variable
attribute "uninitialized" to control runtime overhead.

gcc/ChangeLog:

2021-09-09  qing zhao  <qing.zhao@oracle.com>

	* builtins.c (expand_builtin_memset): Make external visible.
	* builtins.h (expand_builtin_memset): Declare extern.
	* common.opt (ftrivial-auto-var-init=): New option.
	* doc/extend.texi: Document the uninitialized attribute.
	* doc/invoke.texi: Document -ftrivial-auto-var-init.
	* flag-types.h (enum auto_init_type): New enumerated type
	auto_init_type.
	* gimple-fold.c (clear_padding_type): Add one new parameter.
	(clear_padding_union): Likewise.
	(clear_padding_emit_loop): Likewise.
	(clear_type_padding_in_mask): Likewise.
	(gimple_fold_builtin_clear_padding): Handle this new parameter.
	* gimplify.c (gimple_add_init_for_auto_var): New function.
	(gimple_add_padding_init_for_auto_var): New function.
	(is_var_need_auto_init): New function.
	(gimplify_decl_expr): Add initialization to automatic variables per
	users' requests.
	(gimplify_call_expr): Add one new parameter for call to
	__builtin_clear_padding.
	(gimplify_init_constructor): Add padding initialization in the end.
	* internal-fn.c (INIT_PATTERN_VALUE): New macro.
	(expand_DEFERRED_INIT): New function.
	* internal-fn.def (DEFERRED_INIT): New internal function.
	* tree-cfg.c (verify_gimple_call): Verify calls to .DEFERRED_INIT.
	* tree-sra.c (generate_subtree_deferred_init): New function.
	(scan_function): Avoid setting cannot_scalarize_away_bitmap for
	calls to .DEFERRED_INIT.
	(sra_modify_deferred_init): New function.
	(sra_modify_function_body): Handle calls to DEFERRED_INIT specially.
	* tree-ssa-structalias.c (find_func_aliases_for_call): Likewise.
	* tree-ssa-uninit.c (warn_uninit): Handle calls to DEFERRED_INIT
	specially.
	(check_defs): Likewise.
	(warn_uninitialized_vars): Likewise.
	* tree-ssa.c (ssa_undefined_value_p): Likewise.
	* tree.c (build_common_builtin_nodes): Build tree node for
	BUILT_IN_CLEAR_PADDING when needed.

gcc/c-family/ChangeLog:

2021-09-09  qing zhao  <qing.zhao@oracle.com>

	* c-attribs.c (handle_uninitialized_attribute): New function.
	(c_common_attribute_table): Add "uninitialized" attribute.

gcc/testsuite/ChangeLog:

2021-09-09  qing zhao  <qing.zhao@oracle.com>

	* c-c++-common/auto-init-1.c: New test.
	* c-c++-common/auto-init-10.c: New test.
	* c-c++-common/auto-init-11.c: New test.
	* c-c++-common/auto-init-12.c: New test.
	* c-c++-common/auto-init-13.c: New test.
	* c-c++-common/auto-init-14.c: New test.
	* c-c++-common/auto-init-15.c: New test.
	* c-c++-common/auto-init-16.c: New test.
	* c-c++-common/auto-init-2.c: New test.
	* c-c++-common/auto-init-3.c: New test.
	* c-c++-common/auto-init-4.c: New test.
	* c-c++-common/auto-init-5.c: New test.
	* c-c++-common/auto-init-6.c: New test.
	* c-c++-common/auto-init-7.c: New test.
	* c-c++-common/auto-init-8.c: New test.
	* c-c++-common/auto-init-9.c: New test.
	* c-c++-common/auto-init-esra.c: New test.
	* c-c++-common/auto-init-padding-1.c: New test.
	* c-c++-common/auto-init-padding-2.c: New test.
	* c-c++-common/auto-init-padding-3.c: New test.
	* g++.dg/auto-init-uninit-pred-1_a.C: New test.
	* g++.dg/auto-init-uninit-pred-2_a.C: New test.
	* g++.dg/auto-init-uninit-pred-3_a.C: New test.
	* g++.dg/auto-init-uninit-pred-4.C: New test.
	* gcc.dg/auto-init-sra-1.c: New test.
	* gcc.dg/auto-init-sra-2.c: New test.
	* gcc.dg/auto-init-uninit-1.c: New test.
	* gcc.dg/auto-init-uninit-12.c: New test.
	* gcc.dg/auto-init-uninit-13.c: New test.
	* gcc.dg/auto-init-uninit-14.c: New test.
	* gcc.dg/auto-init-uninit-15.c: New test.
	* gcc.dg/auto-init-uninit-16.c: New test.
	* gcc.dg/auto-init-uninit-17.c: New test.
	* gcc.dg/auto-init-uninit-18.c: New test.
	* gcc.dg/auto-init-uninit-19.c: New test.
	* gcc.dg/auto-init-uninit-2.c: New test.
	* gcc.dg/auto-init-uninit-20.c: New test.
	* gcc.dg/auto-init-uninit-21.c: New test.
	* gcc.dg/auto-init-uninit-22.c: New test.
	* gcc.dg/auto-init-uninit-23.c: New test.
	* gcc.dg/auto-init-uninit-24.c: New test.
	* gcc.dg/auto-init-uninit-25.c: New test.
	* gcc.dg/auto-init-uninit-26.c: New test.
	* gcc.dg/auto-init-uninit-3.c: New test.
	* gcc.dg/auto-init-uninit-34.c: New test.
	* gcc.dg/auto-init-uninit-36.c: New test.
	* gcc.dg/auto-init-uninit-37.c: New test.
	* gcc.dg/auto-init-uninit-4.c: New test.
	* gcc.dg/auto-init-uninit-5.c: New test.
	* gcc.dg/auto-init-uninit-6.c: New test.
	* gcc.dg/auto-init-uninit-8.c: New test.
	* gcc.dg/auto-init-uninit-9.c: New test.
	* gcc.dg/auto-init-uninit-A.c: New test.
	* gcc.dg/auto-init-uninit-B.c: New test.
	* gcc.dg/auto-init-uninit-C.c: New test.
	* gcc.dg/auto-init-uninit-H.c: New test.
	* gcc.dg/auto-init-uninit-I.c: New test.
	* gcc.target/aarch64/auto-init-1.c: New test.
	* gcc.target/aarch64/auto-init-2.c: New test.
	* gcc.target/aarch64/auto-init-3.c: New test.
	* gcc.target/aarch64/auto-init-4.c: New test.
	* gcc.target/aarch64/auto-init-5.c: New test.
	* gcc.target/aarch64/auto-init-6.c: New test.
	* gcc.target/aarch64/auto-init-7.c: New test.
	* gcc.target/aarch64/auto-init-8.c: New test.
	* gcc.target/aarch64/auto-init-padding-1.c: New test.
	* gcc.target/aarch64/auto-init-padding-10.c: New test.
	* gcc.target/aarch64/auto-init-padding-11.c: New test.
	* gcc.target/aarch64/auto-init-padding-12.c: New test.
	* gcc.target/aarch64/auto-init-padding-2.c: New test.
	* gcc.target/aarch64/auto-init-padding-3.c: New test.
	* gcc.target/aarch64/auto-init-padding-4.c: New test.
	* gcc.target/aarch64/auto-init-padding-5.c: New test.
	* gcc.target/aarch64/auto-init-padding-6.c: New test.
	* gcc.target/aarch64/auto-init-padding-7.c: New test.
	* gcc.target/aarch64/auto-init-padding-8.c: New test.
	* gcc.target/aarch64/auto-init-padding-9.c: New test.
	* gcc.target/i386/auto-init-1.c: New test.
	* gcc.target/i386/auto-init-2.c: New test.
	* gcc.target/i386/auto-init-21.c: New test.
	* gcc.target/i386/auto-init-22.c: New test.
	* gcc.target/i386/auto-init-23.c: New test.
	* gcc.target/i386/auto-init-24.c: New test.
	* gcc.target/i386/auto-init-3.c: New test.
	* gcc.target/i386/auto-init-4.c: New test.
	* gcc.target/i386/auto-init-5.c: New test.
	* gcc.target/i386/auto-init-6.c: New test.
	* gcc.target/i386/auto-init-7.c: New test.
	* gcc.target/i386/auto-init-8.c: New test.
	* gcc.target/i386/auto-init-padding-1.c: New test.
	* gcc.target/i386/auto-init-padding-10.c: New test.
	* gcc.target/i386/auto-init-padding-11.c: New test.
	* gcc.target/i386/auto-init-padding-12.c: New test.
	* gcc.target/i386/auto-init-padding-2.c: New test.
	* gcc.target/i386/auto-init-padding-3.c: New test.
	* gcc.target/i386/auto-init-padding-4.c: New test.
	* gcc.target/i386/auto-init-padding-5.c: New test.
	* gcc.target/i386/auto-init-padding-6.c: New test.
	* gcc.target/i386/auto-init-padding-7.c: New test.
	* gcc.target/i386/auto-init-padding-8.c: New test.
	* gcc.target/i386/auto-init-padding-9.c: New test.
2021-09-09 15:44:49 -07:00
Harald Anlauf 5fe0865ab7 Fortran - out of bounds in array constructor with implied do loop
gcc/fortran/ChangeLog:

	PR fortran/98490
	* trans-expr.c (gfc_conv_substring): Do not generate substring
	bounds check for implied do loop index variable before it actually
	becomes defined.

gcc/testsuite/ChangeLog:

	PR fortran/98490
	* gfortran.dg/bounds_check_23.f90: New test.
2021-09-09 21:34:01 +02:00
H.J. Lu de515ce0b2 x86-64: Update AVX512FP16 ABI tests for x32
On x32, long is the same as int and pointer is 32 bits.  Update AVX512FP16
ABI tests:

1. Replace long with long long for 64-bit integers.
2. Update type and alignment for long and pointer.
3. Skip tests for long on x32.

	* gcc.target/x86_64/abi/avx512fp16/args.h: Replace long with
	long long.
	(XMM_T): Rename _long to _longlong and _ulong to _ulonglong.
	(X87_T): Rename _ulong to _ulonglong.
	* gcc.target/x86_64/abi/avx512fp16/defines.h (TYPE_SIZE_LONG):
	Define to 4 if __ILP32__ is defined.
	(TYPE_SIZE_POINTER): Likewise.
	(TYPE_ALIGN_LONG): Likewise.
	(TYPE_ALIGN_POINTER): Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_3_element_struct_and_unions.c
	(main): Skip test for long if __ILP32__ is defined.
	* gcc.target/x86_64/abi/avx512fp16/test_m64m128_returning.c
	(do_test): Replace _long with _longlong.
	* gcc.target/x86_64/abi/avx512fp16/test_struct_returning.c:
	(check_300): Replace _ulong with _ulonglong.
	* gcc.target/x86_64/abi/avx512fp16/m256h/args.h: Replace long
	with long long.
	(YMM_T): Rename _long to _longlong and _ulong to _ulonglong.
	(X87_T): Rename _ulong to _ulonglong.
	* gcc.target/x86_64/abi/avx512fp16/m512h/args.h: Replace long
	with long long.
	(ZMM_T): Rename _long to _longlong and _ulong to _ulonglong.
	(X87_T): Rename _ulong to _ulonglong.
2021-09-09 08:42:35 -07:00
Richard Biener 013cfc6484 Improve LIM fill_always_executed_in computation
Currently the DOM walk over a loop body does not walk into not
always executed subloops to avoid scalability issues since doing
so makes the walk quadratic in the loop depth.  It turns out this
is not an issue in practice and even with a loop depth of 1800
this function is way off the radar.

So the following patch removes the limitation, replacing it with
a comment.

2021-09-09  Richard Biener  <rguenther@suse.de>

	* tree-ssa-loop-im.c (fill_always_executed_in_1): Walk
	into all subloops.

	* gcc.dg/tree-ssa/ssa-lim-17.c: New testcase.
2021-09-09 11:50:20 +02:00
Richard Biener 6e27bc2b88 Avoid full DOM walk in LIM fill_always_executed_in
This avoids a full DOM walk via get_loop_body_in_dom_order in the
loop body walk of fill_always_executed_in which is often terminating
the walk of a loop body early by integrating the DOM walk of
get_loop_body_in_dom_order with the actual processing done by
fill_always_executed_in.  This trades the fully populated loop
body array with a worklist allocation of the same size and thus
should be a strict improvement over the recursive approach of
get_loop_body_in_dom_order.

2021-09-09  Richard Biener  <rguenther@suse.de>

	* tree-ssa-loop-im.c (fill_always_executed_in_1): Integrate
	DOM walk from get_loop_body_in_dom_order using a worklist
	approach.
2021-09-09 11:16:58 +02:00
liuhongt f77f3adebd AVX512FP16: Add testcase for vaddph/vsubph/vmulph/vdivph.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-helper.h: New header file for
	FP16 runtime test.
	* gcc.target/i386/avx512fp16-vaddph-1a.c: New test.
	* gcc.target/i386/avx512fp16-vaddph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vdivph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vdivph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vmulph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vmulph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vsubph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vsubph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vaddph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vaddph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vdivph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vdivph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vmulph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vmulph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vsubph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vsubph-1b.c: Ditto.
2021-09-09 16:09:05 +08:00
liuhongt bd7a34ef55 AVX512FP16: Add vaddph/vsubph/vdivph/vmulph.
gcc/ChangeLog:

	* config.gcc: Add avx512fp16vlintrin.h.
	* config/i386/avx512fp16intrin.h: (_mm512_add_ph): New intrinsic.
	(_mm512_mask_add_ph): Likewise.
	(_mm512_maskz_add_ph): Likewise.
	(_mm512_sub_ph): Likewise.
	(_mm512_mask_sub_ph): Likewise.
	(_mm512_maskz_sub_ph): Likewise.
	(_mm512_mul_ph): Likewise.
	(_mm512_mask_mul_ph): Likewise.
	(_mm512_maskz_mul_ph): Likewise.
	(_mm512_div_ph): Likewise.
	(_mm512_mask_div_ph): Likewise.
	(_mm512_maskz_div_ph): Likewise.
	(_mm512_add_round_ph): Likewise.
	(_mm512_mask_add_round_ph): Likewise.
	(_mm512_maskz_add_round_ph): Likewise.
	(_mm512_sub_round_ph): Likewise.
	(_mm512_mask_sub_round_ph): Likewise.
	(_mm512_maskz_sub_round_ph): Likewise.
	(_mm512_mul_round_ph): Likewise.
	(_mm512_mask_mul_round_ph): Likewise.
	(_mm512_maskz_mul_round_ph): Likewise.
	(_mm512_div_round_ph): Likewise.
	(_mm512_mask_div_round_ph): Likewise.
	(_mm512_maskz_div_round_ph): Likewise.
	* config/i386/avx512fp16vlintrin.h: New header.
	* config/i386/i386-builtin-types.def (V16HF, V8HF, V32HF):
	Add new builtin types.
	* config/i386/i386-builtin.def: Add corresponding builtins.
	* config/i386/i386-expand.c
	(ix86_expand_args_builtin): Handle new builtin types.
	(ix86_expand_round_builtin): Likewise.
	* config/i386/immintrin.h: Include avx512fp16vlintrin.h
	* config/i386/sse.md (VFH): New mode_iterator.
	(VF2H): Likewise.
	(avx512fmaskmode): Add HF vector modes.
	(avx512fmaskhalfmode): Likewise.
	(<plusminus_insn><mode>3<mask_name><round_name>): Adjust to for
	HF vector modes.
	(*<plusminus_insn><mode>3<mask_name><round_name>): Likewise.
	(mul<mode>3<mask_name><round_name>): Likewise.
	(*mul<mode>3<mask_name><round_name>): Likewise.
	(div<mode>3): Likewise.
	(<sse>_div<mode>3<mask_name><round_name>): Likewise.
	* config/i386/subst.md (SUBST_V): Add HF vector modes.
	(SUBST_A): Likewise.
	(round_mode512bit_condition): Adjust for V32HFmode.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add -mavx512vl and test for new intrinsics.
	* gcc.target/i386/avx-2.c: Add -mavx512vl.
	* gcc.target/i386/avx512fp16-11a.c: New test.
	* gcc.target/i386/avx512fp16-11b.c: Ditto.
	* gcc.target/i386/avx512vlfp16-11a.c: Ditto.
	* gcc.target/i386/avx512vlfp16-11b.c: Ditto.
	* gcc.target/i386/sse-13.c: Add test for new builtins.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-09 16:08:56 +08:00
liuhongt 8f323c712e Optimize v4sf reduction.
gcc/ChangeLog:

	PR target/101059
	* config/i386/sse.md (reduc_plus_scal_<mode>): Split to ..
	(reduc_plus_scal_v4sf): .. this, New define_expand.
	(reduc_plus_scal_v2df): .. and this, New define_expand.

gcc/testsuite/ChangeLog:

	PR target/101059
	* gcc.target/i386/sse2-pr101059.c: New test.
	* gcc.target/i386/sse3-pr101059.c: New test.
2021-09-09 09:34:15 +08:00
liuhongt 60eec23b5e Optimize vec_extract for 256/512-bit vector when index exceeds the lower 128 bits.
-	vextracti32x8	$0x1, %zmm0, %ymm0
-	vmovd	%xmm0, %eax
+	valignd	$8, %zmm0, %zmm0, %zmm1
+	vmovd	%xmm1, %eax

-	vextracti32x8	$0x1, %zmm0, %ymm0
-	vextracti128	$0x1, %ymm0, %xmm0
-	vpextrd	$3, %xmm0, %eax
+	valignd	$15, %zmm0, %zmm0, %zmm1
+	vmovd	%xmm1, %eax

-	vextractf64x2	$0x1, %ymm0, %xmm0
+	valignq	$2, %ymm0, %ymm0, %ymm0

-	vextractf64x4	$0x1, %zmm0, %ymm0
-	vextractf64x2	$0x1, %ymm0, %xmm0
-	vunpckhpd	%xmm0, %xmm0, %xmm0
+	valignq	$7, %zmm0, %zmm0, %zmm0

gcc/ChangeLog:

	PR target/91103
	* config/i386/sse.md (*vec_extract<mode><ssescalarmodelower>_valign):
	New define_insn.

gcc/testsuite/ChangeLog:

	PR target/91103
	* gcc.target/i386/pr91103-1.c: New test.
	* gcc.target/i386/pr91103-2.c: New test.
2021-09-09 09:33:40 +08:00
GCC Administrator b6db7cd41c Daily bump. 2021-09-09 00:16:32 +00:00
Jonathan Wakely 3c64582372 c++: Fix docs on assignment of virtual bases [PR60318]
The description of behaviour is incorrect, the virtual base gets
assigned before entering the bodies of A::operator= and B::operator=,
not after.

The example is also ill-formed (passing a string literal to char*) and
undefined (missing return from Base::operator=).

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

gcc/ChangeLog:

	PR c++/60318
	* doc/trouble.texi (Copy Assignment): Fix description of
	behaviour and fix code in example.
2021-09-08 22:34:16 +01:00
David Malcolm e66b9f6779 analyzer: fix ICE when discarding result of realloc [PR102225]
gcc/analyzer/ChangeLog:
	PR analyzer/102225
	* analyzer.h (compat_types_p): New decl.
	* constraint-manager.cc
	(constraint_manager::get_or_add_equiv_class): Guard against NULL
	type when checking for pointer types.
	* region-model-impl-calls.cc (region_model::impl_call_realloc):
	Guard against NULL lhs type/region.  Guard against the size value
	not being of a compatible type for dynamic extents.
	* region-model.cc (compat_types_p): Make non-static.

gcc/testsuite/ChangeLog:
	PR analyzer/102225
	* gcc.dg/analyzer/realloc-1.c (test_10): New.
	* gcc.dg/analyzer/torture/pr102225.c: New test.
2021-09-08 14:37:19 -04:00
Richard Biener 716a583692 c++/102228 - make lookup_anon_field O(1)
For the testcase in PR101555 lookup_anon_field takes the majority
of parsing time followed by get_class_binding_direct/fields_linear_search
which is PR83309.  The situation with anon aggregates is particularly
dire when we need to build accesses to their members and the anon
aggregates are nested.  There for each such access we recursively
build sub-accesses to the anon aggregate FIELD_DECLs bottom-up,
DFS searching for them.  That's inefficient since as I believe
there's a 1:1 relationship between anon aggregate types and the
FIELD_DECL used to place them.

The patch below does away with the search in lookup_anon_field and
instead records the single FIELD_DECL in the anon aggregate types
lang-specific data, re-using the RTTI typeinfo_var field.  That
speeds up the compile of the testcase with -fsyntax-only from
about 4.5s to slightly less than 1s.

I tried to poke holes into the 1:1 relationship idea with my C++
knowledge but failed (which might not say much).  It also leaves
a hole for the case when the C++ FE itself duplicates such type
and places it at a semantically different position.  I've tried
to poke holes into it with the duplication mechanism I understand
(templates) but failed.

2021-09-08  Richard Biener  <rguenther@suse.de>

	PR c++/102228
gcc/cp/
	* cp-tree.h (ANON_AGGR_TYPE_FIELD): New define.
	* decl.c (fixup_anonymous_aggr): Wipe RTTI info put in
	place on invalid code.
	* decl2.c (reset_type_linkage): Guard CLASSTYPE_TYPEINFO_VAR
	access.
	* module.cc (trees_in::read_class_def): Likewise.  Reconstruct
	ANON_AGGR_TYPE_FIELD.
	* semantics.c (finish_member_declaration): Populate
	ANON_AGGR_TYPE_FIELD for anon aggregate typed members.
	* typeck.c (lookup_anon_field): Remove DFS search and return
	ANON_AGGR_TYPE_FIELD directly.
2021-09-08 17:43:40 +02:00
Joseph Myers d27d694151 testsuite: Allow .sdata in more cases in gcc.dg/array-quals-1.c
When testing for Nios II (gcc-testresults shows this for MIPS as
well), failures of gcc.dg/array-quals-1.c appear where a symbol was
found in .sdata rather than one of the expected sections.

FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?a$ (found a) has section ^\\.(const|rodata|srodata)|\\[RO\\] (found .sdata)
FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?b$ (found b) has section ^\\.(const|rodata|srodata)|\\[RO\\] (found .sdata)
FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?c$ (found c) has section ^\\.(const|rodata|srodata)|\\[RO\\] (found .sdata)
FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?d$ (found d) has section ^\\.(const|rodata|srodata)|\\[RO\\] (found .sdata)

Jakub's commit 0b34dbc0a2 allowed .sdata
for many variables in that test where use of .sdata caused a failure
on powerpc-linux.  I'm presuming the choice of which variables had
.sdata allowed was based only on the code generated for powerpc-linux,
not on any reason it would be wrong to allow it for the other
variables; thus, this patch adjusts the test to allow .sdata for some
more variables where that is needed on Nios II (and in one case where
it's not needed on Nios II, but the test results on gcc-testresults
suggest that it is needed on MIPS).

Tested with no regressions with cross to nios2-elf.

	* gcc.dg/array-quals-1.c: Allow .sdata section in more cases.
2021-09-08 15:38:18 +00:00
Joseph Myers d081516ae1 testsuite: Use explicit -ftree-cselim in tests using -fdump-tree-cselim-details
When testing for Nios II (gcc-testresults shows this for various other
targets as well), tests scanning cselim dumps produce an UNRESOLVED
result because those dumps do not exist.

cselim is enabled conditionally by code in toplev.c:

  if (flag_tree_cselim == AUTODETECT_VALUE)
    {
      if (HAVE_conditional_move)
	flag_tree_cselim = 1;
      else
	flag_tree_cselim = 0;
    }

Add explicit -ftree-cselim to dg-options in the affected tests (as
already used by some other tests of cselim dumps) so that this dump
exists on all architectures.

Tested with no regressions with cross to nios2-elf, where this causes
the tests in question to PASS instead of being UNRESOLVED.

	* gcc.dg/tree-ssa/pr89430-1.c, gcc.dg/tree-ssa/pr89430-2.c,
	gcc.dg/tree-ssa/pr89430-3.c, gcc.dg/tree-ssa/pr89430-4.c,
	gcc.dg/tree-ssa/pr89430-5.c, gcc.dg/tree-ssa/pr89430-6.c,
	gcc.dg/tree-ssa/pr89430-7-comp-ref.c,
	gcc.dg/tree-ssa/pr89430-8-mem-ref-size.c,
	gcc.dg/tree-ssa/pr99473-1.c: Use -ftree-cselim.
2021-09-08 14:57:20 +00:00
Segher Boessenkool 86e6268cff rs6000: Fix ELFv2 r12 use in epilogue
We cannot use r12 here, it is already in use as the GEP (for sibling
calls).

2021-09-08  Segher Boessenkool  <segher@kernel.crashing.org>
	PR target/102107
	* config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): For ELFv2 use
	r11 instead of r12 for restoring CR.
2021-09-08 13:27:56 +00:00
Jakub Jelinek 7485a52551 i386: Fix up xorsign for AVX [PR89984]
Thinking about it more this morning, while this patch fixes the problems
revealed in the testcase, the recent PR89984 change was buggy too, but
perhaps that can be fixed incrementally.  Because for AVX the new code
destructively modifies op1.  If that is different from dest, say on:
float
foo (float x, float y)
{
  return x * __builtin_copysignf (1.0f, y) + y;
}
then we get after RA:
(insn 8 7 9 2 (set (reg:SF 20 xmm0 [orig:82 _2 ] [82])
        (unspec:SF [
                (reg:SF 20 xmm0 [88])
                (reg:SF 21 xmm1 [89])
                (mem/u/c:V4SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S16 A128])
            ] UNSPEC_XORSIGN)) "hohoho.c":4:12 649 {xorsignsf3_1}
     (nil))
(insn 9 8 15 2 (set (reg:SF 20 xmm0 [87])
        (plus:SF (reg:SF 20 xmm0 [orig:82 _2 ] [82])
            (reg:SF 21 xmm1 [89]))) "hohoho.c":4:44 1021 {*fop_sf_comm}
     (nil))
but split the xorsign into:
        vandps  .LC0(%rip), %xmm1, %xmm1
        vxorps  %xmm0, %xmm1, %xmm0
and then the addition:
        vaddss  %xmm1, %xmm0, %xmm0
which means we miscompile it - instead of adding y in the end we add
__builtin_copysignf (0.0f, y).
So, wonder if we don't want instead in addition to the &Yv <- Yv, 0
alternative (enabled for both pre-AVX and AVX as in this patch) the
&Yv <- Yv, Yv where destination must be different from inputs and another
Yv <- Yv, Yv where it can be the same but then need a match_scratch
(with X for the other alternatives and =Yv for the last one).
That way we'd always have a safe register we can store the op1 & mask
value into, either the destination (in the first alternative known to
be equal to op1 which is needed for non-AVX but ok for AVX too), in the
second alternative known to be different from both inputs and in the third
which could be used for those
float bar (float x, float y) { return x * __builtin_copysignf (1.0f, y); }
cases where op1 is naturally xmm1 and dest == op0 naturally xmm0 we'd use
some other register like xmm2.

On Wed, Sep 08, 2021 at 05:23:40PM +0800, Hongtao Liu wrote:
> I'm curious why we need the  post_reload splitter @xorsign<mode>3_1
> for scalar mode, can't we just expand them into and/xor operations in
> the expander, just like vector modes did.

Following seems to work for all the testcases I've tried (and in some
generates better code than the post-reload splitter).

2021-09-08  Jakub Jelinek  <jakub@redhat.com>
	    liuhongt  <hongtao.liu@intel.com>

	PR target/89984
	* config/i386/i386.md (@xorsign<mode>3_1): Remove.
	* config/i386/i386-expand.c (ix86_expand_xorsign): Expand right away
	into AND with mask and XOR, using paradoxical subregs.
	(ix86_split_xorsign): Remove.
	* config/i386/i386-protos.h (ix86_split_xorsign): Remove.

	* gcc.target/i386/avx-pr102224.c: Fix up PR number.
	* gcc.dg/pr89984.c: New test.
	* gcc.target/i386/avx-pr89984.c: New test.
2021-09-08 14:06:10 +02:00
liuhongt 6576ad5add Compile __{mul,div}hc3 into libgcc_s.so.1.
libgcc/ChangeLog:

	* config/i386/t-softfp: Compile __{mul,div}hc3 into
	libgcc_s.so.1.
2021-09-08 19:18:15 +08:00
Di Zhao 7285f39455 tree-optimization/102183 - sccvn: fix result compare in vn_nary_op_insert_into
If the first predicate value is different and copied, the comparison will then
be between val->result and the copied one. That can cause inserting extra
vn_pvals.

gcc/ChangeLog:

	* tree-ssa-sccvn.c (vn_nary_op_insert_into): fix result compare
2021-09-08 18:47:18 +08:00
Jakub Jelinek 87d55da7d7 libgcc, i386: Export *hf* and *hc* from libgcc_s.so.1
The following patch exports it for Linux from config/i386/*.ver where it
IMNSHO belongs, aarch64 already exports some of those at GCC_11* and other
targets might add them at completely different gcc versions.

2021-09-08  Jakub Jelinek  <jakub@redhat.com>
	    Iain Sandoe  <iain@sandoe.co.uk>

	* config/i386/libgcc-glibc.ver: Add %inherit GCC_12.0.0 GCC_7.0.0
	and export *hf* and *hc* functions at GCC_12.0.0.
2021-09-08 11:34:45 +02:00
Jakub Jelinek a7b626d98a i386: Fix up @xorsign<mode>3_1 [PR102224]
As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
operands are in the same register, because the splitter overwrites op1
before with op1 & mask before using op0.

For dest = xorsign op0, op0 we can actually simplify it from
dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).

The expander change is an optimization improvement, if we at expansion
time know it is xorsign op0, op0, we can emit abs right away and get better
code through that.

The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
to have same operands during expansion, but during RTL optimizations they
would appear.  For non-AVX we need to use earlyclobber, we require
dest and op1 to be the same but op0 must be different because we overwrite
op1 first.  For AVX the constraints ensure that at most 2 of the 3 operands
may be the same register and if both inputs are the same, handles that case.
This case can be easily tested with the xorsign<mode>3 expander change
reverted.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Thinking about it more this morning, while this patch fixes the problems
revealed in the testcase, the recent PR89984 change was buggy too, but
perhaps that can be fixed incrementally.  Because for AVX the new code
destructively modifies op1.  If that is different from dest, say on:
float
foo (float x, float y)
{
  return x * __builtin_copysignf (1.0f, y) + y;
}
then we get after RA:
(insn 8 7 9 2 (set (reg:SF 20 xmm0 [orig:82 _2 ] [82])
        (unspec:SF [
                (reg:SF 20 xmm0 [88])
                (reg:SF 21 xmm1 [89])
                (mem/u/c:V4SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S16 A128])
            ] UNSPEC_XORSIGN)) "hohoho.c":4:12 649 {xorsignsf3_1}
     (nil))
(insn 9 8 15 2 (set (reg:SF 20 xmm0 [87])
        (plus:SF (reg:SF 20 xmm0 [orig:82 _2 ] [82])
            (reg:SF 21 xmm1 [89]))) "hohoho.c":4:44 1021 {*fop_sf_comm}
     (nil))
but split the xorsign into:
        vandps  .LC0(%rip), %xmm1, %xmm1
        vxorps  %xmm0, %xmm1, %xmm0
and then the addition:
        vaddss  %xmm1, %xmm0, %xmm0
which means we miscompile it - instead of adding y in the end we add
__builtin_copysignf (0.0f, y).
So, wonder if we don't want instead in addition to the &Yv <- Yv, 0
alternative (enabled for both pre-AVX and AVX as in this patch) the
&Yv <- Yv, Yv where destination must be different from inputs and another
Yv <- Yv, Yv where it can be the same but then need a match_scratch
(with X for the other alternatives and =Yv for the last one).
That way we'd always have a safe register we can store the op1 & mask
value into, either the destination (in the first alternative known to
be equal to op1 which is needed for non-AVX but ok for AVX too), in the
second alternative known to be different from both inputs and in the third
which could be used for those
float bar (float x, float y) { return x * __builtin_copysignf (1.0f, y); }
cases where op1 is naturally xmm1 and dest == op0 naturally xmm0 we'd use
some other register like xmm2.

2021-09-08  Jakub Jelinek  <jakub@redhat.com>

	PR target/102224
	* config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
	operands[2], emit abs<mode>2 instead.
	(@xorsign<mode>3_1): Add early-clobbers for output operand, enable
	first alternative even for avx, add another alternative with
	=&Yv <- 0, Yv, Yvm constraints.
	* config/i386/i386-expand.c (ix86_split_xorsign): If op0 is equal
	to op1, emit vpandn instead.

	* gcc.dg/pr102224.c: New test.
	* gcc.target/i386/avx-pr102224.c: New test.
2021-09-08 11:25:31 +02:00
liuhongt 4a61bcaca0 AVX512FP16: Add abi test for zmm
gcc/testsuite/ChangeLog:

	* gcc.target/x86_64/abi/avx512fp16/m512h/abi-avx512fp16-zmm.exp:
	New file.
	* gcc.target/x86_64/abi/avx512fp16/m512h/args.h: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m512h/asm-support.S: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m512h/avx512fp16-zmm-check.h:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m512h/test_m512_returning.c:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_m512.c:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_structs.c:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_unions.c:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m512h/test_varargs-m512.c:
	Likewise.
2021-09-08 12:44:50 +08:00
liuhongt 07308cdb0c AVX512FP16: Add ABI test for ymm.
gcc/testsuite/ChangeLog:

	* gcc.target/x86_64/abi/avx512fp16/m256h/abi-avx512fp16-ymm.exp:
	New exp file.
	* gcc.target/x86_64/abi/avx512fp16/m256h/args.h: New header.
	* gcc.target/x86_64/abi/avx512fp16/m256h/avx512fp16-ymm-check.h:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m256h/asm-support.S: New.
	* gcc.target/x86_64/abi/avx512fp16/m256h/test_m256_returning.c:
	New test.
	* gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_m256.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_structs.c:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_unions.c:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/m256h/test_varargs-m256.c: Likewise.
2021-09-08 12:44:50 +08:00
H.J. Lu 22ce16ffa4 AVX512FP16: Add ABI tests for xmm.
Copied from regular XMM ABI tests. Only run AVX512FP16 ABI tests for ELF
targets.

gcc/testsuite/ChangeLog:

	* gcc.target/x86_64/abi/avx512fp16/abi-avx512fp16-xmm.exp: New exp
	file for abi test.
	* gcc.target/x86_64/abi/avx512fp16/args.h: New header file for abi test.
	* gcc.target/x86_64/abi/avx512fp16/avx512fp16-check.h: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/avx512fp16-xmm-check.h: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/defines.h: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/macros.h: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/asm-support.S: New asm for abi check.
	* gcc.target/x86_64/abi/avx512fp16/test_3_element_struct_and_unions.c:
	New test.
	* gcc.target/x86_64/abi/avx512fp16/test_basic_alignment.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_basic_array_size_and_align.c:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_basic_returning.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_basic_sizes.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_basic_struct_size_and_align.c:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_basic_union_size_and_align.c:
	Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_complex_returning.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_m64m128_returning.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_passing_floats.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_passing_m64m128.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_passing_structs.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_passing_unions.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_struct_returning.c: Likewise.
	* gcc.target/x86_64/abi/avx512fp16/test_varargs-m128.c: Likewise.
2021-09-08 12:44:50 +08:00
H.J. Lu 5bbd88bb1e AVX512FP16: Add tests for vector passing in variable arguments.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-vararg-1.c: New test.
	* gcc.target/i386/avx512fp16-vararg-2.c: Ditto.
	* gcc.target/i386/avx512fp16-vararg-3.c: Ditto.
	* gcc.target/i386/avx512fp16-vararg-4.c: Ditto.
2021-09-08 12:44:50 +08:00
liuhongt 2f3318dbcf AVX512FP16: Add testcase for vector init and broadcast intrinsics.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/m512-check.h: Add union128h, union256h, union512h.
	* gcc.target/i386/avx512fp16-10a.c: New test.
	* gcc.target/i386/avx512fp16-10b.c: Ditto.
	* gcc.target/i386/avx512fp16-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-1c.c: Ditto.
	* gcc.target/i386/avx512fp16-1d.c: Ditto.
	* gcc.target/i386/avx512fp16-1e.c: Ditto.
	* gcc.target/i386/avx512fp16-2a.c: Ditto.
	* gcc.target/i386/avx512fp16-2b.c: Ditto.
	* gcc.target/i386/avx512fp16-2c.c: Ditto.
	* gcc.target/i386/avx512fp16-3a.c: Ditto.
	* gcc.target/i386/avx512fp16-3b.c: Ditto.
	* gcc.target/i386/avx512fp16-3c.c: Ditto.
	* gcc.target/i386/avx512fp16-4.c: Ditto.
	* gcc.target/i386/avx512fp16-5.c: Ditto.
	* gcc.target/i386/avx512fp16-6.c: Ditto.
	* gcc.target/i386/avx512fp16-7.c: Ditto.
	* gcc.target/i386/avx512fp16-8.c: Ditto.
	* gcc.target/i386/avx512fp16-9a.c: Ditto.
	* gcc.target/i386/avx512fp16-9b.c: Ditto.
	* gcc.target/i386/pr54855-13.c: Ditto.
	* gcc.target/i386/avx512fp16-vec_set_var.c: Ditto.
2021-09-08 12:44:50 +08:00
liuhongt 9e2a82e1f9 AVX512FP16: Support vector init/broadcast/set/extract for FP16.
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm_set_ph): New intrinsic.
	(_mm256_set_ph): Likewise.
	(_mm512_set_ph): Likewise.
	(_mm_setr_ph): Likewise.
	(_mm256_setr_ph): Likewise.
	(_mm512_setr_ph): Likewise.
	(_mm_set1_ph): Likewise.
	(_mm256_set1_ph): Likewise.
	(_mm512_set1_ph): Likewise.
	(_mm_setzero_ph): Likewise.
	(_mm256_setzero_ph): Likewise.
	(_mm512_setzero_ph): Likewise.
	(_mm_set_sh): Likewise.
	(_mm_load_sh): Likewise.
	(_mm_store_sh): Likewise.
	* config/i386/i386-builtin-types.def (V8HF): New type.
	(DEF_FUNCTION_TYPE (V8HF, V8HI)): New builtin function type
	* config/i386/i386-expand.c (ix86_expand_vector_init_duplicate):
	Support vector HFmodes.
	(ix86_expand_vector_init_one_nonzero): Likewise.
	(ix86_expand_vector_init_one_var): Likewise.
	(ix86_expand_vector_init_interleave): Likewise.
	(ix86_expand_vector_init_general): Likewise.
	(ix86_expand_vector_set): Likewise.
	(ix86_expand_vector_extract): Likewise.
	(ix86_expand_vector_init_concat): Likewise.
	(ix86_expand_sse_movcc): Handle vector HFmodes.
	(ix86_expand_vector_set_var): Ditto.
	* config/i386/i386-modes.def: Add HF vector modes in comment.
	* config/i386/i386.c (classify_argument): Add HF vector modes.
	(ix86_hard_regno_mode_ok): Allow HF vector modes for AVX512FP16.
	(ix86_vector_mode_supported_p): Likewise.
	(ix86_set_reg_reg_cost): Handle vector HFmode.
	(ix86_get_ssemov): Handle vector HFmode.
	(function_arg_advance_64): Pass unamed V16HFmode and V32HFmode
	by stack.
	(function_arg_advance_32): Pass V8HF/V16HF/V32HF by sse reg for 32bit
	mode.
	(function_arg_advance_32): Ditto.
	* config/i386/i386.h (VALID_AVX512FP16_REG_MODE): New.
	(VALID_AVX256_REG_OR_OI_MODE): Rename to ..
	(VALID_AVX256_REG_OR_OI_VHF_MODE): .. this, and add V16HF.
	(VALID_SSE2_REG_VHF_MODE): New.
	(VALID_AVX512VL_128_REG_MODE): Add V8HF and TImode.
	(SSE_REG_MODE_P): Add vector HFmode.
	* config/i386/i386.md (mode): Add HF vector modes.
	(MODE_SIZE): Likewise.
	(ssemodesuffix): Add ph suffix for HF vector modes.
	* config/i386/sse.md (VFH_128): New mode iterator.
	(VMOVE): Adjust for HF vector modes.
	(V): Likewise.
	(V_256_512): Likewise.
	(avx512): Likewise.
	(avx512fmaskmode): Likewise.
	(shuffletype): Likewise.
	(sseinsnmode): Likewise.
	(ssedoublevecmode): Likewise.
	(ssehalfvecmode): Likewise.
	(ssehalfvecmodelower): Likewise.
	(ssePScmode): Likewise.
	(ssescalarmode): Likewise.
	(ssescalarmodelower): Likewise.
	(sseintprefix): Likewise.
	(i128): Likewise.
	(bcstscalarsuff): Likewise.
	(xtg_mode): Likewise.
	(VI12HF_AVX512VL): New mode_iterator.
	(VF_AVX512FP16): Likewise.
	(VIHF): Likewise.
	(VIHF_256): Likewise.
	(VIHF_AVX512BW): Likewise.
	(V16_256): Likewise.
	(V32_512): Likewise.
	(sseintmodesuffix): New mode_attr.
	(sse): Add scalar and vector HFmodes.
	(ssescalarmode): Add vector HFmode mapping.
	(ssescalarmodesuffix): Add sh suffix for HFmode.
	(*<sse>_vm<insn><mode>3): Use VFH_128.
	(*<sse>_vm<multdiv_mnemonic><mode>3): Likewise.
	(*ieee_<ieee_maxmin><mode>3): Likewise.
	(<avx512>_blendm<mode>): New define_insn.
	(vec_setv8hf): New define_expand.
	(vec_set<mode>_0): New define_insn for HF vector set.
	(*avx512fp16_movsh): Likewise.
	(avx512fp16_movsh): Likewise.
	(vec_extract_lo_v32hi): Rename to ...
	(vec_extract_lo_<mode>): ... this, and adjust to allow HF
	vector modes.
	(vec_extract_hi_v32hi): Likewise.
	(vec_extract_hi_<mode>): Likewise.
	(vec_extract_lo_v16hi): Likewise.
	(vec_extract_lo_<mode>): Likewise.
	(vec_extract_hi_v16hi): Likewise.
	(vec_extract_hi_<mode>): Likewise.
	(vec_set_hi_v16hi): Likewise.
	(vec_set_hi_<mode>): Likewise.
	(vec_set_lo_v16hi): Likewise.
	(vec_set_lo_<mode>): Likewise.
	(*vec_extract<mode>_0): New define_insn_and_split for HF
	vector extract.
	(*vec_extracthf): New define_insn.
	(VEC_EXTRACT_MODE): Add HF vector modes.
	(PINSR_MODE): Add V8HF.
	(sse2p4_1): Likewise.
	(pinsr_evex_isa): Likewise.
	(<sse2p4_1>_pinsr<ssemodesuffix>): Adjust to support
	insert for V8HFmode.
	(pbroadcast_evex_isa): Add HF vector modes.
	(AVX2_VEC_DUP_MODE): Likewise.
	(VEC_INIT_MODE): Likewise.
	(VEC_INIT_HALF_MODE): Likewise.
	(avx2_pbroadcast<mode>): Adjust to support HF vector mode
	broadcast.
	(avx2_pbroadcast<mode>_1): Likewise.
	(<avx512>_vec_dup<mode>_1): Likewise.
	(<avx512>_vec_dup<mode><mask_name>): Likewise.
	(<mask_codefor><avx512>_vec_dup_gpr<mode><mask_name>):
	Likewise.
2021-09-08 12:44:50 +08:00
Guo, Xuepeng a68412117f AVX512FP16: Initial support for AVX512FP16 feature and scalar _Float16 instructions.
gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_available_features):
	Detect FEATURE_AVX512FP16.
	* common/config/i386/i386-common.c
	(OPTION_MASK_ISA_AVX512FP16_SET,
	OPTION_MASK_ISA_AVX512FP16_UNSET,
	OPTION_MASK_ISA2_AVX512FP16_SET,
	OPTION_MASK_ISA2_AVX512FP16_UNSET): New.
	(OPTION_MASK_ISA2_AVX512BW_UNSET,
	OPTION_MASK_ISA2_AVX512BF16_UNSET): Add AVX512FP16.
	(ix86_handle_option): Handle -mavx512fp16.
	* common/config/i386/i386-cpuinfo.h (enum processor_features):
	Add FEATURE_AVX512FP16.
	* common/config/i386/i386-isas.h: Add entry for AVX512FP16.
	* config.gcc: Add avx512fp16intrin.h.
	* config/i386/avx512fp16intrin.h: New intrinsic header.
	* config/i386/cpuid.h: Add bit_AVX512FP16.
	* config/i386/i386-builtin-types.def: (FLOAT16): New primitive type.
	* config/i386/i386-builtins.c: Support _Float16 type for i386
	backend.
	(ix86_register_float16_builtin_type): New function.
	(ix86_float16_type_node): New.
	* config/i386/i386-c.c (ix86_target_macros_internal): Define
	__AVX512FP16__.
	* config/i386/i386-expand.c (ix86_expand_branch): Support
	HFmode.
	(ix86_prepare_fp_compare_args): Adjust TARGET_SSE_MATH &&
	SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P.
	(ix86_expand_fp_movcc): Ditto.
	* config/i386/i386-isa.def: Add PTA define for AVX512FP16.
	* config/i386/i386-options.c (isa2_opts): Add -mavx512fp16.
	(ix86_valid_target_attribute_inner_p): Add avx512fp16 attribute.
	* config/i386/i386.c (ix86_get_ssemov): Use
	vmovdqu16/vmovw/vmovsh for HFmode/HImode scalar or vector.
	(ix86_get_excess_precision): Use
	FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when TARGET_AVX512FP16
	existed.
	(sse_store_index): Use SFmode cost for HFmode cost.
	(inline_memory_move_cost): Add HFmode, and perfer SSE cost over
	GPR cost for HFmode.
	(ix86_hard_regno_mode_ok): Allow HImode in sse register.
	(ix86_mangle_type): Add manlging for _Float16 type.
	(inline_secondary_memory_needed): No memory is needed for
	16bit movement between gpr and sse reg under
	TARGET_AVX512FP16.
	(ix86_multiplication_cost): Adjust TARGET_SSE_MATH &&
	SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P.
	(ix86_division_cost): Ditto.
	(ix86_rtx_costs): Ditto.
	(ix86_add_stmt_cost): Ditto.
	(ix86_optab_supported_p): Ditto.
	* config/i386/i386.h (VALID_AVX512F_SCALAR_MODE): Add HFmode.
	(SSE_FLOAT_MODE_SSEMATH_OR_HF_P): Add HFmode.
	(PTA_SAPPHIRERAPIDS): Add PTA_AVX512FP16.
	* config/i386/i386.md (mode): Add HFmode.
	(MODE_SIZE): Add HFmode.
	(isa): Add avx512fp16.
	(enabled): Handle avx512fp16.
	(ssemodesuffix): Add sh suffix for HFmode.
	(comm): Add mult, div.
	(plusminusmultdiv): New code iterator.
	(insn): Add mult, div.
	(*movhf_internal): Adjust for avx512fp16 instruction.
	(*movhi_internal): Ditto.
	(*cmpi<unord>hf): New define_insn for HFmode.
	(*ieee_s<ieee_maxmin>hf3): Likewise.
	(extendhf<mode>2): Likewise.
	(trunc<mode>hf2): Likewise.
	(float<floatunssuffix><mode>hf2): Likewise.
	(*<insn>hf): Likewise.
	(cbranchhf4): New expander.
	(movhfcc): Likewise.
	(<insn>hf3): Likewise.
	(mulhf3): Likewise.
	(divhf3): Likewise.
	* config/i386/i386.opt: Add mavx512fp16.
	* config/i386/immintrin.h: Include avx512fp16intrin.h.
	* doc/invoke.texi: Add mavx512fp16.
	* doc/extend.texi: Add avx512fp16 Usage Notes.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add -mavx512fp16 in dg-options.
	* gcc.target/i386/avx-2.c: Ditto.
	* gcc.target/i386/avx512-check.h: Check cpuid for AVX512FP16.
	* gcc.target/i386/funcspec-56.inc: Add new target attribute check.
	* gcc.target/i386/sse-13.c: Add -mavx512fp16.
	* gcc.target/i386/sse-14.c: Ditto.
	* gcc.target/i386/sse-22.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* lib/target-supports.exp: (check_effective_target_avx512fp16): New.
	* g++.target/i386/float16-1.C: New test.
	* g++.target/i386/float16-2.C: Ditto.
	* g++.target/i386/float16-3.C: Ditto.
	* gcc.target/i386/avx512fp16-12a.c: Ditto.
	* gcc.target/i386/avx512fp16-12b.c: Ditto.
	* gcc.target/i386/float16-3a.c: Ditto.
	* gcc.target/i386/float16-3b.c: Ditto.
	* gcc.target/i386/float16-4a.c: Ditto.
	* gcc.target/i386/float16-4b.c: Ditto.
	* gcc.target/i386/pr54855-12.c: Ditto.
	* g++.dg/other/i386-2.C: Ditto.
	* g++.dg/other/i386-3.C: Ditto.

Co-Authored-By: H.J. Lu <hongjiu.lu@intel.com>
Co-Authored-By: Liu Hongtao <hongtao.liu@intel.com>
Co-Authored-By: Wang Hongyu <hongyu.wang@intel.com>
Co-Authored-By: Xu Dianhong <dianhong.xu@intel.com>
2021-09-08 12:44:50 +08:00
liuhongt f19a327077 Support -fexcess-precision=16 which will enable FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16.
gcc/ada/ChangeLog:

	* gcc-interface/misc.c (gnat_post_options): Issue an error for
	-fexcess-precision=16.

gcc/c-family/ChangeLog:

	* c-common.c (excess_precision_mode_join): Update below comments.
	(c_ts18661_flt_eval_method): Set excess_precision_type to
	EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16.
	* c-cppbuiltin.c (cpp_atomic_builtins): Update below comments.
	(c_cpp_flt_eval_method_iec_559): Set excess_precision_type to
	EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16.

gcc/ChangeLog:

	* common.opt: Support -fexcess-precision=16.
	* config/aarch64/aarch64.c (aarch64_excess_precision): Return
	FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when
	EXCESS_PRECISION_TYPE_FLOAT16.
	* config/arm/arm.c (arm_excess_precision): Ditto.
	* config/i386/i386.c (ix86_get_excess_precision): Ditto.
	* config/m68k/m68k.c (m68k_excess_precision): Issue an error
	when EXCESS_PRECISION_TYPE_FLOAT16.
	* config/s390/s390.c (s390_excess_precision): Ditto.
	* coretypes.h (enum excess_precision_type): Add
	EXCESS_PRECISION_TYPE_FLOAT16.
	* doc/tm.texi (TARGET_C_EXCESS_PRECISION): Update documents.
	* doc/tm.texi.in (TARGET_C_EXCESS_PRECISION): Ditto.
	* doc/extend.texi (Half-Precision): Document
	-fexcess-precision=16.
	* flag-types.h (enum excess_precision): Add
	EXCESS_PRECISION_FLOAT16.
	* target.def (excess_precision): Update document.
	* tree.c (excess_precision_type): Set excess_precision_type to
	EXCESS_PRECISION_FLOAT16 when -fexcess-precision=16.

gcc/fortran/ChangeLog:

	* options.c (gfc_post_options): Issue an error for
	-fexcess-precision=16.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/float16-6.c: New test.
	* gcc.target/i386/float16-7.c: New test.
2021-09-08 12:44:49 +08:00
liuhongt a549a9a39a Adjust the wording for x86 _Float16 type.
gcc/ChangeLog:

	* doc/extend.texi: (@node Floating Types): Adjust the wording.
	(@node Half-Precision): Ditto.
2021-09-08 09:10:45 +08:00
GCC Administrator b2748138c0 Daily bump. 2021-09-08 00:16:23 +00:00
Max Filippov b552c4e601 gcc: xtensa: fix PR target/102115
2021-09-07  Takayuki 'January June' Suwa  <jjsuwa_sys3175@yahoo.co.jp>
gcc/
	PR target/102115
	* config/xtensa/xtensa.c (xtensa_emit_move_sequence): Add
	'CONST_INT_P (src)' to the condition of the block that tries to
	eliminate literal when loading integer contant.
2021-09-07 15:40:26 -07:00
Ian Lance Taylor 21b046bade runtime: use hash32, not hash64, for amd64p32, mips64p32, mips64p32le
Fixes PR go/102102

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/348015
2021-09-07 15:05:11 -07:00
David Faust d9996ccb94 doc: BPF CO-RE documentation
Document the new command line options (-mco-re and -mno-co-re), the new
BPF target builtin (__builtin_preserve_access_index), and the new BPF
target attribute (preserve_access_index) introduced with BPF CO-RE.

gcc/ChangeLog:

	* doc/extend.texi (BPF Type Attributes) New node.
	Document new preserve_access_index attribute.
	Document new preserve_access_index builtin.
	* doc/invoke.texi: Document -mco-re and -mno-co-re options.
2021-09-07 13:48:58 -07:00
David Faust f4cdfd4856 bpf testsuite: Add BPF CO-RE tests
This commit adds several tests for the new BPF CO-RE functionality to
the BPF target testsuite.

gcc/testsuite/ChangeLog:

	* gcc.target/bpf/core-attr-1.c: New test.
	* gcc.target/bpf/core-attr-2.c: Likewise.
	* gcc.target/bpf/core-attr-3.c: Likewise.
	* gcc.target/bpf/core-attr-4.c: Likewise
	* gcc.target/bpf/core-builtin-1.c: Likewise
	* gcc.target/bpf/core-builtin-2.c: Likewise.
	* gcc.target/bpf/core-builtin-3.c: Likewise.
	* gcc.target/bpf/core-section-1.c: Likewise.
2021-09-07 13:48:58 -07:00
David Faust 8bdabb3754 bpf: BPF CO-RE support
This commit introduces support for BPF Compile Once - Run
Everywhere (CO-RE) in GCC.

gcc/ChangeLog:

	* config/bpf/bpf.c: Adjust includes.
	(bpf_handle_preserve_access_index_attribute): New function.
	(bpf_attribute_table): Use it here.
	(bpf_builtins): Add BPF_BUILTIN_PRESERVE_ACCESS_INDEX.
	(bpf_option_override): Handle "-mco-re" option.
	(bpf_asm_init_sections): New.
	(TARGET_ASM_INIT_SECTIONS): Redefine.
	(bpf_file_end): New.
	(TARGET_ASM_FILE_END): Redefine.
	(bpf_init_builtins): Add "__builtin_preserve_access_index".
	(bpf_core_compute, bpf_core_get_index): New.
	(is_attr_preserve_access): New.
	(bpf_expand_builtin): Handle new builtins.
	(bpf_core_newdecl, bpf_core_is_maybe_aggregate_access): New.
	(bpf_core_walk): New.
	(bpf_resolve_overloaded_builtin): New.
	(TARGET_RESOLVE_OVERLOADED_BUILTIN): Redefine.
	(handle_attr): New.
	(pass_bpf_core_attr): New RTL pass.
	* config/bpf/bpf-passes.def: New file.
	* config/bpf/bpf-protos.h (make_pass_bpf_core_attr): New.
	* config/bpf/coreout.c: New file.
	* config/bpf/coreout.h: Likewise.
	* config/bpf/t-bpf (TM_H): Add $(srcdir)/config/bpf/coreout.h.
	(coreout.o): New rule.
	(PASSES_EXTRA): Add $(srcdir)/config/bpf/bpf-passes.def.
	* config.gcc (bpf): Add coreout.h to extra_headers.
	Add coreout.o to extra_objs.
	Add $(srcdir)/config/bpf/coreout.c to target_gtfiles.
2021-09-07 13:48:58 -07:00
David Faust 0a2bd52f1a btf: expose get_btf_id
Expose the function get_btf_id, so that it may be used by the BPF
backend. This enables the BPF CO-RE machinery in the BPF backend to
lookup BTF type IDs, in order to create CO-RE relocation records.

A prototype is added in ctfc.h

gcc/ChangeLog:

	* btfout.c (get_btf_id): Function is no longer static.
	* ctfc.h: Expose it here.
2021-09-07 13:48:58 -07:00
David Faust 5b723401b3 ctfc: add function to lookup CTF ID of a TREE type
Add a new function, ctf_lookup_tree_type, to return the CTF type ID
associated with a type via its is TREE node. The function is exposed via
a prototype in ctfc.h.

gcc/ChangeLog:

	* ctfc.c (ctf_lookup_tree_type): New function.
	* ctfc.h: Likewise.
2021-09-07 13:48:58 -07:00
David Faust 44e4ed6a3c ctfc: externalize ctf_dtd_lookup
Expose the function ctf_dtd_lookup, so that it can be used by the BPF
CO-RE machinery. The function is no longer static, and an extern
prototype is added in ctfc.h.

gcc/ChangeLog:

	* ctfc.c (ctf_dtd_lookup): Function is no longer static.
	* ctfc.h: Analogous change.
2021-09-07 13:48:58 -07:00
David Faust 81eced213c dwarf: externalize lookup_type_die
Expose the function lookup_type_die in dwarf2out, so that it can be used
by CTF/BTF when adding BPF CO-RE information. The function is now
non-static, and an extern prototype is added in dwarf2out.h.

gcc/ChangeLog:

	* dwarf2out.c (lookup_type_die): Function is no longer static.
	* dwarf2out.h: Expose it here.
2021-09-07 13:48:57 -07:00
Hans-Peter Nilsson 578cd82af7 Fix fatal typo in gcc.dg/no_profile_instrument_function-attr-2.c
Dejagnu is unfortunately brittle: a syntax error in a
directive can abort the test-run for the current "tool"
(gcc, g++, gfortran), and if you don't check for this
condition or actually read the stdout log yourself, your
tools may make you believe the test was successful without
regressions.  At the very least, always grep for ^ERROR: in
the stdout log!

With r12-3379, the testsuite got such a fatal syntax error,
causing the gcc test-run to abort at (e.g.):

...
FAIL: gcc.dg/memchr.c (test for excess errors)
FAIL: gcc.dg/memcmp-3.c (test for excess errors)
ERROR: (DejaGnu) proc "scan-tree-dump-not\" = foo {\(\)"} optimized" does not exist.
The error code is TCL LOOKUP COMMAND scan-tree-dump-not\"
The info on the error is:
invalid command name "scan-tree-dump-not""
    while executing
"::tcl_unknown scan-tree-dump-not\" = foo {\(\)"} optimized"
    ("uplevel" body line 1)
    invoked from within
"uplevel 1 ::tcl_unknown $args"

		=== gcc Summary ===

# of expected passes		63740
# of unexpected failures	38
# of unexpected successes	2
# of expected failures		351
# of unresolved testcases	3
# of unsupported tests		662
x/cris-elf/gccobj/gcc/xgcc  version 12.0.0 20210907 (experimental)\
 [master r12-3391-g849d5f5929fc] (GCC)

testsuite:
	* gcc.dg/no_profile_instrument_function-attr-2.c: Fix
	typo in last change.
2021-09-07 22:36:59 +02:00
Harald Anlauf 2a1537a19c Fortran - improve error recovery determining array element from constructor
gcc/fortran/ChangeLog:

	PR fortran/101327
	* expr.c (find_array_element): When bounds cannot be determined as
	constant, return error instead of aborting.

gcc/testsuite/ChangeLog:

	PR fortran/101327
	* gfortran.dg/pr101327.f90: New test.
2021-09-07 20:51:49 +02:00
Indu Bhagat 849d5f5929 dwarf2out: Emit BTF in dwarf2out_finish for BPF CO-RE usecase
DWARF generation is split between early and late phases when LTO is in effect.
This poses challenges for CTF/BTF generation especially if late debug info
generation is desirable, as turns out to be the case for BPF CO-RE.

The approach taken here in this patch is:

1. LTO is disabled for BPF CO-RE
The reason to disable LTO for BPF CO-RE is that if LTO is in effect, BPF CO-RE
relocations need to be generated in the LTO link phase _after_ the optimizations
are done. This means we need to devise way to combine early and late BTF. At
this time, in absence of linker support for BTF sections, it makes sense to
steer clear of LTO for BPF CO-RE and bypass the issue.

2. The BPF backend updates the write_symbols with BPF_WITH_CORE_DEBUG to convey
the case that BTF with CO-RE support needs to be generated.  This information
is used by the debug info emission routines to defer the emission of BTF/CO-RE
until dwarf2out_finish.

So, in other words,

dwarf2out_early_finish
  - Always emit CTF here.
  - if (BTF && !BTF_WITH_CORE), emit BTF now.

dwarf2out_finish
  - if (BTF_WITH_CORE) emit BTF now.

gcc/ChangeLog:

	* dwarf2ctf.c (ctf_debug_finalize): Make it static.
	(ctf_debug_early_finish): New definition.
	(ctf_debug_finish): Likewise.
	* dwarf2ctf.h (ctf_debug_finalize): Remove declaration.
	(ctf_debug_early_finish): New declaration.
	(ctf_debug_finish): Likewise.
	* dwarf2out.c (dwarf2out_finish): Invoke ctf_debug_finish.
	(dwarf2out_early_finish): Invoke ctf_debug_early_finish.
2021-09-07 11:18:54 -07:00
Indu Bhagat e29a9607fa bpf: Add new -mco-re option for BPF CO-RE
-mco-re in the BPF backend enables code generation for the CO-RE usecase. LTO is
disabled for CO-RE compilations.

gcc/ChangeLog:

	* config/bpf/bpf.c (bpf_option_override): For BPF backend, disable LTO
	support when compiling for CO-RE.
	* config/bpf/bpf.opt: Add new command line option -mco-re.

gcc/testsuite/ChangeLog:

	* gcc.target/bpf/core-lto-1.c: New test.
2021-09-07 11:17:55 -07:00
Indu Bhagat 053db9a49b debug: Add BTF_WITH_CORE_DEBUG debug format
To best handle BTF/CO-RE in GCC, a distinct BTF_WITH_CORE_DEBUG debug format is
being added.  This helps the compiler detect whether BTF with CO-RE relocations
needs to be emitted.

gcc/ChangeLog:

	* flag-types.h (enum debug_info_type): Add new enum
	DINFO_TYPE_BTF_WITH_CORE.
	(BTF_WITH_CORE_DEBUG): New bitmask.
	* flags.h (btf_with_core_debuginfo_p): New declaration.
	* opts.c (btf_with_core_debuginfo_p): New definition.
2021-09-07 11:16:53 -07:00
Jason Merrill c03db573b9 tree: Change error_operand_p to an inline function
I've thought for a while that many of the macros in tree.h and such should
become inline functions.  This one in particular was confusing Coverity; the
null check in the macro made it think that all code guarded by
error_operand_p would also need null checks.

gcc/ChangeLog:

	* tree.h (error_operand_p): Change to inline function.
2021-09-07 14:09:38 -04:00
Jakub Jelinek 81f9718139 c++: Fix up constexpr evaluation of deleting dtors [PR100495]
We do not save bodies of constexpr clones and instead evaluate the bodies
of the constexpr functions they were cloned from.
I believe that is just fine for constructors because complete vs. base
ctors differ only in classes that have virtual bases and such constructors
aren't constexpr, similarly complete/base destructors.
But as the testcase below shows, for deleting destructors it is not fine,
deleting dtors while marked as clones in fact are just artificial functions
with synthetized body which calls the user destructor and deallocation.

So, either we'd need to evaluate the destructor and afterwards synthetize
and evaluate the deallocation, or we can just save and use the deleting
dtors bodies.  The latter seems much easier to me.

2021-09-07  Jakub Jelinek  <jakub@redhat.com>

	PR c++/100495
	* constexpr.c (maybe_save_constexpr_fundef): Save body even for
	constexpr deleting dtors.
	(cxx_eval_call_expression): Don't use DECL_CLONED_FUNCTION for
	deleting dtors.

	* g++.dg/cpp2a/constexpr-new21.C: New test.
2021-09-07 19:33:28 +02:00
Tobias Burnus ff7bc505b1 libgomp.texi: Extend OpenMP 5.0 Implementation Status
libgomp/
	* libgomp.texi (OpenMP Implementation Status): Extend
	OpenMP 5.0 section.
	(OpenACC Profiling Interface): Fix typo.
2021-09-07 18:30:25 +02:00
Aldy Hernandez 020e2db0a8 Rename forwarder_block_p in treading code to empty_block_with_phis_p.
gcc/ChangeLog:

	* tree-ssa-threadedge.c (forwarder_block_p): Rename to...
	(empty_block_with_phis_p): ...this.
	(potentially_threadable_block): Same.
	(jump_threader::thread_through_normal_block): Same.
2021-09-07 18:04:36 +02:00