187920 Commits

Author SHA1 Message Date
Eric Botcazou
f1f5b1fbbb Give more informative error message for by-reference types
Recent compilers enforce more strictly the RM C.6(18) clause, which says
that volatile record types are by-reference types.  This changes the typical
error message now given in these cases.

gcc/ada/
	* gcc-interface/decl.c (gnat_to_gnu_entity) <is_type>: Declare new
	constant.  Adjust error message issued by validate_size in the case
	of by-reference types.
	(validate_size): Always use the error strings passed by the caller.
2021-09-14 09:42:43 +02:00
liuhongt
ebcdd004ed AVX512FP16: Add testcase for fpclass/getmant/getexp instructions.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-helper.h (V512):
	Add xmm component.
	* gcc.target/i386/avx512fp16-vfpclassph-1a.c: New test.
	* gcc.target/i386/avx512fp16-vfpclassph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vfpclasssh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vfpclasssh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vgetexpph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vgetexpph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vgetexpsh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vgetexpsh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vgetmantph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vgetmantph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vgetmantsh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vgetmantsh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vfpclassph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vfpclassph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vgetexpph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vgetexpph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vgetmantph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vgetmantph-1b.c: Ditto.
2021-09-14 12:34:57 +08:00
liuhongt
8486e9f268 AVX512FP16: Add fpclass/getexp/getmant instructions.
Add vfpclassph/vfpclasssh/vgetexpph/vgetexpsh/vgetmantph/vgetmantsh.

gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm_fpclass_sh_mask):
	New intrinsic.
	(_mm_mask_fpclass_sh_mask): Likewise.
	(_mm512_mask_fpclass_ph_mask): Likewise.
	(_mm512_fpclass_ph_mask): Likewise.
	(_mm_getexp_sh): Likewise.
	(_mm_mask_getexp_sh): Likewise.
	(_mm_maskz_getexp_sh): Likewise.
	(_mm512_getexp_ph): Likewise.
	(_mm512_mask_getexp_ph): Likewise.
	(_mm512_maskz_getexp_ph): Likewise.
	(_mm_getexp_round_sh): Likewise.
	(_mm_mask_getexp_round_sh): Likewise.
	(_mm_maskz_getexp_round_sh): Likewise.
	(_mm512_getexp_round_ph): Likewise.
	(_mm512_mask_getexp_round_ph): Likewise.
	(_mm512_maskz_getexp_round_ph): Likewise.
	(_mm_getmant_sh): Likewise.
	(_mm_mask_getmant_sh): Likewise.
	(_mm_maskz_getmant_sh): Likewise.
	(_mm512_getmant_ph): Likewise.
	(_mm512_mask_getmant_ph): Likewise.
	(_mm512_maskz_getmant_ph): Likewise.
	(_mm_getmant_round_sh): Likewise.
	(_mm_mask_getmant_round_sh): Likewise.
	(_mm_maskz_getmant_round_sh): Likewise.
	(_mm512_getmant_round_ph): Likewise.
	(_mm512_mask_getmant_round_ph): Likewise.
	(_mm512_maskz_getmant_round_ph): Likewise.
	* config/i386/avx512fp16vlintrin.h (_mm_mask_fpclass_ph_mask):
	New intrinsic.
	(_mm_fpclass_ph_mask): Likewise.
	(_mm256_mask_fpclass_ph_mask): Likewise.
	(_mm256_fpclass_ph_mask): Likewise.
	(_mm256_getexp_ph): Likewise.
	(_mm256_mask_getexp_ph): Likewise.
	(_mm256_maskz_getexp_ph): Likewise.
	(_mm_getexp_ph): Likewise.
	(_mm_mask_getexp_ph): Likewise.
	(_mm_maskz_getexp_ph): Likewise.
	(_mm256_getmant_ph): Likewise.
	(_mm256_mask_getmant_ph): Likewise.
	(_mm256_maskz_getmant_ph): Likewise.
	(_mm_getmant_ph): Likewise.
	(_mm_mask_getmant_ph): Likewise.
	(_mm_maskz_getmant_ph): Likewise.
	* config/i386/i386-builtin-types.def: Add corresponding builtin types.
	* config/i386/i386-builtin.def: Add corresponding new builtins.
	* config/i386/i386-expand.c
	(ix86_expand_args_builtin): Handle new builtin types.
	(ix86_expand_round_builtin): Ditto.
	* config/i386/sse.md (vecmemsuffix): Add HF vector modes.
	(<avx512>_getexp<mode><mask_name><round_saeonly_name>): Adjust
	to support HF vector modes.
	(avx512f_sgetexp<mode><mask_scalar_name><round_saeonly_scalar_name):
	Ditto.
	(avx512dq_fpclass<mode><mask_scalar_merge_name>): Ditto.
	(avx512dq_vmfpclass<mode><mask_scalar_merge_name>): Ditto.
	(<avx512>_getmant<mode><mask_name><round_saeonly_name>): Ditto.
	(avx512f_vgetmant<mode><mask_scalar_name><round_saeonly_scalar_name>):
	Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-14 12:34:57 +08:00
liuhongt
b6e944df4e AVX512FP16: Add testcase for vreduceph/vreducesh/vrndscaleph/vrndscalesh.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-helper.h (_ROUND_CUR): New macro.
	* gcc.target/i386/avx512fp16-vreduceph-1a.c: New test.
	* gcc.target/i386/avx512fp16-vreduceph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vreducesh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vreducesh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vrndscaleph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vrndscaleph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vrndscalesh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vrndscalesh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vreduceph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vreduceph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vrndscaleph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vrndscaleph-1b.c: Ditto.
2021-09-14 12:34:57 +08:00
liuhongt
8bed761796 AVX512FP16: Add vreduceph/vreducesh/vrndscaleph/vrndscalesh.
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h (_mm512_reduce_ph):
	New intrinsic.
	(_mm512_mask_reduce_ph): Likewise.
	(_mm512_maskz_reduce_ph): Likewise.
	(_mm512_reduce_round_ph): Likewise.
	(_mm512_mask_reduce_round_ph): Likewise.
	(_mm512_maskz_reduce_round_ph): Likewise.
	(_mm_reduce_sh): Likewise.
	(_mm_mask_reduce_sh): Likewise.
	(_mm_maskz_reduce_sh): Likewise.
	(_mm_reduce_round_sh): Likewise.
	(_mm_mask_reduce_round_sh): Likewise.
	(_mm_maskz_reduce_round_sh): Likewise.
	(_mm512_roundscale_ph): Likewise.
	(_mm512_mask_roundscale_ph): Likewise.
	(_mm512_maskz_roundscale_ph): Likewise.
	(_mm512_roundscale_round_ph): Likewise.
	(_mm512_mask_roundscale_round_ph): Likewise.
	(_mm512_maskz_roundscale_round_ph): Likewise.
	(_mm_roundscale_sh): Likewise.
	(_mm_mask_roundscale_sh): Likewise.
	(_mm_maskz_roundscale_sh): Likewise.
	(_mm_roundscale_round_sh): Likewise.
	(_mm_mask_roundscale_round_sh): Likewise.
	(_mm_maskz_roundscale_round_sh): Likewise.
	* config/i386/avx512fp16vlintrin.h: (_mm_reduce_ph):
	New intrinsic.
	(_mm_mask_reduce_ph): Likewise.
	(_mm_maskz_reduce_ph): Likewise.
	(_mm256_reduce_ph): Likewise.
	(_mm256_mask_reduce_ph): Likewise.
	(_mm256_maskz_reduce_ph): Likewise.
	(_mm_roundscale_ph): Likewise.
	(_mm_mask_roundscale_ph): Likewise.
	(_mm_maskz_roundscale_ph): Likewise.
	(_mm256_roundscale_ph): Likewise.
	(_mm256_mask_roundscale_ph): Likewise.
	(_mm256_maskz_roundscale_ph): Likewise.
	* config/i386/i386-builtin-types.def: Add corresponding builtin types.
	* config/i386/i386-builtin.def: Add corresponding new builtins.
	* config/i386/i386-expand.c
	(ix86_expand_args_builtin): Handle new builtin types.
	(ix86_expand_round_builtin): Ditto.
	* config/i386/sse.md (<mask_codefor>reducep<mode><mask_name>):
	Renamed to ...
	(<mask_codefor>reducep<mode><mask_name><round_saeonly_name>):
	... this, and adjust for round operands.
	(reduces<mode><mask_scalar_name>): Likewise, with ...
	(reduces<mode><mask_scalar_name><round_saeonly_scalar_name):
	... this.
	(<avx512>_rndscale<mode><mask_name><round_saeonly_name>):
	Adjust for HF vector modes.
	(avx512f_rndscale<mode><mask_scalar_name><round_saeonly_scalar_name>):
	Ditto.
	(*avx512f_rndscale<mode><round_saeonly_name>): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-14 12:34:57 +08:00
liuhongt
03f0cbccb6 AVX512FP16: Add testcase for vrcpph/vrcpsh/vscalefph/vscalefsh.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-vrcpph-1a.c: New test.
	* gcc.target/i386/avx512fp16-vrcpph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vrcpsh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vrcpsh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vscalefph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vscalefph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vscalefsh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vscalefsh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vrcpph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vrcpph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vscalefph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vscalefph-1b.c: Ditto.
2021-09-14 12:34:56 +08:00
liuhongt
bf4c12404f AVX512FP16: Add vrcpph/vrcpsh/vscalefph/vscalefsh.
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h: (_mm512_rcp_ph):
	New intrinsic.
	(_mm512_mask_rcp_ph): Likewise.
	(_mm512_maskz_rcp_ph): Likewise.
	(_mm_rcp_sh): Likewise.
	(_mm_mask_rcp_sh): Likewise.
	(_mm_maskz_rcp_sh): Likewise.
	(_mm512_scalef_ph): Likewise.
	(_mm512_mask_scalef_ph): Likewise.
	(_mm512_maskz_scalef_ph): Likewise.
	(_mm512_scalef_round_ph): Likewise.
	(_mm512_mask_scalef_round_ph): Likewise.
	(_mm512_maskz_scalef_round_ph): Likewise.
	(_mm_scalef_sh): Likewise.
	(_mm_mask_scalef_sh): Likewise.
	(_mm_maskz_scalef_sh): Likewise.
	(_mm_scalef_round_sh): Likewise.
	(_mm_mask_scalef_round_sh): Likewise.
	(_mm_maskz_scalef_round_sh): Likewise.
	* config/i386/avx512fp16vlintrin.h (_mm_rcp_ph):
	New intrinsic.
	(_mm256_rcp_ph): Likewise.
	(_mm_mask_rcp_ph): Likewise.
	(_mm256_mask_rcp_ph): Likewise.
	(_mm_maskz_rcp_ph): Likewise.
	(_mm256_maskz_rcp_ph): Likewise.
	(_mm_scalef_ph): Likewise.
	(_mm256_scalef_ph): Likewise.
	(_mm_mask_scalef_ph): Likewise.
	(_mm256_mask_scalef_ph): Likewise.
	(_mm_maskz_scalef_ph): Likewise.
	(_mm256_maskz_scalef_ph): Likewise.
	* config/i386/i386-builtin.def: Add new builtins.
	* config/i386/sse.md (VFH_AVX512VL): New.
	(avx512fp16_rcp<mode>2<mask_name>): Ditto.
	(avx512fp16_vmrcpv8hf2<mask_scalar_name>): Ditto.
	(avx512f_vmscalef<mode><mask_scalar_name><round_scalar_name>):
	Adjust to support HF vector modes.
	(<avx512>_scalef<mode><mask_name><round_name>): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-14 12:34:56 +08:00
liuhongt
c63657291c AVX512FP16: Add testcase for vsqrtph/vsqrtsh/vrsqrtph/vrsqrtsh.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512fp16-vrsqrtph-1a.c: New test.
	* gcc.target/i386/avx512fp16-vrsqrtph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vrsqrtsh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vrsqrtsh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vsqrtph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vsqrtph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16-vsqrtsh-1a.c: Ditto.
	* gcc.target/i386/avx512fp16-vsqrtsh-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vrsqrtph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vrsqrtph-1b.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vsqrtph-1a.c: Ditto.
	* gcc.target/i386/avx512fp16vl-vsqrtph-1b.c: Ditto.
2021-09-14 12:34:56 +08:00
liuhongt
4204740f64 AVX512FP16: Add vsqrtph/vrsqrtph/vsqrtsh/vrsqrtsh.
gcc/ChangeLog:

	* config/i386/avx512fp16intrin.h: (_mm512_sqrt_ph):
	New intrinsic.
	(_mm512_mask_sqrt_ph): Likewise.
	(_mm512_maskz_sqrt_ph): Likewise.
	(_mm512_sqrt_round_ph): Likewise.
	(_mm512_mask_sqrt_round_ph): Likewise.
	(_mm512_maskz_sqrt_round_ph): Likewise.
	(_mm512_rsqrt_ph): Likewise.
	(_mm512_mask_rsqrt_ph): Likewise.
	(_mm512_maskz_rsqrt_ph): Likewise.
	(_mm_rsqrt_sh): Likewise.
	(_mm_mask_rsqrt_sh): Likewise.
	(_mm_maskz_rsqrt_sh): Likewise.
	(_mm_sqrt_sh): Likewise.
	(_mm_mask_sqrt_sh): Likewise.
	(_mm_maskz_sqrt_sh): Likewise.
	(_mm_sqrt_round_sh): Likewise.
	(_mm_mask_sqrt_round_sh): Likewise.
	(_mm_maskz_sqrt_round_sh): Likewise.
	* config/i386/avx512fp16vlintrin.h (_mm_sqrt_ph): New intrinsic.
	(_mm256_sqrt_ph): Likewise.
	(_mm_mask_sqrt_ph): Likewise.
	(_mm256_mask_sqrt_ph): Likewise.
	(_mm_maskz_sqrt_ph): Likewise.
	(_mm256_maskz_sqrt_ph): Likewise.
	(_mm_rsqrt_ph): Likewise.
	(_mm256_rsqrt_ph): Likewise.
	(_mm_mask_rsqrt_ph): Likewise.
	(_mm256_mask_rsqrt_ph): Likewise.
	(_mm_maskz_rsqrt_ph): Likewise.
	(_mm256_maskz_rsqrt_ph): Likewise.
	* config/i386/i386-builtin-types.def: Add corresponding builtin types.
	* config/i386/i386-builtin.def: Add corresponding new builtins.
	* config/i386/i386-expand.c
	(ix86_expand_args_builtin): Handle new builtins.
	(ix86_expand_round_builtin): Ditto.
	* config/i386/sse.md (VF_AVX512FP16VL): New.
	(sqrt<mode>2): Adjust for HF vector modes.
	(<sse>_sqrt<mode>2<mask_name><round_name>): Likewise.
	(<sse>_vmsqrt<mode>2<mask_scalar_name><round_scalar_name>):
	Likewise.
	(<sse>_rsqrt<mode>2<mask_name>): New.
	(avx512fp16_vmrsqrtv8hf2<mask_scalar_name>): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx-1.c: Add test for new builtins.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/i386/sse-14.c: Add test for new intrinsics.
	* gcc.target/i386/sse-22.c: Ditto.
2021-09-14 12:34:56 +08:00
Jason Merrill
22abfa3166 c++: Fix warning on 32-bit x86
My C++17 hardware interference sizes patch caused a bogus warning on 32-bit
x86, where we have a default L1 cache line size of 0, and the front end
complained that the default constructive interference size of 64 was larger
than that.

gcc/cp/ChangeLog:

	* decl.c (cxx_init_decl_processing): Don't warn if L1 cache line
	size is smaller than maxalign.
2021-09-13 23:16:39 -04:00
GCC Administrator
07985c47dc Daily bump. 2021-09-14 00:16:23 +00:00
Harald Anlauf
104c05c528 Fortran - ensure simplification of bounds of array-valued named constants
gcc/fortran/ChangeLog:

	PR fortran/82314
	* decl.c (add_init_expr_to_sym): For proper initialization of
	array-valued named constants the array bounds need to be
	simplified before adding the initializer.

gcc/testsuite/ChangeLog:

	PR fortran/82314
	* gfortran.dg/pr82314.f90: New test.
2021-09-13 19:28:10 +02:00
Harald Anlauf
8d93ba93d3 Fortran - fix handling of substring start and end indices
gcc/fortran/ChangeLog:

	PR fortran/85130
	* expr.c (find_substring_ref): Handle given substring start and
	end indices as signed integers, not unsigned.

gcc/testsuite/ChangeLog:

	PR fortran/85130
	* gfortran.dg/substr_6.f90: Revert commit r8-7574, adding again
	test that was erroneously considered as illegal.
2021-09-13 19:26:35 +02:00
Thomas Schwinge
6c79057fae Don't maintain a warning spec for 'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' [PR101574]
This resolves PR101574 "gcc/sparseset.h:215:20: error: suggest parentheses
around assignment used as truth value [-Werror=parentheses]", as (bogusly)
reported at commit a61f6afbee370785cf091fe46e2e022748528307:

    In file included from [...]/source-gcc/gcc/lra-lives.c:43:
    [...]/source-gcc/gcc/lra-lives.c: In function ‘void make_hard_regno_dead(int)’:
    [...]/source-gcc/gcc/sparseset.h:215:20: error: suggest parentheses around assignment used as truth value [-Werror=parentheses]
      215 |        && (((ITER) = sparseset_iter_elm (SPARSESET)) || 1);             \
          |            ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    [...]/source-gcc/gcc/lra-lives.c:304:3: note: in expansion of macro ‘EXECUTE_IF_SET_IN_SPARSESET’
      304 |   EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i)
          |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~

	gcc/
	PR bootstrap/101574
	* diagnostic-spec.c (warning_suppressed_at, copy_warning): Handle
	'RESERVED_LOCATION_P' locations.
	* warning-control.cc (get_nowarn_spec, suppress_warning)
	(copy_warning): Likewise.
2021-09-13 18:38:52 +02:00
Thomas Schwinge
1985392242 Clarify 'key_type_t' to 'location_t' as used for 'gcc/diagnostic-spec.h:nowarn_map'
To make it obvious what exactly the key type is.  No change in behavior.

	gcc/
	* diagnostic-spec.h (typedef xint_hash_t): Use 'location_t' instead of...
	(typedef key_type_t): ... this.  Remove.
	(nowarn_map): Document.
	* diagnostic-spec.c (nowarn_map): Likewise.
	* warning-control.cc (convert_to_key): Evolve functions into...
	(get_location): ... these.  Adjust all users.
2021-09-13 18:38:51 +02:00
Thomas Schwinge
accf94329d Simplify 'gcc/diagnostic-spec.h:nowarn_map' setup
If we've just read something from the map, we can be sure that it exists.

	gcc/
	* warning-control.cc (copy_warning): Remove 'nowarn_map' setup.
2021-09-13 18:38:50 +02:00
Jason Merrill
76b75018b3 c++: implement C++17 hardware interference size
The last missing piece of the C++17 standard library is the hardware
intereference size constants.  Much of the delay in implementing these has
been due to uncertainty about what the right values are, and even whether
there is a single constant value that is suitable; the destructive
interference size is intended to be used in structure layout, so program
ABIs will depend on it.

In principle, both of these values should be the same as the target's L1
cache line size.  When compiling for a generic target that is intended to
support a range of target CPUs with different cache line sizes, the
constructive size should probably be the minimum size, and the destructive
size the maximum, unless you are constrained by ABI compatibility with
previous code.

From discussion on gcc-patches, I've come to the conclusion that the
solution to the difficulty of choosing stable values is to give up on it,
and instead encourage only uses where ABI stability is unimportant: in
particular, uses where the ABI is shared at most between translation units
built at the same time with the same flags.

To that end, I've added a warning for any use of the constant value of
std::hardware_destructive_interference_size in a header or module export.
Appropriate uses within a project can disable the warning.

A previous iteration of this patch included an -finterference-tune flag to
make the value vary with -mtune; this iteration makes that the default
behavior, which should be appropriate for all reasonable uses of the
variable.  The previous default of "stable-ish" seems to me likely to have
been more of an attractive nuisance; since we can't promise actual
stability, we should instead make proper uses more convenient.

JF Bastien's implementation proposal is summarized at
https://github.com/itanium-cxx-abi/cxx-abi/issues/74

I implement this by adding new --params for the two sizes.  Targets can
override these values in targetm.target_option.override() to support a range
of values for the generic target; otherwise, both will default to the L1
cache line size.

64 bytes still seems correct for all x86.

I'm not sure why he proposed 64/64 for generic 32-bit ARM, since the Cortex
A9 has a 32-byte cache line, so I'd think 32/64 would make more sense.

He proposed 64/128 for generic AArch64, but since the A64FX now has a 256B
cache line, I've changed that to 64/256.

Other arch maintainers are invited to set ranges for their generic targets
if that seems better than using the default cache line size for both values.

With the above choice to reject stability as a goal, getting these values
"right" is now just a matter of what we want the default optimization to be,
and we can feel free to adjust them as CPUs with different cache lines
become more and less common.

gcc/ChangeLog:

	* params.opt: Add destructive-interference-size and
	constructive-interference-size.
	* doc/invoke.texi: Document them.
	* config/aarch64/aarch64.c (aarch64_override_options_internal):
	Set them.
	* config/arm/arm.c (arm_option_override): Set them.
	* config/i386/i386-options.c (ix86_option_override_internal):
	Set them.

gcc/c-family/ChangeLog:

	* c.opt: Add -Winterference-size.
	* c-cppbuiltin.c (cpp_atomic_builtins): Add __GCC_DESTRUCTIVE_SIZE
	and __GCC_CONSTRUCTIVE_SIZE.

gcc/cp/ChangeLog:

	* constexpr.c (maybe_warn_about_constant_value):
	Complain about std::hardware_destructive_interference_size.
	(cxx_eval_constant_expression): Call it.
	* decl.c (cxx_init_decl_processing): Check
	--param *-interference-size values.

libstdc++-v3/ChangeLog:

	* include/std/version: Define __cpp_lib_hardware_interference_size.
	* libsupc++/new: Define hardware interference size variables.

gcc/testsuite/ChangeLog:

	* g++.dg/warn/Winterference.H: New file.
	* g++.dg/warn/Winterference.C: New test.
	* g++.target/aarch64/interference.C: New test.
	* g++.target/arm/interference.C: New test.
	* g++.target/i386/interference.C: New test.
2021-09-13 12:28:06 -04:00
Martin Liska
8ea292591e i386: support micro-levels in target{,_clone} attrs [PR101696]
As mentioned in the PR, we do miss supports target micro-architectures
in target and target_clone attribute. While the levels
x86-64 x86-64-v2 x86-64-v3 x86-64-v4 are supported values by -march
option, they are actually only aliases for k8 CPU. That said, they are more
closer to __builtin_cpu_supports function and we decided to implement
it there.

	PR target/101696

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (cpu_indicator_init): Add support
	for x86-64 micro levels for __builtin_cpu_supports.
	* common/config/i386/i386-cpuinfo.h (enum feature_priority):
	Add priorities for the micro-arch levels.
	(enum processor_features): Add new features.
	* common/config/i386/i386-isas.h: Add micro-arch features.
	* config/i386/i386-builtins.c (get_builtin_code_for_version):
	Support the micro-arch levels by callsing
	__builtin_cpu_supports.
	* doc/extend.texi: Document that the levels are support by
	  __builtin_cpu_supports.

gcc/testsuite/ChangeLog:

	* g++.target/i386/mv30.C: New test.
	* gcc.target/i386/mvc16.c: New test.
	* gcc.target/i386/builtin_target.c (CHECK___builtin_cpu_supports):
	New.

Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
2021-09-13 17:24:48 +02:00
Andrew Pinski
03312cbd54 [aarch64] Fix target/95969: __builtin_aarch64_im_lane_boundsi interferes with gimple
This patch adds simple folding of __builtin_aarch64_im_lane_boundsi where
we are not going to error out. It fixes the problem by the removal
of the function from the IR.

OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.

gcc/ChangeLog:

	PR target/95969
	* config/aarch64/aarch64-builtins.c (aarch64_fold_builtin_lane_check):
	New function.
	(aarch64_general_fold_builtin): Handle AARCH64_SIMD_BUILTIN_LANE_CHECK.
	(aarch64_general_gimple_fold_builtin): Likewise.

gcc/testsuite/ChangeLog:

	PR target/95969
	* gcc.target/aarch64/lane-bound-1.c: New test.
	* gcc.target/aarch64/lane-bound-2.c: New test.
2021-09-13 15:19:05 +00:00
Andrew Pinski
20f3c16820 Remove m32r{,le}-*-linux* support from GCC
m32r support never made it to glibc and the support for the Linux kernel
was removed with 4.18. It does not remove much but no reason to keep
around a port which never worked or one which the support in other
projects is gone.

OK? Checked to make sure m32r-linux and m32rle-linux were rejected
when building.

contrib/ChangeLog:

	* config-list.mk: Remove m32r-linux and m32rle-linux
	from the list.

gcc/ChangeLog:

	* config.gcc: Add m32r-*-linux* and m32rle-*-linux*
	to the Unsupported targets list.
	Remove support for m32r-*-linux* and m32rle-*-linux*.
	* config/m32r/linux.h: Removed.
	* config/m32r/t-linux: Removed.

libgcc/ChangeLog:

	* config.host: Remove m32r-*-linux* and m32rle-*-linux*.
	* config/m32r/libgcc-glibc.ver: Removed.
	* config/m32r/t-linux: Removed.
2021-09-13 15:16:56 +00:00
Andrew Pinski
9e58de3ce0 Fix PR lto/49664: liblto_plugin.so exports too many symbols
So right now liblto_plugin.so exports many libiberty symbols and
simple_object file symbols but really it just needs to export onload.

This fixes the problem by using "-export-symbols-regex onload" on
the libtool link line.

lto-plugin/ChangeLog:

	PR lto/49664
	* Makefile.am: Export only onload.
	* Makefile.in: Regenerate.
2021-09-13 15:16:56 +00:00
Kyrylo Tkachov
512b383534 aarch64: PR target/102252 Invalid addressing mode for SVE load predicate
In the testcase we generate invalid assembly for an SVE load predicate instruction.
The RTL for the insn is:
(insn 9 8 10 (set (reg:VNx16BI 68 p0)
        (mem:VNx16BI (plus:DI (mult:DI (reg:DI 1 x1 [93])
                    (const_int 8 [0x8]))
                (reg/f:DI 0 x0 [92])) [2 work_3(D)->array[offset_4(D)]+0 S8 A16]))

That addressing mode is not valid for the instruction [1] as it only accepts the addressing mode:
[<Xn|SP>{, #<imm>, MUL VL}]

This patch rejects the register index form for SVE predicate modes.

Bootstrapped and tested on aarch64-none-linux-gnu.

[1] https://developer.arm.com/documentation/ddi0602/2021-06/SVE-Instructions/LDR--predicate---Load-predicate-register-

gcc/ChangeLog:

	PR target/102252
	* config/aarch64/aarch64.c (aarch64_classify_address): Don't allow
	register index for SVE predicate modes.

gcc/testsuite/ChangeLog:

	PR target/102252
	* g++.target/aarch64/sve/pr102252.C: New test.
2021-09-13 15:41:54 +01:00
Aldy Hernandez
c7a669af0a Remove references to FSM threads.
Now that the jump thread back registry has been split into the generic
copier and the custom (old) copier, it becomes trivial to remove the
FSM bits from the jump threaders.

First, there's no need for an EDGE_FSM_THREAD type.  The only reason
we were looking at the threading type was to determine what type of
copier to use, and now that the copier has been split, there's no need
to even look.  However, there is one check in register_jump_thread
where we verify that only the generic copier can thread through
back-edges.  I've removed that check in favor of a flag passed to the
constructor.

I've also removed all the FSM references from the code and tests.
Interestingly, some tests weren't even testing the right thing.  They
were testing for "FSM" which would catch jump thread paths as well as
the backward threader *failing* on registering a path.  *big eye roll*

The only remaining code that was actually checking for EDGE_FSM_THREAD
was adjust_paths_after_duplication, and the checks could be written
without looking at the edge type at all.  For the record, the code
there is horrible: it's convoluted, hard to read, and doesn't have any
tests.  I'd smack myself if I could go back in time.

All that remains are the FSM references in the --param's themselves.
I think we should s/fsm/threader/, since I envision a day when we can
share the cost basis code between the threaders.  However, I don't
know what the proper procedure is for renaming existing compiler
options.

By the way, param_fsm_maximum_phi_arguments is no longer relevant
after the rewrite.  We can nuke that one right away.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadbackward.c
	(back_threader_profitability::profitable_path_p): Remove FSM
	references.
	(back_threader_registry::register_path): Same.
	* tree-ssa-threadedge.c
	(jump_threader::simplify_control_stmt_condition): Same.
	* tree-ssa-threadupdate.c (jt_path_registry::jt_path_registry):
	Add backedge_threads argument.
	(fwd_jt_path_registry::fwd_jt_path_registry): Pass
	backedge_threads argument.
	(back_jt_path_registry::back_jt_path_registry):  Same.
	(dump_jump_thread_path): Adjust for FSM removal.
	(back_jt_path_registry::rewire_first_differing_edge): Same.
	(back_jt_path_registry::adjust_paths_after_duplication): Same.
	(back_jt_path_registry::update_cfg): Same.
	(jt_path_registry::register_jump_thread): Same.
	* tree-ssa-threadupdate.h (enum jump_thread_edge_type): Remove
	EDGE_FSM_THREAD.
	(class back_jt_path_registry): Add backedge_threads to
	constructor.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr21417.c: Adjust for FSM removal.
	* gcc.dg/tree-ssa/pr66752-3.c: Same.
	* gcc.dg/tree-ssa/pr68198.c: Same.
	* gcc.dg/tree-ssa/pr69196-1.c: Same.
	* gcc.dg/tree-ssa/pr70232.c: Same.
	* gcc.dg/tree-ssa/pr77445.c: Same.
	* gcc.dg/tree-ssa/ranger-threader-4.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-13.c: Same.
2021-09-13 16:34:48 +02:00
Patrick Palka
c8b2b89358 c++: parameter pack inside constexpr if [PR101764]
Here when partially instantiating the first pack expansion, substitution
into the condition of the constexpr if yields a still-dependent tree, so
tsubst_expr returns an IF_STMT with an unsubstituted IF_COND and with
IF_STMT_EXTRA_ARGS added to.  Hence after partial instantiation the pack
expansion pattern still refers to the unlowered parameter pack 'ts' of
level 2, and it's thusly recorded in the new PACK_EXPANSION_PARAMETER_PACKS.
During the subsequent final instantiation of the regenerated lambda we
crash in tsubst_pack_expansion because it can't find an argument pack
for this unlowered 'ts', due to the level mismatch.  (Likewise when the
constexpr if is replaced by a requires-expr, which also uses the extra
args mechanism for avoiding partial instantiation.)

So essentially, a pack expansion pattern that contains an "extra args"
tree doesn't play well with partial instantiation.  This patch fixes
this by forcing such pack expansions to use the extra args mechanism as
well.

	PR c++/101764

gcc/cp/ChangeLog:

	* cp-tree.h (PACK_EXPANSION_FORCE_EXTRA_ARGS_P): New accessor
	macro.
	* pt.c (has_extra_args_mechanism_p): New function.
	(find_parameter_pack_data::found_extra_args_tree_p): New data
	member.
	(find_parameter_packs_r): Set ppd->found_extra_args_tree_p
	appropriately.
	(make_pack_expansion): Set PACK_EXPANSION_FORCE_EXTRA_ARGS_P if
	ppd.found_extra_args_tree_p.
	(use_pack_expansion_extra_args_p): Return true if there were
	unsubstituted packs and PACK_EXPANSION_FORCE_EXTRA_ARGS_P.
	(tsubst_pack_expansion): Pass the pack expansion to
	use_pack_expansion_extra_args_p.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1z/constexpr-if35.C: New test.
2021-09-13 10:29:32 -04:00
Martin Liska
90ac6edc3c c++: fix -fsanitize-coverage=trace-pc ICE [PR101331]
PR c++/101331

gcc/ChangeLog:

	* asan.h (sanitize_coverage_p): Handle when fn == NULL.

gcc/testsuite/ChangeLog:

	* g++.dg/pr101331.C: New test.
2021-09-13 15:34:23 +02:00
Aldy Hernandez
a7f59856ea Adjust ssa-dom-thread-7.c on aarch64.
gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust for aarch64.
2021-09-13 14:25:48 +02:00
H.J. Lu
5b01bfeb87 x86: Add TARGET_AVX256_[MOVE|STORE]_BY_PIECES
1. Add TARGET_AVX256_MOVE_BY_PIECES to perform move by-pieces operation
with 256-bit AVX instructions.
2. Add TARGET_AVX256_STORE_BY_PIECES to perform move and store by-pieces
operations with 256-bit AVX instructions.

They are enabled only for Intel Alder Lake and Intel processors with
AVX512.

gcc/

	PR target/101935
	* config/i386/i386.h (TARGET_AVX256_MOVE_BY_PIECES): New.
	(TARGET_AVX256_STORE_BY_PIECES): Likewise.
	(MOVE_MAX): Check TARGET_AVX256_MOVE_BY_PIECES and
	TARGET_AVX256_STORE_BY_PIECES instead of
	TARGET_AVX256_SPLIT_UNALIGNED_LOAD and
	TARGET_AVX256_SPLIT_UNALIGNED_STORE.
	(STORE_MAX_PIECES): Check TARGET_AVX256_STORE_BY_PIECES instead
	of TARGET_AVX256_SPLIT_UNALIGNED_STORE.
	* config/i386/x86-tune.def (X86_TUNE_AVX256_MOVE_BY_PIECES): New.
	(X86_TUNE_AVX256_STORE_BY_PIECES): Likewise.

gcc/testsuite/

	PR target/101935
	* g++.target/i386/pr80566-1.C: Add
	-mtune-ctrl=avx256_store_by_pieces.
	* gcc.target/i386/pr100865-4a.c: Likewise.
	* gcc.target/i386/pr100865-10a.c: Likewise.
	* gcc.target/i386/pr90773-20.c: Likewise.
	* gcc.target/i386/pr90773-21.c: Likewise.
	* gcc.target/i386/pr90773-22.c: Likewise.
	* gcc.target/i386/pr90773-23.c: Likewise.
	* g++.target/i386/pr80566-2.C: Add
	-mtune-ctrl=avx256_move_by_pieces.
	* gcc.target/i386/eh_return-1.c: Likewise.
	* gcc.target/i386/pr90773-26.c: Likewise.
	* gcc.target/i386/pieces-memcpy-12.c: Replace -mtune=haswell
	with -mtune-ctrl=avx256_move_by_pieces.
	* gcc.target/i386/pieces-memcpy-15.c: Likewise.
	* gcc.target/i386/pieces-memset-2.c: Replace -mtune=haswell
	with -mtune-ctrl=avx256_store_by_pieces.
	* gcc.target/i386/pieces-memset-5.c: Likewise.
	* gcc.target/i386/pieces-memset-11.c: Likewise.
	* gcc.target/i386/pieces-memset-14.c: Likewise.
	* gcc.target/i386/pieces-memset-20.c: Likewise.
	* gcc.target/i386/pieces-memset-23.c: Likewise.
	* gcc.target/i386/pieces-memset-29.c: Likewise.
	* gcc.target/i386/pieces-memset-30.c: Likewise.
	* gcc.target/i386/pieces-memset-33.c: Likewise.
	* gcc.target/i386/pieces-memset-34.c: Likewise.
	* gcc.target/i386/pieces-memset-44.c: Likewise.
	* gcc.target/i386/pieces-memset-37.c: Replace -mtune=generic
	with -mtune-ctrl=avx256_store_by_pieces.
2021-09-13 19:55:29 +08:00
liuhongt
c8e4cb8adf Use gen_lowpart_if_possible instead of gen_lowpart to avoid ICE.
gcc/ChangeLog:

	PR bootstrap/102302
	* expmed.c (extract_bit_field_using_extv): Use
	gen_lowpart_if_possible instead of gen_lowpart to avoid ICE.
2021-09-13 19:52:23 +08:00
Aldy Hernandez
924326b3e0 Move pointer_equiv_analyzer to new file.
We need to use the pointer equivalence tracking from evrp in the jump
threader.  Instead of moving it to some *evrp.h header, it's cleaner for
it to live in its own file, since it's completely independent and not
evrp specific.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* Makefile.in (OBJS): Add value-pointer-equiv.o.
	* gimple-ssa-evrp.c (class ssa_equiv_stack): Move to
	value-pointer-equiv.*.
	(ssa_equiv_stack::ssa_equiv_stack): Same.
	(ssa_equiv_stack::enter): Same.
	(ssa_equiv_stack::leave): Same.
	(ssa_equiv_stack::push_replacement): Same.
	(ssa_equiv_stack::get_replacement): Same.
	(is_pointer_ssa): Same.
	(class pointer_equiv_analyzer): Same.
	(pointer_equiv_analyzer::pointer_equiv_analyzer): Same.
	(pointer_equiv_analyzer::~pointer_equiv_analyzer): Same.
	(pointer_equiv_analyzer::set_global_equiv): Same.
	(pointer_equiv_analyzer::set_cond_equiv): Same.
	(pointer_equiv_analyzer::get_equiv): Same.
	(pointer_equiv_analyzer::enter): Same.
	(pointer_equiv_analyzer::leave): Same.
	(pointer_equiv_analyzer::get_equiv_expr): Same.
	(pta_valueize): Same.
	(pointer_equiv_analyzer::visit_stmt): Same.
	(pointer_equiv_analyzer::visit_edge): Same.
	(hybrid_folder::value_of_expr): Same.
	(hybrid_folder::value_on_edge): Same.
	* value-pointer-equiv.cc: New file.
	* value-pointer-equiv.h: New file.
2021-09-13 13:31:35 +02:00
Richard Earnshaw
5f6a6c91d7 gimple: allow more folding of memcpy [PR102125]
The current restriction on folding memcpy to a single element of size
MOVE_MAX is excessively cautious on most machines and limits some
significant further optimizations.  So relax the restriction provided
the copy size does not exceed MOVE_MAX * MOVE_RATIO and that a SET
insn exists for moving the value into machine registers.

Note that there were already checks in place for having misaligned
move operations when one or more of the operands were unaligned.

On Arm this now permits optimizing

uint64_t bar64(const uint8_t *rData1)
{
    uint64_t buffer;
    memcpy(&buffer, rData1, sizeof(buffer));
    return buffer;
}

from
        ldr     r2, [r0]        @ unaligned
        sub     sp, sp, #8
        ldr     r3, [r0, #4]    @ unaligned
        strd    r2, [sp]
        ldrd    r0, [sp]
        add     sp, sp, #8

to
        mov     r3, r0
        ldr     r0, [r0]        @ unaligned
        ldr     r1, [r3, #4]    @ unaligned

PR target/102125 - (ARM Cortex-M3 and newer) missed optimization. memcpy not needed operations

gcc/ChangeLog:

	PR target/102125
	* gimple-fold.c (gimple_fold_builtin_memory_op): Allow folding
	memcpy if the size is not more than MOVE_MAX * MOVE_RATIO.
2021-09-13 11:26:48 +01:00
Richard Earnshaw
f0cfd070b6 arm: expand handling of movmisalign for DImode [PR102125]
DImode is currently handled only for machines with vector modes
enabled, but this is unduly restrictive and is generally better done
in core registers.

gcc/ChangeLog:

	PR target/102125
	* config/arm/arm.md (movmisaligndi): New define_expand.
	* config/arm/vec-common.md (movmisalign<mode>): Iterate over VDQ mode.
2021-09-13 11:26:48 +01:00
Richard Earnshaw
408e8b9066 rtl: directly handle MEM in gen_highpart [PR102125]
gen_lowpart_general handles forming a lowpart of a MEM by using
adjust_address to rework and validate a new version of the MEM.
Do the same for gen_highpart rather than calling simplify_gen_subreg
for this case.

gcc/ChangeLog:

	PR target/102125
	* emit-rtl.c (gen_highpart): Use adjust_address to handle
	MEM rather than calling simplify_gen_subreg.
2021-09-13 11:26:47 +01:00
Jan-Benedict Glaw
c012297c9d cr16-elf is now obsoleted
As we are still building it for ./contrib/config-list.mk, let's add
--enable-obsolete so this has a chance to work.

contrib/ChangeLog:

	* config-list.mk (LIST): --enable-obsolete for cr16-elf.
2021-09-13 12:13:17 +02:00
Jan-Benedict Glaw
f42e95a830 Fix multi-statment macro
INIT_CUMULATIVE_ARGS() expands to multiple statements, which will break right
after an `if` statement. Wrap it into a block.

gcc/ChangeLog:

	* config/alpha/vms.h (INIT_CUMULATIVE_ARGS): Wrap multi-statment
	define into a block.
2021-09-13 12:08:25 +02:00
Richard Biener
c86de344f8 Remove DARWIN_PREFER_DWARF and dead code
This removes the always defined DARWIN_PREFER_DWARF and the code
guarded by it being not defined, removing the possibility to
default some i386 darwin configurations to STABS when it would
not be defined.

2021-09-10  Richard Biener  <rguenther@suse.de>

	* config/darwin.h (DARWIN_PREFER_DWARF): Do not define.
	* config/i386/darwin.h (PREFERRED_DEBUGGING_TYPE): Do not
	change based on DARWIN_PREFER_DWARF not being defined.
2021-09-13 11:32:40 +02:00
Richard Biener
2071a0ed77 Fix i686-lynx build breakage
With the last adjustment I failed to remove a stray undef of
PREFERRED_DEBUGGING_TYPE from config/i386/lynx.h

2021-09-13  Richard Biener  <rguenther@suse.de>

	* config/i386/lynx.h: Remove undef of PREFERRED_DEBUGGING_TYPE
	to inherit from elfos.h
2021-09-13 11:32:40 +02:00
Richard Biener
a7348a1833 Add cr16-*-* to the list of obsoleted targets
This adds cr16-*-* to the list of obsoleted targets in config.gcc

2021-09-13  Richard Biener  <rguenther@suse.de>

	* config.gcc: Add cr16-*-* to the list of obsoleted targets.
2021-09-13 11:25:36 +02:00
Richard Biener
716e03f9f3 Default AVR to DWARF2 debug
This switches the AVR port to generate DWARF2 debugging info by
default since the support for STABS is going to be deprecated for
GCC 12.

2021-09-10  Richard Biener  <rguenther@suse.de>

	* config/avr/elf.h (PREFERRED_DEBUGGING_TYPE): Remove
	override, pick up DWARF2_DEBUG define from elfos.h
2021-09-13 11:17:34 +02:00
Richard Biener
d399e43a91 Always default to DWARF2 debugging for RX, even with -mas100-syntax
The RX port defaults to STABS when -mas100-syntax is used because
the AS100 assembler does not support some of the pseudo-ops used
by DWARF2 debug emission.  Since STABS is going to be deprecated
that has to change.  The following simply always uses DWARF2,
likely leaving -mas100-syntax broken when debug info is generated.

Can the RX port maintainer please sort out the situation?

2021-09-10  Richard Biener  <rguenther@suse.de>

	* config/rx/rx.h (PREFERRED_DEBUGGING_TYPE): Always define to
	DWARF2_DEBUG.
2021-09-13 11:17:34 +02:00
Richard Biener
113ff25217 Default Alpha/VMS to DWARF2 debugging only
This changes the default debug format for Alpha/VMS to DWARF2 only,
skipping emission of VMS debug info which is going do be deprecated
for GCC 12 alongside the support for STABS.

2021-09-10  Richard Biener  <rguenther@suse.de>

	* config/alpha/vms.h (PREFERRED_DEBUGGING_TYPE): Define to
	DWARF2_DEBUG.
2021-09-13 11:17:33 +02:00
Richard Biener
2ebb6f6e51 Always default to DWARF2 debug for cygwin and mingw
This removes the fallback to STABS as default for cygwin and mingw
when the assembler does not support .secrel32 and the default is
to emit 32bit code.  Support for .secrel32 was added to binutils 2.16
released in 2005 so instead document that as requirement.

I left the now unused check for .secrel32 in configure around
in case somebody wants to turn that into an error or warning.

2021-09-10  Richard Biener  <rguenther@suse.de>

	* config/i386/cygming.h: Always default to DWARF2 debugging.
	Do not define DBX_DEBUGGING_INFO, that's done via dbxcoff.h
	already.
	* doc/install.texi: Document binutils 2.16 as minimum
	requirement for mingw.
2021-09-13 11:17:33 +02:00
Andreas Schwab
fc4a29c078 libgfortran: Handle m68k extended real format in ISO_Fortran_binding.h
libgfortran/
	* ISO_Fortran_binding.h (CFI_type_long_double)
	(CFI_type_long_double_Complex) [LDBL_MANT_DIG == 64 &&
	LDBL_MIN_EXP == -16382 && LDBL_MAX_EXP == 16384]: Define.
2021-09-13 10:04:01 +02:00
Kewen Lin
fbeead55e0 rs6000: Add load density heuristic
We noticed that SPEC2017 503.bwaves_r run time degrades by
about 8% on P8 and P9 if we enabled vectorization at O2
fast-math (with cheap vect cost model).  Comparing to Ofast,
compiler doesn't do the loop interchange on the innermost
loop, it's not profitable to vectorize it then.

As Richi's comments [1], this follows the similar idea to
over price the vector construction fed by VMAT_ELEMENTWISE
or VMAT_STRIDED_SLP.  Instead of adding the extra cost on
vector construction costing immediately, it firstly records
how many loads and vectorized statements in the given loop,
later in rs6000_density_test (called by finish_cost) it
computes the load density ratio against all vectorized
statements, and check with the corresponding thresholds
DENSITY_LOAD_NUM_THRESHOLD and DENSITY_LOAD_PCT_THRESHOLD,
do the actual extra pricing if both thresholds are exceeded.

Note that this new load density heuristic check is based on
some fields in target cost which are updated as needed when
scanning each add_stmt_cost entry, it's independent of the
current function rs6000_density_test which requires to scan
non_vect stmts.  Since it's checking the load stmts count
vs. all vectorized stmts, it's kind of density, so I put
it in function rs6000_density_test.  With the same reason to
keep it independent, I didn't put it as an else arm of the
current existing density threshold check hunk or before this
hunk.

In the investigation of -1.04% degradation from 526.blender_r
on Power8, I noticed that the extra penalized cost 320 on one
single vector construction for mode V16QI is much exaggerated,
which makes the final body cost unreliable, so this patch adds
one maximum bound for the extra penalized cost for each vector
construction statement.

Full SPEC2017 performance evaluation on Power8/Power9 with
option combinations:
  * -O2 -ftree-vectorize {,-fvect-cost-model=very-cheap}
    {,-ffast-math}
  * {-O3, -Ofast} {,-funroll-loops}
bwaves_r degradations on P8/P9 have been fixed, nothing else
remarkable was observed.  Power10 -Ofast -funroll-loops run
shows it's neutral, while -O2 -ftree-vectorize run shows the
bwaves_r degradation is fixed expectedly.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570076.html

gcc/ChangeLog:

	* config/rs6000/rs6000.c (struct rs6000_cost_data): New members
	nstmts, nloads and extra_ctor_cost.
	(rs6000_density_test): Add load density related heuristics.  Do
	extra costing on vector construction statements if need.
	(rs6000_init_cost): Init new members.
	(rs6000_update_target_cost_per_stmt): New function.
	(rs6000_add_stmt_cost): Factor vect_nonmem hunk out to function
	rs6000_update_target_cost_per_stmt and call it.
2021-09-13 01:28:59 -05:00
Kewen Lin
b70e2541fe rs6000: Remove typedef for struct rs6000_cost_data
As Segher pointed out, to typedef struct _rs6000_cost_data as
rs6000_cost_data is useless, so rewrite it without typedef.

gcc/ChangeLog:

	* config/rs6000/rs6000.c (struct rs6000_cost_data): Remove typedef.
	(rs6000_init_cost): Adjust.
2021-09-13 01:28:59 -05:00
liuhongt
7f8ee89534 [i386] Remove UNSPEC_{COPYSIGN,XORSIGN}.
gcc/ChangeLog:

	* config/i386/i386.md: (UNSPEC_COPYSIGN): Remove.
	(UNSPEC_XORSIGN): Ditto.
2021-09-13 13:50:40 +08:00
GCC Administrator
e1ab9289be Daily bump. 2021-09-13 00:16:46 +00:00
Iain Buclaw
53a4def0dc d: Don't include terminating null pointer in string expression conversion (PR102185)
This gets re-added by the ExprVisitor when lowering StringExp back into a
STRING_CST during the code generator pass.

	PR d/102185

gcc/d/ChangeLog:

	* d-builtins.cc (d_eval_constant_expression): Don't include
	terminating null pointer in string expression conversion.

gcc/testsuite/ChangeLog:

	* gdc.dg/pr102185.d: New test.
2021-09-12 17:36:19 +02:00
Roger Sayle
b195fae7c1 Also preserve SUBREG_PROMOTED_VAR_P in expr.c's convert_move.
This patch catches another place in the middle-end where it's possible
to preserve the SUBREG_PROMOTED_VAR_P annotation on a subreg to the
benefit of later RTL optimizations.  This adds the same logic to
expr.c's convert_move as recently added to convert_modes.

On nvptx-none, the simple test program:

short foo (char c) { return c; }

currently generates three instructions:

mov.u32	%r23, %ar0;
cvt.u16.u32     %r24, %r23;
cvt.s32.s16     %value, %r24;

with this patch, we now generate just one:

mov.u32 %value, %ar0;

This patch should look familiar, it's almost identical to the recent patch
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578331.html but with
the fix https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578519.html

2021-09-12  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* expr.c (convert_move): Preserve SUBREG_PROMOTED_VAR_P when
	creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P
	subreg.
2021-09-12 15:18:57 +01:00
GCC Administrator
d71126eeea Daily bump. 2021-09-12 00:16:18 +00:00
Ian Lance Taylor
79513dc0b2 compiler: don't pad zero-sized trailing field in results struct
Nothing can take the address of that field anyhow.

Fixes PR go/101994

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/343873
2021-09-11 14:20:19 -07:00