Commit Graph

191442 Commits

Author SHA1 Message Date
Martin Sebor
3c9f762ad0 Constrain conservative string lengths to array sizes [PR104119].
Resolves:
PR tree-optimization/104119 - unexpected -Wformat-overflow after strlen in ILP32 since Ranger integration

gcc/ChangeLog:

	PR tree-optimization/104119
	* gimple-ssa-sprintf.cc (struct directive): Change argument type.
	(format_none): Same.
	(format_percent): Same.
	(format_integer): Same.
	(format_floating): Same.
	(get_string_length): Same.
	(format_character): Same.
	(format_string): Same.
	(format_plain): Same.
	(format_directive): Same.
	(compute_format_length): Same.
	(handle_printf_call): Same.
	* tree-ssa-strlen.cc (get_range_strlen_dynamic): Same.   Call
	get_maxbound.
	(get_range_strlen_phi): Same.
	(get_maxbound): New function.
	(strlen_pass::get_len_or_size): Adjust to parameter change.
	* tree-ssa-strlen.h (get_range_strlen_dynamic): Change argument type.

gcc/testsuite/ChangeLog:

	PR tree-optimization/104119
	* gcc.dg/tree-ssa/builtin-snprintf-13.c: New test.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-29.c: New test.
2022-02-03 13:27:16 -07:00
Harald Anlauf
4e4252db03 Fortran: reject simplifying TRANSFER for MOLD with storage size 0
gcc/fortran/ChangeLog:

	PR fortran/104311
	* check.cc (gfc_calculate_transfer_sizes): Checks for case when
	storage size of SOURCE is greater than zero while the storage size
	of MOLD is zero and MOLD is an array shall not depend on SIZE.

gcc/testsuite/ChangeLog:

	PR fortran/104311
	* gfortran.dg/transfer_simplify_15.f90: New test.
2022-02-03 19:22:40 +01:00
Martin Liska
c7d0d03a6b Speed up fixincludes.
In my case:
$ rm ./stmp-fixinc ; time make -j16

takes 17 seconds, where I can reduce it easily with the suggested
change. Then I get to 11.2 seconds.

The scripts searches ~2500 folders in my case with total 20K header
files.

fixincludes/ChangeLog:

	* fixinc.in: Use mkdir -p rather that a loop.
2022-02-03 18:45:06 +01:00
Bill Schmidt
48bd780ee3 rs6000: Remove -m[no-]fold-gimple flag [PR103686]
The -m[no-]fold-gimple flag was really intended primarily for internal
testing while implementing GIMPLE folding for rs6000 vector built-in
functions.  It ended up leaking into other places, causing problems such
as PR103686 identifies.  Let's remove it.

There are a number of tests in the testsuite that require adjustment.
Some specify -mfold-gimple directly, which is the default, so that is
handled by removing the option.  Others unnecessarily specify
-mno-fold-gimple, as the tests work fine without this.  Again that is
handled by removing the option.  There are a couple of extra variants of
tests specifically for -mno-fold-gimple; for those, we can just	remove the
whole test.

gcc.target/powerpc/builtins-1.c was more problematic.  It was written in
such a way as to be extremely fragile.  For this one, I rewrote the whole
test in a different style, using individual functions to test each
built-in function.  These same tests are also largely covered by
builtins-1-be-folded.c and builtins-1-le-folded.c, so I chose to
explicitly make this test -mbig for simplicity, and use -O2 for clean code
generation.  I made some slight modifications to the expected instruction
counts as a result, and tested on both 32- and 64-bit.

2022-02-02  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	PR target/103686
	* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin):	Remove
	test for !rs6000_fold_gimple.
	* config/rs6000/rs6000.cc (rs6000_option_override_internal): Likewise.
	* config/rs6000/rs6000.opt (mfold-gimple): Remove.

gcc/testsuite/
	PR target/103686
	* gcc.target/powerpc/builtins-1-be-folded.c: Remove -mfold-gimple
	option.
	* gcc.target/powerpc/builtins-1-le-folded.c: Likewise.
	* gcc.target/powerpc/builtins-1.c: Rewrite to use small functions and
	restrict to -O2 -mbig for predictability.  Adjust instruction counts.
	* gcc.target/powerpc/builtins-5.c: Remove -mno-fold-gimple option.
	* gcc.target/powerpc/p8-vec-xl-xst.c: Likewise.
	* gcc.target/powerpc/pr83926.c: Likewise.
	* gcc.target/powerpc/pr86731-nogimplefold-longlong.c: Delete.
	* gcc.target/powerpc/pr86731-nogimplefold.c: Delete.
	* gcc.target/powerpc/swaps-p8-17.c: Remove -mno-fold-gimple option.
2022-02-03 11:17:36 -06:00
Bill Schmidt
3f30f2d1db rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]
These built-ins were misimplemented as always having big-endian semantics.

2022-01-18  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	PR target/95082
	* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Handle
	endianness for vclzlsbb and vctzlsbb.
	* config/rs6000/rs6000-builtins.def (VCLZLSBB_V16QI): Change
	default pattern and indicate a different pattern will be used for
	big endian.
	(VCLZLSBB_V4SI): Likewise.
	(VCLZLSBB_V8HI): Likewise.
	(VCTZLSBB_V16QI): Likewise.
	(VCTZLSBB_V4SI): Likewise.
	(VCTZLSBB_V8HI): Likewise.

gcc/testsuite/
	PR target/95082
	* gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Restrict to -mbig.
	* gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Likewise.
	* gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: New.
	* gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: New.
	* gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Restrict to -mbig.
	* gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Likewise.
	* gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: New.
	* gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: New.
2022-02-03 11:17:18 -06:00
Bill Schmidt
eecee223f4 rs6000: Consolidate target built-ins code
Continuing with the refactoring effort, this patch moves as much of the
target-specific built-in support code into a new file, rs6000-builtin.cc.
However, we can't easily move the overloading support code out of
rs6000-c.cc, because the build machinery understands that as a special file
to be included with the C and C++ front ends.

This patch is just a straightforward move, with one exception.  I found
that the builtin_mode_to_type[] array is no longer used, so I also removed
all code having to do with it.

The code in rs6000-builtin.cc is organized in related sections:
 - General support functions
 - Initialization support
 - GIMPLE folding support
 - Expansion support

Overloading support remains in rs6000-c.cc.

2022-02-03  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config.gcc (powerpc*-*-*): Add rs6000-builtin.o to extra_objs.
	* config/rs6000/rs6000-builtin.cc: New file, containing code moved
	from other files.
	* config/rs6000/rs6000-call.cc (cpu_is_info): Move to
	rs6000-builtin.cc.
	(cpu_supports_info): Likewise.
	(rs6000_type_string): Likewise.
	(altivec_expand_predicate_builtin): Likewise.
	(rs6000_htm_spr_icode): Likewise.
	(altivec_expand_vec_init_builtin): Likewise.
	(get_element_number): Likewise.
	(altivec_expand_vec_set_builtin): Likewise.
	(altivec_expand_vec_ext_builtin): Likewise.
	(rs6000_invalid_builtin): Likewise.
	(rs6000_fold_builtin): Likewise.
	(fold_build_vec_cmp): Likewise.
	(fold_compare_helper): Likewise.
	(map_to_integral_tree_type): Likewise.
	(fold_mergehl_helper): Likewise.
	(fold_mergeeo_helper): Likewise.
	(rs6000_builtin_valid_without_lhs): Likewise.
	(rs6000_builtin_is_supported): Likewise.
	(rs6000_gimple_fold_mma_builtin): Likewise.
	(rs6000_gimple_fold_builtin): Likewise.
	(rs6000_expand_ldst_mask): Likewise.
	(cpu_expand_builtin): Likewise.
	(elemrev_icode): Likewise.
	(ldv_expand_builtin): Likewise.
	(lxvrse_expand_builtin): Likewise.
	(lxvrze_expand_builtin): Likewise.
	(stv_expand_builtin): Likewise.
	(mma_expand_builtin): Likewise.
	(htm_spr_num): Likewise.
	(htm_expand_builtin): Likewise.
	(rs6000_expand_builtin): Likewise.
	(rs6000_vector_type): Likewise.
	(rs6000_init_builtins): Likewise.  Remove initialization of
	builtin_mode_to_type entries.
	(rs6000_builtin_decl): Move to rs6000-builtin.cc.
	* config/rs6000/rs6000.cc (rs6000_builtin_mask_for_load): New
	external declaration.
	(rs6000_builtin_md_vectorized_function): Likewise.
	(rs6000_builtin_reciprocal): Likewise.
	(altivec_builtin_mask_for_load): Move to rs6000-builtin.cc.
	(rs6000_builtin_types): Likewise.
	(builtin_mode_to_type): Remove.
	(rs6000_builtin_mask_for_load): Move to rs6000-builtin.cc.  Remove
	static qualifier.
	(rs6000_builtin_md_vectorized_function): Likewise.
	(rs6000_builtin_reciprocal): Likewise.
	* config/rs6000/rs6000.h (builtin_mode_to_type): Remove.
	* config/rs6000/t-rs6000 (rs6000-builtin.o): New target.
2022-02-03 11:16:43 -06:00
David Seifert
45ba6bf28b make -Werror optional in libatomic/libbacktrace/libgomp/libitm/libsanitizer
* `-Werror` can cause issues when a more recent version of GCC compiles
  an older version:
  - https://bugs.gentoo.org/229059
  - https://bugs.gentoo.org/475350
  - https://bugs.gentoo.org/667104

libatomic/ChangeLog:

	* configure.ac: Support --disable-werror.
	* configure: Regenerate.

libbacktrace/ChangeLog:

	* configure.ac: Support --disable-werror.
	* configure: Regenerate.

libgomp/ChangeLog:

	* configure.ac: Support --disable-werror.
	* configure: Regenerate.

libitm/ChangeLog:

	* configure.ac: Support --disable-werror.
	* configure: Regenerate.

libsanitizer/ChangeLog:

	* configure.ac: Support --disable-werror.
	* aclocal.m4: Include also ../config/warnings.m4.
	* libbacktrace/Makefile.am (WARN_FLAGS): Remove.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* asan/Makefile.in: Regenerate.
	* hwasan/Makefile.in: Regenerate.
	* interception/Makefile.in: Regenerate.
	* libbacktrace/Makefile.in: Regenerate.
	* lsan/Makefile.in: Regenerate.
	* sanitizer_common/Makefile.in: Regenerate.
	* tsan/Makefile.in: Regenerate.
	* ubsan/Makefile.in: Regenerate.

Co-Authored-By: Jakub Jelinek <jakub@redhat.com>
2022-02-03 16:10:18 +01:00
Richard Biener
1d5c7584fd debug/104337 - avoid messing with the abstract origin chain in NRV
The following avoids NRV from massaging DECL_ABSTRACT_ORIGIN after
variable creation since NRV runs _after_ the function was inlined and thus
affects the inlined variables copy indirectly.  We may adjust the abstract
origin of a variable only at the point we create it, not further along the
path since otherwise the (new) invariant that the abstract origin is always
the ultimate origin cannot be maintained.

The intent of what NRV does is OK I guess and it may improve the debug
experience.  But I also notice we do

  SET_DECL_VALUE_EXPR (found, result);
  DECL_HAS_VALUE_EXPR_P (found) = 1;

the code is there since the merge from tree-ssa which added tree-nrv.c.

Jakub added the DECL_VALUE_EXPR in g:938650d8fddb878f623e315f0b7fd94b217efa96
and Jason added the abstract origin setting conditional in g:7716876bbd3a

The follwoing takes the radical approach and remove the attempt
to "optimize" the debug info.

The gdb testsuites show no regressions.

2022-02-03  Richard Biener  <rguenther@suse.de>

	PR debug/104337
	* tree-nrv.cc (pass_nrv::execute): Remove tieing result and found
	together via DECL_ABSTRACT_ORIGIN.

	* gcc.dg/debug/pr104337.c: New testcase.
2022-02-03 16:02:12 +01:00
Bill Schmidt
a1b4d225d8 rs6000: Unify error messages for built-in constant restrictions
We currently give different error messages for built-in functions that
violate range restrictions on their arguments, depending on whether we
record them as requiring an n-bit literal or a literal between two values.
It's better to be consistent.  Change the error message for the n-bit
literal to look like the other one.

2022-02-02  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.cc (rs6000_expand_builtin): Revise error
	message for RES_BITS case.

gcc/testsuite/
	* gcc.target/powerpc/bfp/scalar-test-data-class-10.c: Adjust error
	messages.
	* gcc.target/powerpc/bfp/scalar-test-data-class-2.c: Likewise.
	* gcc.target/powerpc/bfp/scalar-test-data-class-3.c: Likewise.
	* gcc.target/powerpc/bfp/scalar-test-data-class-4.c: Likewise.
	* gcc.target/powerpc/bfp/scalar-test-data-class-5.c: Likewise.
	* gcc.target/powerpc/bfp/scalar-test-data-class-9.c: Likewise.
	* gcc.target/powerpc/bfp/vec-test-data-class-4.c: Likewise.
	* gcc.target/powerpc/bfp/vec-test-data-class-5.c: Likewise.
	* gcc.target/powerpc/bfp/vec-test-data-class-6.c: Likewise.
	* gcc.target/powerpc/bfp/vec-test-data-class-7.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-12.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-14.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-17.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-19.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-2.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-22.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-24.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-27.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-29.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-32.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-34.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-37.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-39.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-4.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-42.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-44.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-47.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-49.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-52.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-54.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-57.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-59.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-62.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-64.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-67.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-69.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-7.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-72.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-74.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-77.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-79.c: Likewise.
	* gcc.target/powerpc/dfp/dtstsfi-9.c: Likewise.
	* gcc.target/powerpc/pr80315-1.c: Likewise.
	* gcc.target/powerpc/pr80315-2.c: Likewise.
	* gcc.target/powerpc/pr80315-3.c: Likewise.
	* gcc.target/powerpc/pr80315-4.c: Likewise.
	* gcc.target/powerpc/pr82015.c: Likewise.
	* gcc.target/powerpc/pr91903.c: Likewise.
	* gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Likewise.
	* gcc.target/powerpc/vec-ternarylogic-10.c: Likewise.
2022-02-03 09:01:55 -06:00
Aldy Hernandez
f544e5efaf ranger: fix small thinko in fur_list constructor
The fur_list constructor for two ranges is leaving [1] in an undefined
state.  The reason we haven't noticed is because after all the
shuffling in the last cycle there are no remaining users of it
(similarly for fur_list(unsigned, irange *)).

Since it's very late in the cycle, I would prefer to fix this, rather
than removing unused constructors altogether.  Besides, we have uses
of them queued up for the next release.

gcc/ChangeLog:

	* gimple-range-fold.cc (fur_list::fur_list): Set m_local[1] correctly.
2022-02-03 15:48:46 +01:00
Jakub Jelinek
8439e866a3 arm: Fix up help.exp regression
On Thu, Jan 20, 2022 at 11:27:20AM +0000, Richard Earnshaw via Gcc-patches wrote:
> gcc/ChangeLog:
>
>       * config/arm/arm.opt (mfix-cortex-a57-aes-1742098): New command-line
>       option.
>       (mfix-cortex-a72-aes-1655431): New option alias.

> --- a/gcc/config/arm/arm.opt
> +++ b/gcc/config/arm/arm.opt
> @@ -272,6 +272,16 @@ mfix-cmse-cve-2021-35465
>  Target Var(fix_vlldm) Init(2)
>  Mitigate issues with VLLDM on some M-profile devices (CVE-2021-35465).
>
> +mfix-cortex-a57-aes-1742098
> +Target Var(fix_aes_erratum_1742098) Init(2) Save
> +Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72.
> +Arm erratum #1742098
> +
> +mfix-cortex-a72-aes-1655431
> +Target Alias(mfix-cortex-a57-aes-1742098)
> +Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72.
> +Arm erratum #1655431
> +
>  munaligned-access
>  Target Var(unaligned_access) Init(2) Save
>  Enable unaligned word and halfword accesses to packed data.

This breaks:
Running /usr/src/gcc/gcc/testsuite/gcc.misc-tests/help.exp ...
FAIL: compiler driver --help=target option(s): "^ +-.*[^:.]$" absent from output: "  -mfix-cortex-a57-aes-1742098 Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72. Arm erratum #1742098"

help.exp with help of lib/options.exp tests whether all non-empty descriptions of
options are terminated with . or :.

2022-02-03  Jakub Jelinek  <jakub@redhat.com>

	* config/arm/arm.opt (mfix-cortex-a57-aes-1742098,
	mfix-cortex-a72-aes-1655431): Ensure description ends with full stop.
2022-02-03 14:34:21 +01:00
Aldy Hernandez
83ad3a96eb Assert that backedges are available in path solver.
gcc/ChangeLog:

	* cfganal.cc (verify_marked_backedges): New.
	* cfganal.h (verify_marked_backedges): New.
	* gimple-range-path.cc (path_range_query::path_range_query):
	Verify freshness of back edges.
	* tree-ssa-loop-ch.cc (ch_base::copy_headers): Call
	mark_dfs_back_edges.
	* tree-ssa-threadbackward.cc (back_threader::back_threader): Move
	path_range_query construction after backedges have been
	updated.
2022-02-03 14:06:45 +01:00
Eric Botcazou
635504510a Skip gnat.dg/div_zero.adb on PowerPC
The hardware instruction does not trap on divide by zero there.

gcc/testsuite
	PR tree-optimization/104356
	* gnat.dg/div_zero.adb: Add dg-skip-if directive for PowerPC.
2022-02-03 13:20:19 +01:00
Richard Sandiford
67cd9cf5bf aarch64: Remove struct_vect_25.c XFAILs
At some point we started generating the intended code for
aarch64/sve/struct_vect_25.c.  This patch removes the xfails
and the scan-assembler-times that replaced the xfailed forms.

gcc/testsuite/
	* gcc.target/aarch64/sve/struct_vect_25.c: Remove XFAILs.
2022-02-03 10:44:01 +00:00
Richard Sandiford
2b4044d8c2 aarch64: Adjust tests after fix for PR102659
After the fix for PR102659, the vectoriser can no longer group
conditional accesses of the form:

  for (int i = 0; i < n; ++i)
    if (...)
      ...a[i * 2] + a[i * 2 + 1]...;

on LP64 targets.  It has to treat them as two independent
gathers instead.

This was causing failures in the sve mask_struct*.c tests.
The tests weren't really testing that int iterators could
be used, so this patch switches to pointer-sized iterators
instead.

gcc/testsuite/
	* gcc.target/aarch64/sve/mask_struct_load_1.c: Use intptr_t
	iterators instead of int iterators.
	* gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_6.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_7.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_8.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise.
2022-02-03 10:44:01 +00:00
Richard Sandiford
7e4f89a23e aarch64: Add missing movmisalign patterns
The Advanced SIMD movmisalign patterns didn't handle 16-bit
FP modes, which meant that the vector loop for:

  void
  test (_Float16 *data)
  {
    _Pragma ("omp simd")
    for (int i = 0; i < 8; ++i)
      data[i] = 1.0;
  }

would be versioned for alignment.

This was causing some new failures in aarch64/sve/single_5.c:

FAIL: gcc.target/aarch64/sve/single_5.c scan-assembler-not \\tb
FAIL: gcc.target/aarch64/sve/single_5.c scan-assembler-not \\tcmp
FAIL: gcc.target/aarch64/sve/single_5.c scan-assembler-times \\tstr\\tq[0-9]+, 10

but I didn't look into what changed from earlier releases.
Adding the missing modes removes some existing xfails.

gcc/
	* config/aarch64/aarch64-simd.md (movmisalign<mode>): Extend from
	VALL to VALL_F16.

gcc/testsuite/
	* gcc.target/aarch64/sve/single_5.c: Remove some XFAILs.
2022-02-03 10:44:00 +00:00
Richard Sandiford
6a77052660 aarch64: Remove VALL_F16MOV iterator
The VALL_F16MOV iterator now has the same modes as VALL_F16,
in the same order.  This patch removes the former in favour
of the latter.

This doesn't fix a bug as such, but it's ultra-safe (no change in
object code) and it saves a follow-up patch from having to make
a false choice between the iterators.

gcc/
	* config/aarch64/iterators.md (VALL_F16MOV): Delete.
	* config/aarch64/aarch64-simd.md (mov<mode>): Use VALL_F16 instead
	of VALL_F16MOV.
2022-02-03 10:44:00 +00:00
Richard Sandiford
d41ba5a053 testsuite: Remove TSVC XFAILs for SVE
Many of the XFAILed TSVC tests pass for SVE.  This patch updates
the markup accordingly.

gcc/testsuite/
	* gcc.dg/vect/tsvc/vect-tsvc-s1115.c: Don't XFAIL for SVE.
	* gcc.dg/vect/tsvc/vect-tsvc-s114.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s1161.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s1232.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s124.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s1279.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s161.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s253.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s257.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s271.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s2711.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s2712.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s272.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s273.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s274.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s276.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s278.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s279.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s3111.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s4113.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s441.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s443.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-s491.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-vas.c: Likewise.
	* gcc.dg/vect/tsvc/vect-tsvc-vif.c: Likewise.
2022-02-03 10:43:59 +00:00
Richard Sandiford
9fb5e771ec testsuite: Update guality xfails for aarch64*-*-*
Following on from GCC 11 patch g:f31ddad8ac8, this one gives clean
guality.exp test results for aarch64-linux-gnu with modern gdb
(this time gdb 11.2).

The justification is the same as previously:

------
For people using older gdbs, it will trade one set of noisy results for
another set.  I still think it's better to have the xfails based on
one “clean” and “modern” run rather than have FAILs and XPASSes for
all runs.

It's hard to tell which of these results are aarch64-specific and
which aren't.  If other target maintainers want to do something similar,
and are prepared to assume the same gdb version, then it should become
clearer over time which ones are target-specific and which aren't.

There are no new skips here, so changes in test results will still
show up as XPASSes.

I've not analysed the failures or filed PRs for them.  In some
ways the guality directory itself seems like the best place to
start looking for xfails, if someone's interested in working
in this area.
------

gcc/testsuite/
	* gcc.dg/guality/ipa-sra-1.c: Update aarch64*-*-* xfails.
	* gcc.dg/guality/pr54519-1.c: Likewise.
	* gcc.dg/guality/pr54519-3.c: Likewise.
2022-02-03 10:43:59 +00:00
Martin Liska
9db03cd0ca Fix wording for: attribute ‘-xyz’ argument ‘target’ is unknown
gcc/ChangeLog:

	* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
	Change subject and object in the error message.
	* config/s390/s390.cc (s390_valid_target_attribute_inner_p):
	Likewise.
2022-02-03 10:19:33 +01:00
Martin Liska
0415470c8d s390x: Fix one more -Wformat-diag.
gcc/ChangeLog:

	* config/s390/s390.cc (s390_valid_target_attribute_inner_p):
	Use the error message for i386 target.
2022-02-03 09:56:33 +01:00
Jakub Jelinek
de67f943b8 ranger: Fix up wi_fold_in_parts for small precision types [PR104334]
The wide-int.h templates expect that when an int/long etc. operand is used
it will be sign-extended based on the types precision.
wi_fold_in_parts passes 3 such non-zero constants to wi::lt_p, wi::gt_p
and wi::eq_p - 1, 3 and 4, which means it was doing weird things if either
some of 1, 3 or 4 weren't representable in type, or if type was unsigned 3 bit
type 4 should be written as -4.
The following patch promotes the subtraction operands to widest_int and
uses that as the type for ?h_range variables and compares them as such.
We don't need the overflow handling because there is never an overflow.

2022-02-02  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/104334
	* range-op.cc (range_operator::wi_fold_in_parts): Change lh_range
	and rh_range type to widest_int and subtract in widest_int.  Remove
	ov_rh, ov_lh and sign vars, always perform comparisons as signed
	and use >, < and == operators for it.

	* g++.dg/opt/pr104334.C: New test.
2022-02-03 09:45:16 +01:00
Jakub Jelinek
54d21dd5b5 openmp, fortran: Improve !$omp atomic checks [PR104328]
The testcase shows some cases that weren't verified and we ICE on
invalid because of that.
One problem is that unlike before, we weren't checking if some expression
is EXPR_VARIABLE with non-NULL symtree in the case where there was
a conversion around it.
The other two issues is that we check that in an IF ->block is non-NULL
and then immediately dereference ->block->next->op, but on invalid
code with no statements in the then clause ->block->next might be NULL.

2022-02-02  Jakub Jelinek  <jakub@redhat.com>

	PR fortran/104328
	* openmp.cc (is_scalar_intrinsic_expr): If must_be_var && conv_ok
	and expr is conversion, verify it is a conversion from EXPR_VARIABLE
	with non-NULL symtree.  Check ->block->next before dereferencing it.

	* gfortran.dg/gomp/atomic-27.f90: New test.
2022-02-03 09:01:07 +01:00
Jason Merrill
501c4ee9fa c++: dependent array bounds completion [PR104302]
The patch for PR55227 changed the minimal init-list handling in
cp_complete_array_type to a call to reshape_init, which broke on the
dependent initializer.  It occurred to me that trying to deduce the array
size from a dependent init-list is wrong in general, so let's just not.  I
also limited the reshape_init call to the case of a char array, as before
the patch for 55227; that's the only case where we want to strip a level of
braces from an array.

	PR c++/104302

gcc/cp/ChangeLog:

	* decl.cc (maybe_deduce_size_from_array_init): Give up
	on type-dependent init.
	(cp_complete_array_type): Only call reshape_init for character
	array.

gcc/testsuite/ChangeLog:

	* g++.dg/template/array35.C: New test.
	* g++.dg/template/array36.C: New test.
2022-02-02 21:14:10 -05:00
Martin Sebor
dc898b2ba5 Correct typos in -Wuse-after-free description.
gcc/ChangeLog:
	* common.opt (-Wuse-after-free): Correct typos.
2022-02-02 17:47:52 -07:00
GCC Administrator
88944e1314 Daily bump. 2022-02-03 00:16:22 +00:00
David Malcolm
fb45d8e692 docs: mention analyzer interaction with -ftrivial-auto-var-init [PR104270]
gcc/ChangeLog:
	PR analyzer/104270
	* doc/invoke.texi (-ftrivial-auto-var-init=): Add reference to
	-Wanalyzer-use-of-uninitialized-value to paragraph documenting that
	-ftrivial-auto-var-init= doesn't suppress warnings.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-02-02 16:50:39 -05:00
Segher Boessenkool
14d642df2b rs6000/testsuite: Return 0 for powerpc_altivec_ok on other targets
2022-02-02  Segher Boessenkool  <segher@kernel.crashing.org>

gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_powerpc_altivec_ok):
	Return 0 if the target is not Power.  Restructure and add some comments.
2022-02-02 20:21:30 +00:00
Jonathan Wakely
2905e1af94 libstdc++: Fix -Wunused-variable warning for -fno-exceptions build
If _GLIBCXX_THROW_OR_ABORT expands to just __builtin_abort() then the
bool variable used in the filesystem_error constructor is unused. Mark
it as maybe_unused to there's no warning for -fno-exceptions builds.

libstdc++-v3/ChangeLog:

	* src/c++17/fs_dir.cc (fs::recursive_directory_iterator::pop):
	Add [[maybe_unused]] attribute.
	* src/filesystem/dir.cc (fs::recursive_directory_iterator::pop):
	Likewise.
2022-02-02 17:55:16 +00:00
Jonathan Wakely
c123096cf1 libstdc++: Fix invalid instantiations in tests
These tests instantiate std::multiset and std::set with a type that has
no operator< so they should use a custom comparison function.

libstdc++-v3/ChangeLog:

	* testsuite/23_containers/multiset/operators/cmp_c++20.cc: Use
	custom comparison function for multiset.
	* testsuite/23_containers/set/operators/cmp_c++20.cc: Use custom
	comparison function for set.
2022-02-02 17:08:54 +00:00
Jonathan Wakely
b229c51860 libstdc++: Fix link failure in _OutputIteratorConcept
The C++98-style concept check for output iterators causes a link
failure on mingw-w64, because the __val() member function isn't defined.
Change it to use a function pointer instead. That pointer is never set
to anything meaningful, but it doesn't matter as the __constraints()
function only has to be instantiated, it's never called.

We could refactor all of these to use unevaluated contexts (e.g. sizeof
of __decltype) so that we only check the expressions are well-formed,
without any codegen at all. Any improvements to these are very low
priority though.

libstdc++-v3/ChangeLog:

	* include/bits/boost_concept_check.h (_OutputIteratorConcept):
	Change member function to data member of function pointer type.
2022-02-02 16:30:51 +00:00
Martin Liska
9a92e46c0e lto: fix error handling for -Wl,-plugin-opt=debug
When one uses something like: -Wl,-plugin-opt=debug,
we end up with lto1 WPA invocation that has 'debug'
on command line. We interpret that as input filename.

The patch moves resolution checking later so that we end up with
a reasonable error message:

lto1: fatal error: open debug failed: No such file or directory
compilation terminated.

	PR lto/104333

gcc/lto/ChangeLog:

	* lto-common.cc (read_cgraph_and_symbols): Move resolution
	checking for number of files later and report a reasonable
	error message.
	* lto-object.cc (lto_obj_file_open): Make error fatal.
2022-02-02 16:05:39 +01:00
Martin Liska
302caa1fae Remove dead macro: TEXT_SECTION_NAME
gcc/ChangeLog:

	* dwarf2out.cc (TEXT_SECTION_NAME): Remove unused macro.
2022-02-02 16:05:24 +01:00
David Malcolm
13ad6d9f50 analyzer: fix missing check for uninit of return values
When moving the -fanalyzer tests for -ftrivial-auto-var-init to the
"torture" subdirectory of gcc.dg/analyzer I noticed that -fanalyzer
wasn't always properly checking for initialization of return values.

The issue was that some "return" handling was using
region_model::copy_region to copy to the RESULT_DECL, and copy_region
wasn't checking for poisoned svalues.

This patch eliminates region_model::copy_region in favor of simply
doing a get_ravlue/set_value pair, fixing the issue.

gcc/analyzer/ChangeLog:
	* region-model.cc (region_model::on_return): Replace usage of
	copy_region with get_rvalue/set_value pair.
	(region_model::pop_frame): Likewise.
	(selftest::test_compound_assignment): Likewise.
	* region-model.h (region_model::copy_region): Delete decl.
	* region.cc (region_model::copy_region): Delete.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/torture/ubsan-1.c: Add missing return stmts.
	* gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c: Move
	to...
	* gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-pattern.c:
	...here.
	* gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c:
	Move to...
	* gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-uninitialized.c:
	...here.
	* gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c: Move to...
	* gcc.dg/analyzer/torture/uninit-trivial-auto-var-init-zero.c: ...here.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-02-02 09:55:29 -05:00
David Malcolm
ea3e191595 analyzer: consolidate duplicate code in region::calc_offset
gcc/analyzer/ChangeLog:
	* region.cc (region::calc_offset): Consolidate effectively
	identical cases.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-02-02 09:54:32 -05:00
David Malcolm
93e759fc18 analyzer: implement bit_range_region
GCC 12 has gained -Wanalyzer-use-of-uninitialized-value, and I'm
seeing various false positives from it due to region_model::get_lvalue
not properly handling BIT_FIELD_REF, and falling back to
using an UNKNOWN_REGION for them.

This patch fixes these false positives by implementing a new
bit_range_region region subclass for handling BIT_FIELD_REF.

gcc/analyzer/ChangeLog:
	* analyzer.h (class bit_range_region): New forward decl.
	* region-model-manager.cc (region_model_manager::get_bit_range):
	New.
	(region_model_manager::log_stats): Handle m_bit_range_regions.
	* region-model.cc (region_model::get_lvalue_1): Handle
	BIT_FIELD_REF.
	* region-model.h (region_model_manager::get_bit_range): New decl.
	(region_model_manager::m_bit_range_regions): New field.
	* region.cc (region::get_base_region): Handle RK_BIT_RANGE.
	(region::base_region_p): Likewise.
	(region::calc_offset): Likewise.
	(bit_range_region::dump_to_pp): New.
	(bit_range_region::get_byte_size): New.
	(bit_range_region::get_bit_size): New.
	(bit_range_region::get_byte_size_sval): New.
	(bit_range_region::get_relative_concrete_offset): New.
	* region.h (enum region_kind): Add RK_BIT_RANGE.
	(region::dyn_cast_bit_range_region): New vfunc.
	(class bit_range_region): New.
	(is_a_helper <const bit_range_region *>::test): New.
	(default_hash_traits<bit_range_region::key_t>): New.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/torture/uninit-bit-field-ref.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-02-02 09:52:58 -05:00
David Malcolm
9b4eee5fd1 analyzer: stop -ftrivial-auto-var-init from suppressing uninit warnings [PR104270]
GCC 12 has gained two features for dealing with uninitialized variables:

(a) a new -Wanalyzer-use-of-uninitialized-value warning within -fanalyzer
for interprocedural path-sensitive detection of ununit uses, and

(b) a new -ftrivial-auto-var-init option for mitigating some uses of
uninit variables

It turns out that using (b) was thwarting (a), as it led to -fanalyzer
seeing calls to IFN_DEFERRED_INIT, which -fanalyzer wasn't
special-casing, thus treating it as initializing the variables in
question, and thus silencing -Wanalyzer-use-of-uninitialized-value on
them.

invoke.texi says:

"GCC still considers an automatic variable that doesn't have an explicit
initializer as uninitialized, @option{-Wuninitialized} will still report
warning messages on such automatic variables."

and thus -Wanalyzer-use-of-uninitialized-value ought to as well.

This patch adds special-case handling to -fanalyzer for
IFN_DEFERRED_INIT,  so that -fanalyzer will warn on uninit uses of
variables that are mitigated by -ftrivial-auto-var-init.

gcc/analyzer/ChangeLog:
	PR analyzer/104270
	* region-model.cc (region_model::on_call_pre): Handle
	IFN_DEFERRED_INIT.

gcc/testsuite/ChangeLog:
	PR analyzer/104270
	* gcc.dg/analyzer/uninit-trivial-auto-var-init-pattern.c: New
	test.
	* gcc.dg/analyzer/uninit-trivial-auto-var-init-uninitialized.c:
	New test.
	* gcc.dg/analyzer/uninit-trivial-auto-var-init-zero.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-02-02 09:51:07 -05:00
Bernd Kuhls
cac2f69cda gcc: define _REENTRANT for OpenRISC when -pthread is passed
The detection of pthread support fails on OpenRISC unless _REENTRANT
is defined. Added the CPP_SPEC definition to correct this.

gcc/ChangeLog:

	PR target/94372
	* config/or1k/linux.h (CPP_SPEC): Define.

Signed-off-by: Bernd Kuhls <bernd.kuhls@t-online.de>
2022-02-02 20:02:59 +09:00
Tamar Christina
9f6f411f63 AArch32: use canonical ordering for complex mul, fma and fms
After the first patch in the series this updates the optabs to expect the
canonical sequence.

gcc/ChangeLog:

	PR tree-optimization/102819
	PR tree-optimization/103169
	* config/arm/vec-common.md (cml<fcmac1><conj_op><mode>4): Use
	canonical order.
2022-02-02 10:52:17 +00:00
Tamar Christina
ab95fe61fe AArch64: use canonical ordering for complex mul, fma and fms
After the first patch in the series this updates the optabs to expect the
canonical sequence.

gcc/ChangeLog:

	PR tree-optimization/102819
	PR tree-optimization/103169
	* config/aarch64/aarch64-simd.md (cml<fcmac1><conj_op><mode>4): Use
	canonical order.
	* config/aarch64/aarch64-sve.md (cml<fcmac1><conj_op><mode>4): Likewise.
2022-02-02 10:51:38 +00:00
Tamar Christina
55d83cdf23 vect: Simplify and extend the complex numbers validation routines.
This patch boosts the analysis for complex mul,fma and fms in order to ensure
that it doesn't create an incorrect output.

Essentially it adds an extra verification to check that the two nodes it's going
to combine do the same operations on compatible values.  The reason it needs to
do this is that if one computation differs from the other then with the current
implementation we have no way to deal with it since we have to remove the
permute.

When we can keep the permute around we can probably handle these by unrolling.

While implementing this since I have to do the traversal anyway I took advantage
of it by simplifying the code a bit.  Previously we would determine whether
something is a conjugate and then try to figure out which conjugate it is and
then try to see if the permutes match what we expect.

Now the code that does the traversal will detect this in one go and return to us
whether the operation is something that can be combined and whether a conjugate
is present.

Secondly because it does this I can now simplify the checking code itself to
essentially just try to apply fixed patterns to each operation.

The patterns represent the order operations should appear in. For instance a
complex MUL operation combines :

  Left 1 + Right 1
  Left 2 + Right 2

with a permute on the nodes consisting of:

  { Even, Even } + { Odd, Odd  }
  { Even, Odd  } + { Odd, Even }

By abstracting over these patterns the checking code becomes quite simple.

As part of this I was checking the order of the operands which was left in
"slp" order. as in, the same order they showed up in during SLP, which means
that the accumulator is first.  However it looks like I didn't document this
and the x86 optab was implemented assuming the same order as FMA, i.e. that
the accumulator is last.

I have this changed the order to match that of FMA and FMS which corrects the
x86 codegen and will update the Arm targets.  This has now also been
documented.

gcc/ChangeLog:

	PR tree-optimization/102819
	PR tree-optimization/103169
	* doc/md.texi: Update docs for cfms, cfma.
	* tree-data-ref.h (same_data_refs): Accept optional offset.
	* tree-vect-slp-patterns.cc (is_linear_load_p): Fix issue with repeating
	patterns.
	(vect_normalize_conj_loc): Remove.
	(is_eq_or_top): Change to take two nodes.
	(enum _conj_status, compatible_complex_nodes_p,
	vect_validate_multiplication): New.
	(class complex_add_pattern, complex_add_pattern::matches,
	complex_add_pattern::recognize, class complex_mul_pattern,
	complex_mul_pattern::recognize, class complex_fms_pattern,
	complex_fms_pattern::recognize, class complex_operations_pattern,
	complex_operations_pattern::recognize, addsub_pattern::recognize): Pass
	new cache.
	(complex_fms_pattern::matches, complex_mul_pattern::matches): Pass new
	cache and use new validation code.
	* tree-vect-slp.cc (vect_match_slp_patterns_2, vect_match_slp_patterns,
	vect_analyze_slp): Pass along cache.
	(compatible_calls_p): Expose.
	* tree-vectorizer.h (compatible_calls_p, slp_node_hash,
	slp_compat_nodes_map_t): New.
	(class vect_pattern): Update signatures include new cache.

gcc/testsuite/ChangeLog:

	PR tree-optimization/102819
	PR tree-optimization/103169
	* g++.dg/vect/pr99149.cc: xfail for now.
	* gcc.dg/vect/complex/pr102819-1.c: New test.
	* gcc.dg/vect/complex/pr102819-2.c: New test.
	* gcc.dg/vect/complex/pr102819-3.c: New test.
	* gcc.dg/vect/complex/pr102819-4.c: New test.
	* gcc.dg/vect/complex/pr102819-5.c: New test.
	* gcc.dg/vect/complex/pr102819-6.c: New test.
	* gcc.dg/vect/complex/pr102819-7.c: New test.
	* gcc.dg/vect/complex/pr102819-8.c: New test.
	* gcc.dg/vect/complex/pr102819-9.c: New test.
	* gcc.dg/vect/complex/pr103169.c: New test.
2022-02-02 10:39:03 +00:00
Martin Sebor
756eabacfc Declare std::array members with attribute const [PR101831].
Resolves:
PR libstdc++/101831 - Spurious maybe-uninitialized warning on std::array::size

libstdc++-v3/ChangeLog:

	PR libstdc++/101831
	* include/std/array (begin): Declare const member function attribute
	const.
	(end, rbegin, rend, size, max_size, empty, data): Same.
	* testsuite/23_containers/array/capacity/empty.cc: Add test cases.
	* testsuite/23_containers/array/capacity/max_size.cc: Same.
	* testsuite/23_containers/array/capacity/size.cc: Same.
	* testsuite/23_containers/array/iterators/begin_end.cc: New test.
2022-02-01 17:21:49 -07:00
Hans-Peter Nilsson
07a6c52c4c cris: Reload using special-regs before general-regs
On code where reload has an effect (i.e. quite rarely, just enough to be
noticeable), this change gets code quality back to the situation prior
to "Remove CRIS v32 ACR artefacts".  We had from IRA a pseudoregister
marked to be reloaded from a union of all allocatable registers (here:
SPEC_GENNONACR_REGS) but where the register-class corresponding to the
constraint for the register-type alternative (here: GENERAL_REGS) was
*not* a subset of that class: SPEC_GENNONACR_REGS (and GENNONACR_REGS)
had a one-register "hole" for the ACR register, a register present in
GENERAL_REGS.

Code in reload.cc:find_reloads adds 4 to the cost of a register-type
alternative that is neither a subset of the preferred register class nor
vice versa and thus reload thinks it can't use.  It would be preferable
to look for a non-empty intersection of the two, and use that
intersection for that alternative, something that can't be expressed
because a register class can't be formed from a random register set.

The effect was here that the GENERAL_REGS to/from memory alternatives
("r") had their cost raised such that the SPECIAL_REGS alternatives
("x") looked better.  This happened to improve code quality just a
little bit compared to GENERAL_REGS being chosen.

Anyway, with the improved CRIS register-class topology, the
subset-checking code no longer has the GENERAL_REGS-demoting effect.
To get the same quality, we have to adjust the port such that
SPECIAL_REGS are specifically preferred when possible and advisible,
i.e. when there's at least two of those registers as for the CPU variant
with multiplication (which happens to be the variant maintained for
performance).

For the move-pattern, the obvious method may seem to simply "curse" the
constraints of some alternatives (by prepending one of the "?!^$"
characters) but that method can't be used, because we want the effect to
be conditional on the CPU variant.  It'd also be a shame to split the
"*movsi_internal<setcc><setnz><setnzvc>" into two CPU-variants (with
different cursing).  Iterators would help, but it still seems unwieldy.
Instead, add copies of the GENERAL_REGS variants (to the SPECIAL_REGS
alternatives) on the "other" side, and make use of the "enabled"
attribute to activate just the desired order of alternatives.

gcc:

	* config/cris/cris.cc (cris_preferred_reload_class): Reject
	"eliminated" registers and small-enough constants unless
	reloaded into a class that is a subset of GENERAL_REGS.
	* config/cris/cris.md (attribute "cpu_variant"): New.
	(attribute "enabled"): Conditionalize on a matching attribute
	cpu_variant, if specified.
	("*movsi_internal<setcc><setnz><setnzvc>"): For moves to and from
	memory, add cpu-variant-enabled variants for "r" alternatives on
	the far side of the "x" alternatives, preferring the "x" ones
	only for variants where MOF is present (in addition to SRP).
2022-02-02 01:20:06 +01:00
Hans-Peter Nilsson
9a7f14ef9b cris: Don't discriminate against ALL_REGS in TARGET_REGISTER_MOVE_COST
When the tightest class including both SPECIAL_REGS and GENERAL_REGS
is ALL_REGS, artificially special-casing for *either* to or from, hits
artificially hard.  This gets the port back to the code quality before
the previous patch ("cris: Remove CRIS v32 ACR artefacts") - except
for_vfprintf_r and _vfiprintf_r in newlib (still .8 and .4% larger).

gcc:
	* config/cris/cris.cc (cris_register_move_cost): Remove special pre-ira
	extra cost for ALL_REGS.
2022-02-02 01:20:05 +01:00
Hans-Peter Nilsson
27e35bc491 cris: Remove CRIS v32 ACR artefacts
This is the change to which I alluded to this in r11-220 /
d0780379c1 as "causes extra register moves in libgcc".  It has
unfortunate side-effects due to the change in register-class topology.
There's a slight improvement in coremark numbers (< 0.07%) though also
increase in code size total (< 0.7%) but looking at the individual
changes in functions, it's all-over (-7..+7%).  Looking specifically
at functions that improved in speed, it's also both plus and minus in
code sizes.  It's unworkable to separate improvements from regressions
for this case.  I'll follow up with patches to restore the previous
code quality, in both size and speed.

gcc:
	* config/cris/constraints.md (define_register_constraint "b"): Now
	GENERAL_REGS.
	* config/cris/cris.md (CRIS_ACR_REGNUM): Remove.
	* config/cris/cris.h: (reg_class, REG_CLASS_NAMES)
	(REG_CLASS_CONTENTS): Remove ACR_REGS, SPEC_ACR_REGS, GENNONACR_REGS,
	and SPEC_GENNONACR_REGS.
	* config/cris/cris.cc (cris_preferred_reload_class): Don't mention
	ACR_REGS and return GENERAL_REGS instead of GENNONACR_REGS.
2022-02-02 01:20:04 +01:00
Hans-Peter Nilsson
a58401d2e6 cris: For expanded movsi, don't match operands we know will be reloaded
In a session investigating unexpected fallout from a change, I
noticed reload needs one operand being a register to make an
informed decision.  It can happen that there's just a constant
and a memory operand, as in:

(insn 668 667 42 104 (parallel [
            (set (mem:SI (plus:SI (reg/v/f:SI 347 [ fs ])
                        (const_int 168 [0xa8])) \
 [1 fs_126(D)->regs.cfa_how+0 S4 A8])
                (const_int 2 [0x2]))
            (clobber (reg:CC 19 dccr))
        ]) "<...>/gcc/libgcc/unwind-dw2.c":1121:21 22 {*movsi_internal}
     (expr_list:REG_UNUSED (reg:CC 19 dccr)
        (nil)))

This was helpfully created by combine.  When this happens,
reload can't check for costs and preferred register classes,
(both operands will start with NO_REGS as the preferred class)
and will default to the constraints order in the insn in reload.
(Which also does its own temporary merge in find_reloads, but
that's a different story.)  Better don't match the simple cases.
Beware that subregs have to be matched.

I'm doing this just for word_mode (SI) for now, but may repeat
this for the other valid modes as well.  In particular, that
goes for DImode as I see the expanded movdi does *almost* this,
but uses register_operand instead of REG_S_P (from cris.h).
Using REG_S_P is the right choice here because register_operand
also matches (subreg (mem ...)  ...) *until* reload is done.
By itself it's just a sub-0.1% performance win (coremark).

Also removing a stale comment.

gcc:
	* config/cris/cris.md ("*movsi_internal<setcc><setnz><setnzvc>"):
	Conditionalize on (sub-)register operands or operand 1 being 0.
2022-02-02 01:20:03 +01:00
Hans-Peter Nilsson
4c4d0af4c9 cris: Don't default to -mmul-bug-workaround
This flips the default for the errata handling for an old version
(TL;DR: workaround: no multiply instruction last on a cache-line).
Newer versions of the CRIS cpu don't have that bug.  While the impact
of the workaround is very marginal (coremark: less than .05% larger,
less than .0005% slower) it's an irritating pseudorandom factor when
assessing the impact of other changes.

Also, fix a wart requiring changes to more than TARGET_DEFAULT to flip
the default.

People building old kernels or operating systems to run on
ETRAX 100 LX are advised to pass "-mmul-bug-workaround".

gcc:
	* config/cris/cris.h (TARGET_DEFAULT): Don't include MASK_MUL_BUG.
	(MUL_BUG_ASM_DEFAULT): New macro.
	(MAYBE_AS_NO_MUL_BUG_ABORT): Define in terms of MUL_BUG_ASM_DEFAULT.
	* doc/invoke.texi (CRIS Options, -mmul-bug-workaround): Adjust
	accordingly.
2022-02-02 01:20:02 +01:00
GCC Administrator
ae7e4af964 Daily bump. 2022-02-02 00:17:16 +00:00
Jonathan Wakely
d98668eb06 libstdc++: Do not use dirent::d_type unconditionally
These new tests should not use the d_type member unless it's actually
present on the OS.

libstdc++-v3/ChangeLog:

	* testsuite/27_io/filesystem/iterators/error_reporting.cc: Use
	autoconf macro to check whether d_type is present.
	* testsuite/experimental/filesystem/iterators/error_reporting.cc:
	Likewise.
2022-02-02 00:01:43 +00:00
Eugene Rozenfeld
c17975d81a AutoFDO: don't set param_early_inliner_max_iterations to 10.
param_early_inliner_max_iterations specifies the maximum number
of nested indirect inlining iterations performed by early inliner.
Normally, the default value is 1.

For AutoFDO this parameter was also used as the number of iteration for
its indirect call promotion loop and the default value was set to 10.
While it makes sense to have 10 in the indirect call promotion loop
(we want to make the IR match the profiled binary before actual annotation)
there is no reason to have a special default value for the
regular early inliner.

This change removes the special AutoFDO default value setting for
param_early_inliner_max_iterations while keeping 10 as the number of
iterations for the AutoFDO indirect call promotion loop.

This change improves a simple fibonacci benchmark in AutoFDO mode
by 15% on x86_64-pc-linux-gnu.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
	* auto-profile.cc (auto_profile): Hard-code the number of iterations (10).

gcc/ChangeLog:
	* opts.cc (common_handle_option): Don't set param_early_inliner_max_iterations
	to 10 for AutoFDO.
2022-02-01 15:20:11 -08:00