Commit Graph

192096 Commits

Author SHA1 Message Date
Richard Biener 3a7ba8fd0c tree-optimization/104960 - unsplit edges after late sinking
Something went wrong when testing the earlier patch to move the
late sinking to before the late phiopt for PR102008.  The following
makes sure to unsplit edges after the late sinking since the split
edges confuse the following phiopt leading to missed optimizations.

I've went for a new pass parameter for this to avoid changing the
CFG after the early sinking pass at this point.

2022-03-17  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/104960
	* passes.def: Add pass parameter to pass_sink_code, mark
	last one to unsplit edges.
	* tree-ssa-sink.cc (pass_sink_code::set_pass_param): New.
	(pass_sink_code::execute): Always execute TODO_cleanup_cfg
	when we need to unsplit edges.

	* gcc.dg/gimplefe-37.c: Adjust to allow either the true
	or false edge to have a forwarder.
2022-03-17 09:51:09 +01:00
Jakub Jelinek 7276a18aba gimplify: Emit clobbers for TARGET_EXPR_SLOT vars later [PR103984]
As mentioned in the PR, we emit a bogus uninitialized warning but
easily could emit wrong-code for it or similar testcases too.
The bug is that we emit clobber for a TARGET_EXPR_SLOT too early:
          D.2499.e = B::qux (&h); [return slot optimization]
          D.2516 = 1;
          try
            {
              B::B (&D.2498, &h);
              try
                {
                  _2 = baz (&D.2498);
                  D.2499.f = _2;
                  D.2516 = 0;
                  try
                    {
                      try
                        {
                          bar (&D.2499);
                        }
                      finally
                        {
                          C::~C (&D.2499);
                        }
                    }
                  finally
                    {
                      D.2499 = {CLOBBER(eol)};
                    }
                }
              finally
                {
                  D.2498 = {CLOBBER(eol)};
                }
            }
          catch
            {
              if (D.2516 != 0) goto <D.2517>; else goto <D.2518>;
              <D.2517>:
              A::~A (&D.2499.e);
              goto <D.2519>;
              <D.2518>:
              <D.2519>:
            }
The CLOBBER for D.2499 is essentially only emitted on the non-exceptional
path, if B::B or baz throws, then there is no CLOBBER for it but there
is a conditional destructor A::~A (&D.2499.e).  Now, ehcleanup1
sink_clobbers optimization assumes that clobbers in the EH cases are
emitted after last use and so sinks the D.2499 = {CLOBBER(eol)}; later,
so we then have
  # _3 = PHI <1(3), 0(9)>
<L2>:
  D.2499 ={v} {CLOBBER(eol)};
  D.2498 ={v} {CLOBBER(eol)};
  if (_3 != 0)
    goto <bb 11>; [INV]
  else
    goto <bb 15>; [INV]

  <bb 11> :
  _35 = D.2499.a;
  if (&D.2499.b != _35)
where that _35 = D.2499.a comes from inline expansion of the A::~A dtor,
and that is a load from a clobbered memory.

Now, what the gimplifier sees in this case is a CLEANUP_POINT_EXPR with
somewhere inside of it a TARGET_EXPR for D.2499 (with the C::~C (&D.2499)
cleanup) which in its TARGET_EXPR_INITIAL has another TARGET_EXPR for
D.2516 bool flag which has CLEANUP_EH_ONLY which performs that conditional
A::~A (&D.2499.e) call.
The following patch ensures that CLOBBERs (and asan poisoning) are emitted
after even those gimple_push_cleanup pushed cleanups from within the
TARGET_EXPR_INITIAL gimplification (i.e. the last point where the slot could
be in theory used).  In my first version of the patch I've done it by just
moving the
      /* Add a clobber for the temporary going out of scope, like
         gimplify_bind_expr.  */
      if (gimplify_ctxp->in_cleanup_point_expr
          && needs_to_live_in_memory (temp))
        {
...
        }
block earlier in gimplify_target_expr, but that regressed a couple of tests
where temp is marked TREE_ADDRESSABLE only during (well, very early during
that) the gimplification of TARGET_EXPR_INITIAL, so we didn't emit e.g. on
pr80032.C or stack2.C tests any clobbers for the slots and thus stack slot
reuse wasn't performed.
So that we don't regress those tests, this patch gimplifies
TARGET_EXPR_INITIAL as before, but doesn't emit it directly into pre_p,
emits it into a temporary sequence.  Then emits the CLOBBER cleanup
into pre_p, then asan poisoning if needed, then appends the
TARGET_EXPR_INITIAL temporary sequence and finally adds TARGET_EXPR_CLEANUP
gimple_push_cleanup.  The earlier a GIMPLE_WCE appears in the sequence, the
outer try/finally or try/catch it is.
So, with this patch the part of the testcase in gimple dump cited above
looks instead like:
          try
            {
              D.2499.e = B::qux (&h); [return slot optimization]
              D.2516 = 1;
              try
                {
                  try
                    {
                      B::B (&D.2498, &h);
                      _2 = baz (&D.2498);
                      D.2499.f = _2;
                      D.2516 = 0;
                      try
                        {
                          bar (&D.2499);
                        }
                      finally
                        {
                          C::~C (&D.2499);
                        }
                    }
                  finally
                    {
                      D.2498 = {CLOBBER(eol)};
                    }
                }
              catch
                {
                  if (D.2516 != 0) goto <D.2517>; else goto <D.2518>;
                  <D.2517>:
                  A::~A (&D.2499.e);
                  goto <D.2519>;
                  <D.2518>:
                  <D.2519>:
                }
            }
          finally
            {
              D.2499 = {CLOBBER(eol)};
            }

2022-03-17  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/103984
	* gimplify.cc (gimplify_target_expr): Gimplify type sizes and
	TARGET_EXPR_INITIAL into a temporary sequence, then push clobbers
	and asan unpoisioning, then append the temporary sequence and
	finally the TARGET_EXPR_CLEANUP clobbers.

	* g++.dg/opt/pr103984.C: New test.
2022-03-17 09:23:45 +01:00
Thomas Schwinge c43cb355f2 Enhance further testcases to verify Openacc 'kernels' decomposition
gcc/testsuite/
	* c-c++-common/goacc-gomp/nesting-1.c: Enhance.
	* c-c++-common/goacc/kernels-loop-g.c: Likewise.
	* c-c++-common/goacc/nesting-1.c: Likewise.
	* gcc.dg/goacc/nested-function-1.c: Likewise.
	* gfortran.dg/goacc/common-block-3.f90: Likewise.
	* gfortran.dg/goacc/nested-function-1.f90: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c:
	Enhance.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Likewise.
	* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
2022-03-17 08:51:32 +01:00
Thomas Schwinge 004fc4f2fc Enhance further testcases to verify handling of OpenACC privatization level [PR90115]
As originally introduced in commit 11b8286a83
"[OpenACC privatization] Largely extend diagnostics and corresponding testsuite
coverage [PR90115]".

	PR middle-end/90115
	gcc/testsuite/
	* c-c++-common/goacc-gomp/nesting-1.c: Enhance.
	* gfortran.dg/goacc/common-block-3.f90: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Enhance.
	* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
2022-03-17 08:47:09 +01:00
GCC Administrator 9fc8f278eb Daily bump. 2022-03-17 00:17:00 +00:00
Roger Sayle 3ef2343927 Fix strange binary corruption with last commit.
2022-03-16  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/sse.md: Delete corrupt character/typo.
2022-03-16 23:28:21 +00:00
Roger Sayle 4565a07a64 PR c/98198: ICE-on-invalid-code error recovery.
This is Christophe Lyon's fix to PR c/98198, an ICE-on-invalid-code
regression affecting mainline, and a suitable testcase.
Tested on x86_64-pc-linux-gnu with make bootstrap and make -k check
with no new failures.  Ok for mainline?

2022-03-16  Christophe Lyon  <christophe.lyon@arm.com>
	    Roger Sayle  <roger@nextmovesoftware.com>

gcc/c-family/ChangeLog
	PR c/98198
	* c-attribs.cc (decl_or_type_attrs): Add error_mark_node check.

gcc/testsuite/ChangeLog
	PR c/98198
	* gcc.dg/pr98198.c: New test case.
2022-03-16 23:20:34 +00:00
Roger Sayle 732e4a75fe PR target/94680: Clear upper bits of V2DF using movq (like V2DI).
This simple i386 patch unblocks a more significant change.  The testcase
gcc.target/i386/sse2-pr94680.c isn't quite testing what's intended, and
alas the fix for PR target/94680 doesn't (yet) handle V2DF mode.

For the first test from sse2-pr94680.c, below

v2df foo_v2df (v2df x) {
  return __builtin_shuffle (x, (v2df) { 0, 0 }, (v2di) { 0, 2 });
}

GCC on x86_64-pc-linux-gnu with -O2 currently generates:

        movhpd  .LC0(%rip), %xmm0
        ret
.LC0:
        .long   0
        .long   0

which passes the test as it contains a mov insn and no xor.
Alas reading a zero from the constant pool isn't quite the
desired implementation.  With this patch we now generate:

        movq    %xmm0, %xmm0
        ret

The same code as we generate for V2DI, and add a stricter
test case.  This implementation generalizes the sse2_movq128
to V2DI and V2DF modes using a VI8F_128 mode iterator and
renames it *sse2_movq128_<mode>.  A new define_expand is
introduced for sse2_movq128 so that the exisiting builtin
interface (CODE_FOR_sse2_movq128) remains the same.

2022-03-16  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR target/94680
	* config/i386/sse.md (sse2_movq128): New define_expand to
	preserve previous named instruction.
	(*sse2_movq128_<mode>): Renamed from sse2_movq128, and
	generalized to VI8F_128 (both V2DI and V2DF).

gcc/testsuite/ChangeLog
	PR target/94680
	* gcc.target/i386/sse2-pr94680-2.c: New stricter V2DF test case.
2022-03-16 23:15:20 +00:00
Jonathan Wakely 2f26b26721 libstdc++: Fix symbol versioning for Solaris 11.3 [PR103407]
The new std::from_chars implementation means that those symbols are now
defined on Solaris 11.3, which lacks uselocale. They were not present in
gcc-11, but the linker script gives them the GLIBCXX_3.4.29 symbol
version because that is the version where they appeared for systems with
uselocale.

This makes the version for those symbols depend on whether uselocale is
available or not, so that they get version GLIBCXX_3.4.30 on targets
where they weren't defined in gcc-11.

In order to avoid needing separate ABI baseline files for Solaris 11.3
and 11.4, the ABI checker program now treats the floating-point
std::from_chars overloads as undesignated if they are not found in the
baseline symbols file. This means they can be left out of the SOlaris
baseline without causing the check-abi target to fail.

libstdc++-v3/ChangeLog:

	PR libstdc++/103407
	* config/abi/pre/gnu.ver: Make version for std::from_chars
	depend on HAVE_USELOCALE macro.
	* testsuite/util/testsuite_abi.cc (compare_symbols): Treat
	std::from_chars for floating-point types as undesignated if
	not found in the baseline symbols file.
2022-03-16 21:16:53 +00:00
Ian Lance Taylor 69921f4a7e libgo: update to final Go 1.18 release
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/393377
2022-03-16 13:52:32 -07:00
David Malcolm 7fd6e36ea9 analyzer: early rejection of disabled warnings [PR104955]
Avoid generating execution paths for warnings that are ultimately
rejected due to -Wno-analyzer-* flags.

This improves the test case from taking at least several minutes
(before I killed it) to taking under a second.

This doesn't fix the slowdown seen in PR analyzer/104955 with large
numbers of warnings when the warnings are still enabled.

gcc/analyzer/ChangeLog:
	PR analyzer/104955
	* diagnostic-manager.cc (get_emission_location): New.
	(diagnostic_manager::diagnostic_manager): Initialize
	m_num_disabled_diagnostics.
	(diagnostic_manager::add_diagnostic): Reject diagnostics that
	will eventually be rejected due to being disabled.
	(diagnostic_manager::emit_saved_diagnostics): Log the number
	of disabled diagnostics.
	(diagnostic_manager::emit_saved_diagnostic): Split out logic for
	determining emission location to get_emission_location.
	* diagnostic-manager.h
	(diagnostic_manager::m_num_disabled_diagnostics): New field.
	* engine.cc (stale_jmp_buf::get_controlling_option): New.
	(stale_jmp_buf::emit): Use it.
	* pending-diagnostic.h
	(pending_diagnostic::get_controlling_option): New vfunc.
	* region-model.cc
	(poisoned_value_diagnostic::get_controlling_option): New.
	(poisoned_value_diagnostic::emit): Use it.
	(shift_count_negative_diagnostic::get_controlling_option): New.
	(shift_count_negative_diagnostic::emit): Use it.
	(shift_count_overflow_diagnostic::get_controlling_option): New.
	(shift_count_overflow_diagnostic::emit): Use it.
	(dump_path_diagnostic::get_controlling_option): New.
	(dump_path_diagnostic::emit): Use it.
	(write_to_const_diagnostic::get_controlling_option): New.
	(write_to_const_diagnostic::emit): Use it.
	(write_to_string_literal_diagnostic::get_controlling_option): New.
	(write_to_string_literal_diagnostic::emit): Use it.
	* sm-file.cc (double_fclose::get_controlling_option): New.
	(double_fclose::emit): Use it.
	(file_leak::get_controlling_option): New.
	(file_leak::emit): Use it.
	* sm-malloc.cc (mismatching_deallocation::get_controlling_option):
	New.
	(mismatching_deallocation::emit): Use it.
	(double_free::get_controlling_option): New.
	(double_free::emit): Use it.
	(possible_null_deref::get_controlling_option): New.
	(possible_null_deref::emit): Use it.
	(possible_null_arg::get_controlling_option): New.
	(possible_null_arg::emit): Use it.
	(null_deref::get_controlling_option): New.
	(null_deref::emit): Use it.
	(null_arg::get_controlling_option): New.
	(null_arg::emit): Use it.
	(use_after_free::get_controlling_option): New.
	(use_after_free::emit): Use it.
	(malloc_leak::get_controlling_option): New.
	(malloc_leak::emit): Use it.
	(free_of_non_heap::get_controlling_option): New.
	(free_of_non_heap::emit): Use it.
	* sm-pattern-test.cc (pattern_match::get_controlling_option): New.
	(pattern_match::emit): Use it.
	* sm-sensitive.cc
	(exposure_through_output_file::get_controlling_option): New.
	(exposure_through_output_file::emit): Use it.
	* sm-signal.cc (signal_unsafe_call::get_controlling_option): New.
	(signal_unsafe_call::emit): Use it.
	* sm-taint.cc (tainted_array_index::get_controlling_option): New.
	(tainted_array_index::emit): Use it.
	(tainted_offset::get_controlling_option): New.
	(tainted_offset::emit): Use it.
	(tainted_size::get_controlling_option): New.
	(tainted_size::emit): Use it.
	(tainted_divisor::get_controlling_option): New.
	(tainted_divisor::emit): Use it.
	(tainted_allocation_size::get_controlling_option): New.
	(tainted_allocation_size::emit): Use it.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/many-disabled-diagnostics.c: New test.
	* gcc.dg/plugin/analyzer_gil_plugin.c
	(gil_diagnostic::get_controlling_option): New.
	(double_save_thread::emit): Use it.
	(fncall_without_gil::emit): Likewise.
	(pyobject_usage_without_gil::emit): Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-03-16 14:01:19 -04:00
Jonathan Wakely 5a4e208022 libstdc++: Ensure that std::from_chars is declared when supported
This adjusts the declarations in <charconv> to match when the definition
is present. This solves the issue that std::from_chars is present on
Solaris 11.3 (using fast_float) but was not declared in the header
(because the declarations were guarded by _GLIBCXX_HAVE_USELOCALE).

Additionally, do not define __cpp_lib_to_chars unless both from_chars
and to_chars are supported (which is only true for IEEE float and
double). We might still provide from_chars (via strtold) but if to_chars
isn't provided, we shouldn't define the feature test macro.

Finally, this simplifies some of the preprocessor checks in the bodies
of std::from_chars in src/c++17/floating_from_chars.cc and hoists the
repeated code for the strtod version into a new function template.

N.B. the long double overload of std::from_chars will always be defined
if the float and double overloads are defined. We can always use one of
strtold or fast_float's binary64 routines (although the latter might
produce errors for some long double values if they are not representable
as binary64).

libstdc++-v3/ChangeLog:

	* include/std/charconv (__cpp_lib_to_chars): Only define when
	both from_chars and to_chars are supported for floating-point
	types.
	(from_chars, to_chars): Adjust preprocessor conditions guarding
	declarations.
	* include/std/version (__cpp_lib_to_chars): Adjust condition to
	match <charconv> definition.
	* src/c++17/floating_from_chars.cc (from_chars_strtod): New
	function template.
	(from_chars): Simplify preprocessor checks and use
	from_chars_strtod when appropriate.
2022-03-16 16:06:29 +00:00
Siddhesh Poyarekar beb12c62ea tree-optimization/104941: Actually assign the conversion result
Assign the result of fold_convert to offset.  Also make the useless
conversion check lighter since the two way check is not needed here.

gcc/ChangeLog:

	PR tree-optimization/104941
	* tree-object-size.cc (size_for_offset): Make useless conversion
	check lighter and assign result of fold_convert to OFFSET.

gcc/testsuite/ChangeLog:

	PR tree-optimization/104941
	* gcc.dg/builtin-dynamic-object-size-0.c (S1, S2): New structs.
	(test_alloc_nested_structs, g): New functions.
	(main): Call test_alloc_nested_structs.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2022-03-16 20:45:48 +05:30
Marcel Vollweiler be093b8dcc OpenMP, Fortran: Bugfix for omp_set_num_teams.
This patch fixes a small bug in the omp_set_num_teams implementation.

libgomp/ChangeLog:

	* fortran.c (omp_set_num_teams_8_): Call omp_set_num_teams instead of
	omp_set_max_active_levels.
	* testsuite/libgomp.fortran/icv-8.f90: New test.
2022-03-16 07:38:54 -07:00
H.J. Lu 3117ffce4c x86: Also check _SOFT_FLOAT in <x86gprintrin.h>
Push target("general-regs-only") in <x86gprintrin.h> if x87 is enabled.

gcc/

	PR target/104890
	* config/i386/x86gprintrin.h: Also check _SOFT_FLOAT before
	pushing target("general-regs-only").

gcc/testsuite/

	PR target/104890
	* gcc.target/i386/pr104890.c: New test.
2022-03-16 06:30:53 -07:00
Kito Cheng 2a5fabeb2f RISC-V: Add version info for zk, zkn and zks
We just expand `zk`, `zkn` and `zks` before, but need version for
combine them back.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc (riscv_ext_version_table):
	Add version info for zk, zks and zkn.
2022-03-16 21:11:46 +08:00
LiaoShihua eb4f83d1f1 RISC-V: Handle combine extension in canonical ordering.
The crypto extension have several shorthand extensions that don't consist of any extra instructions.
Take zk for example, while the extension would imply zkn, zkr, zkt.
The 3 extensions should also combine back into zk to maintain the canonical order in isa strings.
This patch addresses the above.
And if the other extension has the same situation, you can add them in riscv_combine_info[]

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc
	(riscv_combine_info): New.
	(riscv_subset_list::handle_combine_ext): Combine back into zk to
	maintain the canonical order in isa strings.
	(riscv_subset_list::parse): Ditto.
	* config/riscv/riscv-subset.h (handle_combine_ext): New.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/predef-17.c: New test.
2022-03-16 21:11:45 +08:00
Richard Biener f6fb661ea8 tree-optimization/102008 - restore if-conversion of adjacent loads
The following re-orders the newly added code sinking pass before
the last phiopt pass which performs hoisting of adjacent loads
with the intent to enable if-conversion on those.

I've added the aarch64 specific testcase from the PR.

2022-03-16  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/102008
	* passes.def: Move the added code sinking pass before the
	preceeding phiopt pass.

	* gcc.target/aarch64/pr102008.c: New testcase.
2022-03-16 14:00:35 +01:00
Patrick Palka 5809bb4f78 c++: further lookup_member simplification
As a minor followup to r12-7656-gffe9c0a0d3564a, this condenses the
handling of ambiguity and access w.r.t. the value of 'protect' so that
the logic is more clear.

gcc/cp/ChangeLog:

	* search.cc (lookup_member): Simplify by handling all values
	of protect together in the ambiguous case.  Don't modify protect.
2022-03-16 08:26:11 -04:00
Patrick Palka e55c5e24b9 c++: fold calls to std::move/forward [PR96780]
A well-formed call to std::move/forward is equivalent to a cast, but the
former being a function call means the compiler generates debug info,
which persists even after the call gets inlined, for an operation that's
never interesting to debug.

This patch addresses this problem by folding calls to std::move/forward
and other cast-like functions into simple casts as part of the frontend's
general expression folding routine.  This behavior is controlled by a
new flag -ffold-simple-inlines, and otherwise by -fno-inline, so that
users can enable this folding with -O0 (which implies -fno-inline).

After this patch with -O2 and a non-checking compiler, debug info size
for some testcases from range-v3 and cmcstl2 decreases by as much as ~10%
and overall compile time and memory usage decreases by ~2%.

	PR c++/96780

gcc/ChangeLog:

	* doc/invoke.texi (C++ Dialect Options): Document
	-ffold-simple-inlines.

gcc/c-family/ChangeLog:

	* c.opt: Add -ffold-simple-inlines.

gcc/cp/ChangeLog:

	* cp-gimplify.cc (cp_fold) <case CALL_EXPR>: Fold calls to
	std::move/forward and other cast-like functions into simple
	casts.

gcc/testsuite/ChangeLog:

	* g++.dg/opt/pr96780.C: New test.
2022-03-16 08:25:54 -04:00
Siddhesh Poyarekar 818e305ea6 tree-optimization/104942: Retain sizetype conversions till the end
Retain the sizetype alloc_object_size to guarantee the assertion in
size_for_offset and to avoid adding a conversion there.  nop conversions
are eliminated at the end anyway in dynamic object size computation.

gcc/ChangeLog:

	PR tree-optimization/104942
	* tree-object-size.cc (alloc_object_size): Remove STRIP_NOPS.

gcc/testsuite/ChangeLog:

	PR tree-optimization/104942
	* gcc.dg/builtin-dynamic-object-size-0.c (alloc_func_long,
	test_builtin_malloc_long): New functions.
	(main): Use it.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2022-03-16 16:10:51 +05:30
Jakub Jelinek 952155629c aarch64: Fix up RTL sharing bug in aarch64_load_symref_appropriately [PR104910]
We unshare all RTL created during expansion, but when
aarch64_load_symref_appropriately is called after expansion like in the
following testcases, we use imm in both HIGH and LO_SUM operands.
If imm is some RTL that shouldn't be shared like a non-sharable CONST,
we get at least with --enable-checking=rtl a checking ICE, otherwise might
just get silently wrong code.

The following patch fixes that by copying it if it can't be shared.

2022-03-16  Jakub Jelinek  <jakub@redhat.com>

	PR target/104910
	* config/aarch64/aarch64.cc (aarch64_load_symref_appropriately): Copy
	imm rtx.

	* gcc.dg/pr104910.c: New test.
2022-03-16 11:04:16 +01:00
Roger Sayle 6aef670e48 Performance/size improvement to single_use when matching GIMPLE.
This patch improves the implementation of single_use as used in code
generated from match.pd for patterns using :s.  The current implementation
contains the logic "has_zero_uses (t) || has_single_use (t)" which
performs a loop over the uses to first check if there are zero non-debug
uses [which is rare], then another loop over these uses to check if there
is exactly one non-debug use.  This can be better implemented using a
single loop.

This function is currently inlined over 800 times in gimple-match.cc,
whose .o on x86_64-pc-linux-gnu is now up to 30 Mbytes, so speeding up
and shrinking this function should help offset the growth in match.pd
for GCC 12.

I've also done an analysis of the stage3 sizes of gimple-match.o on
x86_64-pc-linux-gnu, which I believe is dominated by debug information,
the .o file is 30MB in stage3, but only 4.8M in stage2.  Before my
proposed patch gimple-match.o is 31385160 bytes.  The patch as proposed
yesterday (using a single loop in single_use) reduces that to 31105040
bytes, saving 280120 bytes.  The suggestion to remove the "inline"
keyword saves only 56 more bytes, but annotating ATTRIBUTE_PURE on a
function prototype was curiously effective, saving 1888 bytes.

before:   31385160
after:    31105040	saved 280120
-inline:  31104984	saved 56
+pure:    31103096	saved 1888

2022-03-16  Roger Sayle  <roger@nextmovesoftware.com>
	    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
	* gimple-match-head.cc (single_use): Implement inline using a
	single loop.
2022-03-16 09:27:33 +00:00
Roger Sayle 7690bee9f3 Some minor HONOR_NANS improvements to match.pd
Tweak the constant folding of X CMP X in when X can't be a NaN.

2022-03-16  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* match.pd (X CMP X -> true): Test tree_expr_maybe_nan_p
	instead of HONOR_NANS.
	(X LTGT X -> false): Enable if X is not tree_expr_maybe_nan_p, as
	this can't trap/signal.
2022-03-16 09:25:34 +00:00
Thomas Schwinge ab46fc7c3b OpenACC privatization diagnostics vs. 'assert' [PR102841]
It's an orthogonal concern why these diagnostics do appear at all for
non-offloaded OpenACC constructs (where they're not relevant at all); PR90115.

Depending on how 'assert' is implemented, it may cause temporaries to be
created, and/or may lower into 'COND_EXPR's, and
'gcc/gimplify.cc:gimplify_cond_expr' uses 'create_tmp_var (type, "iftmp")'.

Fix-up for commit 11b8286a83
"[OpenACC privatization] Largely extend diagnostics and
corresponding testsuite coverage [PR90115]".

	PR testsuite/102841
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Adjust.
2022-03-16 10:12:09 +01:00
liuhongt 570d5bff9a Don't fold __builtin_ia32_blendvpd w/o sse4.2.
__builtin_ia32_blendvpd is defined under sse4.1 and gimple folded
to ((v2di) c) < 0 ? b : a where vec_cmpv2di is under sse4.2 w/o which
it's veclowered to scalar operations and not combined back in rtl.

gcc/ChangeLog:

	PR target/104946
	* config/i386/i386-builtin.def (BDESC): Add
	CODE_FOR_sse4_1_blendvpd for IX86_BUILTIN_BLENDVPD.
	* config/i386/i386.cc (ix86_gimple_fold_builtin): Don't fold
	__builtin_ia32_blendvpd w/o sse4.2

gcc/testsuite/ChangeLog:

	* gcc.target/i386/sse4_1-blendvpd-1.c: New test.
2022-03-16 16:50:29 +08:00
Chung-Ju Wu 088a51a0ab MAINTAINERS: Add myself to DCO section
ChangeLog:

	* MAINTAINERS: Add myself to DCO section.
2022-03-16 03:20:00 +00:00
GCC Administrator 14d2ac82ee Daily bump. 2022-03-16 00:16:44 +00:00
David Malcolm d1d95846e3 analyzer: add test coverage for PR 95000
PR analyzer/95000 isn't fixed yet; add test coverage with XFAILs.

gcc/testsuite/ChangeLog:
	PR analyzer/95000
	* gcc.dg/analyzer/pr95000-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-03-15 17:56:29 -04:00
David Malcolm a58e342d88 analyzer: presize m_cluster_map in store copy ctor
Testing cc1 on pr93032-mztools-unsigned-char.c

Benchmark #1: (without patch)
  Time (mean ± σ):     338.8 ms ±  13.6 ms    [User: 323.2 ms, System: 14.2 ms]
  Range (min … max):   326.7 ms … 363.1 ms    10 runs

Benchmark #2: (with patch)
  Time (mean ± σ):     332.3 ms ±  12.8 ms    [User: 316.6 ms, System: 14.3 ms]
  Range (min … max):   322.5 ms … 357.4 ms    10 runs

Summary
  ./cc1.new ran 1.02 ± 0.06 times faster than ./cc1.old

gcc/analyzer/ChangeLog:
	* store.cc (store::store): Presize m_cluster_map.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-03-15 17:55:14 -04:00
Peter Bergner b5baf569f7 rs6000: Fix invalid address passed to __builtin_mma_disassemble_acc [PR104923]
The mma_disassemble_output_operand predicate is too lenient on the types
of addresses it will accept, leading to combine creating invalid address
that eventually lead to ICEs in LRA.  The solution is to restrict the
addresses to indirect, indexed or those valid for quad memory accesses.

2022-03-15  Peter Bergner  <bergner@linux.ibm.com>

gcc/
	PR target/104923
	* config/rs6000/predicates.md (mma_disassemble_output_operand): Restrict
	acceptable MEM addresses.

gcc/testsuite/
	PR target/104923
	* gcc.target/powerpc/pr104923.c: New test.
2022-03-15 08:49:47 -05:00
Patrick Palka ffe9c0a0d3 c++: extraneous access error with ambiguous lookup [PR103177]
When a lookup is ambiguous, lookup_member still attempts to check
access of the first member found before diagnosing the ambiguity and
propagating the error, and this may cause us to issue an extraneous
access error as in the testcase below (for B1::foo).

This patch fixes this by swapping the order of the ambiguity and access
checks within lookup_member.  In passing, since the only thing that could
go wrong during lookup_field_r is ambiguity, we might as well hardcode
that in lookup_member and get rid of lookup_field_info::errstr.

	PR c++/103177

gcc/cp/ChangeLog:

	* search.cc (lookup_field_info::errstr): Remove this data
	member.
	(lookup_field_r): Don't set errstr.
	(lookup_member): Check ambiguity before checking access.
	Simplify accordingly after errstr removal.  Exit early upon
	error or empty result.

gcc/testsuite/ChangeLog:

	* g++.dg/lookup/ambig6.C: New test.
2022-03-15 08:50:24 -04:00
Jakub Jelinek 98afdc3e2b riscv: Allow -Wno-psabi to turn off ABI warnings [PR91229]
While checking if all targets honor -Wno-psabi for ABI related warnings
or messages, I found that almost all do, except for riscv.
In the testsuite when we want to ignore ABI related messages we
typically use -Wno-psabi -w, but it would be nice to get rid of those
-w uses eventually.

The following allows silencing those warnings with -Wno-psabi rather than
just -w even on riscv.

2022-03-15  Jakub Jelinek  <jakub@redhat.com>

	PR target/91229
	* config/riscv/riscv.cc (riscv_pass_aggregate_in_fpr_pair_p,
	riscv_pass_aggregate_in_fpr_and_gpr_p): Pass OPT_Wpsabi instead of 0
	to warning calls.
2022-03-15 13:34:33 +01:00
Jakub Jelinek da24fce323 i386: Use no-mmx,no-sse for LIBGCC2_UNWIND_ATTRIBUTE [PR104890]
Regardless of the outcome of the general-regs-only stuff in x86gprintrin.h,
apparently general-regs-only is much bigger hammer than no-sse, and e.g.
using 387 instructions in the unwinder isn't a big deal, it never needs
to realign the stack because of it.

So, the following patch uses no-sse (and adds no-mmx to it, even when not
strictly needed).

2022-03-15  Jakub Jelinek  <jakub@redhat.com>

	PR target/104890
	* config/i386/i386.h (LIBGCC2_UNWIND_ATTRIBUTE): Use no-mmx,no-sse
	instead of general-regs-only.
2022-03-15 10:24:22 +01:00
Roger Sayle 49fb0af9bf PR tree-optimization/101895: Fold VEC_PERM to help recognize FMA.
This patch resolves PR tree-optimization/101895 a missed optimization
regression, by adding a costant folding simplification to match.pd to
simplify the transform "mult; vec_perm; plus" into "vec_perm; mult; plus"
with the aim that keeping the multiplication and addition next to each
other allows them to be recognized as fused-multiply-add on suitable
targets.  This transformation requires a tweak to match.pd's
vec_same_elem_p predicate to handle CONSTRUCTOR_EXPRs using the same
SSA_NAME_DEF_STMT idiom used for constructors elsewhere in match.pd.

The net effect is that the following code example:

void foo(float * __restrict__ a, float b, float *c) {
  a[0] = c[0]*b + a[0];
  a[1] = c[2]*b + a[1];
  a[2] = c[1]*b + a[2];
  a[3] = c[3]*b + a[3];
}

when compiled on x86_64-pc-linux-gnu with -O2 -march=cascadelake
currently generates:

        vbroadcastss    %xmm0, %xmm0
        vmulps  (%rsi), %xmm0, %xmm0
        vpermilps       $216, %xmm0, %xmm0
        vaddps  (%rdi), %xmm0, %xmm0
        vmovups %xmm0, (%rdi)
        ret

but with this patch now generates the improved:

        vpermilps       $216, (%rsi), %xmm1
        vbroadcastss    %xmm0, %xmm0
        vfmadd213ps     (%rdi), %xmm0, %xmm1
        vmovups %xmm1, (%rdi)
        ret

2022-03-15  Roger Sayle  <roger@nextmovesoftware.com>
	    Marc Glisse  <marc.glisse@inria.fr>
	    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
	PR tree-optimization/101895
	* match.pd (vec_same_elem_p): Handle CONSTRUCTOR_EXPR def.
	(plus (vec_perm (mult ...) ...) ...): New reordering simplification.

gcc/testsuite/ChangeLog
	PR tree-optimization/101895
	* gcc.target/i386/pr101895.c: New test case.
2022-03-15 09:05:28 +00:00
Jakub Jelinek efd1582926 c++: Fix up cp_parser_skip_to_pragma_eol [PR104623]
We ICE on the following testcase, because we tentatively parse it multiple
times and the erroneous attribute syntax results in
cp_parser_skip_to_end_of_statement, which when seeing CPP_PRAGMA (can be
any deferred one, OpenMP/OpenACC/ivdep etc.) it calls
cp_parser_skip_to_pragma_eol, which calls cp_lexer_purge_tokens_after.
That call purges all the tokens from CPP_PRAGMA until CPP_PRAGMA_EOL,
excluding the initial CPP_PRAGMA though (but including the final
CPP_PRAGMA_EOL).  This means the second time we parse this, we see
CPP_PRAGMA with no tokens after it from the pragma, most importantly
not the CPP_PRAGMA_EOL, so either if it is the last pragma in the TU,
we ICE, or if there are other pragmas we treat everything in between
as a pragma.

I've tried various things, including making the CPP_PRAGMA token
itself also purged, or changing the cp_parser_skip_to_end_of_statement
(and cp_parser_skip_to_end_of_block_or_statement) to call it with
NULL instead of token, so that this purging isn't done there,
but each patch resulted in lots of regressions.
But removing the purging altogether surprisingly doesn't regress anything,
and I think it is the right thing, if we e.g. parse tentatively, why can't
we parse the pragma multiple times or at least skip over it?

2022-03-15  Jakub Jelinek  <jakub@redhat.com>

	PR c++/104623
	* parser.cc (cp_parser_skip_to_pragma_eol): Don't purge any tokens.

	* g++.dg/gomp/pr104623.C: New test.
2022-03-15 09:15:27 +01:00
Jakub Jelinek a2645cd8fb ifcvt: Punt if not onlyjump_p for find_if_case_{1,2} [PR104814]
find_if_case_{1,2} implicitly assumes conditional jumps and rewrites them,
so if they have extra side-effects or are say asm goto, things don't work
well, either the side-effects are lost or we could ICE.
In particular, the testcase below on s390x has there a doloop instruction
that decrements a register in addition to testing it for non-zero and
conditionally jumping based on that.

The following patch fixes that by punting for !onlyjump_p case, i.e.
if there are side-effects in the jump instruction or it isn't a plain PC
setter.

Also, it assumes BB_END (test_bb) will be always non-NULL, because basic
blocks with 2 non-abnormal successor edges should always have some instruction
at the end that determines which edge to take.

2022-03-15  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/104814
	* ifcvt.cc (find_if_case_1, find_if_case_2): Punt if test_bb doesn't
	end with onlyjump_p.  Assume BB_END (test_bb) is always non-NULL.

	* gcc.c-torture/execute/pr104814.c: New test.
2022-03-15 09:12:03 +01:00
Martin Sebor 373a2dc2be Avoid -Wdangling-pointer for by-transparent-reference arguments [PR104436].
This change avoids -Wdangling-pointer for by-value arguments transformed
into by-transparent-reference.

Resolves:
PR middle-end/104436 - spurious -Wdangling-pointer assigning local address to a class passed by value

gcc/ChangeLog:

	PR middle-end/104436
	* gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores):
	Check for warning suppression.  Avoid by-value arguments transformed
	into by-transparent-reference.

gcc/testsuite/ChangeLog:

	PR middle-end/104436
	* c-c++-common/Wdangling-pointer-8.c: New test.
	* g++.dg/warn/Wdangling-pointer-5.C: New test.
2022-03-14 18:26:05 -06:00
GCC Administrator 510613e76c Daily bump. 2022-03-15 00:16:49 +00:00
Joseph Myers c6f7a9fcbf Update gcc de.po, fr.po, sv.po
* de.po, fr.po, sv.po: Update.
2022-03-14 22:28:33 +00:00
Roger Sayle 6abc4e46f8 Fix libitm.c/memset-1.c test fails with new peephole2s.
My sincere apologies for the breakage, but alas handling SImode in the
recently added "xorl;movb -> movzbl" peephole2 turns out to be slightly
more complicated that just using SWI48 as a mode iterator.  I'd failed
to check the machine description carefully, but the *zero_extend<mode>si2
define_insn is conditionally defined, based on x86 target tuning using
TARGET_ZERO_EXTEND_WITH_AND, and therefore unavailable on 486 and pentium
unless optimizing the code for size.  It turns out that the libitm testsuite
specifies -m486 with make check RUNTESTFLAGS="--target_board='unix{-m32}'"
and therefore encounters/catches oversight.

Fixed by adding the appropriate conditions to the new peephole2 patterns.

2022-03-14  Roger Sayle  <roger@nextmovesoftware.com>
	    Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
	* config/i386/i386.md (peephole2 xorl;movb -> movzbl): Disable
	transformation when *zero_extend<mode>si2 is not available.

gcc/testsuite/ChangeLog
	* gcc.target/i386/pr98335.c: Skip this test if tuning for i486
	or pentium, and not optimizing for size.
2022-03-14 18:12:55 +00:00
Xi Ruoyao 344e6f9f2a
Enable libsanitizer build on mips64
Bootstrapped and regtested on mips64-linux-gnuabi64.

bootstrap-ubsan revealed 3 bugs (PR 104842, 104843, 104851).
bootstrap-asan did not reveal any new bug.

gcc/

	* config/mips/mips.h (SUBTARGET_SHADOW_OFFSET): Define.
	* config/mips/mips.cc (mips_option_override): Make
	-fsanitize=address imply -fasynchronous-unwind-tables.  This is
	needed by libasan for stack backtrace on MIPS.
	(mips_asan_shadow_offset): Return SUBTARGET_SHADOW_OFFSET.

gcc/testsuite:

	* c-c++-common/asan/global-overflow-1.c: Skip for MIPS with some
	optimization levels because inaccurate debug info is causing
	dg-output mismatch on line numbers.
	* g++.dg/asan/large-func-test-1.C: Likewise.

libsanitizer/

	* configure.tgt: Enable build on mips*64*-*-linux*.
2022-03-15 00:39:47 +08:00
Xi Ruoyao a60a3a95d0
libsanitizer: cherry-pick db7bca28638e from upstream
libsanitizer/

	* sanitizer_common/sanitizer_atomic_clang.h: Ensures to only
	include sanitizer_atomic_clang_mips.h for O32.
2022-03-15 00:34:12 +08:00
Jakub Jelinek 77eb0461ab lra: Fix up debug_p handling in lra_substitute_pseudo [PR104778]
The following testcase ICEs on powerpc-linux, because lra_substitute_pseudo
substitutes (const_int 1) into a subreg operand.  First a subreg of subreg
of a reg appears in a debug insn (which surely is invalid outside of
debug insns, but in debug insns we allow even what is normally invalid in
RTL like subregs which the target doesn't like, because either dwarf2out
is able to handle it, or we just throw away the location expression,
making some var <optimized out>.

lra_substitute_pseudo already has some code to deal with specifically
SUBREG of REG with the REG being substituted for VOIDmode constant,
but that doesn't cover this case, so the following patch extends
lra_substitute_pseudo for debug_p mode to treat stuff like e.g.
combiner's subst function to ensure we don't lose mode which is essential
for the IL.

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

	PR debug/104778
	* lra.cc (lra_substitute_pseudo): For debug_p mode, simplify
	SUBREG, ZERO_EXTEND, SIGN_EXTEND, FLOAT or UNSIGNED_FLOAT if recursive
	call simplified the first operand into VOIDmode constant.

	* gcc.target/powerpc/pr104778.c: New test.
2022-03-14 14:49:09 +01:00
Jonathan Wakely 8f7b7c1495 libstdc++: Fix reading UTF-8 characters for 16-bit targets [PR104875]
The current code in read_utf8_code_point assumes that integer promotion
will create a 32-bit int, but that's not true for 16-bit targets like
msp430 and avr. This changes the intermediate variables used for each
octet from unsigned char to char32_t, so that (c << N) works correctly
when N > 8.

libstdc++-v3/ChangeLog:

	PR libstdc++/104875
	* src/c++11/codecvt.cc (read_utf8_code_point): Use char32_t to
	hold octets that will be left-shifted.
2022-03-14 13:08:02 +00:00
Jonathan Wakely 67a1cb2ad1 top-level: Fix comment about --enable-libstdcxx in configure
The custom option for enabling/disabling libstdc++ is not spelled the
same as the directory name:

AC_ARG_ENABLE(libstdcxx,
AS_HELP_STRING([--disable-libstdcxx],
  [do not build libstdc++-v3 directory])

The comment referring to it later use the wrong name.

ChangeLog:

	* configure.ac: Fix incorrect option in comment.
	* configure: Regenerate.
2022-03-14 13:08:02 +00:00
Jakub Jelinek c879b92c30 c++: Reject __builtin_clear_padding on non-trivially-copyable types with one exception [PR102586]
As mentioned by Jason in the PR, non-trivially-copyable types (or non-POD
for purposes of layout?) types can be base classes of derived classes in
which the padding in those non-trivially-copyable types can be reused for
some real data members or even the layout can change and data members can
be moved to other positions.
__builtin_clear_padding is right now used for multiple purposes,
in <atomic> where it isn't used yet but was planned as the main spot
it can be used for trivially copyable types only, ditto for std::bit_cast
where we also use it.  It is used for OpenMP long double atomics too but
long double is trivially copyable, and lastly for -ftrivial-auto-var-init=.

The following patch restricts the builtin to pointers to trivially-copyable
types, with the exception when it is called directly on an address of a
variable, in that case already the FE can verify it is the complete object
type and so it is safe to clear all the paddings in it.

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/102586
gcc/
	* doc/extend.texi (__builtin_clear_padding): Clearify that for C++
	argument type should be pointer to trivially-copyable type unless it
	is address of a variable or parameter.
gcc/cp/
	* call.cc (build_cxx_call): Diagnose __builtin_clear_padding where
	first argument's type is pointer to non-trivially-copyable type unless
	it is address of a variable or parameter.
gcc/testsuite/
	* g++.dg/cpp2a/builtin-clear-padding1.C: New test.
2022-03-14 10:47:38 +01:00
Jakub Jelinek a010954cc1 i386: Fix up _mm_loadu_si{16,32} [PR99754]
These intrinsics are supposed to do an unaligned may_alias load
of a 16-bit or 32-bit value and store it as the first element of
a 128-bit integer vector, with all other elements cleared.

The current _mm_storeu_* implementation implements that correctly, uses
__*_u types to do the store and extracts the first element of a vector into
it.
But _mm_loadu_si{16,32} gets it all wrong.  It performs an aligned
non-may_alias load and because _mm_set_epi{16,32} has the args reversed,
it also inserts it into the last vector element instead of first.

The following patch fixes that.

Note, while the Intrinsics guide for _mm_loadu_si32 says SSE2,
for _mm_loadu_si16 it says strangely SSE.  But the intrinsics
returns __m128i, which is only defined in emmintrin.h, and
_mm_set_epi16 is also only SSE2 and later in emmintrin.h.
Even clang defines it in emmintrin.h and ends up with inlining
failure when calling _mm_loadu_si16 from sse,no-sse2 function.
So, isn't that a bug in the intrinsic guide instead?

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

	PR target/99754
	* config/i386/emmintrin.h (_mm_loadu_si32): Put loaded value into
	first 	rather than last element of the vector, use __m32_u to do
	a really unaligned load, use just 0 instead of (int)0.
	(_mm_loadu_si16): Put loaded value into first rather than last
	element of the vector, use __m16_u to do a really unaligned load,
	use just 0 instead of (short)0.

	* gcc.target/i386/pr99754-1.c: New test.
	* gcc.target/i386/pr99754-2.c: New test.
2022-03-14 10:44:38 +01:00
Jakub Jelinek b424467166 Spelling fix - cannott -> cannot [PR104899]
This fixes typos and while changing that, also uses %< %> around attribute
names and fixes up formatting.

2022-03-14  Jakub Jelinek  <jakub@redhat.com>

	PR other/104899
	* config/bfin/bfin.cc (bfin_handle_longcall_attribute): Fix a typo
	in diagnostic message - cannott -> cannot.  Use %< and %> around
	names of attribute.  Avoid too long line.
	* range-op.cc (operator_logical_and::op1_range): Fix up a typo
	in comment - cannott -> cannot.  Use 2 spaces after . instead of one.
2022-03-14 10:40:47 +01:00
liuhongt 823b3b79cd Don't fold builtin into gimple when isa mismatches.
The patch fixes ICE in ix86_gimple_fold_builtin.

gcc/ChangeLog:

	PR target/104666
	* config/i386/i386-expand.cc
	(ix86_check_builtin_isa_match):	New func.
	(ix86_expand_builtin): Move code to
	ix86_check_builtin_isa_match and call it.
	* config/i386/i386-protos.h
	(ix86_check_builtin_isa_match): Declare.
	* config/i386/i386.cc (ix86_gimple_fold_builtin): Don't fold
	builtin into gimple when isa mismatches.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr104666.c: New test.
2022-03-14 09:22:19 +08:00