Commit Graph

187781 Commits

Author SHA1 Message Date
Roger Sayle 74cb45e67d Correct implementation of wi::clz
As diagnosed with Jakub and Richard in the analysis of PR 102134, the
current implementation of wi::clz has incorrect/inconsistent behaviour.
As mentioned by Richard in comment #7, clz should (always) return zero
for negative values, but the current implementation can only return 0
when precision is a multiple of HOST_BITS_PER_WIDE_INT.  The fix is
simply to reorder/shuffle the existing tests.

2021-09-06  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* wide-int.cc (wi::clz): Reorder tests to ensure the result
	is zero for all negative values.
2021-09-06 22:50:45 +01:00
Tobias Burnus 1bc66017c1 invoke.texi: Fix @opindex for -foffload-options
gcc/
	* doc/invoke.texi (-foffload-options): Fix @opindex.
2021-09-06 18:49:08 +02:00
Serge Belyshev 78b34cd8a8 gcc_update: use human readable name for revision string in gcc/REVISION
contrib/Changelog:

	* gcc_update: Derive human readable name for HEAD using git describe
	like "git gcc-descr" with short commit hash.  Drop "revision" from
	gcc/REVISION.
2021-09-06 15:34:39 +03:00
H.J. Lu 652bef70d3 x86: Add non-destructive source to @xorsign<mode>3_1
Add non-destructive source alternative to @xorsign<mode>3_1 for AVX.

gcc/

	PR target/89984
	* config/i386/i386-expand.c (ix86_split_xorsign): Use operands[2].
	* config/i386/i386.md (@xorsign<mode>3_1): Add non-destructive
	source alternative for AVX.

gcc/testsuite/

	PR target/89984
	* gcc.target/i386/pr89984-1.c: New test.
	* gcc.target/i386/pr89984-2.c: Likewise.
	* gcc.target/i386/xorsign-avx.c: Likewise.
2021-09-06 05:13:47 -07:00
liuhongt 93e6809459 Avoid FROM being overwritten in expand_fix.
For the conversion from _Float16 to int, if the corresponding optab
does not exist, the compiler will try the wider mode (SFmode here),
but when floatsfsi exists but FAIL, FROM will be rewritten, which
leads to a PR runtime error.

gcc/ChangeLog:

	PR middle-end/102182
	* optabs.c (expand_fix): Add from1 to avoid from being
	overwritten.

gcc/testsuite/ChangeLog:

	PR middle-end/102182
	* gcc.target/i386/pr101282.c: New test.
2021-09-06 18:57:46 +08:00
Thomas Schwinge 086bb917d6 'libgomp.c/target-43.c': '-latomic' for nvptx offloading
... to avoid a regression with recent
commit 090f0d78f1
"openmp: Improve expand_omp_atomic_pipeline":

    unresolved symbol __atomic_compare_exchange_1
    collect2: error: ld returned 1 exit status
    mkoffload: fatal error: [...]/gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status

	libgomp/
	* testsuite/libgomp.c/target-43.c: '-latomic' for nvptx offloading.
2021-09-06 11:51:13 +02:00
Eric Botcazou c0b03afeab Fix debug info for packed array types in Ada
Packed array types are sometimes represented with integer types under the
hood in Ada, but we nevertheless need to emit them as array types in the
debug info so we have the types.get_array_descr_info langhook for this
purpose; but it is not invoked from modified_type_die, which causes:

FAIL: gdb.ada/arrayptr.exp: scenario=minimal: print pa_ptr.all
FAIL: gdb.ada/arrayptr.exp: scenario=minimal: print pa_ptr.all(3)

in the GDB testsuite.

gcc/
	* dwarf2out.c (modified_type_die): Deal with all array types earlier
	and use local variable consistently throughout the function.
2021-09-06 11:18:06 +02:00
Jakub Jelinek 8a4602c2e0 match.pd: Fix up __builtin_*_overflow arg demotion [PR102207]
My earlier patch to demote arguments of __builtin_*_overflow unfortunately
caused a wrong-code regression.  The builtins operate on infinite precision
arguments, outer_prec > inner_prec signed -> signed, unsigned -> unsigned
promotions there are just repeating the sign or 0s and can be demoted,
similarly unsigned -> signed which also is repeating 0s, but as the
testcase shows, signed -> unsigned promotions need to be preserved (unless
we'd know the inner arguments can't be negative), because for negative
numbers such promotion sets the outer_prec -> inner_prec bits to 1 bit the
bits above that to 0 in the infinite precision.

So, the following patch avoids the demotions for the signed -> unsigned
promotions.

2021-09-06  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/102207
	* match.pd: Don't demote operands of IFN_{ADD,SUB,MUL}_OVERFLOW if they
	were promoted from signed to wider unsigned type.

	* gcc.dg/pr102207.c: New test.
2021-09-06 10:08:16 +02:00
Andrew Pinski 564efbf400 Fix PR tree-optimization/63184: add simplification of (& + A) != (& + B)
These two testcases have been failing since GCC 5 but things
have improved such that adding a simplification to match.pd
for this case is easier than before.
In the end we have the following IR:
....
  _5 = &a[1] + _4;
  _7 = &a + _13;
  if (_5 != _7)

So we can fold the _5 != _7 into:
(&a[1] - &a) + _4 != _13

The subtraction is folded into constant by ptr_difference_const.
In this case, the full expression gets folded into a constant
and we are able to remove the if statement.

OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.

gcc/ChangeLog:

	PR tree-optimization/63184
	* match.pd: Add simplification of pointer_diff of two pointer_plus
	with addr_expr in the first operand of each pointer_plus.
	Add simplificatoin of ne/eq of two pointer_plus with addr_expr
	in the first operand of each pointer_plus.

gcc/testsuite/ChangeLog:

	PR tree-optimization/63184
	* c-c++-common/pr19807-2.c: Enable for all targets and remove the xfail.
	* c-c++-common/pr19807-3.c: Likewise.
2021-09-06 07:45:06 +00:00
liuhongt 637dfcf43c Explicitly add -msse2 to compile HF related libgcc source file.
For 32-bit libgcc configure w/o sse2, there's would be an error since
GCC only support _Float16 under sse2. Explicitly add -msse2 for those
HF related libgcc functions, so users can still link them w/ the
upper configuration.

libgcc/ChangeLog:

	* Makefile.in: Adjust to support specific CFLAGS for each
	libgcc source file.
	* config/i386/64/t-softfp: Explicitly add -msse2 for HF
	related libgcc source files.
	* config/i386/t-softfp: Ditto.
	* config/i386/_divhc3.c: New file.
	* config/i386/_mulhc3.c: New file.
2021-09-06 15:13:14 +08:00
Richard Biener a3fb781d4b tree-optimization/102176 - locally compute participating SLP stmts
This performs local re-computation of participating scalar stmts
in BB vectorization subgraphs to allow precise computation of
liveness of scalar stmts after vectorization and thus precise
costing.  This treats all extern defs as live but continues
to optimistically handle scalar defs that we think we can handle
by lane-extraction even though that can still fail late during
code-generation.

2021-09-02  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/102176
	* tree-vect-slp.c (vect_slp_gather_vectorized_scalar_stmts):
	New function.
	(vect_bb_slp_scalar_cost): Use the computed set of
	vectorized scalar stmts instead of relying on the out-of-date
	and not accurate PURE_SLP_STMT.
	(vect_bb_vectorization_profitable_p): Compute the set
	of vectorized scalar stmts.
2021-09-06 08:03:49 +02:00
GCC Administrator 66bba4dc26 Daily bump. 2021-09-06 00:16:18 +00:00
Ian Lance Taylor 74df79ec3e libgo: update to final Go 1.17 release
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/343729
2021-09-05 16:10:51 -07:00
Aldy Hernandez a827909537 Make the path solver's range_of_stmt() handle all statements.
The path solver's range_of_stmt() was handcuffed to only fold
GIMPLE_COND statements, since those were the only statements the
backward threader needed to resolve.  However, there is no need for this
restriction, as the folding code is perfectly capable of folding any
statement.

This can be the case when trying to fold other statements in the final
block of a path (for instance, in the forward threader as it tries to
fold candidate statements along a path).

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-path.cc (path_range_query::range_of_stmt): Remove
	GIMPLE_COND special casing.
	(path_range_query::range_defined_in_block): Use range_of_stmt
	instead of calling fold_range directly.
2021-09-05 17:27:42 +02:00
Aldy Hernandez 90ef153527 Add an unreachable_path_p method to path_range_query.
Keeping track of unreachable calculations while traversing a path is
useful to determine edge reachability, among other things.  We've been
doing this ad-hoc in the backwards threader, so this provides a cleaner
way of accessing the information.

This patch also makes it easier to compare different threading
implementations, in some upcoming work.  For example, it's currently
difficult to gague how good we're doing compared to the forward threader,
because it can thread paths that are obviously unreachable.  This
provides a way of discarding those paths.

Note that I've opted to keep unreachable_path_p() out-of-line, because I
have local changes that will enhance this method.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-path.cc (path_range_query::range_of_expr): Set
	m_undefined_path when appropriate.
	(path_range_query::internal_range_of_expr): Copy from range_of_expr.
	(path_range_query::unreachable_path_p): New.
	(path_range_query::precompute_ranges): Set m_undefined_path.
	* gimple-range-path.h (path_range_query::unreachable_path_p): New.
	(path_range_query::internal_range_of_expr): New.
	* tree-ssa-threadbackward.c (back_threader::find_taken_edge_cond):
	Use unreachable_path_p.
2021-09-05 13:43:51 +02:00
Aldy Hernandez cbeeadff4c Clean up registering of paths in backwards threader.
All callers to maybe_register_path() call find_taken_edge() beforehand
and pass the edge as an argument.  There's no reason to repeat this
at each call site.

This is a clean-up in preparation for some other enhancements to the
backwards threader.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadbackward.c (back_threader::maybe_register_path):
	Remove argument and call find_taken_edge.
	(back_threader::resolve_phi): Do not calculate taken edge before
	calling maybe_register_path.
	(back_threader::find_paths_to_names): Same.
2021-09-05 11:46:22 +02:00
Jeff Law b27416a7a9 Improve handling of C bit for setcc insns
gcc/
	* config/h8300/h8300.md (QHSI2 mode iterator): New mode iterator.
	* config/h8300/testcompare.md (store_c): Update name, use new
	QHSI2 iterator.
	(store_neg_c, store_shifted_c): New patterns.
2021-09-05 00:08:34 -04:00
GCC Administrator 617c9ce232 Daily bump. 2021-09-05 00:16:17 +00:00
GCC Administrator 7b7395409c Daily bump. 2021-09-04 00:16:38 +00:00
Segher Boessenkool 2484f7a4b0 rs6000: Don't use r12 for CR save on ELFv2 (PR102107)
CR is saved and/or restored on some paths where GPR12 is already live
since it has a meaning in the calling convention in the ELFv2 ABI.

It is not completely clear to me that we can always use r11 here, but
it does seem save, there is checking code (to detect conflicts here),
and it is stage 1.  So here goes.

2021-09-03  Segher Boessenkool <segher@kernel.crashing.org>

	PR target/102107
	* config/rs6000/rs6000-logue.c (rs6000_emit_prologue): On ELFv2 use r11
	instead of r12 for CR save, in all cases.
2021-09-03 21:04:23 +00:00
Iain Sandoe addf167a23 coroutines: Support for debugging implementation state.
Some of the state that is associated with the implementation
is of interest to a user debugging a coroutine.  In particular
items such as the suspend point, promise object, and current
suspend point.

These variables live in the coroutine frame, but we can inject
proxies for them into the outermost bind expression of the
coroutine.  Such variables are automatically moved into the
coroutine frame (if they need to persist across a suspend
expression).  PLacing the proxies thus allows the user to
inspect them by name in the debugger.

To implement this, we ensure that (at the outermost scope) the
frame entries are not mangled (coroutine frame variables are
usually mangled with scope nesting information so that they do
not clash).  We can safely avoid doing this for the outermost
scope so that we can map frame entries directly to the variables.

This is partial contribution to debug support (PR 99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/cp/ChangeLog:

	* coroutines.cc (register_local_var_uses): Do not mangle
	frame entries for the outermost scope.  Record the outer
	scope as nesting depth 0.
2021-09-03 19:42:43 +01:00
Iain Sandoe a45a7ecdf3 coroutines: Add a helper for creating local vars.
This is primarily code factoring, but we take this opportunity
to rename some of the implementation variables (which we intend
to expose to debugging) so that they are in the implementation
namespace.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/cp/ChangeLog:

	* coroutines.cc (coro_build_artificial_var): New.
	(build_actor_fn): Use var builder, rename vars to use
	implementation namespace.
	(coro_rewrite_function_body): Likewise.
	(morph_fn_to_coro): Likewise.
2021-09-03 19:42:31 +01:00
Iain Sandoe 88974974d8 coroutines: Use DECL_VALUE_EXPR instead of rewriting vars.
Variables that need to persist over suspension expressions
must be preserved by being copied into the coroutine frame.

The initial implementations do this manually in the transform
code.  However, that has various disadvantages - including
that the debug connections are lost between the original var
and the frame copy.

The revised implementation makes use of DECL_VALUE_EXPRs to
contain the frame offset expressions, so that the original
var names are preserved in the code.

This process is also applied to the function parms which are
always copied to the frame.  In this case the decls need to be
copied since they are used in two different contexts during
the re-write (in the building of the ramp function, and in
the actor function itself).

This will assist in improvement of debugging (PR 99215).

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/cp/ChangeLog:

	* coroutines.cc (transform_local_var_uses): Record
	frame offset expressions as DECL_VALUE_EXPRs instead of
	rewriting them.
2021-09-03 19:42:09 +01:00
Andrew Pinski 6b69bf5729 Fix target/102173 ICE after error recovery
After the recent r12-3278-823685221de986a change, the testcase
gcc.target/aarch64/sve/acle/general-c/type_redef_1.c started
to ICE as the code was not ready for error_mark_node in the
type.  This fixes that and the testcase now passes.

gcc/ChangeLog:

	* config/aarch64/aarch64-sve-builtins.cc (register_vector_type):
	Handle error_mark_node as the type of the type_decl.
2021-09-03 16:56:33 +00:00
Andrew Pinski 98f1dd0212 Fix some GC issues in the aarch64 back-end.
I got some ICEs in my latest testsing while running the libstdc++ testsuite.
I had noticed the problem was connected to types and had just touched the
builtins code but nothing which could have caused this and I looked for
some types/variables that were not being marked with GTY.

OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.

gcc/ChangeLog:

	* config/aarch64/aarch64-builtins.c (struct aarch64_simd_type_info):
	Mark with GTY.
	(aarch64_simd_types): Likewise.
	(aarch64_simd_intOI_type_node): Likewise.
	(aarch64_simd_intCI_type_node): Likewise.
	(aarch64_simd_intXI_type_node): Likewise.
	* config/aarch64/aarch64.h (aarch64_fp16_type_node): Likewise.
	(aarch64_fp16_ptr_type_node): Likewise.
	(aarch64_bf16_type_node): Likewise.
	(aarch64_bf16_ptr_type_node): Likewise.
2021-09-03 16:56:33 +00:00
Aldy Hernandez 8af8abfbba Implement POINTER_DIFF_EXPR entry in range-op.
I've seen cases in the upcoming jump threader enhancements where we see
a difference of two pointers that are known to be equivalent, and yet we
fail to return 0 for the range.  This is because we have no working
range-op entry for POINTER_DIFF_EXPR.  The entry we currently have is
a mere placeholder to avoid ignoring POINTER_DIFF_EXPR's so
adjust_pointer_diff_expr() could get a whack at it here:

//	def = __builtin_memchr (arg, 0, sz)
//	n = def - arg
//
// The range for N can be narrowed to [0, PTRDIFF_MAX - 1].

This patch adds the relational magic to range-op, which we can just
steal from the minus_expr code.

gcc/ChangeLog:

	* range-op.cc (operator_minus::op1_op2_relation_effect): Abstract
	out to...
	(minus_op1_op2_relation_effect): ...here.
	(class operator_pointer_diff): New.
	(operator_pointer_diff::op1_op2_relation_effect): Call
	minus_op1_op2_relation_effect.
	(integral_table::integral_table): Add entry for POINTER_DIFF_EXPR.
2021-09-03 18:40:02 +02:00
Patrick Palka 47543e5f9d c++: shortcut bad convs during overload resolution [PR101904]
In the context of overload resolution we have the notion of a "bad"
argument conversion, which is a conversion that "would be a permitted
with a bending of the language standards", and we handle such bad
conversions specially.  In particular, we rank a bad conversion as
better than no conversion but worse than a good conversion, and a bad
conversion doesn't necessarily make a candidate unviable.  With the
flag -fpermissive, we permit the situation where overload resolution
selects a candidate that contains a bad conversion (which we call a
non-strictly viable candidate).  And without the flag, the caller
of overload resolution usually issues a distinct permerror in this
situation instead.

One consequence of this defacto behavior is that in order to distinguish
a non-strictly viable candidate from an unviable candidate, if we
encounter a bad argument conversion during overload resolution we must
keep converting subsequent arguments because a subsequent conversion
could render the candidate unviable instead of just non-strictly viable.
But checking subsequent arguments can force template instantiations and
result in otherwise avoidable hard errors.  And in particular, all
'this' conversions are at worst bad, so this means the const/ref-qualifiers
of a member function can't be used to prune a candidate quickly, which
is the subject of the mentioned PR.

This patch tries to improve the situation without changing the defacto
output of add_candidates.  Specifically, when considering a candidate
during overload resolution this patch makes us shortcut argument
conversion checking upon encountering the first bad conversion
(tentatively marking the candidate as non-strictly viable, though it
could ultimately be unviable) under the assumption that we'll eventually
find a strictly viable candidate anyway (which renders moot the
distinction between non-strictly viable and unviable, since both are
worse than a strictly viable candidate).  If this assumption turns out
to be false, we'll fully reconsider the candidate under the defacto
behavior (without the shortcutting) so that all its conversions are
computed.

So in the best case (there's a strictly viable candidate), we avoid
some argument conversions and/or template argument deduction that may
cause a hard error.  In the worst case (there's no such candidate), we
have to redundantly consider some candidates twice.  (In a previous
version of the patch, to avoid this redundant checking I created a new
"deferred" conversion type that represents a conversion that is yet to
be computed, and instead of reconsidering a candidate I just realized
its deferred conversions.  But it doesn't seem this redundancy is a
significant performance issue to justify the added complexity of this
other approach.)

	PR c++/101904

gcc/cp/ChangeLog:

	* call.c (build_this_conversion): New function, split out from
	add_function_candidate.
	(add_function_candidate): New parameter shortcut_bad_convs.
	Document it.  Use build_this_conversion.  Stop at the first bad
	argument conversion when shortcut_bad_convs is true.
	(add_template_candidate_real): New parameter shortcut_bad_convs.
	Use build_this_conversion to check the 'this' conversion before
	attempting deduction.  When the rejection reason code is
	rr_bad_arg_conversion, pass -1 instead of 0 as the viable
	parameter to add_candidate.  Pass 'convs' to add_candidate.
	(add_template_candidate): New parameter shortcut_bad_convs.
	(add_template_conv_candidate): Pass false as shortcut_bad_convs
	to add_template_candidate_real.
	(add_candidates): Prefer to shortcut bad conversions during
	overload resolution under the assumption that we'll eventually
	see a strictly viable candidate.  If this assumption turns out
	to be false, re-process the non-strictly viable candidates
	without shortcutting those bad conversions.

gcc/testsuite/ChangeLog:

	* g++.dg/template/conv17.C: New test.
2021-09-03 11:33:41 -04:00
Iain Sandoe 3ccb523bdd libgcc, soft-float: Fix strong_alias macro use for Darwin.
Darwin does not support strong symbol aliases and a work-
around is provided in sfp-machine.h where a second function
is created that simply calls the original.  However this
needs the arguments to the synthesized function to track
the mode of the original function.

So the fix here is to match known floating point modes from
the incoming function and apply the one found to the new
function args.

The matching is highly specific to the current set of modes
and will need adjusting should more cases be added.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

libgcc/ChangeLog:

	* config/i386/sfp-machine.h (alias_HFtype, alias_SFtype
	alias_DFtype, alias_TFtype): New.
	(ALIAS_SELECTOR): New.
	(strong_alias): Use __typeof and a _Generic selector to
	provide the type to the synthesized function.
2021-09-03 16:25:40 +01:00
Aldy Hernandez 0100555037 Do not assume loop header threading in backward threader.
The registry's thread_through_all_blocks() has a may_peel_loop_headers
argument.  When refactoring the backward threader code, I removed this
argument for the local passthru method because it was always TRUE.  This
may not necessarily be true in the future, if the backward threader is
called from another context.  This patch removes the default definition,
in favor of an argument that is exactly the same as the identically
named function in tree-ssa-threadupdate.c.  I think this also makes it
less confusing when looking at both methods across the source base.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadbackward.c (back_threader::thread_through_all_blocks):
	Add may_peel_loop_headers.
	(back_threader_registry::thread_through_all_blocks): Same.
	(try_thread_blocks): Pass may_peel_loop_headers argument.
	(pass_early_thread_jumps::execute): Same.
2021-09-03 17:22:04 +02:00
Aldy Hernandez 62099645c2 Abstract PHI and forwarder block checks in jump threader.
This patch abstracts out a couple common idioms in the forward
threader that I found useful while navigating the code base.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadedge.c (has_phis_p): New.
	(forwarder_block_p): New.
	(potentially_threadable_block): Call forwarder_block_p.
	(jump_threader::thread_around_empty_blocks): Call has_phis_p.
	(jump_threader::thread_through_normal_block): Call
	forwarder_block_p.
2021-09-03 17:19:54 +02:00
Aldy Hernandez 779275c083 Improve backwards threader debugging dumps.
This patch adds debugging helpers to the backwards threader.  I have
also noticed that profitable_path_p() can bail early on paths that
crosses loops and leave the dump of blocks incomplete.  Fixed as
well.

Unfortunately the new methods cannot be marked const, because we call
the solver's dump which is not const.  I believe this was because the
ranger dump calls m_cache.block_range().  This could probably use a
cleanup at a later time.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadbackward.c (back_threader::dump): New.
	(back_threader::debug): New.
	(back_threader_profitability::profitable_path_p): Dump blocks
	even if we are bailing early.
2021-09-03 17:19:54 +02:00
Aldy Hernandez a3ff15afb4 Dump reason why threads are being cancelled and abstract code.
We are inconsistent on dumping out reasons why a thread was canceled.
This makes debugging jump threading problems harder because paths can be
canceled with no reason given.  This patch abstracts out the thread
canceling code and adds a reason for every cancellation.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadupdate.c (cancel_thread): New.
	(jump_thread_path_registry::thread_block_1): Use cancel_thread.
	(jump_thread_path_registry::mark_threaded_blocks): Same.
	(jump_thread_path_registry::register_jump_thread): Same.
2021-09-03 17:19:53 +02:00
Jason Merrill 5ec4990bc7 c++: Avoid bogus -Wunused with recent change
My change to make limit_bad_template_recursion avoid instantiating members
of erroneous classes produced a bogus "used but not defined" warning for
23_containers/unordered_set/instantiation_neg.cc; it's not defined because
we decided not to instantiate it.  So we need to suppress that warning.

gcc/cp/ChangeLog:

	* pt.c (limit_bad_template_recursion): Suppress -Wunused for decls
	we decide not to instantiate.
2021-09-03 11:12:09 -04:00
Tobias Burnus 943c65c449 Fortran: Fix Bind(C) char-len check, add ptr-contiguous check
Add F2018, 18.3.6 (5), pointer + contiguous is not permitted
check for dummies in BIND(C) procs.

Fix misreading of F2018, 18.3.4/18.3.5 + 18.3.6 (5) regarding
character dummies passed as byte stream to a bind(C) dummy arg:
Per F2018, 18.3.1 only len=1 is interoperable (since F2003).
F2008 added 'constant expression' for vars (F2018, 18.3.4/18.3.5),
applicable to dummy args per F2018, C1554. I misread this such
that len > 1 is permitted if len is a constant expr.

While the latter would work as character len=1 a(10) and len=2 a(5)
have the same storage sequence and len is fixed, it is still invalid.
Hence, it is now rejected again.

gcc/fortran/ChangeLog:

	* decl.c (gfc_verify_c_interop_param): Reject pointer with
	CONTIGUOUS attributes as dummy arg. Reject character len > 1
	when passed as byte stream.

gcc/testsuite/ChangeLog:

	* gfortran.dg/bind_c_char_6.f90: Update dg-error.
	* gfortran.dg/bind_c_char_7.f90: Likewise.
	* gfortran.dg/bind_c_char_8.f90: Likewise.
	* gfortran.dg/iso_c_binding_char_1.f90: Likewise.
	* gfortran.dg/pr32599.f03: Likewise.
	* gfortran.dg/bind_c_char_9.f90: Comment testcase bits which are
	implementable but not valid F2018.
	* gfortran.dg/bind_c_contiguous.f90: New test.
2021-09-03 16:28:04 +02:00
Aldy Hernandez 2fcfc03459 Avoid using unavailable objects in jt_state.
The jump threading state is about to get more interesting, and it may
get with a ranger or with the const_copies/etc helpers.  This patch
makes sure we have an object before we attempt to call push_marker or
pop_to_marker.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadedge.c (jt_state::push): Only call methods for
	which objects are available.
	(jt_state::pop): Same.
	(jt_state::register_equiv): Same.
	(jt_state::register_equivs_on_edge): Same.
2021-09-03 15:47:34 +02:00
Aldy Hernandez b237eb9dfd Do not release state location until after path registry.
We are popping state and then calling the registry code.  This causes
the registry to have incorrect information.  This isn't visible in
current trunk, but will be an issue when I submit further enhancements
to the threading code.  However, it is a cleanup on its own so I am
pushing it now.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadedge.c (jump_threader::thread_across_edge):
	Move pop until after a thread is registered.
2021-09-03 15:42:22 +02:00
Aldy Hernandez 9fa5ba4c56 Add debug helper for jump thread paths.
Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-threadupdate.c (debug): New.
2021-09-03 15:35:46 +02:00
Aldy Hernandez 7200a4424c RAII class to change dump_file.
The function dump_ranger() shows everything the ranger knows at the
current time.  To do this, we tickle all the statements to force ranger
to provide as much information as possible.  During this process, the
relation code will dump status out to the dump_file, whereas in
dump_ranger, we want to dump it out to a specific file (most likely
stderr).  This patch changes the dump_file through the life of
dump_ranger() and resets it when its done.

This patch only affects dump/debugging code.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-trace.cc (push_dump_file::push_dump_file): New.
	(push_dump_file::~push_dump_file): New.
	(dump_ranger): Change dump_file temporarily while dumping
	ranger.
	* gimple-range-trace.h (class push_dump_file): New.
2021-09-03 15:30:57 +02:00
Aldy Hernandez 4db10cbf21 Add function name when dumping ranger contents.
These are minor cleanups to the dumping code.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-trace.cc (debug_seed_ranger): Remove static.
	(dump_ranger): Dump function name.
2021-09-03 15:30:57 +02:00
Aldy Hernandez 410e874263 Use non-null knowledge in path_range_query.
This patch improves ranges for pointers we are interested in a path, by
using the non-null class from the ranger.  This allows us to thread more
paths with minimal effort.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-path.cc (path_range_query::range_defined_in_block):
	Adjust for non-null.
	(path_range_query::adjust_for_non_null_uses): New.
	(path_range_query::precompute_ranges): Call
	adjust_for_non_null_uses.
	* gimple-range-path.h: Add m_non_null and
	adjust_for_non_null_uses.
2021-09-03 15:30:57 +02:00
Aldy Hernandez 1342891464 Improve path_range_query dumps.
Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-path.cc (path_range_query::dump): Dump path
	length.
	(path_range_query::precompute_ranges): Dump entire path.
2021-09-03 15:30:57 +02:00
Aldy Hernandez abcd237363 Implement relation_oracle::debug.
Tested on x86-64 Linux.

gcc/ChangeLog:

	* value-relation.cc (relation_oracle::debug): New.
	* value-relation.h (relation_oracle::debug): New.
2021-09-03 15:30:56 +02:00
Aldy Hernandez d2e278e26a Remove unnecessary include from tree-ssa-loop-ch.c
Tested on x86-64 Linux.

gcc/ChangeLog:

	* tree-ssa-loop-ch.c: Remove unnecessary include file.
2021-09-03 15:30:56 +02:00
Aldy Hernandez 5db93cd083 Skip statements with no BB in ranger.
The function postfold_gcond_edges() registers relations coming out of a
GIMPLE_COND.  With upcoming changes, we may be called with statements
not in the IL (for example, dummy statements created by the
forward threader).  This patch avoids breakage by exiting if the
statement does not have a defining basic block.  There is a similar
change to the path solver.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-fold.cc (fold_using_range::postfold_gcond_edges):
	Skip statements with no defining BB.
	* gimple-range-path.cc (path_range_query::range_defined_in_block):
	Do not get confused by statements with no defining BB.
2021-09-03 15:30:56 +02:00
Aldy Hernandez bccf4b88e1 Improve support for IMAGPART_EXPR and REALPART_EXPR in ranger.
Currently we adjust statements containing an IMAGPART_EXPR if the
defining statement was one of a few built-ins known to return boolean
types.  We can also adjust statements for both IMAGPART_EXPR and
REALPART_EXPR where the defining statement is a constant.

This patch adds such support, and cleans up the code a bit.

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-fold.cc (adjust_imagpart_expr): Move from
	gimple_range_adjustment.  Add support for constants.
	(adjust_realpart_expr): New.
	(gimple_range_adjustment): Move IMAGPART_EXPR code to
	adjust_imagpart_expr.
	* range-op.cc (integral_table::integral_table): Add entry for
	REALPART_CST.
2021-09-03 15:30:56 +02:00
Tobias Burnus 4ce90454c2 libgomp.*/error-1.{c,f90}: Fix dg-output newline pattern
libgomp/ChangeLog:

	* testsuite/libgomp.c-c++-common/error-1.c: Use \r\n not \n\r in
	dg-output.
	* testsuite/libgomp.fortran/error-1.f90: Likewise.
2021-09-03 15:27:00 +02:00
Eric Botcazou 8d34ffb4e8 Improve compatibility of -fdump-ada-spec with warnings
This makes sure that the style and warning settings used in the
C/C++ bindings generated by -fdump-ada-spec do not leak into the
units that use them.

gcc/c-family/
	* c-ada-spec.c (dump_ads): Generate pragmas to disable style checks
	and -gnatwu warning for the package specification.
2021-09-03 11:19:23 +02:00
Jakub Jelinek 090f0d78f1 openmp: Improve expand_omp_atomic_pipeline
When __atomic_* builtins were introduced, omp-expand.c (omp-low.c
at that point) has been adjusted in several spots so that it uses
the atomic builtins instead of sync builtins, but
expand_omp_atomic_pipeline has not because the __atomic_compare_exchange_*
APIs take address of the argument, so it kept using __sync_val_compare_swap_*.
That means it always uses seq_cst though.
This patch changes it to use the ATOMIC_COMPARE_EXCHANGE ifn which gimple-fold
folds __atomic_compare_exchange_* into - that ifn also passes expected
directly.

2021-09-03  Jakub Jelinek  <jakub@redhat.com>

	* omp-expand.c (expand_omp_atomic_pipeline): Use
	IFN_ATOMIC_COMPARE_EXCHANGE instead of
	BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_? so that memory order
	can be provided.
2021-09-03 09:54:58 +02:00
Jakub Jelinek e902136b31 c++, abi: Set DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD on C++ zero width bitfields [PR102024]
The removal of remove_zero_width_bitfields function and its call from
C++ FE layout_class_type (which I've done in the P0466R5
layout-compatible helper intrinsics patch, so that the FE can actually
determine what is and isn't layout-compatible according to the spec)
unfortunately changed the ABI on various platforms.
The C FE has been keeping zero-width bitfields in the types, while
the C++ FE has been removing them after structure layout, so in various
cases when passing such structures in registers we had different ABI
between C and C++.

While both the C and C++ FE had some code to remove zero width bitfields
after structure layout, in both FEs it was buggy and didn't really remove
any.  In the C FE that code has been removed later on, while in the C++ FE
for GCC 4.5 in PR42217 it has been actually fixed, so the C++ FE started
to remove those bitfields.

The following patch doesn't change anything ABI-wise, but allows the
targets to decide what to do, emit -Wpsabi warnings etc.
Non-C zero width bitfields will be seen by the backends as normal
zero width bitfields, C++ zero width bitfields that used to be previously
removed will have DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD flag set.
I've reused the DECL_FIELD_ABI_IGNORED flag which is only used on non-bitfield
FIELD_DECLs right now, but the macros now check DECL_BIT_FIELD flag.

Each backend can then decide what it wants, whether it wants to keep
different ABI between C and C++ as in GCC 11 and older (i.e. incompatible
with G++ <= 4.4, compatible with G++ 4.5 .. 11), for that it would
ignore for the aggregate passing/returning decisions all
DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD FIELD_DECLs), whether it wants to never
ignore zero width bitfields (no changes needed for that case, except perhaps
-Wpsabi warning should be added and for that DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD
can be tested), or whether it wants to always ignore zero width bitfields
(I think e.g. riscv in GCC 10+ does that).

All this patch does is set the flag which the backends can then use.

2021-09-03  Jakub Jelinek  <jakub@redhat.com>

	PR target/102024
gcc/
	* tree.h (DECL_FIELD_ABI_IGNORED): Changed into rvalue only macro
	that is false if DECL_BIT_FIELD.
	(SET_DECL_FIELD_ABI_IGNORED, DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD,
	SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD): Define.
	* tree-streamer-out.c (pack_ts_decl_common_value_fields): For
	DECL_BIT_FIELD stream DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD instead
	of DECL_FIELD_ABI_IGNORED.
	* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use
	SET_DECL_FIELD_ABI_IGNORED instead of writing to
	DECL_FIELD_ABI_IGNORED and for DECL_BIT_FIELD use
	SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD instead.
	* lto-streamer-out.c (hash_tree): For DECL_BIT_FIELD hash
	DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD instead of DECL_FIELD_ABI_IGNORED.
gcc/cp/
	* class.c (build_base_field): Use SET_DECL_FIELD_ABI_IGNORED
	instead of writing to DECL_FIELD_ABI_IGNORED.
	(layout_class_type): Likewise.  In the place where zero-width
	bitfields used to be removed, use
	SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD on those fields instead.
gcc/lto/
	* lto-common.c (compare_tree_sccs_1): Also compare
	DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD values.
2021-09-03 09:46:32 +02:00
liuhongt de6795bbf5 Remove macro check for __AMX_BF16/INT8/TILE__ in header file.
gcc/ChangeLog:

	PR target/102166
	* config/i386/amxbf16intrin.h : Remove macro check for __AMX_BF16__.
	* config/i386/amxint8intrin.h : Remove macro check for __AMX_INT8__.
	* config/i386/amxtileintrin.h : Remove macro check for __AMX_TILE__.

gcc/testsuite/ChangeLog:

	PR target/102166
	* g++.target/i386/pr102166.C: New test.
2021-09-03 13:04:41 +08:00