Commit Graph

179327 Commits

Author SHA1 Message Date
Richard Biener
095d42feed code generate live lanes in basic-block vectorization
The following adds the capability to code-generate live lanes in
basic-block vectorization using lane extracts from vector stmts
rather than keeping the original scalar code around for those.
This eventually makes previously not profitable vectorizations
profitable (the live scalar code was appropriately costed so
are the lane extracts now), without considering the cost model
this patch doesn't add or remove any basic-block vectorization
capabilities.

The patch re/ab-uses STMT_VINFO_LIVE_P in basic-block vectorization
mode to tell whether a live lane is vectorized or whether it is
provided by means of keeping the scalar code live.

The patch is a first step towards vectorizing sequences of
stmts that do not end up in stores or vector constructors though.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

2020-09-04  Richard Biener  <rguenther@suse.de>

	* tree-vectorizer.h (vectorizable_live_operation): Adjust.
	* tree-vect-loop.c (vectorizable_live_operation): Vectorize
	live lanes out of basic-block vectorization nodes.
	* tree-vect-slp.c (vect_bb_slp_mark_live_stmts): New function.
	(vect_slp_analyze_operations): Analyze live lanes and their
	vectorization possibility after the whole SLP graph is final.
	(vect_bb_slp_scalar_cost): Adjust for vectorized live lanes.
	* tree-vect-stmts.c (can_vectorize_live_stmts): Adjust.
	(vect_transform_stmt): Call can_vectorize_live_stmts also for
	basic-block vectorization.

	* gcc.dg/vect/bb-slp-46.c: New testcase.
	* gcc.dg/vect/bb-slp-47.c: Likewise.
	* gcc.dg/vect/bb-slp-32.c: Adjust.
2020-09-07 09:47:36 +02:00
Francois-Xavier Coudert
d30869a8d4 fortran: Fix argument types in derived types procedures
gcc/fortran/ChangeLog

	* trans-types.c (gfc_get_derived_type): Fix argument types.
2020-09-07 09:38:25 +02:00
Francois-Xavier Coudert
a502683de1 fortran: Fix arg types of _gfortran_is_extension_of
gcc/fortran/ChangeLog

	* resolve.c (resolve_select_type): Provide a formal arg list.
2020-09-07 09:37:01 +02:00
liuhongt
995bb851ff Adjust testcase.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr92658-avx512bw-trunc.c: Add
	-mprefer-vector-width=512 to avoid impact of different default
	tune which gcc is built with.
2020-09-07 15:26:18 +08:00
GCC Administrator
0fd39e420e Daily bump. 2020-09-07 00:16:22 +00:00
Francois-Xavier Coudert
23f8b90c40 fortran: Add comment about previous commit
gcc/fortran/ChangeLog

	* trans-types.c (gfc_get_ppc_type): Add comment.
2020-09-06 18:37:05 +02:00
Francois-Xavier Coudert
7c72651a93 fortran: Fix function arg types for class objects
gcc/fortran/ChangeLog

	* trans-types.c (gfc_get_ppc_type): Fix function arg types.
2020-09-06 18:33:43 +02:00
Francois-Xavier Coudert
3489d80fee fortran: caf_fail_image expects no argument
gcc/fortran/ChangeLog

	PR fortran/96947
	* trans-stmt.c (gfc_trans_fail_image): caf_fail_image
	expects no argument.

gcc/testsuite/ChangeLog

	* gfortran.dg/coarray_fail_st.f90: Adjust test.
2020-09-06 18:29:09 +02:00
GCC Administrator
0dc8050556 Daily bump. 2020-09-06 00:16:20 +00:00
GCC Administrator
bec05c98b9 Daily bump. 2020-09-05 00:16:20 +00:00
Iain Buclaw
f8eabd47ac d: Fix ICE in create_tmp_var, at gimple-expr.c:482
Array concatenate expressions were creating more SAVE_EXPRs than what
was necessary.  The internal error itself was the result of a forced
temporary being made on a TREE_ADDRESSABLE type.

gcc/d/ChangeLog:

	PR d/96924
	* expr.cc (ExprVisitor::visit (CatAssignExp *)): Don't force
	temporaries needlessly.

gcc/testsuite/ChangeLog:

	PR d/96924
	* gdc.dg/simd13927b.d: Removed.
	* gdc.dg/pr96924.d: New test.
2020-09-04 23:01:46 +02:00
Jason Merrill
f923c40f9b c++: Use iloc_sentinel in mark_use.
gcc/cp/ChangeLog:

	* expr.c (mark_use): Use iloc_sentinel.
2020-09-04 13:56:32 -04:00
Richard Biener
46a58c779a tree-optimization/96920 - another ICE when vectorizing nested cycles
This refines the previous fix for PR96698 by re-doing how and where
we arrange for setting vectorized cycle PHI backedge values.

2020-09-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/96698
	PR tree-optimization/96920
	* tree-vectorizer.h (loop_vec_info::reduc_latch_defs): Remove.
	(loop_vec_info::reduc_latch_slp_defs): Likewise.
	* tree-vect-stmts.c (vect_transform_stmt): Remove vectorized
	cycle PHI latch code.
	* tree-vect-loop.c (maybe_set_vectorized_backedge_value): New
	helper to set vectorized cycle PHI latch values.
	(vect_transform_loop): Walk over all PHIs again after
	vectorizing them, calling maybe_set_vectorized_backedge_value.
	Call maybe_set_vectorized_backedge_value for each vectorized
	stmt.  Remove delayed update code.
	* tree-vect-slp.c (vect_analyze_slp_instance): Initialize
	SLP instance reduc_phis member.
	(vect_schedule_slp): Set vectorized cycle PHI latch values.

	* gfortran.dg/vect/pr96920.f90: New testcase.
	* gcc.dg/vect/pr96920.c: Likewise.
2020-09-04 15:42:43 +02:00
Andrea Corallo
09fa6acd8d vec: dead code removal in tree-vect-loop.c
gcc/ChangeLog

2020-09-04  Andrea Corallo  <andrea.corallo@arm.com>

	* tree-vect-loop.c (vect_estimate_min_profitable_iters): Remove
	dead code as LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) is
	always verified.
2020-09-04 14:16:52 +02:00
Christophe Lyon
2033a63cbd arm: Improve immediate generation for thumb-1 with -mpurecode [PR96769]
This patch moves the move-immediate splitter after the regular ones so
that it has lower precedence, and updates its constraints.

For
int f3 (void) { return 0x11000000; }
int f3_2 (void) { return 0x12345678; }

we now generate:
* with -O2 -mcpu=cortex-m0 -mpure-code:
f3:
	movs    r0, #136
	lsls    r0, r0, #21
	bx      lr
f3_2:
	movs    r0, #18
	lsls    r0, r0, #8
	adds    r0, r0, #52
	lsls    r0, r0, #8
	adds    r0, r0, #86
	lsls    r0, r0, #8
	adds    r0, r0, #121
	bx      lr

* with -O2 -mcpu=cortex-m23 -mpure-code:
f3:
	movs    r0, #136
	lsls    r0, r0, #21
	bx      lr
f3_2:
	movw    r0, #22136
	movt    r0, 4660
	bx      lr

2020-09-04  Christophe Lyon  <christophe.lyon@linaro.org>

	PR target/96769
	gcc/
	* config/arm/thumb1.md: Move movsi splitter for
	arm_disable_literal_pool after the other movsi splitters.

	gcc/testsuite/
	* gcc.target/arm/pure-code/pr96769.c: New test.
2020-09-04 11:48:36 +00:00
Aldy Hernandez
c5a6c2237a rename widest_irange to int_range_max.
gcc/ChangeLog:

	* range-op.cc (range_operator::fold_range): Rename widest_irange
	to int_range_max.
	(operator_div::wi_fold): Same.
	(operator_lshift::op1_range): Same.
	(operator_rshift::op1_range): Same.
	(operator_cast::fold_range): Same.
	(operator_cast::op1_range): Same.
	(operator_bitwise_and::remove_impossible_ranges): Same.
	(operator_bitwise_and::op1_range): Same.
	(operator_abs::op1_range): Same.
	(range_cast): Same.
	(widest_irange_tests): Same.
	(range3_tests): Rename irange3 to int_range3.
	(int_range_max_tests): Rename from widest_irange_tests.
	Rename widest_irange to int_range_max.
	(operator_tests): Rename widest_irange to int_range_max.
	(range_tests): Same.
	* tree-vrp.c (find_case_label_range): Same.
	* value-range.cc (irange::irange_intersect): Same.
	(irange::invert): Same.
	* value-range.h: Same.
2020-09-04 12:26:14 +02:00
Richard Biener
fab7764484 tree-optimization/96931 - clear ctrl-altering flag more aggressively
The testcase shows that we fail to clear gimple_call_ctrl_altering_p
when the last abnormal edge goes away, causing an edge insert to
a loop header edge when we have preheaders to split the edge
unnecessarily.

The following addresses this by more aggressively clearing the
flag in cleanup_call_ctrl_altering_flag.

2020-09-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/96931
	* tree-cfgcleanup.c (cleanup_call_ctrl_altering_flag): If
	there's a fallthru edge and no abnormal edge the call is
	no longer control-altering.
	(cleanup_control_flow_bb): Pass down the BB to
	cleanup_call_ctrl_altering_flag.

	* gcc.dg/pr96931.c: New testcase.
2020-09-04 12:22:29 +02:00
Jakub Jelinek
b898878032 lto: Remove stream_input_location_now
As discussed yesterday, stream_input_location_now has been used in 3
remaining places.  For ERT_MUST_NOT_THROW, I believe the failure_loc
location is stable at least until the apply_cache after the bbs are all
read, and the locations do not include BLOCK, so we can use normal
stream_input_location, and the two input_struct_function_base also
shouldn't include BLOCK and are stable at least until that same apply_cache
after reading all bbs, so again we can use the location cache.

2020-09-04  Jakub Jelinek  <jakub@redhat.com>

	* lto-streamer.h (stream_input_location_now): Remove declaration.
	* lto-streamer-in.c (stream_input_location_now): Remove.
	(input_eh_region, input_struct_function_base): Use
	stream_input_location instead of stream_input_location_now.
2020-09-04 11:55:13 +02:00
Jakub Jelinek
70d8d9bd93 lto: Ensure we force a change for file/line/column after clear_line_info
As discussed yesterday:
On the streamer out side, we call clear_line_info
in multiple spots which resets the current_* values to something, but on the
reader side, we don't have corresponding resets in the same location, just have
the stream_* static variables that keep the current values through the
entire stream in (so across all the clear_line_info spots in a single LTO
object but also across jumping from one LTO object to another one).
Now, in an earlier version of my patch it actually broke LTO bootstrap
(and a lot of LTO testcases), so for the BLOCK case I've solved it by
clear_line_info setting current_block to something that should never appear,
which means that in the LTO stream after the clear_line_info spots including
the start of the LTO stream we force the block change bit to be set and thus
BLOCK to be streamed and therefore stream_block from earlier to be
ignored.  But for the rest I think that is not the case, so I wonder if we
don't sometimes end up with wrong line/column info because of that, or
please tell me what prevents that.
clear_line_info does:
  ob->current_file = NULL;
  ob->current_line = 0;
  ob->current_col = 0;
  ob->current_sysp = false;
while I think NULL current_file is something that should likely be different
from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are
handled separately and not go through the caching), I think line number 0
can sometimes occur and especially column 0 occurs frequently if we ran out
of location_t with columns info.  But then we do:
      bp_pack_value (bp, ob->current_file != xloc.file, 1);
      bp_pack_value (bp, ob->current_line != xloc.line, 1);
      bp_pack_value (bp, ob->current_col != xloc.column, 1);
and stream the details only if the != is true.  If that happens immediately
after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and
not stream the actual value, so on read-in it would reuse whatever
stream_col etc. were before.  Shouldn't we set some ob->current_* new bit
that would signal we are immediately past clear_line_info which would force
all these != checks to non-zero?  Either by oring something into those
tests, or perhaps:
  if (ob->current_reset)
    {
      if (xloc.file == NULL)
        ob->current_file = "";
      if (xloc.line == 0)
        ob->current_line = 1;
      if (xloc.column == 0)
        ob->current_column = 1;
      ob->current_reset = false;
    }
before doing those bp_pack_value calls with a comment, effectively forcing
all 6 != comparisons to be true?

2020-09-04  Jakub Jelinek  <jakub@redhat.com>

	* lto-streamer.h (struct output_block): Add reset_locus member.
	* lto-streamer-out.c (clear_line_info): Set reset_locus to true.
	(lto_output_location_1): If reset_locus, clear it and ensure
	current_{file,line,col} is different from xloc members.
2020-09-04 11:53:28 +02:00
David Faust
c3a0f53739 bpf: generate indirect calls for xBPF
This patch updates the BPF back end to generate indirect calls via
the 'call %reg' instruction when targetting xBPF.

Additionally, the BPF ASM_SPEC is updated to pass along -mxbpf to
gas, where it is now supported.

2020-09-03  David Faust  <david.faust@oracle.com>

gcc/

	* config/bpf/bpf.h (ASM_SPEC): Pass -mxbpf to gas, if specified.
	* config/bpf/bpf.c (bpf_output_call): Support indirect calls in xBPF.

gcc/testsuite/

	* gcc.target/bpf/xbpf-indirect-call-1.c: New test.
2020-09-04 10:18:56 +02:00
Kewen Lin
e1336703f8 test/rs6000: Replace test targets p8 and p9+
This patch is to clean existing rs6000 test targets p8 and p9+
with existing has_arch_pwr8 and has_arch_pwr9 targets combination
or only one of them.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr92398.p9+.c: Replace p9+ with has_arch_pwr9.
	* gcc.target/powerpc/pr92398.p9-.c: Replace p9+ with has_arch_pwr9,
	and replace p8 with has_arch_pwr8 && !has_arch_pwr9.
	* lib/target-supports.exp (check_effective_target_p8): Remove.
	(check_effective_target_p9+): Remove.
2020-09-03 22:01:56 -05:00
GCC Administrator
6e82b6cfcf Daily bump. 2020-09-04 00:16:32 +00:00
Martin Jambor
8ad3fc6ca4 sra: Avoid SRAing if there is an aout-of-bounds access (PR 96820)
The testcase causes and ICE in the SRA verifier on x86_64 when
compiling with -m32 because build_user_friendly_ref_for_offset looks
at an out-of-bounds array_ref within an array_ref which accesses an
offset which does not fit into a signed 32bit integer and turns it
into an array-ref with a negative index.

The best thing is probably to bail out early when encountering an out
of bounds access to a local stack-allocated aggregate (and let the DSE
just delete such statements) which is what the patch does.

I also glanced over to the initial candidate vetting routine to make
sure the size would fit into HWI and noticed that it uses unsigned
variants whereas the rest of SRA operates on signed offsets and
sizes (because get_ref_and_extent does) and so changed that for the
sake of consistency.  These ancient checks operate on sizes of types
as opposed to DECLs but I hope that any issues potentially arising
from that are basically hypothetical.

gcc/ChangeLog:

2020-08-28  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/96820
	* tree-sra.c (create_access): Disqualify candidates with accesses
	beyond the end of the original aggregate.
	(maybe_add_sra_candidate): Check that candidate type size fits
	signed uhwi for the sake of consistency.

gcc/testsuite/ChangeLog:

2020-08-28  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/96820
	* gcc.dg/tree-ssa/pr96820.c: New test.
2020-09-03 22:43:49 +02:00
Will Schmidt
d8f3474ff8 [PATCH, rs6000] Fix vector long long subtype (PR96139)
Hi,
  This corrects an issue with the powerpc vector long long subtypes.
As reported by SjMunroe, when building some code with -Wall, and
attempting to print an element of a "long long vector" with a
long long printf format string, we will report an error because
the vector sub-type was improperly defined as int.

When defining a V2DI_type_node we use a TARGET_POWERPC64 ternary to
define the V2DI_type_node with "vector long" or "vector long long".
We also need to specify the proper sub-type when we define the type.

PR target/96139

2020-09-03  Will Schmidt  <will_schmidt@vnet.ibm.com>

gcc/ChangeLog:
	* config/rs6000/rs6000-call.c (rs6000_init_builtin): Update V2DI_type_node
	and unsigned_V2DI_type_node definitions.

gcc/testsuite/ChangeLog:
	* gcc.target/powerpc/pr96139-a.c: New test.
	* gcc.target/powerpc/pr96139-b.c: New test.
	* gcc.target/powerpc/pr96139-c.c: New test.
2020-09-03 15:05:59 -05:00
Jakub Jelinek
ba6730bd18 c++: Fix another PCH hash_map issue [PR96901]
The recent libstdc++ changes caused lots of libstdc++-v3 tests FAILs
on i686-linux, all of them in the same spot during constexpr evaluation
of a recursive _S_gcd call.
The problem is yet another hash_map that used the default hasing of
tree keys through pointer hashing which is preserved across PCH write/read.
During PCH handling, the addresses of GC objects are changed, which means
that the hash values of the keys in such hash tables change without those
hash tables being rehashed.  Which in the fundef_copies_table case usually
means we just don't find a copy of a FUNCTION_DECL body for recursive uses
and start from scratch.  But when the hash table keeps growing, the "dead"
elements in the hash table can sometimes reappear and break things.
In particular what I saw under the debugger is when the fundef_copies_table
hash map has been used on the outer _S_gcd call, it didn't find an entry for
it, so returned a slot with *slot == NULL, which is treated as that the
function itself is used directly (i.e. no recursion), but that addition of
a hash table slot caused the recursive _S_gcd call to actually find
something in the hash table, unfortunately not the new *slot == NULL spot,
but a different one from the pre-PCH streaming which contained the returned
toplevel (non-recursive) call entry for it, which means that for the
recursive _S_gcd call we actually used the same trees as for the outer ones
rather than a copy of those, which breaks constexpr evaluation.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

	PR c++/96901
	* tree.h (struct decl_tree_traits): New type.
	(decl_tree_map): New typedef.

	* constexpr.c (fundef_copies_table): Change type from
	hash_map<tree, tree> * to decl_tree_map *.
2020-09-03 21:53:40 +02:00
Harald Anlauf
8eeeecbcc1 PR fortran/96890 - Wrong answer with intrinsic IALL
The IALL intrinsic would always return 0 when the DIM and MASK arguments
were present since the initial value of repeated BIT-AND operations was
set to 0 instead of -1.

libgfortran/ChangeLog:

	* m4/iall.m4: Initial value for result should be -1.
	* generated/iall_i1.c (miall_i1): Generated.
	* generated/iall_i16.c (miall_i16): Likewise.
	* generated/iall_i2.c (miall_i2): Likewise.
	* generated/iall_i4.c (miall_i4): Likewise.
	* generated/iall_i8.c (miall_i8): Likewise.

gcc/testsuite/ChangeLog:

	* gfortran.dg/iall_masked.f90: New test.
2020-09-03 20:33:14 +02:00
Marek Polacek
753b4679bc c++: Fix P0960 in member init list and array [PR92812]
This patch nails down the remaining P0960 case in PR92812:

  struct A {
    int ar[2];
    A(): ar(1, 2) {} // doesn't work without this patch
  };

Note that when the target object is not of array type, this already
works:

  struct S { int x, y; };
  struct A {
    S s;
    A(): s(1, 2) { } // OK in C++20
  };

because build_new_method_call_1 takes care of the P0960 magic.

It proved to be quite hairy.  When the ()-list has more than one
element, we can always create a CONSTRUCTOR, because the code was
previously invalid.  But when the ()-list has just one element, it
gets all kinds of difficult.  As usual, we have to handle a("foo")
so as not to wrap the STRING_CST in a CONSTRUCTOR.  Always turning
x(e) into x{e} would run into trouble as in c++/93790.  Another
issue was what to do about x({e}): previously, this would trigger
"list-initializer for non-class type must not be parenthesized".
I figured I'd make this work in C++20, so that given

  struct S { int x, y; };

you can do

   S a[2];
   [...]
   A(): a({1, 2}) // initialize a[0] with {1, 2} and a[1] with {}

It also turned out that, as an extension, we support compound literals:

  F (): m((S[1]) { 1, 2 })

so this has to keep working as before.  Moreover, make sure not to trigger
in compiler-generated code, like =default, where array assignment is allowed.

I've factored out a function that turns a TREE_LIST into a CONSTRUCTOR
to simplify handling of P0960.

paren-init35.C also tests this with vector types.

gcc/cp/ChangeLog:

	PR c++/92812
	* cp-tree.h (do_aggregate_paren_init): Declare.
	* decl.c (do_aggregate_paren_init): New.
	(grok_reference_init): Use it.
	(check_initializer): Likewise.
	* init.c (perform_member_init): Handle initializing an array from
	a ()-list.  Use do_aggregate_paren_init.

gcc/testsuite/ChangeLog:

	PR c++/92812
	* g++.dg/cpp0x/constexpr-array23.C: Adjust dg-error.
	* g++.dg/cpp0x/initlist69.C: Likewise.
	* g++.dg/diagnostic/mem-init1.C: Likewise.
	* g++.dg/init/array28.C: Likewise.
	* g++.dg/cpp2a/paren-init33.C: New test.
	* g++.dg/cpp2a/paren-init34.C: New test.
	* g++.dg/cpp2a/paren-init35.C: New test.
	* g++.old-deja/g++.brendan/crash60.C: Adjust dg-error.
	* g++.old-deja/g++.law/init10.C: Likewise.
	* g++.old-deja/g++.other/array3.C: Likewise.
2020-09-03 14:30:06 -04:00
Jakub Jelinek
6641d6d3fe c++: Disable -frounding-math during manifestly constant evaluation [PR96862]
As discussed in the PR, fold-const.c punts on floating point constant
evaluation if the result is inexact and -frounding-math is turned on.
      /* Don't constant fold this floating point operation if the
         result may dependent upon the run-time rounding mode and
         flag_rounding_math is set, or if GCC's software emulation
         is unable to accurately represent the result.  */
      if ((flag_rounding_math
           || (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations))
          && (inexact || !real_identical (&result, &value)))
        return NULL_TREE;
Jonathan said that we should be evaluating them anyway, e.g. conceptually
as if they are done with the default rounding mode before user had a chance
to change that, and e.g. in C in initializers it is also ignored.
In fact, fold-const.c for C initializers turns off various other options:

/* Perform constant folding and related simplification of initializer
   expression EXPR.  These behave identically to "fold_buildN" but ignore
   potential run-time traps and exceptions that fold must preserve.  */

  int saved_signaling_nans = flag_signaling_nans;\
  int saved_trapping_math = flag_trapping_math;\
  int saved_rounding_math = flag_rounding_math;\
  int saved_trapv = flag_trapv;\
  int saved_folding_initializer = folding_initializer;\
  flag_signaling_nans = 0;\
  flag_trapping_math = 0;\
  flag_rounding_math = 0;\
  flag_trapv = 0;\
  folding_initializer = 1;

  flag_signaling_nans = saved_signaling_nans;\
  flag_trapping_math = saved_trapping_math;\
  flag_rounding_math = saved_rounding_math;\
  flag_trapv = saved_trapv;\
  folding_initializer = saved_folding_initializer;

So, shall cxx_eval_outermost_constant_expr instead turn off all those
options (then warning_sentinel wouldn't be the right thing to use, but given
the 8 or how many return stmts in cxx_eval_outermost_constant_expr, we'd
need a RAII class for this.  Not sure about the folding_initializer, that
one is affecting complex multiplication and division constant evaluation
somehow.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

	PR c++/96862
	* constexpr.c (cxx_eval_outermost_constant_expr): Temporarily disable
	flag_rounding_math during manifestly constant evaluation.

	* g++.dg/cpp1z/constexpr-96862.C: New test.
2020-09-03 20:11:43 +02:00
Jonathan Wakely
032a4b42cc libstdc++: Add workaround for weird std::tuple error [PR 96592]
This "fix" makes no sense, but it avoids an error from G++ about
std::is_constructible being incomplete. The real problem is elsewhere,
but this "fixes" the regression for now.

libstdc++-v3/ChangeLog:

	PR libstdc++/96592
	* include/std/tuple (_TupleConstraints<true, T...>): Use
	alternative is_constructible instead of std::is_constructible.
	* testsuite/20_util/tuple/cons/96592.cc: New test.
2020-09-03 16:26:16 +01:00
Jonathan Wakely
3c21913415 libstdc++: Optimise GCD algorithms
The current std::gcd and std::chrono::duration::_S_gcd algorithms are
both recursive. This is potentially expensive to evaluate in constant
expressions, because each level of recursion makes a new copy of the
function to evaluate. The maximum number of steps is bounded
(proportional to the number of decimal digits in the smaller value) and
so unlikely to exceed the limit for constexpr nesting, but the memory
usage is still suboptimal. By using an iterative algorithm we avoid
that compile-time cost. Because looping in constexpr functions is not
allowed until C++14, we need to keep the recursive implementation in
duration::_S_gcd for C++11 mode.

For std::gcd we can also optimise runtime performance by using the
binary GCD algorithm.

libstdc++-v3/ChangeLog:

	* include/std/chrono (duration::_S_gcd): Use iterative algorithm
	for C++14 and later.
	* include/std/numeric (__detail::__gcd): Replace recursive
	Euclidean algorithm with iterative version of binary GCD algorithm.
	* testsuite/26_numerics/gcd/1.cc: Test additional inputs.
	* testsuite/26_numerics/gcd/gcd_neg.cc: Adjust dg-error lines.
	* testsuite/26_numerics/lcm/lcm_neg.cc: Likewise.
	* testsuite/experimental/numeric/gcd.cc: Test additional inputs.
	* testsuite/26_numerics/gcd/2.cc: New test.
2020-09-03 12:46:13 +01:00
Jakub Jelinek
3536ff2de8 lto: Cache location_ts including BLOCKs in GIMPLE streaming [PR94311]
As mentioned in the PR, when compiling valgrind even on fairly small
testcase where in one larger function the location keeps oscillating
between a small line number and 8000-ish line number in the same file
we very quickly run out of all possible location_t numbers and because of
that emit non-sensical line numbers in .debug_line.
There are ways how to decrease speed of depleting location_t numbers
in libcpp, but the main reason of this is that we use
stream_input_location_now for streaming in location_t for gimple_location
and phi arg locations.  libcpp strongly prefers that the locations
it is given are sorted by the different files and by line numbers in
ascending order, otherwise it depletes quickly no matter what and is much
more costly (many extra file changes etc.).
The reason for not caching those were the BLOCKs that were streamed
immediately after the location and encoded into the locations (and for PHIs
we failed to stream the BLOCKs altogether).
This patch enhances the location cache to handle also BLOCKs (but not for
everything, only for the spots we care about the BLOCKs) and also optimizes
the size of the LTO stream by emitting a single bit into a pack whether the
BLOCK changed from last case and only streaming the BLOCK tree if it
changed.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

	PR lto/94311
	* gimple.h (gimple_location_ptr, gimple_phi_arg_location_ptr): New
	functions.
	* streamer-hooks.h (struct streamer_hooks): Add
	output_location_and_block callback.  Fix up formatting for
	output_location.
	(stream_output_location_and_block): Define.
	* lto-streamer.h (class lto_location_cache): Fix comment typo.  Add
	current_block member.
	(lto_location_cache::input_location_and_block): New method.
	(lto_location_cache::lto_location_cache): Initialize current_block.
	(lto_location_cache::cached_location): Add block member.
	(struct output_block): Add current_block member.
	(lto_output_location): Formatting fix.
	(lto_output_location_and_block): Declare.
	* lto-streamer.c (lto_streamer_hooks_init): Initialize
	streamer_hooks.output_location_and_block.
	* lto-streamer-in.c (lto_location_cache::cmp_loc): Also compare
	block members.
	(lto_location_cache::apply_location_cache): Handle blocks.
	(lto_location_cache::accept_location_cache,
	lto_location_cache::revert_location_cache): Fix up function comments.
	(lto_location_cache::input_location_and_block): New method.
	(lto_location_cache::input_location): Implement using
	input_location_and_block.
	(input_function): Invoke apply_location_cache after streaming in all
	bbs.
	* lto-streamer-out.c (clear_line_info): Set current_block.
	(lto_output_location_1): New function, moved from lto_output_location,
	added block handling.
	(lto_output_location): Implement using lto_output_location_1.
	(lto_output_location_and_block): New function.
	* gimple-streamer-in.c (input_phi): Use input_location_and_block
	to input and cache both location and block.
	(input_gimple_stmt): Likewise.
	* gimple-streamer-out.c (output_phi): Use
	stream_output_location_and_block.
	(output_gimple_stmt): Likewise.
2020-09-03 12:51:01 +02:00
Richard Biener
b246f5272e Improve constant folding of vector lowering with vector bools
This improves the situation somewhat when vector lowering tries
to access vector bools as seen in PR96814.

2020-09-03  Richard Biener  <rguenther@suse.de>

	* tree-vect-generic.c (tree_vec_extract): Remove odd
	special-casing of boolean vectors.
	* fold-const.c (fold_ternary_loc): Handle boolean vector
	type BIT_FIELD_REFs.
2020-09-03 12:47:59 +02:00
Arnaud Charlet
3cc3a373fe Preliminary work on support for 128bits integers
* fe.h, opt.ads (Enable_128bit_Types): New.
	* stand.ads (Standard_Long_Long_Long_Integer,
	S_Long_Long_Long_Integer): New.
2020-09-03 04:34:48 -04:00
Arnaud Charlet
eb6ea9e54f Look at fullest view when checking for static types in unnesting
When seeing if any bound involved in a type is an uplevel reference,
we must look at the fullest view of a type, since that's what the
backends will do.  Similarly for private types. We introduce
Get_Fullest_View for that purpose.

	* sem_util.ads, sem_util.adb (Get_Fullest_View): New procedure.
	* exp_unst.adb (Check Static_Type): Do all processing on fullest
	view of specified type.
2020-09-03 04:15:03 -04:00
liuhongt
4337341269 Optimize memory broadcast for constant vector under AVX512.
For constant vector having one duplicated value, there's no need to put
whole vector in the constant pool, using embedded broadcast instead.

2020-07-09  Hongtao Liu  <hongtao.liu@intel.com>

gcc/ChangeLog:

	PR target/87767
	* config/i386/i386-features.c
	(replace_constant_pool_with_broadcast): New function.
	(constant_pool_broadcast): Ditto.
	(class pass_constant_pool_broadcast): New pass.
	(make_pass_constant_pool_broadcast): Ditto.
	(remove_partial_avx_dependency): Call
	replace_constant_pool_with_broadcast under TARGET_AVX512F, it
	would save compile time when both pass rpad and cpb are
	available.
	(remove_partial_avx_dependency_gate): New function.
	(class pass_remove_partial_avx_dependency::gate): Call
	remove_partial_avx_dependency_gate.
	* config/i386/i386-passes.def: Insert new pass after combine.
	* config/i386/i386-protos.h
	(make_pass_constant_pool_broadcast): Declare.
	* config/i386/sse.md (*avx512dq_mul<mode>3<mask_name>_bcst):
	New define_insn.
	(*avx512f_mul<mode>3<mask_name>_bcst): Ditto.
	* config/i386/avx512fintrin.h (_mm512_set1_ps,
	_mm512_set1_pd,_mm512_set1_epi32, _mm512_set1_epi64): Adjusted.

gcc/testsuite/ChangeLog:

	PR target/87767
	* gcc.target/i386/avx2-broadcast-pr87767-1.c: New test.
	* gcc.target/i386/avx512f-broadcast-pr87767-1.c: New test.
	* gcc.target/i386/avx512f-broadcast-pr87767-2.c: New test.
	* gcc.target/i386/avx512f-broadcast-pr87767-3.c: New test.
	* gcc.target/i386/avx512f-broadcast-pr87767-4.c: New test.
	* gcc.target/i386/avx512f-broadcast-pr87767-5.c: New test.
	* gcc.target/i386/avx512f-broadcast-pr87767-6.c: New test.
	* gcc.target/i386/avx512f-broadcast-pr87767-7.c: New test.
	* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: New test.
	* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: New test.
	* gcc.target/i386/avx512vl-broadcast-pr87767-2.c: New test.
	* gcc.target/i386/avx512vl-broadcast-pr87767-3.c: New test.
	* gcc.target/i386/avx512vl-broadcast-pr87767-4.c: New test.
	* gcc.target/i386/avx512vl-broadcast-pr87767-5.c: New test.
	* gcc.target/i386/avx512vl-broadcast-pr87767-6.c: New test.
2020-09-03 16:10:45 +08:00
liuhongt
8bd5530bfa Adjust testcase.
gcc/testsuite/ChangeLog:
	PR target/96246
	PR target/96855
	PR target/96856
	PR target/96857
	* g++.target/i386/avx512bw-pr96246-2.C: Add runtime check for
	AVX512BW.
	* g++.target/i386/avx512vl-pr96246-2.C: Add runtime check for
	AVX512BW and AVX512VL
	* g++.target/i386/avx512f-helper.h: New header.
	* gcc.target/i386/pr92658-avx512f.c: Add
	-mprefer-vector-width=512 to avoid impact of different default
	mtune which gcc is built with.
	* gcc.target/i386/avx512bw-pr95488-1.c: Ditto.
	* gcc.target/i386/pr92645-4.c: Add -mno-avx512f to avoid
	impact of different default march which gcc is built with.
2020-09-03 10:25:50 +08:00
GCC Administrator
6a8f4e47c9 Daily bump. 2020-09-03 00:16:26 +00:00
Iain Buclaw
f0a3bab43f d: __vectors unsupported in hardware should be rejected at compile-time.
gcc/d/ChangeLog:

	PR d/96869
	* d-builtins.cc (build_frontend_type): Don't expose intrinsics that
	use unsupported vector types.
	* d-target.cc (Target::isVectorTypeSupported): Restrict to supporting
	only if TARGET_VECTOR_MODE_SUPPORTED_P is true.  Don't allow complex
	or boolean vector types.

gcc/testsuite/ChangeLog:

	PR d/96869
	* gdc.dg/simd.d: Removed.
	* gdc.dg/cast1.d: New test.
	* gdc.dg/gdc213.d: Compile with target vect_sizes_16B_8B.
	* gdc.dg/gdc284.d: Likewise.
	* gdc.dg/gdc67.d: Likewise.
	* gdc.dg/pr96869.d: New test.
	* gdc.dg/simd1.d: New test.
	* gdc.dg/simd10447.d: New test.
	* gdc.dg/simd12776.d: New test.
	* gdc.dg/simd13841.d: New test.
	* gdc.dg/simd13927.d: New test.
	* gdc.dg/simd15123.d: New test.
	* gdc.dg/simd15144.d: New test.
	* gdc.dg/simd16087.d: New test.
	* gdc.dg/simd16697.d: New test.
	* gdc.dg/simd17237.d: New test.
	* gdc.dg/simd17695.d: New test.
	* gdc.dg/simd17720a.d: New test.
	* gdc.dg/simd17720b.d: New test.
	* gdc.dg/simd19224.d: New test.
	* gdc.dg/simd19627.d: New test.
	* gdc.dg/simd19628.d: New test.
	* gdc.dg/simd19629.d: New test.
	* gdc.dg/simd19630.d: New test.
	* gdc.dg/simd2a.d: New test.
	* gdc.dg/simd2b.d: New test.
	* gdc.dg/simd2c.d: New test.
	* gdc.dg/simd2d.d: New test.
	* gdc.dg/simd2e.d: New test.
	* gdc.dg/simd2f.d: New test.
	* gdc.dg/simd2g.d: New test.
	* gdc.dg/simd2h.d: New test.
	* gdc.dg/simd2i.d: New test.
	* gdc.dg/simd2j.d: New test.
	* gdc.dg/simd7951.d: New test.
	* gdc.dg/torture/array2.d: New test.
	* gdc.dg/torture/array3.d: New test.
	* gdc.dg/torture/simd16488a.d: New test.
	* gdc.dg/torture/simd16488b.d: New test.
	* gdc.dg/torture/simd16703.d: New test.
	* gdc.dg/torture/simd19223.d: New test.
	* gdc.dg/torture/simd19607.d: New test.
	* gdc.dg/torture/simd3.d: New test.
	* gdc.dg/torture/simd4.d: New test.
	* gdc.dg/torture/simd7411.d: New test.
	* gdc.dg/torture/simd7413a.d: New test.
	* gdc.dg/torture/simd7413b.d: New test.
	* gdc.dg/torture/simd7414.d: New test.
	* gdc.dg/torture/simd9200.d: New test.
	* gdc.dg/torture/simd9304.d: New test.
	* gdc.dg/torture/simd9449.d: New test.
	* gdc.dg/torture/simd9910.d: New test.
2020-09-02 22:59:35 +02:00
Iain Buclaw
c285126cc0 d: Only test with default permutation flags for runnable tests.
Unless the test explicitly requests, all compilable tests as well as
fail_compilation tests will be ran without any extra flags.

The C++ tests now are checked against shared D runtime library.

gcc/testsuite/ChangeLog:

	* lib/gdc-utils.exp (gdc-convert-test): Handle LINK directive.
	Set PERMUTE_ARGS as DEFAULT_DFLAGS only for runnable tests.
	(gdc-do-test): Set default action of compilable tests to compile.
	Test SHARED_OPTION on runnable_cxx tests.
2020-09-02 22:59:34 +02:00
Iain Buclaw
72ddef620b d: Move all runnable tests in gdc.dg to gdc.dg/torture
Tests that are not executed do not need to be compiled as torture tests,
they are only present for testing for a certain bug or ICE.

gcc/testsuite/ChangeLog:

	* gdc.dg/dg.exp: Remove torture options.
	* gdc.dg/gdc115.d: Move test to gdc.dg/torture.
	* gdc.dg/gdc131.d: Likewise.
	* gdc.dg/gdc141.d: Likewise.
	* gdc.dg/gdc17.d: Likewise.
	* gdc.dg/gdc171.d: Likewise.
	* gdc.dg/gdc179.d: Likewise.
	* gdc.dg/gdc186.d: Likewise.
	* gdc.dg/gdc187.d: Likewise.
	* gdc.dg/gdc191.d: Likewise.
	* gdc.dg/gdc198.d: Likewise.
	* gdc.dg/gdc200.d: Likewise.
	* gdc.dg/gdc210.d: Likewise.
	* gdc.dg/gdc240.d: Likewise.
	* gdc.dg/gdc242b.d: Likewise.
	* gdc.dg/gdc248.d: Likewise.
	* gdc.dg/gdc250.d: Likewise.
	* gdc.dg/gdc273.d: Likewise.
	* gdc.dg/gdc283.d: Likewise.
	* gdc.dg/gdc285.d: Likewise.
	* gdc.dg/gdc286.d: Likewise.
	* gdc.dg/gdc309.d: Likewise.
	* gdc.dg/gdc35.d: Likewise.
	* gdc.dg/gdc36.d: Likewise.
	* gdc.dg/gdc51.d: Likewise.
	* gdc.dg/gdc57.d: Likewise.
	* gdc.dg/gdc66.d: Likewise.
	* gdc.dg/imports/gdc36.d: Likewise.
	* gdc.dg/init1.d: Likewise.
	* gdc.dg/pr92309.d: Likewise.
	* gdc.dg/pr94424.d: Likewise.
	* gdc.dg/pr94777b.d: Likewise.
	* gdc.dg/pr96152.d: Likewise.
	* gdc.dg/pr96153.d: Likewise.
	* gdc.dg/pr96156.d: Likewise.
	* gdc.dg/pr96157a.d: Likewise.
	* gdc.dg/torture/torture.exp: New file.
2020-09-02 22:59:34 +02:00
Jonathan Wakely
f049cda373 c++: Stop defining true, false and bool as macros in <stdbool.h>
Since r216679 these macros have only been defined in C++98 mode, rather
than all modes. That is permitted as a GNU extension because that header
doesn't exist in the C++ standard until C++11, so we can make it do
whatever we want for C++98. But as discussed in the PR c++/60304
comments, these macros shouldn't ever be defined for C++.

This patch removes the macro definitions for C++98 too.

The new test already passed for C++98 (and the conversion is ill-formed
in C++11 and later) so this new test is arguably unnecessary.

gcc/ChangeLog:

	PR c++/60304
	* ginclude/stdbool.h (bool, false, true): Never define for C++.

gcc/testsuite/ChangeLog:

	PR c++/60304
	* g++.dg/warn/Wconversion-null-5.C: New test.
2020-09-02 18:51:28 +01:00
Jonathan Wakely
ce90d203ce testsuite: Add missing <exception> header to testcase
This test no longer compiles because <new> stopped including
<exception>, so std::set_terminate is not defined.

gcc/testsuite/ChangeLog:

	* g++.old-deja/g++.abi/cxa_vec.C: Include <exception> for
	std::set_terminate.
2020-09-02 18:41:20 +01:00
Jonathan Wakely
c71644776f libstdc++: Fix test to use correct function
This was copied from a test for std::lcm but I forgot to change one of
the calls to use the experimental version of the function.

libstdc++-v3/ChangeLog:

	PR libstdc++/92978
	* testsuite/experimental/numeric/92978.cc: Use experimental::lcm
	not std::lcm.
2020-09-02 17:22:47 +01:00
Jozef Lawrynowicz
0edc2c1a24 MSP430: Fix -mlarge documentation to indicate size_t is a 20-bit type
gcc/ChangeLog:

	* doc/invoke.texi (MSP430 options): Fix -mlarge description to
	indicate size_t is a 20-bit type.
2020-09-02 16:50:59 +01:00
Jonathan Wakely
2f983fa690 libstdc++: Fix three-way comparison for std::array [PR 96851]
The spaceship operator for std::array uses memcmp when the
__is_byte<value_type> trait is true, but memcmp isn't usable in
constexpr contexts. Also, memcmp should only be used for unsigned byte
types, because it gives the wrong answer for signed chars with negative
values.

We can simply check std::is_constant_evaluated() so that we don't use
memcmp during constant evaluation.

To fix the problem of using memcmp for inappropriate types, this patch
adds new __is_memcmp_ordered and __is_memcmp_ordered_with traits. These
say whether using memcmp will give the right answer for ordering
operations such as lexicographical_compare and three-way comparisons.
The new traits can be used in several places, and can also be used to
implement my suggestion in PR 93059 comment 37 to use memcmp for
unsigned integers larger than one byte on big endian targets.

libstdc++-v3/ChangeLog:

	PR libstdc++/96851
	* include/bits/cpp_type_traits.h (__is_memcmp_ordered):
	New trait that says if memcmp can be used for ordering.
	(__is_memcmp_ordered_with): Likewise, for two types.
	* include/bits/deque.tcc (__lex_cmp_dit): Use new traits
	instead of __is_byte and __numeric_traits.
	(__lexicographical_compare_aux1): Likewise.
	* include/bits/ranges_algo.h (__lexicographical_compare_fn):
	Likewise.
	* include/bits/stl_algobase.h (__lexicographical_compare_aux1)
	(__is_byte_iter): Likewise.
	* include/std/array (operator<=>): Likewise. Only use memcmp
	when std::is_constant_evaluated() is false.
	* testsuite/23_containers/array/comparison_operators/96851.cc:
	New test.
	* testsuite/23_containers/array/tuple_interface/get_neg.cc:
	Adjust dg-error line numbers.
2020-09-02 15:32:11 +01:00
Jozef Lawrynowicz
d45a6c7099 MSP430: Skip gcc.dg/pr55940.c in the small memory model
In the MSP430 small memory model, there is a 16-bit address space and
pointer arithmetic wraps around the address space, so any calculated
address is always within this range.

In this test, pointer arithmetic wraps when 0x1000 is added to the
address of a variable, causing the resulting address to be unexpectedly
less than 0x2000, which breaks the test.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr55940.c: Skip for msp430 unless -mlarge is specified.
2020-09-02 14:18:09 +01:00
Jonathan Wakely
6bdbf0f37b libstdc++: Break header cycle between <new> and <exception>
The <new> and <exception> headers each include each other, which makes
building them as header-units "exciting". The <new> header only needs
the definition of std::exception (in order to derive from it) which is
already in its own header, so just include that.

libstdc++-v3/ChangeLog:

	* include/bits/stl_iterator.h: Include <bits/exception_defines.h>
	for definitions of __try, __catch and __throw_exception_again.
	(counted_iterator::operator++(int)): Use __throw_exception_again
	instead of throw.
	* libsupc++/new: Include <bits/exception.h> not <exception>.
	* libsupc++/new_opvnt.cc: Include <bits/exception_defines.h>.
	* testsuite/18_support/destroying_delete.cc: Include
	<type_traits> for std::is_same_v definition.
	* testsuite/20_util/variant/index_type.cc: Qualify size_t.
2020-09-02 13:56:32 +01:00
Jakub Jelinek
b567d3bd30 fortran: Fix o'...' boz to integer/real conversions [PR96859]
The standard says that excess digits from boz are truncated.
For hexadecimal or binary, the routines copy just the number of digits
that will be needed, but for octal we copy number of digits that
contain one extra bit (for 8-bit, 32-bit or 128-bit, i.e. kind 1, 4 and 16)
or two extra bits (for 16-bit or 64-bit, i.e. kind 2 and 8).
The clearing of the first bit is done correctly by changing the first digit
if it is 4-7 to one smaller by 4 (i.e. modulo 4).
The clearing of the first two bits is done by changing 4 or 6 to 0
and 5 or 7 to 1, which is incorrect, because we really want to change the
first digit to 0 if it was even, or to 1 if it was odd, so digits
2 and 3 are mishandled by keeping them as is, rather than changing 2 to 0
and 3 to 1.

2020-09-02  Jakub Jelinek  <jakub@redhat.com>

	PR fortran/96859
	* check.c (gfc_boz2real, gfc_boz2int): When clearing first two bits,
	change also '2' to '0' and '3' to '1' rather than just handling '4'
	through '7'.

	* gfortran.dg/pr96859.f90: New test.
2020-09-02 12:18:46 +02:00
Roger Sayle
6640a5b9e7 hppa: Improve hppa_rtx_costs for shifts by constants.
This patch provides more accurate rtx_costs estimates for shifts by
integer constants (which are cheaper than by a register amount).

2020-09-02  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/pa/pa.c (hppa_rtx_costs) [ASHIFT, ASHIFTRT, LSHIFTRT]:
	Provide accurate costs for shifts of integer constants.
2020-09-02 09:30:50 +01:00
Jose E. Marchesi
7047a8bab6 bpf: use the default asm_named_section target hook
This patch makes the BPF backend to not provide its own implementation
of the asm_named_section hook; the default handler works perfectly
well.

2020-09-02  Jose E. Marchesi  <jose.marchesi@oracle.com>

	gcc/
	* config/bpf/bpf.c (bpf_asm_named_section): Delete.
	(TARGET_ASM_NAMED_SECTION): Likewise.
2020-09-02 09:12:51 +02:00