This adds a set of calls to name lookup that are needed by modules.
Generally installing imported bindings, or walking the current TU's
bindings. One note about template instantiations though. When we're
about to instantiate a template we have to know about all the
maybe-partial specializations that exist. These can be in any
imported module -- not necesarily the module defining the template.
Thus we key such foreign templates to the innermost namespace and
identifier of the containing entitity -- that's the only thing we have
a handle on. That's why we note and load pending specializations here.
gcc/cp/
* module.cc (lazy_specializations_p): Stub.
* name-lookup.h (append_imported_binding_slot)
(mergeable_namespacE_slots, lookup_class_binding)
(walk_module_binding, import_module_binding, set_module_binding)
(note_pending_specializations, load_pending_specializations)
(add_module_decl, add_imported_namespace): Declare.
(get_cxx_dialect_name): Declare.
(enum WMB_flags): New.
* name-lookup.c (append_imported_binding_slot)
(mergeable_namespacE_slots, lookup_class_binding)
(walk_module_binding, import_module_binding, set_module_binding)
(note_pending_specializations, load_pending_specializations)
(add_module_decl, add_imported_namespace): New.
(get_cxx_dialect_name): Make extern.
This fixes a missed SFINAE when subtracting pointers to an incomplete
type.
gcc/cp/ChangeLog:
PR c++/78173
* typeck.c (pointer_diff): Use complete_type_or_maybe_complain
instead of complete_type_or_else.
gcc/testsuite/ChangeLog:
PR c++/78173
* g++.dg/cpp2a/concepts-pr78173.C: New test.
2020-11-26 Andrea Corallo <andrea.corallo@arm.com>
* gcc.target/arm/lob2.c: Use '-march=armv8.1-m.main+fp'.
* gcc.target/arm/lob3.c: Skip with '-mfloat-abi=hard'.
* gcc.target/arm/lob4.c: Likewise.
* gcc.target/arm/lob5.c: Use '-march=armv8.1-m.main+fp'.
As the testcase shows, for 32-bit word size we can end up with op1
up to 0xffffffff (0x100000000 % 0xffffffff == 1 and so we use bit == 32
for that), but the CONST_INT we got from caller is for DImode in that case
and not valid for SImode operations.
The following patch canonicalizes the two spots where the constant needs
canonicalization.
2020-12-10 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/98229
* optabs.c (expand_doubleword_mod): Canonicalize op1 and
1 - INTVAL (op1) as word_mode constants when used in
word_mode arithmetics.
* gcc.c-torture/compile/pr98229.c: New test.
With following backedges and the SLP discovery cache not being
permute aware we have to put some discovery limits in place again.
That's also the opportunity to ditch the separate limit on the
number of permutes we try, so the patch limits the overall work
done (as in vect_build_slp_tree cache misses) to what we compute
as max_tree_size which is based on the number of scalar stmts in
the vectorized region.
Note the limit is global and there's no attempt to divide the
allowed work evenly amongst opportunities, so one degenerate
can eat it all up. That's probably only relevant for BB
vectorization where the limit is based on up to the size of the
whole function.
2020-12-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/98235
* tree-vect-slp.c (vect_build_slp_tree): Exchange npermutes
for limit. Decrement that for each cache miss and fail
discovery when it reaches zero.
(vect_build_slp_tree_2): Remove npermutes handling and
simply pass down limit.
(vect_build_slp_instance): Use pass down limit.
(vect_analyze_slp_instance): Likewise.
(vect_analyze_slp): Base the SLP discovery limit on
max_tree_size and pass it down.
* gcc.dg/torture/pr98235.c: New testcase.
Some targets decide to promote certain scalar variables to wider mode,
so their DECL_RTL is a SUBREG with SUBREG_PROMOTED_VAR_P.
When storing to such vars, store_expr takes care of sign or zero extending,
but if we store e.g. through MEM_REF into them, no sign or zero extension
happens and that leads to wrong-code e.g. on the following testcase on
aarch64-linux.
The following patch uses store_expr if we overwrite all the bits and it is
not reversed storage order, i.e. something that store_expr handles normally,
and otherwise (if the most significant bit is (or for pdp11 might be, but
pdp11 doesn't promote) being modified), the code extends manually.
2020-12-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/98190
* expr.c (expand_assignment): If to_rtx is a promoted SUBREG,
ensure sign or zero extension either through use of store_expr
or by extending manually.
* gcc.dg/pr98190.c: New test.
gcc/ChangeLog
2020-12-10 Andrea Corallo <andrea.corallo@arm.com>
PR rtl-optimization/97092
* ira-color.c (update_costs_from_allocno): Do not carry over mode
between subsequent iterations.
gcc/testsuite/ChangeLog
2020-12-10 Andrea Corallo <andrea.corallo@arm.com>
* gcc.target/aarch64/sve/pr97092.c: New test.
The pattern recognizer fends off against recognizing conversions
from VECT_SCALAR_BOOLEAN_TYPE_P to precision one types but what
it really needs to fend off is conversions between
VECT_SCALAR_BOOLEAN_TYPE_P types - the Ada FE uses an 8 bit
boolean type that satisfies this predicate.
2020-12-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/95582
* tree-vect-patterns.c (vect_recog_bool_pattern): Check
for VECT_SCALAR_BOOLEAN_TYPE_P, not just precision one.
When compiling:
void foo (void);
void bar (float a, float b) { if (__builtin_expect (a != b, 1)) foo (); }
void baz (float a, float b) { if (__builtin_expect (a == b, 1)) foo (); }
void qux (float a, float b) { if (__builtin_expect (a != b, 0)) foo (); }
void corge (float a, float b) { if (__builtin_expect (a == b, 0)) foo (); }
on x86_64, we get (unimportant cruft removed):
bar: ucomiss %xmm1, %xmm0
jp .L4
je .L1
.L4: jmp foo
.L1: ret
baz: ucomiss %xmm1, %xmm0
jp .L6
jne .L6
jmp foo
.L6: ret
qux: ucomiss %xmm1, %xmm0
jp .L13
jne .L13
ret
.L13: jmp foo
corge: ucomiss %xmm1, %xmm0
jnp .L18
.L14: ret
.L18: jne .L14
jmp foo
(note for bar and qux that changed with a patch I've posted earlier today).
This is all reasonable, except the last function, the overall jump to
the tail call is predicted unlikely (10%), so it is good jmp foo isn't on
the straight line path, but NaNs are (or should be) considered very unlikely
in the programs, so IMHO the right code (and one emitted with the following
patch) is:
corge: ucomiss %xmm1, %xmm0
jp .L14
je .L18
.L14: ret
.L18: jmp foo
Let's discuss the probabilities in the above testcase:
for !and_them it looks all correct, so for
bar we split
if (a != b) goto t; // prob 90%
goto f;
into:
if (a unord b) goto t; // first_prob = prob * cprob = 90% * 1% = 0.9%
if (a ltgt b) goto t; // adjusted prob = (prob - first_prob) / (1 - first_prob) = (90% - 0.9%) / (1 - 0.9%) = 89.909%
and for qux we split
if (a != b) goto t; // prob 10%
goto f;
into:
if (a unord b) goto t; // first_prob = prob * cprob = 10% * 1% = 0.1%
if (a ltgt b) goto t; // adjusted prob = (prob - first_prob) / (1 - first_prob) = (10% - 0.1%) / (1 - 0.1%) = 9.910%
Now, the and_them cases should be probability wise exactly the same
if we swap the f and t labels, because baz
if (a == b) goto t; // prob 90%
goto f;
is equivalent to:
if (a != b) goto f; // prob 10%
goto t;
which is in qux. This means we could expand baz as:
if (a unord b) goto f; // 0.1%
if (a ltgt b) goto f; // 9.910%
goto t;
But we don't expand it exactly that way, but instead (as the comment says)
as:
if (a ord b) ; else goto f; // first_prob as probability of ;
if (a uneq b) goto t; // adjusted prob
goto f;
So, first_prob.invert () should be 0.1% and adjusted prob should be
1 - 9.910%.
Thus, the right thing is 4 inverts:
prob = prob.invert (); // baz is equivalent to qux with swap(t, f) and thus inverted original prob
first_prob = prob.split (cprob.invert ()).invert ();
// cprob.invert because by doing if (cond) ; else goto f; we effectively invert the condition
// the second invert because first_prob is probability of ; rather than goto f
prob = prob.invert (); // lastly because adjusted prob we want is
// probability of goto t;, while the one from corresponding !and_them case
// would be if (...) goto f; goto t;
2020-12-11 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/98212
* dojump.c (do_compare_rtx_and_jump): Change computation of
first_prob for and_them. Add comment explaining and_them case.
* gcc.dg/predict-8.c: Adjust expected probability.
There's no need to explicitly check for the maximum value, because the
function we call handles it correctly anyway.
libstdc++-v3/ChangeLog:
PR libstdc++/98226
* include/std/bit (__countl_one, __countr_one): Remove redundant
branches.
Calculate block exit info upfront, and then any SSA_NAME which is never
used in an outgoing range calculation is a pure global and can bypass the
on-entry cache.
PR tree-optimization/98174
* gimple-range-cache.cc (ranger_cache::ssa_range_in_bb): Only push
poor values to be examined if it isn't a pure global.
(ranger_cache::block_range): Don't process pure globals.
(ranger_cache::fill_block_cache): Adjust has_edge_range call.
* gimple-range-gori.cc (gori_map::all_outgoing): New bitmap.
(gori_map::gori_map): Allocate all_outgoing.
(gori_map::is_export_p): No specified BB returns global context.
(gori_map::calculate_gori): Accumulate each block into global.
(gori_compute::gori_compute): Preprocess each block for exports.
(gori_compute::has_edge_range_p): No edge returns global context.
* gimple-range-gori.h (has_edge_range_p): Provide default parameter.
It's a rather curious malfunction of the 'Mod attribute applied to the
variable of a loop whose upper bound is dynamic.
gcc/ada/ChangeLog:
PR ada/98230
* exp_attr.adb (Expand_N_Attribute_Reference, case Mod): Use base
type of argument to obtain static bound and required size.
gcc/testsuite/ChangeLog:
* gnat.dg/modular6.adb: New test.
A common pattern before C++17 is the generator function, used to avoid
having to specify the type of a container element by using a function call
to get type deduction; for example, std::make_pair. C++17 added class type
argument deduction, making generator functions unnecessary for many uses,
but GCC won't be written in C++17 for years yet.
gcc/cp/ChangeLog:
* cp-tree.h (struct type_identity): New.
(make_temp_override): New.
* decl.c (grokdeclarator): Use it.
* except.c (maybe_noexcept_warning): Use it.
* parser.c (cp_parser_enum_specifier): Use it.
(cp_parser_parameter_declaration_clause): Use it.
(cp_parser_gnu_attributes_opt): Use it.
(cp_parser_std_attribute): Use it.
Pre-r11-557 we issued a bogus
error: parameter may not have variably modified type 'double [x]'
but now we compile this, as we should.
gcc/testsuite/ChangeLog:
PR c++/91506
* g++.dg/init/array60.C: New test.
This extends using-decls to modules. In modules you can export a
using decl, but the exported decl must have external linkage already.
One thing you can do is export something from the GMF.
The novel thing is that now 'export using foo::bar;' *in namespace
bar* can mean something significant (rather than be an obscure nop).
gcc/cp/
* name-lookup.c (do_nonmember_using_decl): Add INSERT_P parm.
Deal with exporting using decls.
(finish_nonmember_using_decl): Examine BINDING_VECTOR.
This augments the name lookup with knowledge about the BINDING_VECTOR.
That holds per-module namespace bindings, and we need to collect the
bindings in visible imports when we do lookup. We also need to do
some checking when we're pushing a new decl to check we're not
overriding an existing visible binding in some way.
To deal with the Global Module and Module Partitions, we reserve 1 or
2 slots inthe BINDING_VECTOR to record those entities that may
legitimately appear in more than one module.
As mentioned before, the BINDING_VECTOR is created lazily, when
imported bindings appear. The current TUs decls then appear on slot
zero.
gcc/cp/
* cp-tree.h (visible_instantiation_path): Renamed.
* module.cc (get_originating_module_decl, lazy_load_binding)
(lazy_load_members, visible_instantiation_path): Stubs.
* name-lookup.c (STAT_TYPE_VISIBLE_P, STAT_VISIBLE): New.
(search_imported_binding_slot, init_global_partition)
(get_fixed_binding_slot): New.
(name_lookup::process_module_binding): New.
(name_lookup::search_namespace_only): Search BINDING_VECTOR.
(name_lookup::adl_namespace_fns): Likewise.
(name_lookip::search_adl): Search visible instantiation path.
(maybe_lazily_declare): Maybe lazy load members.
(implicitly_exporT_namespace): New.
(maybe_record_mergeable_decl): New.
(check_module_override): New.
(do_pushdecl): Deal with BINDING_VECTOR, check override.
(add_mergeable_namespace_entity): New.
(get_namespace_binding): Deal with BINDING_VECTOR.
(do_namespace_alias): Call set_originating_module.
(lookup_elaborated_type_1): Deal with BINDING_VECTOR.
(do_pushtag): Call set_originating_module.
(reuse_namespace): New.
(make_namespace_finish): Add FROM_IMPORT parm.
(push_namespace): Deal with BINDING_VECTOR & namespace reuse.
(maybe_save_operator_binding): Save when module CMI in play.
* name-lookup.h (add_mergeable_namespace_entity): Declare.
This augments the spelling suggestion code to understand about visible
imported modules. Simply consider each visible binding in the
binding_vector, until we find one that has something of interest.
gcc/cp/
* name-lookup.c: Include bitmap.h.
(enum binding_slots): New.
(maybe_add_fuzzy_binding): Return bool true if found.
(consider_binding_level): Add module support.
* module.cc (get_import_bitmap): Stub.
I was about to add this test with dg-ice but it turned out it had
already been fixed by the recent r11-3361!
gcc/testsuite/ChangeLog:
PR c++/68451
* g++.dg/cpp0x/friend6.C: New test.
Here are some refactorings to the name-lookup machinery. Primarily
breakout out worker functions that the modules patch will also use.
Fixing a couple of comments on the way.
gcc/cp/
* name-lookup.c (pop_local_binding): Check for IDENTIFIER_ANON_P.
(update_binding): Level may be null, don't add namespaces to
level.
(newbinding_bookkeeping): New, broken out of ...
(do_pushdecl): ... here, call it. Don't push anonymous decls.
(pushdecl, add_using_namespace): Correct comments.
(do_push_nested_namespace): Remove assert.
(make_namespace, make_namespace_finish): New, broken out of ...
(push_namespace): ... here. Call them. Add namespace to level
here.
This handles the discriminated record types of Ada: the PLACEHOLDER_EXPR is
the "template" expression for the discriminant in the type definition. Now
for some components, typically arrays whose upper bound is the discriminant,
the compiler creates a local subtype for the component, so the code needs to
be able to deal with this nested type.
gcc/ChangeLog:
* dwarf2out.c (loc_list_from_tree_1) <PLACEHOLDER_EXPR>: Deal with
a nested context type
With modules, we need the ability to name 'foos' in different modules.
The idiom for that is a trailing '@modulename' suffix. This adds that
to the error printing routines. I also augment the tree dumping
machinery to show module-specific metadata.
gcc/cp/
* error.c (dump_module_suffix): New.
(dump_aggr_type, dump_simple_decl, dump_function_name): Call it.
* ptree.c (cxx_print_decl): Print module information.
* module.cc (module_name, get_importing_module): Stubs.
Name-lookup is the most changed piece of the front end for modules.
Here are some preparatort cleanups and API extensions.
gcc/cp/
* name-lookup.h (set_class_bindings): Return vector, take signed
'extra' parm.
* name-lookup.c (maybe_lazily_declare): Break out ...
(get_class_binding): .. of here, call it.
(find_member_slot): Adjust get_class_bindings call.
(set_class_bindings): Allow -ve extra. Return the vector.
(set_identifier_type_value_with_scope): Remove checking assert.
(lookup_using_decl): Set decl's context.
(do_pushtag): Adjust set_identifier_type_value_with_scope handling.
This removes gimple_debug_begin_stmts without block info which remain
after a gimple block originating from an inline function is unused.
The line numbers from these stmts are from the inline function,
but since the inline function is completely optimized away,
there will be no DW_TAG_inlined_subroutine so the debugger has
no callstack available at this point, and therefore those
line table entries are not helpful to the user.
2020-12-10 Bernd Edlinger <bernd.edlinger@hotmail.de>
* cfgexpand.c (expand_gimple_basic_block): Remove special handling
of debug_inline_entries without block info.
* tree-inline.c (remap_gimple_stmt): Drop debug_nonbind_markers when
the call statement has no block info.
(copy_debug_stmt): Remove debug_nonbind_markers when inlining
and the block info is mapped to NULL.
* tree-ssa-live.c (clear_unused_block_pointer): Remove
debug_nonbind_markers originating from removed inline functions.
This removes an odd special-case of VECTOR_BOOLEAN_TYPE_P typed
conversions from vectorizable_assignment that was obsoleted by
making all integer mode VECTOR_BOOLEAN_TYPE_P types have 1-bit
precision bool components with 605c2a393d
2020-12-10 Richard Biener <rguenther@suse.de>
* tree-vect-stmts.c (vectorizable_assignment): Remove special
allowance of VECTOR_BOOLEAN_TYPE_P conversions.
This patch enables MVE vandq instructions for auto-vectorization. MVE
vandq insns in mve.md are modified to use 'and' instead of unspec
expression to support and<mode>3. The and<mode>3 expander is added to
vec-common.md
2020-12-03 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* config/arm/iterators.md (supf): Remove VANDQ_S and VANDQ_U.
(VANQ): Remove.
(VDQ): Add TARGET_HAVE_MVE condition where relevant.
* config/arm/mve.md (mve_vandq_u<mode>): New entry for vand
instruction using expression 'and'.
(mve_vandq_s<mode>): New expander.
(mve_vaddq_n_f<mode>): Use 'and' code instead of unspec.
* config/arm/neon.md (and<mode>3): Rename into and<mode>3_neon.
* config/arm/predicates.md (imm_for_neon_inv_logic_operand):
Enable for MVE.
* config/arm/unspecs.md (VANDQ_S, VANDQ_U, VANDQ_F): Remove.
* config/arm/vec-common.md (and<mode>3): New expander.
gcc/testsuite/
* gcc.target/arm/simd/mve-vand.c: New test.
PR98069 is about a case in which split_constant_offset miscategorises
an expression of the form:
int foo;
…
POINTER_PLUS_EXPR<base, (sizetype)(INT_MIN - foo) * size>
as:
base: base
offset: (sizetype) (-foo) * size
init: INT_MIN * size
“-foo” overflows when “foo” is INT_MIN, whereas the original expression
didn't overflow in that case.
As discussed in the PR trail, we could simply ignore the fact that
int overflow is undefined and treat it as a wrapping type, but that
is likely to pessimise quite a few cases.
This patch instead reworks split_constant_offset so that:
- it treats integer operations as having an implicit cast to sizetype
- for integer operations, the returned VAR has type sizetype
In other words, the problem becomes to express:
(sizetype) (OP0 CODE OP1)
as:
VAR:sizetype + (sizetype) OFF:ssizetype
The top-level integer split_constant_offset will (usually) be a sizetype
POINTER_PLUS operand, so the extra cast to sizetype disappears. But adding
the cast allows the conversion handling to defer a lot of the difficult
cases to the recursive split_constant_offset call, which can detect
overflow on individual operations.
The net effect is to analyse the access above as:
base: base
offset: -(sizetype) foo * size
init: INT_MIN * size
See the comments in the patch for more details.
gcc/
PR tree-optimization/98069
* tree-data-ref.c (compute_distributive_range): New function.
(nop_conversion_for_offset_p): Likewise.
(split_constant_offset): In the internal overload, treat integer
expressions as having an implicit cast to sizetype and express
them accordingly. Pass back the range of the original (uncast)
expression in a new range parameter.
(split_constant_offset_1): Likewise. Rework the handling of
conversions to account for the implicit sizetype casts.
This addresses pr97929. The case for WIDEN_PLUS and WIDEN_MINUS were
missing in vect_get_smallest_scalar_type.
gcc/ChangeLog:
PR tree-optimization/97929
* tree-vect-data-refs.c (vect_get_smallest_scalar_type): Add
WIDEN_PLUS/WIDEN_MINUS case.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/pr97929.c: New test.
Add 'w+'/'w-' as WIDEN_PLUS/WIDEN_MINUS respectively.
Add VEC_WIDEN_PLUS/MINUS_HI/LO<...> for
VEC_WIDEN_PLUS/MINUS_HI/LO
gcc/ChangeLog:
* tree-pretty-print.c (dump_generic_node): Add case for
VEC_WIDEN_(PLUS/MINUS)_(HI/LO)_EXPR and WIDEN_(PLUS/MINUS)_EXPR.
Pattern recog incompletely handles some bool cases but we shouldn't
miscompile as a result but not vectorize. Unfortunately
vectorizable_assignment lets invalid conversions (that
vectorizable_conversion rejects) slip through. The following
rectifies that.
2020-12-10 Richard Biener <rguenther@suse.de>
PR tree-optimization/98211
* tree-vect-stmts.c (vectorizable_assignment): Disallow
invalid conversions to bool vector types.
* gcc.dg/pr98211.c: New testcase.
I made a cut&pasto in my previous patch for tree.c, causing platforms
that have CLEAR_INSN_CACHE defined, and none of the internal
__clear_cache expansion overriders, to issue calls to symbols named
__builtin___clear_cache rather than __clear_cache, on languages other
than those in the C family. Oops.
This patch removes __builtin_ from the string used as the libname for
__buuiltin___clear_cache.
for gcc/ChangeLog
* tree.c (build_common_builtin_nodes): Drop __builtin_ from
__clear_cache libname.
The x86 backend doesn't have EQ or NE floating point comparisons,
so splits x != y into x unord y || x <> y. The problem with that is
that unord comparison doesn't trap on qNaN operands but LTGT does.
The end effect is that it doesn't trap on qNaN operands, because x unord y
will be true for those and so LTGT will not be performed, but as the backend
is currently unable to merge signalling and non-signalling comparisons (and
after all, with this exact exception it shouldn't unless the first one is
signalling and the second one is non-signalling) it means we end up with:
ucomiss %xmm1, %xmm0
jp .L4
comiss %xmm1, %xmm0
jne .L4
ret
.p2align 4,,10
.p2align 3
.L4:
xorl %eax, %eax
jmp foo
where the comiss is the signalling comparison, but we already know that
the right flags bits are already computed by the ucomiss insn.
The following patch, if target supports UNEQ comparisons, splits NE
as x unord y || !(x uneq y) instead, which in the end means we end up with
just:
ucomiss %xmm1, %xmm0
jp .L4
jne .L4
ret
.p2align 4,,10
.p2align 3
.L4:
jmp foo
because UNEQ is like UNORDERED non-signalling.
2020-12-10 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/98212
* dojump.c (do_compare_rtx_and_jump): When splitting NE and backend
can do UNEQ, prefer splitting x != y into x unord y || !(x uneq y)
instead of into x unord y || x ltgt y.
* gcc.target/i386/pr98212.c: New test.
If the backend doesn't have floating point EQ or NE comparison, dojump.c
splits it into ORDERED && UNEQ or UNORDERED || LTGT. If both comparison
operands are the same, we know the result of the second comparison though,
a == b is equivalent to a ord b and a != b is equivalent to a unord b,
and thus can just use ORDERED or UNORDERED.
On the testcase, this changes f1:
- ucomiss %xmm0, %xmm0
- movl $1, %eax
- jp .L3
- jne .L3
- ret
- .p2align 4,,10
- .p2align 3
-.L3:
xorl %eax, %eax
+ ucomiss %xmm0, %xmm0
+ setnp %al
and f3:
- ucomisd %xmm0, %xmm0
- movl $1, %eax
- jp .L8
- jne .L8
- ret
- .p2align 4,,10
- .p2align 3
-.L8:
xorl %eax, %eax
+ ucomisd %xmm0, %xmm0
+ setnp %al
while keeping the same code for f2 and f4.
2020-12-10 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/98169
* dojump.c (do_compare_rtx_and_jump): Don't split self-EQ/NE
comparisons, just use ORDERED or UNORDERED.
* gcc.target/i386/pr98169.c: New test.
If the loop body doesn't ever continue, we don't have a bb to insert the
updates. Fixed by not adding them at all in that case.
2020-12-10 Jakub Jelinek <jakub@redhat.com>
PR middle-end/98205
* omp-expand.c (expand_omp_for_generic): Fix up broken_loop handling.
* c-c++-common/gomp/doacross-4.c: New test.
This adjusts the SLP build to allow a pattern root stmt to be
built from scalars. I've noticed this in PR98211 where we fail
to promote a SLP subtree to a simple splat operation and instead
emit a series of uniform vector operations. The bb-slp-div-1.c
testcase is now vectorized on x86_64 but only the store so I
adjusted it to expect the load to be vectorized.
2020-12-10 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_get_and_check_slp_defs): Do
not mark the defs to occur in a pattern if it is the
pattern root and record the original stmt defs in that
case.
* gcc.dg/vect/bb-slp-div-1.c: Expect the load to be
vectorized.
When building GCC for RISC-V with the --with-multilib-generator option,
it may not be possible to call arch-canonicalize as an executable when
building on Windows. Instead directly invoke the expected python
interpreter for this step.
gcc/ChangeLog:
* config/riscv/multilib-generator (arch_canonicalize): Invoke
python interpreter when calling arch-canonicalize script.
gcc/:
* godump.c (go_format_type): Don't consider whether a type has
been seen when determining whether to output a type by name.
Consider only the use_type_name parameter.
(go_output_typedef): When outputting a typedef, format the
declaration's original type, which contains the name of the
underlying type rather than the name of the typedef.
gcc/testsuite:
* gcc.misc-tests/godump-1.c: Add test case.
For Ada with LTO, boolean_{false,true}_node can be 1-bit precision boolean,
while TREE_TYPE (lhs) can be 8-bit precision boolean and thus we can end up
with wide_int mismatches.
This patch for non-VR_RANGE just use VARYING min/max manually.
The min + 1 != max check will then do the rest.
2020-12-09 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/98188
* tree-ssa-phiopt.c (two_value_replacement): Don't special case
BOOLEAN_TYPEs for ranges, instead if get_range_info doesn't return
VR_RANGE, set min/max to wi::min/max_value.
New +pauth (Pointer Authentication from Armv8.3-A) feature option for
-march command line option.
Please note that majority of PAUTH instructions are implemented behind HINT
instruction. PAUTH stays an Armv8.3-A feature but now can be assigned to other
architectures or CPUs.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def
(AARCH64_OPT_EXTENSION): New +pauth option in -march for AArch64.
* config/aarch64/aarch64.h (AARCH64_FL_PAUTH): New pauth extension bitmask.
(AARCH64_ISA_PUATH): New ISA bitmask for PAUTH.
(AARCH64_FL_FOR_ARCH8_3): Add PAUTH to Armv8.3-A.
(TARGET_PAUTH): New target mask to isolate PAUTH instructions.
* config/aarch64/aarch64.md (do_return): Condition set to TARGET_PAUTH.
* doc/invoke.texi: Update docs for +flagm and +pauth.