This makes us always use a single-bit boolean type component type
for integer mode mask VECTOR_BOOLEAN_TYPE_P to match the RTL and target
representation. This aovids the need for magic translation and
the inconsistencies from the translation requirement now that
we expose temporaries of those types on the GIMPLE level.
2020-10-23 Richard Biener <rguenther@suse.de>
PR middle-end/97521
* expr.c (const_scalar_mask_from_tree): Remove.
(expand_expr_real_1): Always VIEW_CONVERT integer mode
vector constants to an integer type.
* tree.c (build_truth_vector_type_for_mode): Use a single-bit
boolean component type for non-vector-mode mask_mode.
* gcc.target/i386/pr97521.c: New testcase.
Expand strncmp to "repz cmpsb" only with -minline-all-stringops since
"repz cmpsb" can be much slower than strncmp function implemented with
vector instructions, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
gcc/
PR target/95458
* config/i386/i386-expand.c (ix86_expand_cmpstrn_or_cmpmem):
Return false for -mno-inline-all-stringops.
gcc/testsuite/
PR target/95458
* gcc.target/i386/pr95458-1.c: New test.
* gcc.target/i386/pr95458-2.c: Likewise.
We used to expand memcmp to "repz cmpsb" via cmpstrnsi. It was changed
by
commit 9b0f6f5e51
Author: Nick Clifton <nickc@redhat.com>
Date: Fri Aug 12 16:26:11 2011 +0000
builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
* builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
pattern.
* doc/md.texi (cmpstrn): Note that the comparison stops if both
fetched bytes are zero.
(cmpstr): Likewise.
(cmpmem): Note that the comparison does not stop if both of the
fetched bytes are zero.
Duplicate the cmpstrn pattern for cmpmem. The only difference is that
the length argument of cmpmem is guaranteed to be less than or equal to
lengths of 2 memory areas. Since "repz cmpsb" can be much slower than
memcmp function implemented with vector instruction, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
expand cmpmem to "repz cmpsb" only for -minline-all-stringops.
gcc/
PR target/95151
* config/i386/i386-expand.c (ix86_expand_cmpstrn_or_cmpmem): New
function.
* config/i386/i386-protos.h (ix86_expand_cmpstrn_or_cmpmem): New
prototype.
* config/i386/i386.md (cmpmemsi): New pattern.
gcc/testsuite/
PR target/95151
* gcc.target/i386/pr95151-1.c: New test.
* gcc.target/i386/pr95151-2.c: Likewise.
* gcc.target/i386/pr95151-3.c: Likewise.
* gcc.target/i386/pr95151-4.c: Likewise.
This avoids overflow in the allocation size computations in
sbitmap_vector_alloc when the result exceeds 2GB.
2020-10-26 Richard Biener <rguenther@suse.de>
* sbitmap.c (sbitmap_vector_alloc): Use size_t for byte
quantities to avoid overflow.
This makes sure to reset out-of-loop debug uses before vectorizer
loop peeling as we cannot make sure to retain the use-def dominance
relationship when there are no LC SSA nodes.
2020-10-26 Richard Biener <rguenther@suse.de>
PR tree-optimization/97539
* tree-vect-loop-manip.c (vect_do_peeling): Reset out-of-loop
debug uses before peeling.
* gcc.dg/pr97539.c: New testcase.
the default duplicate and insert methods of sumaries produce empty
summary that is not useful for anything and makes it easy to introduce
bugs.
This patch makes the default hooks to abort and summaries that do not
need dupicaito/insertion disable the corresponding hooks. I also
implemented missing insertion hook for ipa-sra which forced me to move
analysis out of anonymous namespace.
2020-10-23 Jan Hubicka <hubicka@ucw.cz>
* cgraph.h (struct cgraph_node): Make ipa_transforms_to_apply vl_ptr.
* ipa-inline-analysis.c (initialize_growth_caches): Disable insertion
and duplication hooks.
* ipa-inline-transform.c (clone_inlined_nodes): Clear
ipa_transforms_to_apply.
(save_inline_function_body): Disable insertion hoook for
ipa_saved_clone_sources.
* ipa-prop.c (ipcp_transformation_initialize): Disable insertion hook.
* ipa-prop.h (ipa_node_params_t): Disable insertion hook.
* ipa-reference.c (propagate): Disable insertion hoook.
* ipa-sra.c (ipa_sra_summarize_function): Move out of anonymous
namespace.
(ipa_sra_function_summaries::insert): New virtual function.
* passes.c (execute_one_pass): Do not add transforms to inline clones.
* symbol-summary.h (function_summary_base): Make insert and duplicate
hooks fail instead of silently producing empty summaries; add way to
disable duplication hooks
(call_summary_base): Likewise.
* tree-nested.c (nested_function_info::get_create): Disable insertion
hooks
(maybe_record_nested_function): Likewise.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_base.h
(_Sp_counted_base::_M_add_ref_lock_nothrow(): Add noexcept to
definitions to match declaration.
(__shared_count(const __weak_count&, nothrow_t)): Add noexcept
to declaration to match definition.
gcc/ada/
* exp_aggr.adb (Build_Array_Aggr_Code): If the aggregate
includes an Others_Choice in an association that is an
Iterated_Component_Association, generate a proper loop for it.
gcc/ada/
* sem_attr.adb (Check_Image_Type): Remove "|", so the compiler
will not crash.
* errout.ads: Improve comment. This has nothing to do with
-gnatQ.
gcc/ada/
* contracts.adb (Causes_Contract_Freezing): Extend condition to
match the one in Analyze_Subprogram_Body_Helper. This routine is
used both as an assertion at the very start of
Freeze_Previous_Contracts and to detect previous declaration for
which Freeze_Previous_Contracts has been executed.
gcc/ada/
* inline.adb (Establish_Actual_Mapping_For_Inlined_Call): Add
guard for a call to Set_Last_Assignment with the same condition
as the assertion in that routine and explain why this guard
fails in GNATprove mode.
gcc/ada/
* libgnat/a-tifiio.adb: Change the range of supported Small
values.
(E0, E1, E2): Adjust factors.
(Exact): Return false if the Small does not fit in 64 bits.
gcc/ada/
* libgnat/g-socket.adb (Wait_On_Socket): Boolean parameter
For_Read changed to Event parameter of type
GNAT.Sockets.Poll.Wait_Event_Set. Implementation is simplified
and based on call to GNAT.Sockets.Poll.Wait now.
gcc/ada/
* libgnat/s-dwalin.adb (Symbolic_Traceback): Always emit the hex
address at the beginning of an entry if suppression is not
requested. Consistently output a "???" for the subprogram name
when it is unknown.
gcc/ada/
* sem_aggr.adb (Resolve_Delta_Array_Aggregate): For an
association that is an iterated component association, attach
the copy of the expression to the tree prior to analysis, in
order to preserve its context. This is needed when verifying
static semantic rules that depend on context, for example that a
use of 'Old appears only within a postcondition.
gcc/ada/
* sem_aggr.adb (Resolve_Extension_Aggregate): When testing for
an aggregate that is illegal due to having an ancestor type that
has unknown discriminants, add an "or else" condition testing
whether the aggregate type has unknown discriminants and that
Partial_View_Has_Unknown_Discr is also set on the ancestor type.
Extend the comment, including adding ??? about a possible
simpler test.
gcc/ada/
* exp_spark.adb (Expand_SPARK_Delta_Or_Update): Add missing call
to Enter_Name, just like it is called for
iterated_component_association in Expand_SPARK_N_Aggregate.
gcc/ada/
* exp_spark.adb (Expand_SPARK_Delta_Or_Update): Reuse local
constant Expr and the Choice_List routine.
(Expand_SPARK_N_Aggregate): Reuse local constant Expr.
gcc/ada/
* freeze.adb (Freeze_Type_Refs): When an entity in an expression
function is a type, freeze the entity and not just its type,
which would be incomplete when the type is derived and/or
tagged.
Add overloads that accept a flags argument so we can print
debug_bb_n (5, TDF_DETAILS) in gdb, also the debug_bb_slim
variant would then be just a forwarder.
gcc/ChangeLog:
2020-10-26 Xionghu Luo <luoxhu@linux.ibm.com>
* cfg.c (debug_bb): New overloaded function.
(debug_bb_n): New overloaded function.
* cfg.h (debug_bb): New declaration.
(debug_bb_n): New declaration.
* print-rtl.c (debug_bb_slim): Call debug_bb with flags.
The GNATRTL_128BIT_PAIRS/OBJS need to be added for 64bit
multilibs on powerpc-darwin, and for powerpc64-darwin.
gcc/ada/ChangeLog:
* Makefile.rtl: Add GNATRTL_128BIT_PAIRS/OBJS for 64bit
PowerPC Darwin cases.
A wrong decl for findloc caused segfaults at runtime on
Darwin for ARM; however, this is only a symptom of a larger
disease: The declarations for our library functions are often
inconsistent. This patch solves that problem for the functions
specifically for the functions for which we do not pass optional
arguments, i.e. findloc and (min|max)loc.
It works by saving the symbols of the specific functions in
gfc_intrinsic_namespace and by generating the formal argument
lists from the actual argument lists. Because symbols are
re-used, so are the backend decls.
gcc/fortran/ChangeLog:
PR fortran/97454
* gfortran.h (gfc_symbol): Add pass_as_value flag.
(gfc_copy_formal_args_intr): Add optional argument
copy_type.
(gfc_get_intrinsic_function_symbol): Add prototype.
(gfc_find_intrinsic_symbol): Add prototype.
* intrinsic.c (gfc_get_intrinsic_function_symbol): New function.
(gfc_find_intrinsic_symbol): New function.
* symbol.c (gfc_copy_formal_args_intr): Add argument. Handle case
where the type needs to be copied from the actual argument.
* trans-intrinsic.c (remove_empty_actual_arguments): New function.
(specific_intrinsic_symbol): New function.
(gfc_conv_intrinsic_funcall): Use it.
(strip_kind_from_actual): Adjust so that the expression pointer
is set to NULL.
(gfc_conv_intrinsic_minmaxloc): Likewise.
(gfc_conv_intrinsic_minmaxval): Adjust removal of dim.
* trans-types.c (gfc_sym_type): If sym->pass_as_value is set, do
not pass by reference.
Rename HAVE_AS_WORKING_DWARF_4_FLAG to HAVE_AS_WORKING_DWARF_N_FLAG
Don't set HAVE_AS_WORKING_DWARF_N_FLAG if --gdwarf-5/--gdwarf-4 generate
an extra assembly input file in debug info from compiler generated
.debug_line or fail with the APP marker:
https://sourceware.org/bugzilla/show_bug.cgi?id=25878https://sourceware.org/bugzilla/show_bug.cgi?id=26740https://sourceware.org/bugzilla/show_bug.cgi?id=26778
Also replace success with dwarf4_success in the 32-bit --gdwarf-4 check.
PR bootstrap/97451
* configure.ac (HAVE_AS_WORKING_DWARF_4_FLAG): Renamed to ...
(HAVE_AS_WORKING_DWARF_N_FLAG): This. Don't define if there is
an extra assembly input file in debug info. Replace success
with dwarf4_success in the 32-bit --gdwarf-4 check.
* dwarf2out.c (asm_outputs_debug_line_str): Check
HAVE_AS_WORKING_DWARF_N_FLAG instead of
HAVE_AS_WORKING_DWARF_4_FLAG.
* gcc.c (ASM_DEBUG_SPEC): Likewise.
(ASM_DEBUG_OPTION_SPEC): Likewise.
* config.in: Regenerated.
* configure: Likewise.
The code added in r10-6437 caused us to create a CONSTRUCTOR when we're
{}-initializing an aggregate. Then we pass this new CONSTRUCTOR down to
cxx_eval_constant_expression which, if the CONSTRUCTOR isn't TREE_CONSTANT
or reduced_constant_expression_p, calls cxx_eval_bare_aggregate. In
this case the CONSTRUCTOR wasn't reduced_constant_expression_p because
for r_c_e_p a CONST_DECL isn't good enough so it returns false. So we
go to cxx_eval_bare_aggregate where we crash, because ctx->ctor wasn't
set up properly. So my fix is to do so. Since we're value-initializing,
I'm not setting CONSTRUCTOR_NO_CLEARING. To avoid keeping a garbage
constructor around, I call free_constructor in case the evaluation did
not use it.
gcc/cp/ChangeLog:
PR c++/96241
* constexpr.c (cxx_eval_array_reference): Set up ctx->ctor if we
are initializing an aggregate. Call free_constructor on the new
CONSTRUCTOR if it isn't returned from cxx_eval_constant_expression.
gcc/testsuite/ChangeLog:
PR c++/96241
* g++.dg/cpp0x/constexpr-96241.C: New test.
* g++.dg/cpp1y/constexpr-96241.C: New test.
An undefined range was leaking through to the end of this function,
which leads us to use an uninitialized wide_int.
gcc/ChangeLog:
PR tree-optimization/97538
* calls.c (get_size_range): Handle undefined ranges.
gcc/testsuite/ChangeLog:
* g++.dg/pr97538.C: New test.
This fixes the following failure:
ld: cgraph.o: in function `cgraph_edge::verify_corresponds_to_fndecl(tree_node*)':
gcc/cgraph.c:3067: undefined reference to `cgraph_node::former_thunk_p()'
ld: cgraph.o: in function `clone_of_p':
gcc/ChangeLog:
* cgraph.c (cgraph_node::former_thunk_p): Move out of CHECKING_P
macro.
gcc.target/powerpc/fold-vec-st-pixel.c and other testcases fail on
power10, generating
addi 9,5,12
rldicr 9,9,0,59
stxv 34,0(9)
rather than
addi 5,5,12
stvx 2,0,5
for an altivec lvx/stvx style address.
The problem starts with fwprop creating
(insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ])
(const_int 12 [0xc]))
(const_int -16 [0xfffffffffffffff0])) [0 MEM <vector(8) short int> [(void *)_4 & -16B]+0 S16 A128])
(reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit}
which is finally thrown out as invalid by lra. lra of course does that
by reloading the entire address.
fwprop creates the invalid address due to rs6000_legitimate_address_p
trimming off the outer AND of altivec style addresses before applying
other predicates. address_is_prefixed then allows the inner address.
Now at the time the AND stripping was added (git commit 850e8d3d56),
rs6000_legitimate_address looked a lot simpler. This patch allows
through just those addresses that were legitimate in those simpler
days.
* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Limit
AND addressing to just lvx/stvx style addresses.