gcc/ada/
* exp_util.adb (Expand_Sliding_Conversion): Only perform
expansion when Expander_Active is True. Add a comment about this
and refine existing comment regarding string literals.
The following plugs another hole where we cache a failed SLP build
attempt with an all-success 'matches'. It also adds checking that
we don't do that.
2021-06-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/101121
* tree-vect-slp.c (vect_build_slp_tree_2): To not fail fatally
when we just lack a stmt with the desired op when doing permutation.
(vect_build_slp_tree): When caching a failed SLP build attempt
assert that at least one lane is marked as not matching.
* gfortran.dg/pr101121.f: New testcase.
The avx512 supports bitwise operations with mask registers, but the
throughput of those instructions is much lower than that of the
corresponding gpr version, so we would additionally disparages
slightly the mask register alternative for bitwise operations in the
LRA.
Also when allocano cost of GENERAL_REGS is same as MASK_REGS, allocate
MASK_REGS first since it has already been disparaged.
gcc/ChangeLog:
PR target/101142
* config/i386/i386.md: (*anddi_1): Disparage slightly the mask
register alternative.
(*and<mode>_1): Ditto.
(*andqi_1): Ditto.
(*andn<mode>_1): Ditto.
(*<code><mode>_1): Ditto.
(*<code>qi_1): Ditto.
(*one_cmpl<mode>2_1): Ditto.
(*one_cmplsi2_1_zext): Ditto.
(*one_cmplqi2_1): Ditto.
* config/i386/i386.c (x86_order_regs_for_local_alloc): Change
the order of mask registers to be before general registers.
gcc/testsuite/ChangeLog:
PR target/101142
* gcc.target/i386/spill_to_mask-1.c: Adjust testcase.
* gcc.target/i386/spill_to_mask-2.c: Adjust testcase.
* gcc.target/i386/spill_to_mask-3.c: Adjust testcase.
* gcc.target/i386/spill_to_mask-4.c: Adjust testcase.
The following patch attempts to resolve PR target/11877 (without
triggering PR/23102). On x86_64, writing an SImode or DImode zero
to memory uses an instruction encoding that is larger than first
clearing a register (using xor) then writing that to memory. Hence,
after reload, the peephole2 pass can determine if there's a suitable
free register, and if so, use that to shrink the code size with -Os.
To improve code size, and avoid inserting a large number of xor
instructions (PR target/23102), this patch makes use of peephole2's
efficient pattern matching to use a single temporary for a run of
consecutive writes. In theory, one could do better still with a
new target-specific pass, gated on -Os, to shrink these instructions
(like stv), but that's probably overkill for the little remaining
space savings.
Evaluating this patch on the CSiBE benchmark (v2.1.1) results in a
0.26% code size improvement (3715273 bytes down to 3705477) on x86_64
with -Os [saving 1 byte every 400]. 549 of 894 tests improve, two
tests grow larger. Analysis of these 2 pathological cases reveals
that although peephole2's match_scratch prefers to use a call-clobbered
register (to avoid requiring a new stack frame), very rarely this
interacts with GCC's shrink wrapping optimization, which may previously
have avoided saving/restoring a call clobbered register, such as %eax,
in the calling function.
2021-06-21 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/11877
* config/i386/i386.md: New define_peephole2s to shrink writing
1, 2 or 4 consecutive zeros to memory when optimizing for size.
gcc/testsuite/ChangeLog
PR target/11877
* gcc.target/i386/pr11877.c: New test case.
This implements the new views::split from P2210R2 "Superior String
Splitting".
libstdc++-v3/ChangeLog:
* include/std/ranges (__non_propagating_cache::operator bool):
Define for split_view::begin().
(split_view): Define as per P2210.
(views::__detail::__can_split_view): Define.
(views::_Split, views::split): Define.
* testsuite/std/ranges/adaptors/100577.cc (test01, test02):
Test views::split.
* testsuite/std/ranges/adaptors/split.cc: New test.
* testsuite/std/ranges/p2325.cc (test08a): New test.
* testsuite/std/ranges/p2367.cc (test01): Test views::split.
This mostly mechanical patch renames split to lazy_split throughout.
libstdc++-v3/ChangeLog:
* include/std/ranges: Rename views::split to views::lazy_split,
split_view to lazy_split_view, etc. throughout.
* testsuite/std/ranges/*: Likewise.
This implements the part of P2210R2 "Superior String Splitting" that
resolves LWG 3478.
libstdc++-v3/ChangeLog:
* include/std/ranges (split_view::_OuterIter::__at_end):
Check _M_trailing_empty.
(split_view::_OuterIter::_M_trailing_empty): Define this
data member.
(split_view::_OuterIter::operator++): Set _M_trailing_empty
appropriately.
(split_view::_OuterIter::operator==): Compare
_M_trailing_empty.
* testsuite/std/ranges/adaptors/100479.cc (test03): Expect two
split parts instead of one.
* testsuite/std/ranges/adaptors/split.cc (test11): New test.
libstdc++-v3/ChangeLog:
* include/std/ranges (transform_view::_Iterator::_S_iter_concept):
Consider _Base instead of _Vp as per LWG 3555.
(elements_view::_Iterator::_S_iter_concept): Likewise.
libstdc++-v3/ChangeLog:
* include/std/ranges (split_view::_OuterIter::value_type::begin):
Remove the non-const overload, and remove the copyable constraint
on the const overload as per LWG 3553.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h
(__detail::__common_iter_use_postfix_proxy): Add
move_constructible constraint as per LWG 3546.
(common_iterator::__postfix_proxy): Adjust initializer of
_M_keep as per LWG 3546.
This rewrites ranges::minmax and ranges::minmax_element so that it
performs at most 3*N/2 many comparisons, as required by the standard.
In passing, this also fixes PR100387 by avoiding a premature std::move
in ranges::minmax and in std::shift_right.
PR libstdc++/100387
libstdc++-v3/ChangeLog:
* include/bits/ranges_algo.h (__minmax_fn::operator()): Rewrite
to limit comparison complexity to 3*N/2.
(__minmax_element_fn::operator()): Likewise.
(shift_right): Avoid premature std::move of __result.
* testsuite/25_algorithms/minmax/constrained.cc (test04, test05):
New tests.
* testsuite/25_algorithms/minmax_element/constrained.cc (test02):
Likewise.
Update the count of matches for the fusion combine patterns after
the recent changes to them. At Segher's request, used \m and \M
in the match patterns. Also I have grouped together all alternatives of
each fusion insn, which should hopefully make this test a little less
fragile.
gcc/testsuite/ChangeLog
* gcc.target/powerpc/fusion-p10-2logical.c: Update pattern
match counts.
* gcc.target/powerpc/fusion-p10-addadd.c: Update pattern match
counts.
* gcc.target/powerpc/fusion-p10-ldcmpi.c: Update pattern match
counts.
* gcc.target/powerpc/fusion-p10-logadd.c: Update pattern match
counts.
gcc/
* config/h8300/h8300.c (h8300_select_cc_mode): Handle SYMBOL_REF.
* config/h8300/logical.md (<code><mode>3 logcial expander): Generate
more efficient code when the source can be trivially simplified.
With poor values gone, Pick up range restrictions from statements
by folding them with global cache values.
* gimple-range-cache.cc (ranger_cache::range_of_def): Calculate
a range if global is not available.
(ranger_cache::entry_range): Fallback to range_of_def.
* gimple-range-cache.h (range_of_def): Adjust prototype.
Remove the old "poor value" approach which made callbacks into ranger
from the cache. Use only the best available value for all propagation.
PR tree-optimization/101014
* gimple-range-cache.cc (ranger_cache::ranger_cache): Remove poor
value list.
(ranger_cache::~ranger_cache): Ditto.
(ranger_cache::enable_new_values): Delete.
(ranger_cache::push_poor_value): Delete.
(ranger_cache::range_of_def): Remove poor value processing.
(ranger_cache::entry_range): Ditto.
(ranger_cache::fill_block_cache): Ditto.
* gimple-range-cache.h (class ranger_cache): Remove poor value members.
* gimple-range.cc (gimple_ranger::range_of_expr): Remove call.
* gimple-range.h (class gimple_ranger): Adjust.
gcc/fortran/ChangeLog:
PR fortran/100283
PR fortran/101123
* trans-intrinsic.c (gfc_conv_intrinsic_minmax): Unconditionally
convert result of min/max to result type.
gcc/testsuite/ChangeLog:
PR fortran/100283
PR fortran/101123
* gfortran.dg/min0_max0_1.f90: New test.
* gfortran.dg/min0_max0_2.f90: New test.
gcc/analyzer/ChangeLog:
* store.cc (binding_cluster::get_any_binding): Make symbolic reads
from a cluster with concrete bindings return unknown.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/symbolic-7.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/analyzer/ChangeLog:
* region-model-manager.cc
(region_model_manager::get_or_create_int_cst): New.
(region_model_manager::maybe_undo_optimize_bit_field_compare): Use
it to simplify away a local tree.
* region-model.cc (region_model::on_setjmp): Likewise.
(region_model::on_longjmp): Likewise.
* region-model.h (region_model_manager::get_or_create_int_cst):
New decl.
* store.cc (binding_cluster::zero_fill_region): Use it to simplify
away a local tree.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
I have followup work where a custom event's description would be better
handled via a vfunc rather that a precanned string, hence this
refactoring to make it easy to add custom_event subclasses.
gcc/analyzer/ChangeLog:
* checker-path.cc (class custom_event): Make abstract to allow for
custom vfuncs, splitting existing implementation into...
(class precanned_custom_event): New subclass.
(custom_event::get_desc): Move to...
(precanned_custom_event::get_desc): ...subclass.
* checker-path.h (class custom_event): Make abstract to allow for
custom vfuncs, splitting existing implementation into...
(class precanned_custom_event): New subclass.
* diagnostic-manager.cc (diagnostic_manager::add_events_for_eedge):
Use precanned_custom_event.
* engine.cc
(stale_jmp_buf::maybe_add_custom_events_for_superedge): Likewise.
* sm-signal.cc (signal_delivery_edge_info_t::add_events_to_path):
Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The standard does not require the iterator's value type to be
convertible to the result type, it only requires that the result of
dereferencing the iterator can be passed to the binary function.
libstdc++-v3/ChangeLog:
PR libstdc++/95833
* include/std/numeric (reduce(Iter, Iter, T, BinaryOp)): Replace
incorrect static_assert with ones matching the 'Mandates'
conditions in the standard.
* testsuite/26_numerics/reduce/95833.cc: New test.
On passing +cdecp[0-7] extension to the -march string in command line options,
multilib linking is failing as mentioned in PR100856. This patch fixes this issue by
generating a separate canonical string by removing compiler options which are not
required for multilib linking from march string and assign the new string to mlibarch
option. This mlibarch string is used for multilib comparison.
gcc/ChangeLog:
2021-06-10 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/100856
* common/config/arm/arm-common.c (arm_canon_arch_option_1): New function
derived from arm_canon_arch.
(arm_canon_arch_option): Call it.
(arm_canon_arch_multilib_option): New function.
* config/arm/arm-cpus.in (IGNORE_FOR_MULTILIB): New fgroup.
* config/arm/arm.h (arm_canon_arch_multilib_option): New prototype.
(CANON_ARCH_MULTILIB_SPEC_FUNCTION): New macro.
(MULTILIB_ARCH_CANONICAL_SPECS): New macro.
(DRIVER_SELF_SPECS): Add MULTILIB_ARCH_CANONICAL_SPECS.
* config/arm/arm.opt (mlibarch): New option.
* config/arm/t-rmprofile (MULTILIB_MATCHES): For armv8*-m, replace use
of march on RHS with mlibarch.
gcc/testsuite/ChangeLog:
2021-06-10 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/100856
* gcc.target/arm/acle/pr100856.c: New test.
* gcc.target/arm/multilib.exp: Add tests for cde options.
This fixes the lookup of a pattern stmt def operand.
2021-06-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/101112
* tree-vect-slp.c (vect_slp_linearize_chain): Fix condition
to lookup a pattern stmt def.
When compiled with -m32 -O2 -D_GLIBCXX_USE_CXX11_ABI=0 we get a warning
for 21_strings/basic_string/cons/char/1.cc:
bits/char_traits.h:409:56: warning: ‘void* __builtin_memcpy(void*, const void*, unsigned int)’ reading 1073741821 bytes from a region of size 19 [-Wstringop-overread]
The warning is legitimate, even if that line cannot be reached because
we throw std::length_error before getting there. Since the invalid
length is deliberate (and mentioned in a comment) just suppress the
warning, so that the test can verify we get the exception.
Also remove an unused typedef that produces another warning.
libstdc++-v3/ChangeLog:
* testsuite/21_strings/basic_string/cons/char/1.cc: Use
diagnostic pragma to suppress -Wstringop-overread error.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
> The following patch does create them, but treats all such bitfields as if
> they were in a structure where the particular bitfield is the only field.
While the patch passed bootstrap/regtest on the trunk, when trying to
backport it to 11 branch the bootstrap failed with
atree.ads:3844:34: size for "Node_Record" too small
errors. Turns out the error is not about size being too small, but actually
about size being non-constant, and comes from:
/* In a FIELD_DECL of a RECORD_TYPE, this is a pointer to the storage
representative FIELD_DECL. */
#define DECL_BIT_FIELD_REPRESENTATIVE(NODE) \
(FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
/* For a FIELD_DECL in a QUAL_UNION_TYPE, records the expression, which
if nonzero, indicates that the field occupies the type. */
#define DECL_QUALIFIER(NODE) (FIELD_DECL_CHECK (NODE)->field_decl.qualifier)
so by setting up DECL_BIT_FIELD_REPRESENTATIVE in QUAL_UNION_TYPE we
actually set or modify DECL_QUALIFIER and then construct size as COND_EXPRs
with those bit field representatives (e.g. with array type) as conditions
which doesn't fold into constant.
The following patch fixes it by not creating DECL_BIT_FIELD_REPRESENTATIVEs
for QUAL_UNION_TYPE as there is nowhere to store them,
Shall we change tree.h to document that DECL_BIT_FIELD_REPRESENTATIVE
is valid also on UNION_TYPE?
I see:
tree-ssa-alias.c- if (TREE_CODE (type1) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field1))
tree-ssa-alias.c: field1 = DECL_BIT_FIELD_REPRESENTATIVE (field1);
tree-ssa-alias.c- if (TREE_CODE (type2) == RECORD_TYPE
tree-ssa-alias.c: && DECL_BIT_FIELD_REPRESENTATIVE (field2))
tree-ssa-alias.c: field2 = DECL_BIT_FIELD_REPRESENTATIVE (field2);
Shall we change that to || == UNION_TYPE or do we assume all fields
are overlapping in a UNION_TYPE already?
At other spots (asan, ubsan, expr.c) it is unclear what will happen
if they see a QUAL_UNION_TYPE with a DECL_QUALIFIER (or does the Ada FE
lower that somehow)?
2021-06-18 Jakub Jelinek <jakub@redhat.com>
PR middle-end/101062
* stor-layout.c (finish_bitfield_layout): Don't add bitfield
representatives in QUAL_UNION_TYPE.
gcc/ada/
* sem_ch3.adb (Constrain_Array): Add error checking for
fixed-lower-bound and constrained index ranges applied
inappropriately on subtypes of unconstrained and
fixed-lower-bound array types.
(Constrain_Index): Correct and refine comment related to
fixed-lower-bound index ranges.
gcc/ada/
* gen_il-gen.adb: Improve comments.
* snames.ads-tmpl (Convention_Id): Remove "-- Plenty of space
for expansion", because that's irrelevant now that we are no
longer laying out node fields by hand.
gcc/ada/
* sem_util.adb (Denotes_Same_Object): Explicitly test for node
kinds being the same; deal with renamings one-by-one; adjust
numbers in references to the Ada RM.
gcc/ada/
* sprint.adb (Write_Source_Line): Check for EOF in
Line_Terminator loop. Note that when a source file is read in,
an EOF character is added to the end.
gcc/ada/
* sem_aux.adb (Package_Specification): Add assertions to confirm
the kind of the of parameter and returned node.
* sem_ch12.adb (Remove_Parent): Reorder conditions; this change
appears to be semantically neutral, but is enough to avoid the
problematic call to Package_Specification.
* sem_util.adb (Is_Incomplete_Or_Private_Type): Replace loop
with a call to Package_Specification.
gcc/ada/
* exp_ch4.adb (Expand_N_Quantified_Expression): Ensure the type
of the name of a "for of" loop is frozen.
* exp_disp.adb (Check_Premature_Freezing): Complete condition to
take into account a private type completed by another private
type now that the freezing rule are better implemented.
* freeze.adb (Freeze_Entity.Freeze_Profile): Do not perform an
early freeze on types if not in the proper scope. Special case
expression functions that requires access to the dispatch table.
(Should_Freeze_Type): New.
* sem_ch13.adb (Resolve_Aspect_Expressions): Prevent assert
failure in case of an invalid tree (previous errors detected).
* sem_res.adb (Resolve): Remove kludge related to entities
causing incorrect premature freezing.
* sem_util.adb (Ensure_Minimum_Decoration): Add protection
against non base types.
gcc/ada/
* sem_ch3.adb (Constrain_Index): Set the High_Bound of a
fixed-lower-bound subtype's range to T (the subtype of the FLB
index being constrained) rather than Base_Type (T).
gcc/ada/
* ada_get_targ.adb, aspects.ads, checks.adb, cstand.adb,
einfo.ads, exp_attr.adb, freeze.adb, get_targ.adb,
libgnat/a-textio.ads, libgnat/g-memdum.ads,
libgnat/s-scaval__128.adb, libgnat/s-scaval.adb, make.adb,
osint.ads, par-prag.adb, sem_ch13.adb, sem_prag.adb,
sem_prag.ads, set_targ.adb, set_targ.ads, snames.ads-tmpl,
targparm.ads, types.ads: Remove AAMP-specific code.
* switch.ads: Minor reformatting.
* gen_il-fields.ads, gen_il-gen.adb,
gen_il-gen-gen_entities.adb, gen_il-types.ads, einfo-utils.adb,
einfo-utils.ads: Package Types now contains "type Float_Rep_Kind
is (IEEE_Binary);", which used to also have an enumeral AAMP.
Gen_IL can't handle fields of this type, which would be zero
sized. Therefore, we move the Float_Rep field into Einfo.Utils
as a synthesized attribute. (We do not delete the field
altogether, in case we want new floating-point representations
in the future.)
* doc/gnat_rm/implementation_defined_pragmas.rst,
doc/gnat_rm/implementation_defined_aspects.rst,
doc/gnat_ugn/building_executable_programs_with_gnat.rst,
doc/gnat_ugn/the_gnat_compilation_model.rst: Remove
AAMP-specific documentation.
* gnat_rm.texi, gnat_ugn.texi: Regenerate.
gcc/ada/
* exp_util.adb (Expand_Sliding_Conversion): Move test of
Is_Fixed_Lower_Bound_Subtype to an assertion. Exclude string
literals from sliding expansion.