This rewrites our range adaptor implementation for more comprehensible
error messages, improved SFINAE behavior and conformance to P2281.
The diagnostic improvements mostly come from using appropriately named
functors instead of lambdas in the generic implementation of partial
application and composition of range adaptors, and in the definition of
each of the standard range adaptors. This makes their pretty printed
types much shorter and more self-descriptive.
The improved SFINAE behavior comes from constraining the range adaptors'
member functions appropriately. This improvement fixes PR99433, and is
also necessary in order to implement the wording changes of P2281.
Finally, P2281 clarified that partial application and composition of
range adaptors behaves like a perfect forwarding call wrapper. This
patch implements this, except that we don't bother adding overloads for
forwarding captured state entities as non-const lvalues, since it seems
sufficient to handle the const lvalue and non-const rvalue cases for now,
given the current set of standard range adaptors. But such overloads
can be easily added if they turn out to be needed.
libstdc++-v3/ChangeLog:
PR libstdc++/99433
* include/std/ranges (__adaptor::__maybe_refwrap): Remove.
(__adaptor::__adaptor_invocable): New concept.
(__adaptor::__adaptor_partial_app_viable): New concept.
(__adaptor::_RangeAdaptorClosure): Rewrite, turning it into a
non-template base class.
(__adaptor::_RangeAdaptor): Rewrite, turning it into a CRTP base
class template.
(__adaptor::_Partial): New class template that represents
partial application of a range adaptor non-closure.
(__adaptor::__pipe_invocable): New concept.
(__adaptor::_Pipe): New class template.
(__detail::__can_ref_view): New concept.
(__detail::__can_subrange): New concept.
(all): Replace the lambda here with ...
(_All): ... this functor. Add appropriate constraints.
(__detail::__can_filter_view): New concept.
(filter, _Filter): As in all/_All.
(__detail::__can_transform): New concept.
(transform, _Transform): As in all/_All.
(__detail::__can_take_view): New concept.
(take, _Take): As in all/_All.
(__detail::__can_take_while_view): New concept.
(take_while, _TakeWhile): As in all/_All.
(__detail::__can_drop_view): New concept.
(drop, _Drop): As in all/_All.
(__detail::__can_drop_while_view): New concept.
(drop_while, _DropWhile): As in all/_All.
(__detail::__can_join_view): New concept.
(join, _Join): As in all/_All.
(__detail::__can_split_view): New concept.
(split, _Split): As in all/_All. Rename template parameter
_Fp to _Pattern.
(__detail::__already_common): New concept.
(__detail::__can_common_view): New concept.
(common, _Common): As in all/_All.
(__detail::__can_reverse_view): New concept.
(reverse, _Reverse): As in all/_All.
(__detail::__can_elements_view): New concept.
(elements, _Elements): As in all/_All.
(keys, values): Adjust.
* testsuite/std/ranges/adaptors/99433.cc: New test.
* testsuite/std/ranges/adaptors/all.cc: No longer expect that
adding empty range adaptor closure objects to a pipeline doesn't
increase the size of the pipeline.
(test05): New test.
* testsuite/std/ranges/adaptors/common.cc (test03): New test.
* testsuite/std/ranges/adaptors/drop.cc (test09): New test.
* testsuite/std/ranges/adaptors/drop_while.cc (test04): New test.
* testsuite/std/ranges/adaptors/elements.cc (test04): New test.
* testsuite/std/ranges/adaptors/filter.cc (test06): New test.
* testsuite/std/ranges/adaptors/join.cc (test09): New test.
* testsuite/std/ranges/adaptors/p2281.cc: New test.
* testsuite/std/ranges/adaptors/reverse.cc (test07): New test.
* testsuite/std/ranges/adaptors/split.cc (test01, test04):
Adjust.
(test09): New test.
* testsuite/std/ranges/adaptors/split_neg.cc (test01): Adjust
expected error message.
(test02): Likewise. Extend test.
* testsuite/std/ranges/adaptors/take.cc (test06): New test.
* testsuite/std/ranges/adaptors/take_while.cc (test05): New test.
* testsuite/std/ranges/adaptors/transform.cc (test07, test08):
New test.
pr99102.c needs to override the default options to exercise the original
problem, but that means that it also needs to respecify the dump flags.
gcc/testsuite/
* gcc.dg/vect/pr99102.c: Add -fdump-tree-vect-details.
The “previous definition of 'x'” notes now include the type
of the original definition before “was here”. There's not really
any need to hard-code that much of the message in the ACLE tests,
so this patch just removes the “was here” from the match string.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general-c/func_redef_1.c: Remove
"was here" from error message.
* gcc.target/aarch64/sve/acle/general-c/func_redef_2.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/func_redef_3.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/func_redef_6.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_1.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_2.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_3.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_4.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_5.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_6.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_8.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_9.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_10.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/type_redef_13.c: Likewise.
Some sve/mul_2.c tests were failing because we'd (reasonably)
decided to use shifts and adds instead of MULs for some simple
negative constants. We'd already needed to avoid that when
picking positive constants, so this patch does the same thing
for the negative ones.
gcc/testsuite/
* gcc.target/aarch64/sve/mul_2.c: Adjust negative constants to avoid
conversion to shifts and adds.
This allows the docs to be generated on hosts without the necessary
files present for multilib support.
maintainer-scripts/ChangeLog:
* generate_libstdcxx_web_docs: Add --disable-multilib to
configure command.
Prior to this patch, program_state::detect_leaks worked by finding all
live svalues in the old state and in the new state, and calling
on_svalue_leak for each svalue that has changed from being live to
not being live.
PR analyzer/99042 and PR analyzer/99774 both describe false leak
diagnostics from -fanalyzer (a false FILE * leak in git, and a false
malloc leak in qemu, respectively).
In both cases the root cause of the false leak diagnostic relates to
svalues no longer being explicitly bound in the store due to regions
being conservatively clobbered, due to an unknown function being
called, or due to a write through a pointer that could alias the
region, respectively.
We have a transition from an svalue being explicitly live to not
being explicitly live - but only because the store is being
conservative, clobbering the binding. The leak detection is looking
for transitions from "definitely live" to "not definitely live",
when it should be looking for transitions from "definitely live"
to "definitely not live".
This patch introduces a new class to temporarily capture information
about svalues that were explicitly live, but for which a region bound
to them got clobbered for conservative reasons. This new
"uncertainty_t" class is passed around to capture the data long enough
for use in program_state::detect_leaks, where it is used to only
complain about svalues that were definitely live and are now both
not definitely live *or* possibly-live i.e. definitely not-live.
The class also captures for which svalues we can't meaningfully track
sm-state anymore, and resets the svalues back to the "start" state.
Together, these changes fix the false leak reports.
gcc/analyzer/ChangeLog:
PR analyzer/99042
PR analyzer/99774
* engine.cc
(impl_region_model_context::impl_region_model_context): Add
uncertainty param and use it to initialize m_uncertainty.
(impl_region_model_context::get_uncertainty): New.
(impl_sm_context::get_fndecl_for_call): Add NULL for new
uncertainty param when constructing impl_region_model_context.
(impl_sm_context::get_state): Likewise.
(impl_sm_context::set_next_state): Likewise.
(impl_sm_context::warn): Likewise.
(exploded_node::on_stmt): Add uncertainty param
and use it when constructing impl_region_model_context.
(exploded_node::on_edge): Add uncertainty param and pass
to on_edge call.
(exploded_node::detect_leaks): Create uncertainty_t and pass to
impl_region_model_context.
(exploded_graph::get_or_create_node): Create uncertainty_t and
pass to prune_for_point.
(maybe_process_run_of_before_supernode_enodes): Create
uncertainty_t and pass to impl_region_model_context.
(exploded_graph::process_node): Create uncertainty_t instances and
pass around as needed.
* exploded-graph.h
(impl_region_model_context::impl_region_model_context): Add
uncertainty param.
(impl_region_model_context::get_uncertainty): New decl.
(impl_region_model_context::m_uncertainty): New field.
(exploded_node::on_stmt): Add uncertainty param.
(exploded_node::on_edge): Likewise.
* program-state.cc (sm_state_map::on_liveness_change): Get
uncertainty from context and use it to unset sm-state from
svalues as appropriate.
(program_state::on_edge): Add uncertainty param and use it when
constructing impl_region_model_context. Fix indentation.
(program_state::prune_for_point): Add uncertainty param and use it
when constructing impl_region_model_context.
(program_state::detect_leaks): Get any uncertainty from ctxt and
use it to get maybe-live svalues for dest_state, rather than
definitely-live ones; use this when determining which svalues
have leaked.
(selftest::test_program_state_merging): Create uncertainty_t and
pass to impl_region_model_context.
* program-state.h (program_state::on_edge): Add uncertainty param.
(program_state::prune_for_point): Likewise.
* region-model-impl-calls.cc (call_details::get_uncertainty): New.
(region_model::impl_call_memcpy): Pass uncertainty to
mark_region_as_unknown call.
(region_model::impl_call_memset): Likewise.
(region_model::impl_call_strcpy): Likewise.
* region-model-reachability.cc (reachable_regions::handle_sval):
Also add sval to m_mutable_svals.
* region-model.cc (region_model::on_assignment): Pass any
uncertainty from ctxt to the store::set_value call.
(region_model::handle_unrecognized_call): Get any uncertainty from
ctxt and use it to record mutable svalues at the unknown call.
(region_model::get_reachable_svalues): Add uncertainty param and
use it to mark any maybe-bound svalues as being reachable.
(region_model::set_value): Pass any uncertainty from ctxt to the
store::set_value call.
(region_model::mark_region_as_unknown): Add uncertainty param and
pass it on to the store::mark_region_as_unknown call.
(region_model::update_for_call_summary): Add uncertainty param and
pass it on to the region_model::mark_region_as_unknown call.
* region-model.h (call_details::get_uncertainty): New decl.
(region_model::get_reachable_svalues): Add uncertainty param.
(region_model::mark_region_as_unknown): Add uncertainty param.
(region_model_context::get_uncertainty): New vfunc.
(noop_region_model_context::get_uncertainty): New vfunc
implementation.
* store.cc (dump_svalue_set): New.
(uncertainty_t::dump_to_pp): New.
(uncertainty_t::dump): New.
(binding_cluster::clobber_region): Pass NULL for uncertainty to
remove_overlapping_bindings.
(binding_cluster::mark_region_as_unknown): Add uncertainty param
and pass it to remove_overlapping_bindings.
(binding_cluster::remove_overlapping_bindings): Add uncertainty param.
Use it to record any svalues that were in clobbered bindings.
(store::set_value): Add uncertainty param. Pass it to
binding_cluster::mark_region_as_unknown when handling symbolic
regions.
(store::mark_region_as_unknown): Add uncertainty param and pass it
to binding_cluster::mark_region_as_unknown.
(store::remove_overlapping_bindings): Add uncertainty param and
pass it to binding_cluster::remove_overlapping_bindings.
* store.h (binding_cluster::mark_region_as_unknown): Add
uncertainty param.
(binding_cluster::remove_overlapping_bindings): Likewise.
(store::set_value): Likewise.
(store::mark_region_as_unknown): Likewise.
gcc/testsuite/ChangeLog:
PR analyzer/99042
PR analyzer/99774
* gcc.dg/analyzer/pr99042.c: New test.
* gcc.dg/analyzer/pr99774-1.c: New test.
* gcc.dg/analyzer/pr99774-2.c: New test.
D attribute support has been updated to have a baseline parity with the
LLVM D compiler's own `ldc.attributes'.
The handler that extracts GCC attributes from a list of UDAs has been
improved to take care of some mistakes that could have been warnings.
UDAs attached to field variables are also now processed for any GCC
attributes attached to them.
The following new attributes have been added to the D front-end:
- @attribute("alloc_size")
- @attribute("used")
- @attribute("optimize")
- @attribute("restrict")
- @attribute("cold")
- @attribute("noplt")
- @attribute("target_clones")
- @attribute("no_icf")
- @attribute("noipa")
- @attribute("symver")
With convenience aliases in a new `gcc.attributes' module to match
the same naming convention as `ldc.attributes':
- @allocSize()
- @assumeUsed
- @fastmath
- @naked
- @restrict
- @cold
- @noplt
- @optStrategy()
- @polly
- @section()
- @target()
- @weak
The old gcc.attribute module has been deprecated, along with the removal
of the following attribute handlers:
- @attribute("alias"): Has been superseded by `pragma(mangle)'.
- @attribute("forceinline"): Renamed to always_inline.
gcc/d/ChangeLog:
* d-attribs.cc: Include fold-const.h and opts.h.
(attr_noreturn_exclusions): Add alloc_size.
(attr_const_pure_exclusions): Likewise.
(attr_inline_exclusions): Add target_clones.
(attr_noinline_exclusions): Rename forceinline to always_inline.
(attr_target_exclusions): New array.
(attr_target_clones_exclusions): New array.
(attr_alloc_exclusions): New array.
(attr_cold_hot_exclusions): New array.
(d_langhook_common_attribute_table): Add new D attribute handlers.
(build_attributes): Update to look for gcc.attributes. Issue warning
if not given a struct literal. Handle void initialized arguments.
(handle_always_inline_attribute): Remove function.
(d_handle_noinline_attribute): Don't extract TYPE_LANG_FRONTEND.
(d_handle_forceinline_attribute): Rename to...
(d_handle_always_inline_attribute): ...this. Remove special handling.
(d_handle_flatten_attribute): Don't extract TYPE_LANG_FRONTEND.
(d_handle_target_attribute): Likewise. Warn about empty arguments.
(d_handle_target_clones_attribute): New function.
(optimize_args): New static variable.
(parse_optimize_options): New function.
(d_handle_optimize_attribute): New function.
(d_handle_noclone_attribute): Don't extract TYPE_LANG_FRONTEND.
(d_handle_alias_attribute): Remove function.
(d_handle_noicf_attribute): New function.
(d_handle_noipa_attribute): New function.
(d_handle_section_attribute): Call the handle_generic_attribute target
hook after performing target independent processing.
(d_handle_symver_attribute): New function.
(d_handle_noplt_attribute): New function.
(positional_argument): New function.
(d_handle_alloc_size_attribute): New function.
(d_handle_cold_attribute): New function.
(d_handle_restrict_attribute): New function.
(d_handle_used_attribute): New function.
* decl.cc (gcc_attribute_p): Update to look for gcc.attributes.
(get_symbol_decl): Update decl source location of old prototypes to
the new declaration being merged.
* types.cc (layout_aggregate_members): Apply user defined attributes
on fields.
libphobos/ChangeLog:
* libdruntime/Makefile.am (DRUNTIME_DSOURCES): Add
gcc/attributes.d.
* libdruntime/Makefile.in: Regenerate.
* libdruntime/gcc/attribute.d: Deprecate module, publicly import
gcc.attributes.
* libdruntime/gcc/deh.d: Update imports.
* libdruntime/gcc/attributes.d: New file.
gcc/testsuite/ChangeLog:
* gdc.dg/gdc108.d: Update test.
* gdc.dg/gdc142.d: Likewise.
* gdc.dg/pr90136a.d: Likewise.
* gdc.dg/pr90136b.d: Likewise.
* gdc.dg/pr90136c.d: Likewise.
* gdc.dg/pr95173.d: Likewise.
* gdc.dg/attr_allocsize1.d: New test.
* gdc.dg/attr_allocsize2.d: New test.
* gdc.dg/attr_alwaysinline1.d: New test.
* gdc.dg/attr_cold1.d: New test.
* gdc.dg/attr_exclusions1.d: New test.
* gdc.dg/attr_exclusions2.d: New test.
* gdc.dg/attr_flatten1.d: New test.
* gdc.dg/attr_module.d: New test.
* gdc.dg/attr_noclone1.d: New test.
* gdc.dg/attr_noicf1.d: New test.
* gdc.dg/attr_noinline1.d: New test.
* gdc.dg/attr_noipa1.d: New test.
* gdc.dg/attr_noplt1.d: New test.
* gdc.dg/attr_optimize1.d: New test.
* gdc.dg/attr_optimize2.d: New test.
* gdc.dg/attr_optimize3.d: New test.
* gdc.dg/attr_optimize4.d: New test.
* gdc.dg/attr_restrict1.d: New test.
* gdc.dg/attr_section1.d: New test.
* gdc.dg/attr_symver1.d: New test.
* gdc.dg/attr_target1.d: New test.
* gdc.dg/attr_targetclones1.d: New test.
* gdc.dg/attr_used1.d: New test.
* gdc.dg/attr_used2.d: New test.
* gdc.dg/attr_weak1.d: New test.
* gdc.dg/imports/attributes.d: New test.
We were telling users they needed more template<> to specialize a member
template in a testcase with no member templates. Only produce that message
if we actually see a member template, and also always print the candidates.
gcc/cp/ChangeLog:
PR c++/94529
* pt.c (determine_specialization): Improve diagnostic.
gcc/testsuite/ChangeLog:
PR c++/94529
* g++.dg/template/mem-spec2.C: New test.
In explicit17.C, we weren't detecting an unexpanded parameter pack in
explicit(bool), so we crashed on a TEMPLATE_PARM_INDEX in constexpr.
I noticed the same is true for noexcept(), but only since my patch to
implement delayed parsing of noexcept. Previously, we would detect the
unexpanded pack in push_template_decl but now the noexcept expression
has not yet been parsed, so we need to do it a bit later.
gcc/cp/ChangeLog:
PR c++/99844
* decl.c (build_explicit_specifier): Call
check_for_bare_parameter_packs.
* except.c (build_noexcept_spec): Likewise.
gcc/testsuite/ChangeLog:
PR c++/99844
* g++.dg/cpp2a/explicit16.C: Use c++20.
* g++.dg/cpp0x/noexcept66.C: New test.
* g++.dg/cpp2a/explicit17.C: New test.
Tim Song pointed out that using __underlying_type is ill-formed for
incomplete enumeration types, and is_scoped_enum doesn't require a
complete type. This changes the trait to check for conversion to int
instead of to the underlying type.
In order to give the correct result when the trait is used in the
enumerator-list of an incomplete type the partial specialization for
enums has an additional check that fails for incomplete types. This
assumes that an incompelte enumeration type must be an unscoped
enumeration, and so the primary template (with a std::false_type base
characteristic) can be used. This isn't necessarily true, but it is not
currently possible to refer to a scoped enumeration type before its type
is complete (PR c++/89025).
It should be possible to use requires(remove_cv_t<_Tp> __t) in the
partial specialization's assignablility check, but that currently gives
an ICE (PR c++/99968) so there is an extra partial specialization of
is_scoped_enum<const _Tp> to handle const types.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_scoped_enum<T>): Constrain partial
specialization to not match incomplete enum types. Use a
requires-expression instead of instantiating is_convertible.
(is_scoped_enum<const T>): Add as workaround for PR c++/99968.
* testsuite/20_util/is_scoped_enum/value.cc: Check with
incomplete types and opaque-enum-declarations.
This patch fixes various issues with vec_duplicate in the MVE patterns.
Currently there are two patterns named *mve_mov<mode>. The second of
these is really a vector duplicate rather than a move, so I've renamed
it accordingly.
As it stands, there are several issues with this pattern:
1. The MVE_types iterator has an entry for TImode, but
vec_duplicate:TI is invalid.
2. The mode of the operand to vec_duplicate is SImode, but it should
vary according to the vector mode iterator.
3. The second alternative of this pattern is bogus: it allows matching
symbol_refs (the cause of the PR) and const_ints (which means that it
matches (vec_duplicate (const_int ...)) which is non-canonical: such
rtxes should be const_vectors instead and handled by the main vector
move pattern).
This patch fixes all of these issues, and removes the redundant
*mve_vec_duplicate<mode> pattern.
gcc/ChangeLog:
PR target/99647
* config/arm/iterators.md (MVE_vecs): New.
(V_elem): Also handle V2DF.
* config/arm/mve.md (*mve_mov<mode>): Rename to ...
(*mve_vdup<mode>): ... this. Remove second alternative since
vec_duplicate of const_int is not canonical RTL, and we don't
want to match symbol_refs.
(*mve_vec_duplicate<mode>): Delete (pattern is redundant).
gcc/testsuite/ChangeLog:
PR target/99647
* gcc.c-torture/compile/pr99647.c: New test.
print_rtl will dump the rtx_insn from current until LAST. But it is only
useful to see the particular insn that called by print_rtx_insn_vec,
Let's call print_rtl_single to display that insn in the gcse and store-motion
pass dump.
2021-04-07 Xionghu Luo <luoxhu@linux.ibm.com>
gcc/ChangeLog:
* fold-const.c (fold_single_bit_test): Fix typo.
* print-rtl.c (print_rtx_insn_vec): Call print_rtl_single
instead.
Different code paths were correctly choosing to look up D directly, since C
is the current instantiation, but here we decided to try to make it a
typename type, leading to confusion. Fixed by using dependent_scope_p as we
do elsewhere.
gcc/cp/ChangeLog:
PR c++/41723
* parser.c (cp_parser_class_name): Check dependent_scope_p.
gcc/testsuite/ChangeLog:
PR c++/41723
* g++.dg/template/friend71.C: New test.
Here we were mistakenly treating the injected-class-name as a partial
specialization.
gcc/cp/ChangeLog:
PR c++/52625
* pt.c (maybe_process_partial_specialization): Check
DECL_SELF_REFERENCE_P.
gcc/testsuite/ChangeLog:
PR c++/52625
* g++.dg/template/friend70.C: New test.
The problem here was that the lookup for 'impl' when parsing the template
only found the using-declaration, not the member function declaration.
This happened because when trying to add the member function declaration,
push_class_level_binding_1 saw that the current binding was a USING_DECL and
the new value is an overload, and decided to just return success.
That 'return true' dates back to r69921. In
https://gcc.gnu.org/pipermail/gcc-patches/2003-July/110632.html Nathan
mentions that we only push dependent USING_DECLs, which is no longer the
case; now that we retain more USING_DECLs, handling this case like the other
overloaded function cases seems like the obvious solution.
gcc/cp/ChangeLog:
PR c++/92918
* name-lookup.c (push_class_level_binding_1): Do overload a new
function with a previous using-declaration.
gcc/testsuite/ChangeLog:
PR c++/92918
* g++.dg/lookup/using66.C: New test.
It turns out that, on targets that use testglue, many gcc.dg/vect
scan-dump tests became UNRESOLVED after the change to the dump
file naming scheme.
The problem is that, when creating an executable, we normally name
the dump file after both the executable and the source file name.
However, as an exception, we name it after only the source file
name if:
(a) there is only one source file name and
(b) the source file and the executable have the same basename
Both (a) and (b) are normally true when building executables from
gcc.dg/vect. But (a) is not true when linking against testglue.
The harness was therefore looking for a dump file based only on the
source file name while the compiler was producing a dump file that
contained both names.
We get around this for dg-additional-sources using:
# This option restores naming of aux and dump output files
# after input files when multiple input files are named,
# instead of getting them combined with the output name.
lappend options "additional_flags=-dumpbase \"\""
This patch does the same thing for executables that are linked
against testglue. This removes over 2400 UNRESOLVEDs from an
armeb-eabi test run, but in so doing introduces FAILs for some
tests that were previously skipped.
gcc/testsuite/
* lib/gcc.exp (gcc_target_compile): Add -dumpbase ""
when building an executable with testglue.
Calling the non-const data() member on a COW string makes it "leaked",
possibly resulting in reallocating the string to ensure a unique owner.
The path::_M_split_cmpts() member parses its _M_pathname string using
string_view objects and then calls _M_pathname.data() to find the offset
of each string_view from the start of the string. However because
_M_pathname is non-const that will cause a COW string to reallocate if
it happens to be shared with another string object. This results in the
offsets calculated for each component being wrong (i.e. undefined)
because the string views no longer refer to substrings of the
_M_pathname member. The fix is to use the parse.offset(c) member which
gets the offset safely.
The bug only happens for the path(string_type&&) constructor and only
for COW strings. When constructed from an lvalue string the string's
contents are copied rather than just incrementing the refcount, so
there's no reallocation when calling the non-const data() member. The
testsuite changes check the lvalue case anyway, because we should
probably change the deep copying to just be a refcount increment (by
adding a path(const string_type&) constructor or an overload for
__effective_range(const string_type&), for COW strings only).
libstdc++-v3/ChangeLog:
PR libstdc++/99805
* src/c++17/fs_path.cc (path::_M_split_cmpts): Do not call
non-const member on _M_pathname, to avoid copy-on-write.
* testsuite/27_io/filesystem/path/decompose/parent_path.cc:
Check construction from strings that might be shared.
Many of the gcc.target/sve/slp-perm*.c tests started failing
after the introduction of separate SLP permute nodes.
This patch adds variable-length support using a similar
technique to vect_transform_slp_perm_load.
As there, the idea is to detect when every permute mask vector
is the same and can be generated using a regular stepped sequence.
We can easily handle those cases for variable-length, but still
need to restrict the general case to constant-length.
Again copying vect_transform_slp_perm_load, the idea is to distinguish
the two cases regardless of whether the length is variable or not,
partly to increase testing coverage and partly because it avoids
generating redundant trees.
Doing this means that we can also use SLP for the two-vector
permute in pr88834.c, which we couldn't before VEC_PERM_EXPR
nodes were introduced. The patch therefore makes pr88834.c
check that we don't regress back to not using SLP and adds
pr88834_ld3.c to check for the original problem in the PR.
gcc/
PR tree-optimization/97513
* tree-vect-slp.c (vect_add_slp_permutation): New function,
split out from...
(vectorizable_slp_permutation): ...here. Detect cases in which
all VEC_PERM_EXPRs are guaranteed to have the same stepped
permute vector and only generate one permute vector for that case.
Extend that case to handle variable-length vectors.
gcc/testsuite/
* gcc.target/aarch64/sve/pr88834.c: Expect the vectorizer to use SLP.
* gcc.target/aarch64/sve/pr88834_ld3.c: New test.
As noted in the PR, we were no longer using ST3 for the testcase and
instead stored each lane individually. This is because we'd split
the store group during SLP and couldn't recover when SLP failed.
However, we can also get better code with ST3 and ST4 even if SLP would
have succeeded, such as for vect-complex-5.c. I'm not sure exactly
where the cut-off point is, but it seems reasonable to allow the split
if either of the new groups would operate on full vectors *within*
rather than across scalar loop iterations.
E.g. on a Cortex-A57, pr99873_3.c performs better using ST4 while
pr99873_2.c performs better with SLP.
Another factor is that SLP can handle smaller iteration counts than
IFN_STORE_LANES can, but we don't have the infrastructure to choose
reliably based on that.
gcc/
PR tree-optimization/99873
* tree-vect-slp.c (vect_slp_prefer_store_lanes_p): New function.
(vect_build_slp_instance): Don't split store groups that could
use IFN_STORE_LANES.
gcc/testsuite/
* gcc.dg/vect/slp-21.c: Only expect 2 of the loops to use SLP
if IFN_STORE_LANES is available.
* gcc.dg/vect/vect-complex-5.c: Expect no loops to use SLP if
IFN_STORE_LANES is available.
* gcc.target/aarch64/pr99873_1.c: New test.
* gcc.target/aarch64/pr99873_2.c: Likewise.
* gcc.target/aarch64/pr99873_3.c: Likewise.
* gcc.target/aarch64/sve/pr99873_1.c: Likewise.
* gcc.target/aarch64/sve/pr99873_2.c: Likewise.
* gcc.target/aarch64/sve/pr99873_3.c: Likewise.
Last year, I have added in r11-2944-g0106300f6c3f7bae5eb1c46dbd45aa07c94e1b15
(aka PR54201 fix) code to find bitwise duplicates in constant pool and output
them as aliases instead of duplicating the data.
Unfortunately this broke mingw32 -m32.
On most targets, ASM_GENERATE_INTERNAL_LABEL with "LC" emits something like
*.LC123 and the targets don't add user label prefixes, so the aliases
that we print should be something like
.set .LC5, .LC6
or
.set .LC5, .LC6 + 8
and I wasn't sure if ASM_OUTPUT_DEF can handle the * and therefore I have
stripped it.
But, on mingw32 -m32, ASM_GENERATE_INTERNAL_LABEL with "LC" emits
*LC123 and the target has user label prefixes, which means what I wrote
results in
LC6:
...
.set _LC5, _LC6
which results in unresolved symbols. I went through the ASM_OUTPUT_DEF
definitions of all targets and all of them use assemble_name twice under
the hood (with various differences on what they print before, in between or
after those names). And assemble_name handles the name encoding properly,
so if we pass it ASM_OUTPUT_DEF (..., "*.LC123", "*.LC456+16") it will
emit .LC123 and .LC456+16 and if we pass it "*LC789", it will emit
LC789.
2021-04-07 Jakub Jelinek <jakub@redhat.com>
PR target/99872
* varasm.c (output_constant_pool_contents): Don't strip name encoding
from XSTR (desc->sym, 0) or from label before passing those to
ASM_OUTPUT_DEF.
This fixes bogus classification of a copy as memcpy. We cannot use
plain dependence analysis to decide between memcpy and memmove when
it computes no dependence. Instead we have to try harder later which
the patch does for the gcc.dg/tree-ssa/ldist-24.c testcase by resorting
to tree-affine to compute the difference between src and dest and
compare against the copy size.
2021-04-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/99954
* tree-loop-distribution.c: Include tree-affine.h.
(generate_memcpy_builtin): Try using tree-affine to prove
non-overlap.
(loop_distribution::classify_builtin_ldst): Always classify
as PKIND_MEMMOVE.
* gcc.dg/torture/pr99954.c: New testcase.
This fixes the order of the type attributes to preserve may_alias
for the vector type.
2021-04-07 Richard Biener <rguenther@suse.de>
PR testsuite/99955
* gcc.c-torture/execute/pr92618.c: Move may_alias attributes
last.
This avoids (again) the C++ pitfall of pushing a reference to
sth being reallocated.
2021-04-07 Richard Biener <rguenther@suse.de>
PR tree-optimization/99947
* tree-vect-loop.c (vectorizable_induction): Pre-allocate
steps vector to avoid pushing elements from the reallocated
vector.
* gcc.dg/torture/pr99947.c: New testcase.
This factors out a helper to dump VN reference operands, sth that
proves useful in debugging VN issues.
2021-04-07 Richard Biener <rguenther@suse.de>
* tree-ssa-sccvn.h (print_vn_reference_ops): Declare.
* tree-ssa-pre.c (print_pre_expr): Factor out VN reference operand
printing...
* tree-ssa-sccvn.c (print_vn_reference_ops): ... into this new
function.
(debug_vn_reference_ops): New.
Tree loop distribution uses RPO to build reduced dependence graph,
it's important that RPO preserves the original programing order.
Though it usually does so, when distributing loop nest, exit BB can
be placed before some loop BBs while after loop header. This patch
fixes the issue by calling rev_post_order_and_mark_dfs_back_seme.
gcc/ChangeLog:
PR tree-optimization/98736
* tree-loop-distribution.c
* (loop_distribution::bb_top_order_init):
Compute RPO with programing order preserved by calling function
rev_post_order_and_mark_dfs_back_seme.
gcc/testsuite/ChangeLog:
PR tree-optimization/98736
* gcc.c-torture/execute/pr98736.c: New test.
We were deferring access checks while parsing B<int>{}, didn't adjust that
when we went to instantiate the default member initializer for B::c,
deferred access checking for C::C, and then checked it after parsing
B<int>{}, back in the main() context which has no access. We need to do the
access checks in the class context of the DMI.
I tried fixing this in push_to/pop_from_top_level, but that caused several
regressions.
gcc/cp/ChangeLog:
PR c++/96673
* init.c (get_nsdmi): Don't defer access checking.
gcc/testsuite/ChangeLog:
PR c++/96673
* g++.dg/cpp1y/nsdmi-aggr13.C: New test.
C++17 makes constexpr static data members implicitly inline variables. In
C++14, a subsequent out-of-class declaration is the definition. We want to
continue emitting a symbol for such a declaration in C++17 mode, for ABI
compatibility with C++14 code that wants to refer to it.
Normally I'd distinguish in- and out-of-class declarations by looking at
DECL_IN_AGGR_P, but we never set DECL_IN_AGGR_P on inline variables. I
think that's wrong, but don't want to mess with it so close to release.
Conveniently, we already have a test for in-class declaration earlier in the
function.
gcc/cp/ChangeLog:
PR c++/99901
* decl.c (cp_finish_decl): mark_needed an implicitly inline
static data member with an out-of-class redeclaration.
gcc/testsuite/ChangeLog:
PR c++/99901
* g++.dg/cpp1z/inline-var9.C: New test.
Add [[nodiscard]] to functions that are effectively just a static_cast,
as per P2351. Also add it to std::addressof.
libstdc++-v3/ChangeLog:
* include/bits/move.h (forward, move, move_if_noexcept)
(addressof): Add _GLIBCXX_NODISCARD.
* include/bits/ranges_cmp.h (identity::operator()): Add
nodiscard attribute.
* include/c_global/cstddef (to_integer): Likewise.
* include/std/bit (bit_cast): Likewise.
* include/std/utility (as_const, to_underlying): Likewise.
libstdc++-v3/ChangeLog:
* include/bits/move.h (forward): Change static_assert message
to be unambiguous about what must be true.
* testsuite/20_util/forward/c_neg.cc: Adjust dg-error.
* testsuite/20_util/forward/f_neg.cc: Likewise.
libstdc++-v3/ChangeLog:
* include/bits/alloc_traits.h: Use markdown for code font.
* include/bits/basic_string.h: Fix @param names.
* include/bits/max_size_type.h: Remove period after @file.
* include/bits/regex.h: Fix duplicate @retval names, and rename.
* include/ext/pb_ds/detail/priority_queue_base_dispatch.hpp: Add
group open to match existing group close.
* include/ext/pb_ds/priority_queue.hpp: Add blank line before group
open.
The PR is about incorrect use of partial_subreg_p for unordered modes.
I found 2 places of dangerous comparing unordered modes in LRA. The
patch removes dangerous use of paradoxical_subreg_p and
partial_subreg_p in split_reg and process_bb_lives. The both places
used them to solve PR77761 long time ago. But the problem was also
fixed by later patches too (if there is no hard reg explicitly, it
have VOIDmode and we use natural mode to split hard reg live,
otherwise we use the biggest explicitly used mode for hard reg
splitting). The PR also says about inaccurate update of reg notes in
LRA. It happens for reg notes which refer for multi-registers. The
patch also fixes this issue.
gcc/ChangeLog:
PR target/99781
* lra-constraints.c (split_reg): Don't check paradoxical_subreg_p.
* lra-lives.c (clear_sparseset_regnos, regnos_in_sparseset_p): New
functions.
(process_bb_lives): Don't update biggest mode of hard reg for
implicit in multi-register group. Use the new functions for
updating dead_set and unused_set by register notes.
gcc/testsuite/ChangeLog:
PR target/99781
* g++.target/aarch64/sve/pr99781.C: New.
Simply memcpy and memset inline strategies to avoid branches for
Skylake family CPUs:
1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
load and store for up to 16 * 16 (256) bytes when the data size is
fixed and known.
2. Inline only if data size is known to be <= 256.
a. Use "rep movsb/stosb" with simple code sequence if the data size
is a constant.
b. Use loop if data size is not a constant.
3. Use memcpy/memset libray function if data size is unknown or > 256.
On Cascadelake processor with -march=native -Ofast -flto,
1. Performance impacts of SPEC CPU 2017 rate are:
500.perlbench_r 0.17%
502.gcc_r -0.36%
505.mcf_r 0.00%
520.omnetpp_r 0.08%
523.xalancbmk_r -0.62%
525.x264_r 1.04%
531.deepsjeng_r 0.11%
541.leela_r -1.09%
548.exchange2_r -0.25%
557.xz_r 0.17%
Geomean -0.08%
503.bwaves_r 0.00%
507.cactuBSSN_r 0.69%
508.namd_r -0.07%
510.parest_r 1.12%
511.povray_r 1.82%
519.lbm_r 0.00%
521.wrf_r -1.32%
526.blender_r -0.47%
527.cam4_r 0.23%
538.imagick_r -1.72%
544.nab_r -0.56%
549.fotonik3d_r 0.12%
554.roms_r 0.43%
Geomean 0.02%
2. Significant impacts on eembc benchmarks are:
eembc/idctrn01 9.23%
eembc/nnet_test 29.26%
gcc/
* config/i386/x86-tune-costs.h (skylake_memcpy): Updated.
(skylake_memset): Likewise.
(skylake_cost): Change CLEAR_RATIO to 17.
* config/i386/x86-tune.def (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
Replace m_CANNONLAKE, m_ICELAKE_CLIENT, m_ICELAKE_SERVER,
m_TIGERLAKE and m_SAPPHIRERAPIDS with m_SKYLAKE and m_CORE_AVX512.
gcc/testsuite/
* gcc.target/i386/memcpy-strategy-9.c: New test.
* gcc.target/i386/memcpy-strategy-10.c: Likewise.
* gcc.target/i386/memcpy-strategy-11.c: Likewise.
* gcc.target/i386/memset-strategy-7.c: Likewise.
* gcc.target/i386/memset-strategy-8.c: Likewise.
* gcc.target/i386/memset-strategy-9.c: Likewise.
This adds a relevancy check before trying to set the vector def of
a backedge in an unvectorized PHI.
2021-04-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/99880
* tree-vect-loop.c (maybe_set_vectorized_backedge_value): Only
set vectorized defs of relevant PHIs.
* gcc.dg/torture/pr99880.c: New testcase.
The va_arg scans are just too brittle. Let's not be that picky. We
have other tested builtins that are less brittle now anyway.
gcc/testsuite/
* g++.dg/modules/builtin-3_a.C: Remove dump scans.
* g++.dg/modules/builtin-3_b.C: Remove dump scans.