This avoids a useless walk of the prefix chain in instances.
gcc/ada/
* sem_ch8.adb (Analyze_Subprogram_Renaming): Move final test on
In_Instance to outer condition.
The QNX system specs for ARM and AARCH64 are identical. It makes more
sense to have it named for the base architecture.
gcc/ada/
* Makefile.rtl: Rename system-qnx-aarch64.ads to
system-qnx-arm.ads.
(AARCH64 QNX section): Modify to handle both arm and arch64.
* tracebak.c (__QNX__): Add new __ARMEL__ section.
* sigtramp-arm-qnx.c: New file.
* libgnat/system-qnx-aarch64.ads: Renamed to ...
* libgnat/system-qnx-arm.ads: this.
gcc/ChangeLog:
PR target/104610
* config/i386/i386-expand.cc (ix86_expand_branch): Use ptest
for QImode when code is EQ or NE.
* config/i386/i386.md (cbranchoi4): New expander.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr104610.c: New test.
When optimizing the DGEMM kernel in OpenBLAS to use MMA, the MMA code
uses all 8 accumulators, which overlap all vs0-vs31 vector registers.
Current trunk assigns one of the normal vector inputs to one of the MMA
instructions, which forces us to spill one of the accumulators to memory,
leading to poor performance. The solution here is to replace the "wa"
constraints for the vector input operands in the MMA instruction patterns
with "v,?wa" so that we prefer using the altivec registers vs32-vs63
over the vs0-vs31 registers.
2022-05-17 Peter Bergner <bergner@linux.ibm.com>
Segher Boessenkool <segher@kernel.crashing.org>
gcc/
PR target/105556
* config/rs6000/mma.md (mma_<vv>, mma_<avv>, mma_<pv>, mma_<apv>,
mma_<vvi4i4i8>, mma_<avvi4i4i8>, mma_<vvi4i4i2>, mma_<avvi4i4i2>,
mma_<vvi4i4>, mma_<avvi4i4>, mma_<pvi4i2>, mma_<apvi4i2>,
mma_<vvi4i4i4>, mma_<avvi4i4i4>): Replace "wa" constraints with "v,?wa".
Update other operands accordingly.
The problem here is that first check_initializer calls
build_aggr_init_full_exprs, which does overload resolution, but then in the
case of failed constexpr throws away the result and does it again in
build_functional_cast. But in the first overload resolution,
reshape_init_array_1 decided to reuse the inner CONSTRUCTORs because
tf_error is set, so we know we're committed. But the second pass gets
confused by the CONSTRUCTORs with non-init-list types.
Fixed by avoiding a second pass: instead, pass the call from build_aggr_init
to build_cplus_new, which will turn it into a TARGET_EXPR. I don't bother
to change the object argument because it will be replaced later in
simplify_aggr_init_expr.
PR c++/102307
gcc/cp/ChangeLog:
* decl.cc (check_initializer): Use build_cplus_new in case of
constexpr failure.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/constexpr-array2.C: New test.
The C and C++ FEs differ in TYPE_VALUES for an enum type: an entry in
the list in the C++ FE has a CONST_DECL in the TREE_VALUE, but the C FE
has only the numerical value of the CONST_DECL there. This has caused
me some trouble in my PR105497 patch. Using a CONST_DECL is preferable
because a CONST_DECL can track more information (e.g., attributes), and
you can always get the value simply by looking at its DECL_INITIAL.
This turned out to be a trivial change. One place in godump.cc had to be
adjusted. I'm not changing the CONST_DECL check in c_do_switch_warnings
because I'll be changing it soon in my next patch. I didn't see any other
checks that this patch makes redundant.
gcc/c/ChangeLog:
* c-decl.cc (finish_enum): Store the CONST_DECL into TREE_VALUE, not
its value.
gcc/ChangeLog:
* godump.cc (go_output_typedef): Use the DECL_INITIAL of the TREE_VALUE.
For ABI_V4, we do not split complex args. This created a problem because
even though an arg would be passed in two VSX regs, we were only advancing the
function arg counter by one VSX register. Fixed with this patch.
PR target/99685
gcc/
* config/rs6000/rs6000-call.cc (rs6000_function_arg_advance_1): Bump
register count when not splitting IEEE 128-bit Complex.
gcc/ChangeLog:
* omp-low.cc (check_omp_nesting_restrictions): Skip warning for
target inside target if inner is reverse offload.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/target-device-ancestor-5.c: New test.
Currently pmr::set_default_resource and pmr::get_default_resource both
use sequentially consistent memory ordering. This is overkill. The
standard only requires that a call to set_default_resource synchronizes
with subsequent calls to set_default_resource and get_default_resource.
Using acquire-release for the setter and acquire for the getter is
sufficient to meet the requirement.
Reviewed-by: Thomas Rodgers <trodgers@redhat.com>
libstdc++-v3/ChangeLog:
* src/c++17/memory_resource.cc (set_default_resource): Use
memory_order_acq_rel.
(get_default_resource): Use memory_order_acquire.
Add attributes to the accessors for the global memory resource objects,
to allow the compiler to eliminate redundant calls to them. For example,
multiple calls to std::pmr::new_delete_resource() will always return the
same object, and so the compiler can replace them with a single call.
Ideally we would like adjacent calls to std::pmr::get_default_resource()
to be combined into a single call by the CSE pass. The 'pure' attribute
would permit that. However, the standard requires that calls to
std::pmr::set_default_resource() synchronize with subsequent calls to
std::pmr::get_default_resource(). With 'pure' the DCE pass might
eliminate seemingly redundant calls to std::pmr::get_default_resource().
That might be unsafe, because the caller might be relying on the
associated synchronization. We could use a hypothetical attribute that
allows CSE but not DCE, but we don't have one. So it can't be 'pure'.
Also add [[nodiscard]] to equality operators.
libstdc++-v3/ChangeLog:
* include/std/memory_resource (new_delete_resource): Add
nodiscard, returns_nonnull and const attributes.
(null_memory_resource): Likewise.
(set_default_resource, get_default_resource): Add returns_nonnull
attribute.
(memory_resource::is_equal): Add nodiscard attribute.
(operator==, operator!=): Likewise.
Add the const attribute to std::future_category() and
std::iostream_category(), to match the existing attributes on
std::generic_category() and std::system_category().
Also add [[nodiscard]] to those functions and to the comparison
operators for std::error_code and std::error_condition, and to
std::make_error_code and std::make_error_condition overloads.
libstdc++-v3/ChangeLog:
* include/bits/ios_base.h (io_category): Add const and nodiscard
attributes.
(make_error_code, make_error_condition): Add nodiscard.
* include/std/future (future_category): Add const and nodiscard.
(make_error_code, make_error_condition): Add nodiscard.
* include/std/system_error (generic_category system_category):
Add nodiscard. Replace _GLIBCXX_CONST with C++11 attribute.
(error_code::value, error_code::category, error_code::operator bool)
(error_condition::value, error_condition::category)
(error_condition::operator bool, make_error_code)
(make_error_condition, operator==, operator!=, operator<=>): Add
nodiscard.
Revert commit r13-472-gca32b29ec3e92dcf8dda5c2501d0baf9dd1cb09d partially;
namely for {gcn,nvptx}/mkoffload.cc, only.
The patch changed 'sizeof(...)/sizeof(...[0])' to the 'ARRAY_SIZE' macro,
which is in principle a good idea – except that in the two mkoffload.cc,
the change happened inside a string that is used to generate plain C code.
With offlading to nvptx or gcn, the mkoffload genenates then the C file
and compilation of the latter fails with
"warning: implicit declaration of function 'ARRAY_SIZE'" followed by
"error: initializer element is not constant"
gcc/
* config/gcn/mkoffload.cc (process_obj): Revert: Use ARRAY_SIZE.
* config/nvptx/mkoffload.cc (process): Likewise.
C++ Structured bindings have a mangling that has yet to be formally
documented. However, it's been around for a while and shows up for
module support.
include/
* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_STRUCTURED_BINDING.
libiberty/
* cp-demangle.c (d_make_comp): Adjust.
(d_unqualified_name): Add 'DC' support.
(d_count_template_scopes): Adjust.
(d_print_comp_inner): Add structured binding.
* testsuite/demangle-expected: Add testcases.
When -fpatchable-function-entry= is enabled, certain C++ codes fails to
link because of generated references to discarded sections in
__patchable_function_entry section. This commit fixes this problem by
puting those references in a COMDAT section.
2022-05-06 Giuliano Belinassi <gbelinassi@suse.de>
gcc/ChangeLog
PR c++/105169
* targhooks.cc (default_print_patchable_function_entry_1): Handle COMDAT case.
* varasm.cc (switch_to_comdat_section): New
(handle_vtv_comdat_section): Call switch_to_comdat_section.
* varasm.h: Declare switch_to_comdat_section.
gcc/testsuite/ChangeLog
2022-05-06 Giuliano Belinassi <gbelinassi@suse.de>
PR c++/105169
* g++.dg/modules/pr105169.h: New file.
* g++.dg/modules/pr105169_a.C: New test.
* g++.dg/modules/pr105169_b.C: New file.
duplicate_loop_body_to_header_edge clears bb->aux which is not wanted
by a new use in loop unswitching. The clearing was introduced with
r0-69110-g6580ee7781f903 and it seems accidentially so.
2022-05-17 Richard Biener <rguenther@suse.de>
* cfgloopmanip.cc (duplicate_loop_body_to_header_edge): Do
not clear bb->aux of the copied blocks.
When registering a relation, we need to merge with any existing relation
before checking if it was an equivalence... otherwise it was not being
handled properly.
gcc/
PR tree-optimization/105458
* value-relation.cc (path_oracle::register_relation): Merge, then check
for equivalence.
gcc/testsuite/
* gcc.dg/pr105458.c: New.
OpenMP 5.2 added
"When called from within a target region the effect is unspecified."
restriction to omp_display_env, so it is ok not to support it in
target regions (worst case we could add an empty implementation
or one with __builtin_trap in there).
2022-05-17 Jakub Jelinek <jakub@redhat.com>
* libgomp.texi (OpenMP 5.1): Remove "Not inside target regions"
comment for omp_display_env feature.
gcc/
* diagnostic.cc: Don't advise to call 'abort' instead of
'internal_error'.
* system.h: Advise to call 'internal_error' instead of 'abort' or
'fancy_abort'.
Suggested-by: Richard Biener <richard.guenther@gmail.com>
gcc/ChangeLog:
* graphite-sese-to-poly.cc (build_poly_sr_1): Fix a typo and
a reference to a variable which does not exist.
* graphite-isl-ast-to-gimple.cc (gsi_insert_earliest): Fix typo
in comment.
The SSA names for which this function gets used are always SCoP
parameters and hence "isl_id_for_parameter" is a better name. It also
explains the prefix "P_" for those names in the ISL representation.
gcc/ChangeLog:
* graphite-sese-to-poly.cc (isl_id_for_ssa_name): Rename to ...
(isl_id_for_parameter): ... this new function name.
(build_scop_context): Adjust function use.
This patch adds support for inoutset depend-kind in depend
clauses. It is very similar to the in depend-kind in that
a task with a dependency with that depend-kind is dependent
on all previously created sibling tasks with matching address
unless they have the same depend-kind.
In the in depend-kind case everything is dependent except
for in -> in dependency, for inoutset everything is
dependent except for inoutset -> inoutset dependency.
mutexinoutset is also similar (everything is dependent except
for mutexinoutset -> mutexinoutset dependency), but there is
also the additional restriction that only one task with
mutexinoutset for each address can be scheduled at once (i.e.
mutual exclusitivty). For now we support mutexinoutset
the same as inout/out, but the inoutset support is full.
In order not to bump the ABI for dependencies each time
(we've bumped it already once, the old ABI supports only
inout/out and in depend-kind, the new ABI supports
inout/out, mutexinoutset, in and depobj), this patch arranges
for inoutset to be at least for the time being always handled
as if it was specified through depobj even when it is not.
So it uses the new ABI for that and inoutset are represented
like depobj - pointer to a pair of pointers where the first one
will be the actual address of the object mentioned in depend
clause and second pointer will be (void *) GOMP_DEPEND_INOUTSET.
2022-05-17 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum omp_clause_depend_kind): Add
OMP_CLAUSE_DEPEND_INOUTSET.
* tree-pretty-print.cc (dump_omp_clause): Handle
OMP_CLAUSE_DEPEND_INOUTSET.
* gimplify.cc (gimplify_omp_depend): Likewise.
* omp-low.cc (lower_depend_clauses): Likewise.
gcc/c-family/
* c-omp.cc (c_finish_omp_depobj): Handle
OMP_CLAUSE_DEPEND_INOUTSET.
gcc/c/
* c-parser.cc (c_parser_omp_clause_depend): Parse
inoutset depend-kind.
(c_parser_omp_depobj): Likewise.
gcc/cp/
* parser.cc (cp_parser_omp_clause_depend): Parse
inoutset depend-kind.
(cp_parser_omp_depobj): Likewise.
* cxx-pretty-print.cc (cxx_pretty_printer::statement): Handle
OMP_CLAUSE_DEPEND_INOUTSET.
gcc/testsuite/
* c-c++-common/gomp/all-memory-1.c (boo): Add test with
inoutset depend-kind.
* c-c++-common/gomp/all-memory-2.c (boo): Likewise.
* c-c++-common/gomp/depobj-1.c (f1): Likewise.
(f2): Adjusted expected diagnostics.
* g++.dg/gomp/depobj-1.C (f4): Adjust expected diagnostics.
include/
* gomp-constants.h (GOMP_DEPEND_INOUTSET): Define.
libgomp/
* libgomp.h (struct gomp_task_depend_entry): Change is_in type
from bool to unsigned char.
* task.c (gomp_task_handle_depend): Handle GOMP_DEPEND_INOUTSET.
Ignore dependencies where
task->depend[i].is_in && task->depend[i].is_in == ent->is_in
rather than just task->depend[i].is_in && ent->is_in. Remember
whether GOMP_DEPEND_IN loop is needed and guard the loop with that
conditional.
(gomp_task_maybe_wait_for_dependencies): Handle GOMP_DEPEND_INOUTSET.
Ignore dependencies where elem.is_in && elem.is_in == ent->is_in
rather than just elem.is_in && ent->is_in.
* testsuite/libgomp.c-c++-common/depend-1.c (test): Add task with
inoutset depend-kind.
* testsuite/libgomp.c-c++-common/depend-2.c (test): Likewise.
* testsuite/libgomp.c-c++-common/depend-3.c (test): Likewise.
* testsuite/libgomp.c-c++-common/depend-inoutset-1.c: New test.
Most tests for the contents of header synopses need to be supressed for
the versioned namespace build, because redeclaring the entities in std
fails when they were originally declared in std::__8.
I added these tests recently without the suppression, so they fail.
libstdc++-v3/ChangeLog:
* testsuite/20_util/expected/synopsis.cc: Skip for versioned
namespace.
* testsuite/27_io/headers/iosfwd/synopsis.cc: Likewise.
The src/c++11/compatibility*-c++0x.cc files define symbols that need to
be exported for ancient versions of libstdc++.so.6 due to changes
between C++0x and the final C++11 standard. Those symbols are not needed
in the libstdc++.so.8 library, and we can skip building them entirely.
This also fixes the build failure I introduced last week when making the
versioned namespace config not use the _V2 namespace for compat symbols.
libstdc++-v3/ChangeLog:
* src/Makefile.am [ENABLE_SYMVERS_GNU_NAMESPACE] (cxx11_sources):
Do not build the compatibility*-c++0x.cc objects.
* src/Makefile.in: Regenerate.
* src/c++11/compatibility-c++0x.cc [_GLIBCXX_INLINE_VERSION]:
Refuse to build for the versioned namespace.
* src/c++11/compatibility-chrono.cc: Likewise.
* src/c++11/compatibility-condvar.cc: Likewise.
* src/c++11/compatibility-thread-c++0x.cc: Likewise.
* src/c++11/chrono.cc (system_clock, steady_clock):
Use macros to define in inline namespace _V2, matching the
declarations in <system_error>.
* src/c++11/system_error.cc (system_category, generic_category):
Likewise.
The recent r13-458 change to introduce vec_cmpeqv1tiv1ti and
add TARGET_SSE2 support to vec_cmpeqv2div2di works nicely for
equality comparisons, but as the testcase shows doesn't work
for inequality comparisons.
For EQ if we perform comparison with twice as many half-sized elemenets,
the result should be ~0 when both halves are ~0 only (both halves need
to be equal for the whole to be equal), otherwise 0, so AND is the
correct operation for it.
But for NE, the result should be ~0 when either of the halves is ~0
(if either half is not equal, the whole is not equal) and so the right
operation for NE is IOR, not AND.
2022-05-17 Jakub Jelinek <jakub@redhat.com>
PR target/105613
* config/i386/sse.md (vec_cmpeqv2div2di, vec_cmpeqv1tiv1ti): Use
andv4si3 only for EQ, for NE use iorv4si3 instead.
* gcc.c-torture/execute/pr105613.c: New test.
The PR97330 fix caused some missed sinking of loads out of loops
the following patch re-instantiates.
2022-05-17 Richard Biener <rguenther@suse.de>
PR tree-optimization/105618
* tree-ssa-sink.cc (statement_sink_location): For virtual
PHI uses ignore those defining the used virtual operand.
* gcc.dg/tree-ssa/ssa-sink-19.c: New testcase.
When flagging names of volatile objects occurring in actual parameters
it is safer to guard against identifiers without entity. This is
redundant (because earlier in the resolution of actual parameters we
already guard against actuals with Any_Type), but perhaps such
identifiers will become allowed in constructs like:
Subprogram_Call
(Actual =>
(declare
X : Boolean := ...
with Annotate (GNATprove, ...)));
^^^^^^^^^
which include an identifier that does not denote any entity.
Code cleanup related to handling of volatile components; behaviour is
unaffected.
gcc/ada/
* sem_res.adb (Flag_Effectively_Volatile_Objects): Restore
redundant guard.
The compiler failed to detect an error where the first prefix of an
expanded name given as the renamed subprogram in a subprogram renaming
declaration denotes a unit with the same name as the name given for the
subprogram renaming. Such a unit must be hidden by the renaming itself.
An error check is added to catch this case.
gcc/ada/
* sem_ch8.adb (Analyze_Subprogram_Renaming): Add error check for
the case of a renamed subprogram given by an expanded name whose
outermost prefix names a unit that is hidden by the name of the
renaming.
(Ult_Expanded_Prefix): New local expression function to return
the ultimate prefix of an expanded name.
A previous commit implemented a new kernel registration scheme, using
the binder to generate registration code rather than inserting
registration code in packages. Now that this new approach has had time
to be thoroughly tested, it is time to remove the old approach.
gcc/ada/
* gnat_cuda.ads: Update package-level comments.
(Build_And_Insert_CUDA_Initialization): Remove function.
* gnat_cuda.adb (Build_And_Insert_CUDA_Initialization): Remove
function.
(Expand_CUDA_Package): Remove call to
Build_And_Insert_CUDA_Initialization.
Improve the warning message and silence warning when size > 32, this is
likely intentional and does not warrant a warning.
gcc/ada/
* freeze.adb (Freeze_Enumeration_Type): Fix comment, enhance
message and silence warning for size > 32.
For local subprograms without contracts inside generics, allow their
inlining for proof in GNATprove mode. This requires forbidding the
inlining of subprograms which contain references to object renamings,
which would be replaced in the SPARK expansion and violate assumptions
of the inlining code.
gcc/ada/
* exp_spark.adb (Expand_SPARK_Potential_Renaming): Deal with no
entity case.
* inline.ads (Check_Object_Renaming_In_GNATprove_Mode): New
procedure.
* inline.adb (Check_Object_Renaming_In_GNATprove_Mode): New
procedure.
(Can_Be_Inlined_In_GNATprove_Mode): Remove case forbidding
inlining for subprograms inside generics.
* sem_ch12.adb (Copy_Generic_Node): Preserve global entities
when inlining in GNATprove mode.
* sem_ch6.adb (Analyse_Subprogram_Body_Helper): Remove body to
inline if renaming is detected in GNATprove mode.
When an allocator is for an access type that has a
Designated_Storage_Model aspect, and the designated type is an
unconstrained record type with discriminants, and the subtype associated
with the allocator is constrained, a dereference of the new access value
can be passed to the designated type's initialization procedure. The
post-front-end phase of the compiler needs to be able to create a
temporary object in the host memory space to pass to the init proc,
which requires creating such an object, but the subtype needed for the
allocation isn't readily available at the point of the dereference. To
make the subtype easily accessible, we set the Actual_Designated_Subtype
of such a dereference to the subtype of the allocated object.
gcc/ada/
* exp_ch4.adb (Expand_N_Allocator): For an allocator with an
unconstrained discriminated designated type, and whose
allocation subtype is constrained, set the
Actual_Designated_Subtype of the dereference passed to the init
proc of the designated type to be the allocation subtype.
* sinfo.ads: Add documentation of new setting of
Actual_Designated_Subtype on a dereference used as an actual
parameter of call to an init proc associated with an allocator.
Also add missing syntax and documentation for the GNAT language
extension that allows an expression as a default for a concrete
generic formal function.
This patch cleans up some code that is left over from the front-end SJLJ
exception handling mechanism, which has been removed.
This is in preparation for fixing a finalization-related bug.
Most importantly:
The documentation is changed: a Handled_Sequence_Of_Statements node
CAN contain both Exception_Handlers and an At_End_Proc.
The assertion contradicting that is removed from
Expand_At_End_Handler.
The From_At_End field is removed.
gcc/ada/
* sinfo.ads: Remove From_At_End. Update comments.
* gen_il-fields.ads, gen_il-gen-gen_nodes.adb, sem_ch11.adb:
Remove From_At_End.
* exp_ch11.adb (Expand_At_End_Handler): Remove assertion.
* fe.h (Exception_Mechanism, Exception_Mechanism_Type, Has_DIC,
Has_Invariants, Is_List_Member, List_Containing): Remove
declarations that are not used in gigi.
* opt.ads (Exception_Mechanism): This is not used in gigi.
* exp_util.ads: Minor comment fix.
When the evaluation of the subtype_indication for the
iterator_specification of a quantified_expression leads to the insertion
of a type declaration, this should be done with Insert_Action instead of
Insert_Before.
gcc/ada/
* sem_ch5.adb (Analyze_Iterator_Specification): Use
Insert_Action when possibly inside an expression.
Fix the Forced sign flag that is incorrectly ignored for scientific
notation and shortest representation.
gcc/ada/
* libgnat/g-forstr.adb (Is_Number): Add scientific notation and
shortest representation.
The original node is not guaranteed to also be an
N_Full_Type_Declaration, so the code needs to look into the node itself.
gcc/ada/
* exp_ch3.adb (Expand_N_Full_Type_Declaration): Look into N.
This patch disallows N_Protected_Body from being passed to
Requires_Cleanup_Actions. Protected bodies never need cleanup, and are
never passed to Requires_Cleanup_Actions, which is a good thing, because
it would blow up on Handled_Statement_Sequence, which doesn't exist for
N_Protected_Body.
gcc/ada/
* exp_util.adb (Requires_Cleanup_Actions): Remove
N_Protected_Body from the case statement, so that case will be
covered by "raise Program_Error".
There are several debugging procedures called Output.w, and some
output-redirection features. This patch modifies Output.w so their
output is not redirected; it always goes to standard error. Otherwise,
debugging output can get mixed in with some "real" output (perhaps to a
file), which causes confusion and in some cases failure to build.
gcc/ada/
* output.adb (Pop_Output, Set_Output): Unconditionally flush
output when switching from one output destination to another.
Otherwise buffering can cause garbled output.
(w): Push/pop the current settings, and temporarily
Set_Standard_Error during these procedures.