gcc/ada/
* exp_prag.adb (Get_Launch_Kernel_Arg_Type): Renamed to
Get_Nth_Arg_Type and made more generic.
(Build_Dim3_Declaration): Now builds a CUDA.Internal.Dim3
instead of a CUDA.Vector_Types.Dim3.
(Build_Shared_Memory_Declaration): Now infers needed type from
Launch_Kernel instead of using a hard-coded type.
(Expand_Pragma_CUDA_Execute): Build additional temporaries to
store Grids and Blocks.
* rtsfind.ads: Move Launch_Kernel from public to internal
package.
gcc/ada/
* sem_ch4.adb (Complete_Object_Operation): Only mark entities
referenced if we are compiling the extended main unit.
* sem_attr.adb (Analyze_Attribute [Attribute_Tag]): Record a
reference on the type and its scope.
gcc/ada/
* freeze.adb (Is_Uninitialized_Aggregate): Recognize an array
aggregate with box initialization, scalar components, and no
component default values.
(Freeze_Entity, Check_Address_Clause): Call it, and simplify
freeze code for entity by removing useless assignment.
gcc/ada/
* sem_ch3.adb (Check_Abstract_Overriding): Subprogram renamings
cannot be overridden.
(Derive_Subprogram): Enable setting attribute
Requires_Overriding on functions with controlling access results
of record extensions with a null extension part require
overriding (AI95-00391/06).
gcc/ada/
* sem_aggr.adb (Resolve_Delta_Array_Aggregate): Push scope of
the implicit loop before entering name of the index parameter,
not after; enter name no matter if the identifier has been
decorated before.
gcc/ada/
* sem_ch4.adb (Analyze_Call): In the case where the call is not
overloaded, check for a call to an abstract nondispatching
operation and flag an error.
gcc/ada/
* libgnat/a-nbnbin.adb (From_String): Take advantage of
Long_Long_Long_Integer.
* libgnat/s-genbig.ads, libgnat/s-genbig.adb (To_Bignum): New
function taking a Long_Long_Long_Integer.
gcc/ada/
* sem_util.adb (Accessibility_Call_Helper): In the selected
component case, test if a prefix is a function call and whether
the subprogram call is not being used in its entirety and use
the Innermost_Master_Scope_Depth in that case.
(Innermost_Master_Scope_Depth): Test against the node_par
instead of its identifier to avoid misattributing unnamed blocks
as not being from source.
(Function_Call_Level): Add calculation for whether a subprogram
call is initializing an object in its entirety.
(Subprogram_Call_Level): Renamed to Function_Call_Level.
gcc/ada/
* sem_prag.adb (Analyze_External_Property_In_Decl_Part): Set the
output parameter Expr_Val to the (implicit) pragma argument even
when returning early.
Jeff has noticed and I've confirmed that config/i386/i386.h header which is
installed on x86 in plugin/include/ directory newly in GCC 11 has
which breaks all plugins that include tm.h etc. because that header is not
shipped.
The following patch seems to fix that. Unfortunately it isn't just a matter
of TM_H += t-i386 change, because the header has full path and therefore
needs to be installed in its full path.
Additionally, I've noticed that the b-header-vars generation is completely
broken, it will just throw many of the dependencies away, because it
incorrectly removed everything from first ... remaining till the last /,
while what it clearly wants to do is remove each ... till last / in the same
header path (i.e. instead of .* should have used [^ ]* and g modifier).
I've also noticed that some other headers mentioned in #include of other
headers aren't included (gomp-constants.h as dependency of omp-general.h
and various dependencies of expr.h (where omp-general.h and expr.h were
previously installed)).
2020-10-23 Jakub Jelinek <jakub@redhat.com>
* Makefile.in (PLUGIN_HEADERS): Add gomp-constants.h and $(EXPR_H).
(s-header-vars): Accept not just spaces but also tabs between *_H name
and =. Handle common/config/ headers similarly to config. Don't
throw away everything from first ... to last / on the remaining
string, instead skip just ... to corresponding last / without
intervening spaces and tabs.
(install-plugin): Treat common/config headers like config headers.
* config/i386/t-i386 (TM_H): Add
$(srcdir)/common/config/i386/i386-cpuinfo.h.
As mentioned in the PR, since 2005 we reject if array elements are smaller
than their alignment (i.e. overaligned elements), because such arrays don't
make much sense, only their first element is guaranteed to be aligned as
user requested, but the next element can't be.
The following testcases show something we've been silent about but is
equally bad, the 2005 case is just the most common special case of that
the array element size is not divisible by the alignment. In those arrays
too only the first element is guaranteed to be properly aligned and the
second one can't be.
This patch rejects those cases too, but keeps the existing wording for the
old common case.
Unfortunately, the patch breaks bootstrap, because libbid uses this mess
(forms arrays with 24 byte long elements with 16 byte element alignment).
I don't really see justification for that, so I've decreased the alignment
to 8 bytes instead.
2020-10-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/97164
gcc/
* stor-layout.c (layout_type): Also reject arrays where element size
is constant, but not a multiple of element alignment.
gcc/testsuite/
* c-c++-common/pr97164.c: New test.
* gcc.c-torture/execute/pr36093.c: Move ...
* gcc.dg/pr36093.c: ... here. Add dg-do compile and dg-error
directives.
* gcc.c-torture/execute/pr43783.c: Move ...
* gcc.dg/pr43783.c: ... here. Add dg-do compile, dg-options and
dg-error directives.
libgcc/config/libbid/
* bid_functions.h (UINT192): Decrease alignment to 8 bytes.
This fixes the following Ada failure on 64-bit PowerPC:
-FAIL: gnat.dg/unroll4.adb scan-rtl-dump-times loop2_unroll "optimized:
loop unrolled 7 times" 2
The IVOPTS pass detects a doloop pattern and consequently discombobulates
the loop sufficiently as to make it hard for the RTL unrolling pass to
compute the best number of iterations.
gcc/ChangeLog:
* tree-ssa-loop-ivopts.c (analyze_and_mark_doloop_use): Bail out if
the loop is subject to a pragma Unroll with no specific count.
This patch enables MVE vsub instructions for auto-vectorization.
The sub<mode>3 in vec-common.md is modified to use new mode macros
to include MVE extension for vectorization. MVE vsub insns in mve.md are
modified to use 'minus' instead of unspec expression to support
sub<mode>3. Use VDQ instead fo VALL to cover all supported modes. The
redundant sub<mode>3 insns in neon.md are then removed.
gcc/ChangeLog:
2020-10-23 Dennis Zhang <dennis.zhang@arm.com>
* config/arm/mve.md (mve_vsubq<mode>): New entry for vsub instruction
using expression 'minus'.
(mve_vsubq_f<mode>): Use minus instead of VSUBQ_F unspec.
* config/arm/neon.md (sub<mode>3, sub<mode>3_fp16): Removed.
(neon_vsub<mode>): Use gen_sub<mode>3 instead of gen_sub<mode>3_fp16.
* config/arm/vec-common.md (sub<mode>3): Use the new mode macros
ARM_HAVE_<MODE>_ARITH. Use iterator VDQ instead of VALL.
gcc/testsuite/ChangeLog:
* gcc.target/arm/simd/mve-vsub_1.c: New test.
gcc/ChangeLog:
PR lto/97524
* lto-wrapper.c (make_exists): New function.
(run_gcc): Use it to check that make is present and working
for parallel execution.
Remove one redundant LOOP_VINFO_FULLY_MASKED_P condition check
which will be checked in vect_use_loop_mask_for_alignment_p.
gcc/ChangeLog:
* tree-vect-loop.c (vect_transform_loop): Remove the redundant
LOOP_VINFO_FULLY_MASKED_P check.
The recent changes to reduce includes in <memory_resource> went a bit
too far, and it's possible for std::forward_as_tuple to not be defined
when used.
While doing this, I noticed the problematic calls to forward_as_tuple
were not qualified, so performed unwanted ADL.
libstdc++-v3/ChangeLog:
* include/experimental/memory_resource: Include <tuple>.
(polymorphic_allocator::construct): Qualify forward_as_tuple.
* include/std/memory_resource: Likewise.
FAIL: gcc.target/powerpc/vec-splati-runnable.c 1 blank line(s) in output
FAIL: gcc.target/powerpc/vec-splati-runnable.c (test for excess errors)
Excess errors:
rs6000_emit_xxspltidp_v2df called ...
and running the test fails. As the comment says
/* Although the instruction says the results are not defined, it does seem
to work, at least on Mambo. But no guarentees! */
So the simulator works but not real hardware.
gcc/
* config/rs6000/rs6000.c (rs6000_emit_xxspltidp_v2df): Delete
debug printf. Remove trailing ".\n" from inform message.
Break long line.
gcc/testsuite/
* gcc.target/powerpc/vec-splati-runnable.c: Don't abort on
undefined output.
This test fails in C++20 mode because std::is_clock is false for the
test clock, because it doesn't define a duration member.
libstdc++-v3/ChangeLog:
* testsuite/30_threads/condition_variable/members/68519.cc:
Define recent_epoch_float_clock::duration to meet the Cpp17Clock
requirements.
Updated to only use range_compatible_p in range assert sanity checks,
not for actual type cmpatibility.
* gimple-range-gori.cc (is_gimple_logical_p): Use types_compatible_p
for logical compatibility.
(logical_stmt_cache::cacheable_p): Ditto.
The <condition_variable> header is not small, so <shared_mutex> should
not include it unless it actually needs std::condition_variable, which
is only the case when we don't have pthread_rwlock_t and the POSIX
Timers option.
The <shared_mutex> header would be even smaller if we had a header for
std::condition_variable (separate from std::condition_variable_any).
That's already planned for a future change.
And <memory_resource> would be even smaller if it was possible to get
std::shared_mutex without std::shared_timed_mutex (which depends on
<chrono>). For that to be effective, the synchronized_pool_resource
would have to create its own simpler version of std::shared_lock without
the timed waiting functions. I have no plans to do that.
libstdc++-v3/ChangeLog:
* include/std/shared_mutex: Only include <condition_variable>
when pthread_rwlock_t and POSIX timers are not available.
(__cpp_lib_shared_mutex, __cpp_lib_shared_timed_mutex): Change
value to be type 'long'.
* include/std/version (__cpp_lib_shared_mutex)
(__cpp_lib_shared_timed_mutex): Likewise.
By moving std::make_obj_using_allocator and the related "utility
functions for uses-allocator construction" to a new header, we can avoid
including the whole of <memory> in <scoped_allocator> and
<memory_resource>.
In order to simplify the implementation of those utility functions they
now use concepts unconditionally. They are no longer defined if
__cpp_concepts is not defined. To simplify the code that uses those
functions I've introduced a __cpp_lib_make_obj_using_allocator feature
test macro (not specified in the standard, which might be an oversight).
That allows the code in <memory_resource> and <scoped_allocator> to
check the feature test macro to decide whether to use the new utilities,
or fall back to the C++17 code.
At the same time, this reshuffles some of the headers included by
<memory> so that they are (mostly?) self-contained. It should no longer
be necessary to include other headers before <bits/shared_ptr.h> when
other parts of the library want to use std::shared_ptr without including
the whole of <memory>.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/shared_ptr.h: Include <iosfwd>.
* include/bits/shared_ptr_base.h: Include required headers here
directly, instead of in <memory>.
* include/bits/uses_allocator_args.h: New file. Move utility
functions for uses-allocator construction from <memory> to here.
Only define the utility functions when concepts are available.
(__cpp_lib_make_obj_using_allocator): Define non-standard
feature test macro.
* include/std/condition_variable: Remove unused headers.
* include/std/future: Likewise.
* include/std/memory: Remove headers that are not needed
directly, and are now inclkuded where they're needed. Include
new <bits/uses_allocator_args.h> header.
* include/std/memory_resource: Include only the necessary
headers. Use new feature test macro to detect support for the
utility functions.
* include/std/scoped_allocator: Likewise.
* include/std/version (__cpp_lib_make_obj_using_allocator):
Define.
When libstdc++ is enabled, the current high level configuration
bits should apply the same to all versions of VxWorks. Adjust the
config triplets matching rules accordingly.
2010-10-21 Olivier Hainque <hainque@adacore.com>
libstdc++-v3/
* crossconfig.m4: Turn vxworks matcher into vxworks*.
* configure.host: Likewise.
* configure: Regenerate.
this patch removes the pass to materialize all clones and instead this
is now done on demand. The motivation is to reduce lifetime of function
bodies in ltrans that should noticeably reduce memory use for highly
parallel compilations of large programs (like Martin does) or with
partitioning reduced/disabled. For cc1 with one partition the memory use
seems to go down from 4gb to cca 1.5gb (seeing from top, so this is not
particularly accurate).
gcc/ChangeLog:
2020-10-22 Jan Hubicka <hubicka@ucw.cz>
* cgraph.c (cgraph_node::get_untransformed_body): Perform lazy
clone materialization.
* cgraph.h (cgraph_node::materialize_clone): Declare.
(symbol_table::materialize_all_clones): Remove.
* cgraphclones.c (cgraph_materialize_clone): Turn to ...
(cgraph_node::materialize_clone): .. this one; move here
dumping from symbol_table::materialize_all_clones.
(symbol_table::materialize_all_clones): Remove.
* cgraphunit.c (mark_functions_to_output): Clear stmt references.
(cgraph_node::expand): Initialize bitmaps early;
do not call execute_all_ipa_transforms if there are no transforms.
* ipa-inline-transform.c (save_inline_function_body): Fix formating.
(inline_transform): Materialize all clones before function is modified.
* ipa-param-manipulation.c (ipa_param_adjustments::modify_call):
Materialize clone if needed.
* ipa.c (class pass_materialize_all_clones): Remove.
(make_pass_materialize_all_clones): Remove.
* passes.c (execute_all_ipa_transforms): Materialize all clones.
* passes.def: Remove pass_materialize_all_clones.
* tree-pass.h (make_pass_materialize_all_clones): Remove.
* tree-ssa-structalias.c (ipa_pta_execute): Clear refs.
Hi,
This adds support for the VSX load/store rightmost element operations.
This includes the instructions lxvrbx, lxvrhx, lxvrwx, lxvrdx,
stxvrbx, stxvrhx, stxvrwx, stxvrdx; And the builtins
vec_xl_sext() /* vector load sign extend */
vec_xl_zext() /* vector load zero extend */
vec_xst_trunc() /* vector store truncate */.
Testcase results show that the instructions added with this patch show
up at low/no optimization (-O0), with a number of those being replaced
with other load and store instructions at higher optimization levels.
For consistency I've left the tests at -O0.
[v2] Refreshed per review comments. Comments cleaned up, indentation
corrected.
gcc/ChangeLog:
* config/rs6000/altivec.h (vec_xl_zext, vec_xl_sext, vec_xst_trunc):
New defines.
* config/rs6000/rs6000-builtin.def (BU_P10V_OVERLOAD_X): New builtin
macro.
(BU_P10V_AV_X): New builtin macro.
(se_lxvrhbx, se_lxrbhx, se_lxvrwx, se_lxvrdx): Define internal names
for load and sign extend vector element.
(ze_lxvrbx, ze_lxvrhx, ze_lxvrwx, ze_lxvrdx): Define internal names
for load and zero extend vector element.
(tr_stxvrbx, tr_stxvrhx, tr_stxvrwx, tr_stxvrdx): Define internal names
for truncate and store vector element.
(se_lxvrx, ze_lxvrx, tr_stxvrx): Define internal names for overloaded
load/store rightmost element.
* config/rs6000/rs6000-call.c (altivec_builtin_types): Define the
internal monomorphs P10_BUILTIN_SE_LXVRBX, P10_BUILTIN_SE_LXVRHX,
P10_BUILTIN_SE_LXVRWX, P10_BUILTIN_SE_LXVRDX,
P10_BUILTIN_ZE_LXVRBX, P10_BUILTIN_ZE_LXVRHX, P10_BUILTIN_ZE_LXVRWX,
P10_BUILTIN_ZE_LXVRDX,
P10_BUILTIN_TR_STXVRBX, P10_BUILTIN_TR_STXVRHX, P10_BUILTIN_TR_STXVRWX,
P10_BUILTIN_TR_STXVRDX,
(altivec_expand_lxvr_builtin): New expansion for load element builtins.
(altivec_expand_stv_builtin): Update to for truncate and store builtins.
(altivec_expand_builtin): Add clases for load/store rightmost builtins.
(altivec_init_builtins): Add def_builtin entries for
__builtin_altivec_se_lxvrbx, __builtin_altivec_se_lxvrhx,
__builtin_altivec_se_lxvrwx, __builtin_altivec_se_lxvrdx,
__builtin_altivec_ze_lxvrbx, __builtin_altivec_ze_lxvrhx,
__builtin_altivec_ze_lxvrwx, __builtin_altivec_ze_lxvrdx,
__builtin_altivec_tr_stxvrbx, __builtin_altivec_tr_stxvrhx,
__builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrdx,
__builtin_vec_se_lxvrx, __builtin_vec_ze_lxvrx, __builtin_vec_tr_stxvrx.
* config/rs6000/vsx.md (vsx_lxvr<wd>x, vsx_stxvr<wd>x, vsx_stxvr<wd>x):
New define_insn entries.
* doc/extend.texi: Add documentation for vsx_xl_sext, vsx_xl_zext,
and vec_xst_trunc.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vsx-load-element-extend-char.c: New test.
* gcc.target/powerpc/vsx-load-element-extend-int.c: New test.
* gcc.target/powerpc/vsx-load-element-extend-longlong.c: New test.
* gcc.target/powerpc/vsx-load-element-extend-short.c: New test.
* gcc.target/powerpc/vsx-store-element-truncate-char.c: New test.
* gcc.target/powerpc/vsx-store-element-truncate-int.c: New test.
* gcc.target/powerpc/vsx-store-element-truncate-longlong.c: New test.
* gcc.target/powerpc/vsx-store-element-truncate-short.c: New test.
Hi
This is a sub-set of the 128-bit sign extension support patch series
that will be fully implemented in a subsequent patch from Carl.
This is a necessary pre-requisite for the vector-load/store rightmost
element patch that follows in this thread.
[v2] Refreshed and touched up per review comments.
- updated set_attr entries. removed superfluous set_attr entries.
- moved define_insn and define_expand entries to vsx.md.
gcc/ChangeLog:
* config/rs6000/vsx.md (enum unspec): Add
UNSPEC_EXTENDDITI2 and UNSPEC_MTVSRD_DITI_W1 entries.
(mtvsrdd_diti_w1, extendditi2_vector): New define_insns.
(extendditi2): New define_expand.
gcc/ada/
* einfo.ads (Has_Limited_View): New synthesized attribute.
* einfo.adb (Has_Limited_View): New synthesized attribute.
(Set_Limited_View): Complete assertion.
* sem_ch10.ads (Is_Visible_Through_Renamings): Make this routine
public to invoke it from Find_Expanded_Name and avoid reporting
spurious errors on renamings of limited-with packages.
(Load_Needed_Body): Moved to have this spec alphabetically
ordered.
* sem_ch10.adb (Is_Visible_Through_Renamings): Moved to library
level.
(Is_Limited_Withed_Unit): New subprogram.
* sem_ch3.adb (Access_Type_Declaration): Adding protection to
avoid reading attribute Entity() when not available.
* sem_ch8.adb (Analyze_Package_Renaming): Report error on
renamed package not visible through context clauses.
(Find_Expanded_Name): Report error on renamed package not
visible through context clauses; handle special case where the
prefix is a renaming of a (now visible) shadow package.
gcc/ada/
* aspects.ads: Introduce the subtype Nonoverridable_Aspect_Id,
whose Static_Predicate reflects the list of nonoverridable
aspects given in Ada RM 13.1.1(18.7).
* sem_util.ads, sem_util.adb: Add two new visible subprograms,
Check_Inherited_Nonoverridable_Aspects and Is_Confirming. The
former is used to check the consistency of inherited
nonoverridable aspects from multiple sources. The latter
indicates whether two aspect specifications for a nonoverridable
aspect are confirming. Because of compatibility concerns in
compiling QGen, Is_Confirming always returns True if
Relaxed_RM_Semantics (i.e., -gnatd.M) is specified.
* sem_ch3.adb (Derived_Type_Declaration): Call new
Check_Inherited_Nonoverridable_Aspects procedure if interface
list is non-empty.
* sem_ch9.adb (Check_Interfaces): Call new
Check_Inherited_Nonoverridable_Aspects procedure if interface
list is non-empty.
* sem_ch13.adb (Analyze_Aspect_Specifications): When an explicit
aspect specification overrides an inherited nonoverridable
aspect, check that the explicit specification is confirming.
gcc/ada/
* sem_util.adb (Is_Container_Aggregate): A new local predicates
which indicates whether a given expression is a container
aggregate. The implementation of this function is incomplete; in
the unusual case of a record aggregate (i.e., not a container
aggregate) of a type whose Aggregate aspect is specified, the
function will incorrectly return True.
(Immediate_Context_Implies_Is_Potentially_Unevaluated): Improve
handling of aggregate components.
(Is_Repeatedly_Evaluated): Test for container aggregate
components along with existing test for array aggregate
components.