* module.c (gfc_match_use): Fix name-conflict check for use-associating
the same symbol again in a submodule.
* gfortran.dg/use_rename_10.f90: New.
* gfortran.dg/use_rename_11.f90: New.
2020-04-14 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/94270
* interface.c (gfc_get_formal_from_actual_arglist): Always
set artificial attribute for symbols.
* trans-decl.c (generate_local_decl): Do not warn if the
symbol is artifical.
2020-04-14 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/94270
* gfortran.dg/warn_unused_dummy_argument_6.f90: New test.
While reviewing [basic.scope.param] I noticed we don't show the location
of the previous declaration when giving an error about "A parameter name
shall not be redeclared in the outermost block of the function definition".
PR c++/94588
* name-lookup.c (check_local_shadow): Add an inform call.
* g++.dg/diagnostic/redeclaration-1.C: Add dg-message.
gcc/ChangeLog:
* doc/extend.texi (-Wall): Mention -Wformat-overflow and
-Wformat-truncation. Move -Wzero-length-bounds last.
(-Wrestrict): Document positive form of option enabled by -Wall.
We are hitting a recursive loop when printing the signature of a function
containing a decltype([]{}) type. The loop is
dump_function_decl -> dump_substitution
-> dump_template_bindings
-> dump_type
-> dump_aggr_type
-> dump_scope -> dump_function_decl
and we loop because dump_template_bindings wants to print the resolved type of
decltype([]{}) (i.e. just a lambda type), so it calls dump_aggr_type, which
wants to print the function scope of the lambda type. But the function scope of
the lambda type is the function which we're in the middle of printing.
This patch breaks the loop by passing TFF_NO_FUNCTION_ARGUMENTS to
dump_function_decl from dump_scope, so that we avoid recursing into
dump_substitution and ultimately looping.
This also means we no longer emit the "[with ...]" clause when printing a
function template scope, and we instead just emit its template argument list in
a more natural way, e.g. instead of
foo(int, char) [with T=bool]::x
we would now print
foo<bool>::x
which seems like an improvement on its own.
The full signature of the function 'spam' in the below testcase is now
void spam(decltype (<lambda>)*) [with T = int; decltype (<lambda>) = spam<int>::<lambda()>]
gcc/cp/ChangeLog:
PR c++/94521
* error.c (dump_scope): Pass TFF_NO_FUNCTION_ARGUMENTS to
dump_function_decl when printing a function template instantiation as a
scope.
gcc/testsuite/ChangeLog:
PR c++/94521
* g++.dg/cpp2a/lambda-uneval12.C: New test.
In this PR we're incorrectly rejecting a self-modifying constexpr initializer as
a consequence of the fix for PR78572.
It looks like however that the fix for PR78572 is obsoleted by the fix for
PR89336: the testcase from the former PR successfully compiles even with its fix
reverted.
But then further testing showed that the analogous testcase of PR78572 where the
array has an aggregate element type is still problematic (i.e. we ICE) even with
the fix for PR78572 applied. The reason is that in cxx_eval_bare_aggregate we
attach a constructor_elt of aggregate type always to the end of the new
CONSTRUCTOR, but that's not necessarily correct if the CONSTRUCTOR is
self-modifying. We should instead be using get_or_insert_ctor_field to insert
the constructor_elt in the right place.
So this patch reverts the PR78572 fix and makes the appropriate changes to
cxx_eval_bare_aggregate. This fixes PR94470, and we now are also able to fully
reduce the initializers of 'arr' and 'arr2' in the new test array57.C to
constant initializers.
gcc/cp/ChangeLog:
PR c++/94470
* constexpr.c (get_or_insert_ctor_field): Set default value of parameter
'pos_hint' to -1.
(cxx_eval_bare_aggregate): Use get_or_insert_ctor_field instead of
assuming the the next index belongs at the end of the new CONSTRUCTOR.
(cxx_eval_store_expression): Revert PR c++/78572 fix.
gcc/testsuite/ChangeLog:
PR c++/94470
* g++.dg/cpp1y/constexpr-nsdmi8.C: New test.
* g++.dg/cpp1y/constexpr-nsdmi9.C: New test.
* g++.dg/init/array57.C: New test.
From XCode 11.4 on 10.14/15 use of 10.6 and 10.7 is deprecated.
The tools issue diagnostics if -mmacosx-version-min= < 10.8
Adjust the testcase to avoid that usage on 10.14, 10.15 for now.
gcc/testsuite/ChangeLog:
2020-04-13 Iain Sandoe <iain@sandoe.co.uk>
* gcc.dg/darwin-version-1.c: Use -mmacosx-version-min= 10.8
for system versions 10.14 and 10.15.
The idea is not have another resolution of a pointer if an error has
occurred previously.
2020-04-13 Linus Koenig <link@sig-st.de>
PR fortran/94192
* resolve.c (resolve_fl_var_and_proc): Set flag "error" to 1 if
pointer is found to not have an assumed rank or a deferred shape.
* simplify.c (simplify_bound): If an error has been issued for a
given pointer, one should not attempt to find its bounds.
2020-04-13 Linus Koenig <link@sig-st.de>
PR fortran/94192
* gfortran.dg/bound_resolve_after_error_1.f90: New test.
My fix for 94147 was confusing no-linkage with internal linkage, at
the language level. That's wrong. (the std is confusing here, because
it describes linkage of names (which is wrong), and lambdas have no
names)
Lambdas with extra-scope, have linkage. However, at the
implementation-level that linkage is at least as restricted as the
linkage of the extra-scope decl.
Further, when instantiating a variable initialized by a lambda, we
must determine the visibility of the variable itself, before
instantiating its initializer. If the template arguments are internal
(or no-linkage), the variable will have internal linkage, regardless
of the linkage of the template it is instantiated from. We need to
know that before instantiating the lambda, so we can restrict its
linkage correctly.
* decl2.c (determine_visibility): A lambda's visibility is
affected by its extra scope.
* pt.c (instantiate_decl): Determine var's visibility before
instantiating its initializer.
* tree.c (no_linkage_check): Revert code looking at visibility of
lambda's extra scope.
` gcc/cp/
* g++.dg/cpp0x/lambda/pr94426-[12].C: New.
* g++.dg/abi/lambda-vis.C: Drop a warning.
* g++.dg/cpp0x/lambda/lambda-mangle.C: Lambda visibility on
variable changes.
* g++.dg/opt/dump1.C: Drop warnings of no import.
We must restore the frame pointer in word_mode for eh_return epilogues
since the upper 32 bits of RBP register can have any values.
Tested on Linux/x32 and Linux/x86-64.
PR target/94556
* config/i386/i386.c (ix86_expand_epilogue): Restore the frame
pointer in word_mode for eh_return epilogues.
Some insns, which operate on SImode operands, output assembler template
that comprise of multiple instructions using HImode operands. To access
the high word of an SImode operand, an operand selector '%H' is used to
offset the operand value by a constant amount.
When one of these HImode operands is a memory reference to a post_inc,
the address does not need to be offset, since the preceding instruction
has already offset the address to the correct value.
This fixes an ICE in change_address_1, at emit-rtl.c:2318 for
gcc.c-torture/execute/pr20527-1.c in the "-mlarge -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects" configuration.
This test generated the following insn, and the attempt to output the
high part of the post_inc address caused the ICE.
(set (reg:SI 6 R6)
(minus:SI (reg:SI 6 R6)
(mem:SI (post_inc:PSI (reg:PSI 10 R10)) {subsi3}
gcc/ChangeLog:
2020-04-13 Jozef Lawrynowicz <jozef.l@mittosystems.com>
* config/msp430/msp430.c (msp430_print_operand): Don't add offsets to
memory references in %B, %C and %D operand selectors when the inner
operand is a post increment address.
The %C and %D operand modifiers are supposed to access the 3rd and 4th
words of a 64-bit value, so for memory references they need to offset
the given address by 4 and 6 bytes respectively.
gcc/ChangeLog:
2020-04-13 Jozef Lawrynowicz <jozef.l@mittosystems.com>
* config/msp430/msp430.c (msp430_print_operand): Offset a %C memory
reference by 4 bytes, and %D memory reference by 6 bytes.
gcc/testsuite/ChangeLog:
2020-04-13 Jozef Lawrynowicz <jozef.l@mittosystems.com>
* gcc.target/msp430/operand-modifiers.c: New test.
Removes the implementation of __traits(argTypes), which only supported
x86_64 targets. The only use of this trait is an unused va_arg()
function, this has been removed as well.
Reviewed-on: https://github.com/dlang/dmd/pull/11022
gcc/d/ChangeLog:
2020-04-13 Iain Buclaw <ibuclaw@gdcproject.org>
* Make-lang.in (D_FRONTEND_OBJS): Remove d/argtypes.o.
* d-target.cc (Target::toArgTypes): New function.
libphobos/ChangeLog:
2020-04-13 Iain Buclaw <ibuclaw@gdcproject.org>
* libdruntime/core/stdc/stdarg.d: Remove run-time va_list template.
Darwin mandates an indirection for variables in the commmon
section. Since the change to -fno-common, variables in some
of the thunk tests are now in the .data section where they
may be accessed directly. Remove the indirections from the
scan-assembler matches.
gcc/testsuite/ChangeLog:
2020-04-12 Iain Sandoe <iain@sandoe.co.uk>
* gcc.target/i386/indirect-thunk-1.c: Adjust for fno-common
change, removing indirections for vars in .data.
* gcc.target/i386/indirect-thunk-2.c: Likewise.
* gcc.target/i386/indirect-thunk-3.c: Likewise.
* gcc.target/i386/indirect-thunk-4.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-1.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-2.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-3.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-4.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-5.c: Likewise.
* gcc.target/i386/indirect-thunk-attr-6.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-1.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-2.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-3.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-4.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-1.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-2.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-3.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-4.c: Likewise.
V4SI, V8HI and V16QI modes of redux_<code>_scal_<mode> expander
expand with SSE2 instructions (PSRLDQ and PCMPGTx) so use
TARGET_SSE2 as relevant mode iterator codition.
PR target/94494
* config/i386/sse.md (REDUC_SSE_SMINMAX_MODE): Use TARGET_SSE2
condition for V4SI, V8HI and V16QI modes.
testsuite/ChangeLog:
PR target/94494
* gcc.target/i386/pr94494.c: New test.
The test FAILs on powerpc64-linux with -m32 due to psabi warnings.
Furthermore, the test needs really -msse2 to reproduce on x86 -m32 at -O2.
2020-04-11 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94482
* gcc.dg/torture/pr94482.c: Add -Wno-psabi -w. Don't add -msse
and sse_runtime effective target on x86, instead only add -msse2
if target is sse2_runtime.
Sometimes the cselib_record_sp_cfa_base_equiv makes it into the var-tracking
used VALUEs and needs to be preserved.
2020-04-11 Jakub Jelinek <jakub@redhat.com>
PR debug/94495
PR target/94551
* cselib.c (cselib_record_sp_cfa_base_equiv): Set PRESERVED_VALUE_P on
val->val_rtx.
The expansions for await expressions were specific to particular
cases, this revises it to be more generic.
a: Revise co_await statement walkers.
We want to process the co_awaits one statement at a time.
We also want to be able to determine the insertion points for
new bind scopes needed to cater for temporaries that are
captured by reference and have lifetimes that need extension
to the end of the full expression. Likewise, the handling of
captured references in the evaluation of conditions might
result in the need to make a frame copy.
This reorganises the statement walking code to make it easier to
extend for these purposes.
b: Factor reference-captured temp code.
We want to be able to use the code that writes a new bind expr
with vars (and their initializers) from several places, so split
that out of the maybe_promote_captured_temps() function into a
new replace_statement_captures (). Update some comments.
c: Generalize await statement expansion.
This revises the expansion to avoid the need to expand conditionally
on the tree type. It resolves PR 94528.
gcc/cp/ChangeLog:
2020-04-10 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94538
* coroutines.cc (co_await_expander): Remove.
(expand_one_await_expression): New.
(process_one_statement): New.
(await_statement_expander): New.
(build_actor_fn): Revise to use per-statement expander.
(struct susp_frame_data): Reorder and comment.
(register_awaits): Factor code.
(replace_statement_captures): New, factored from...
(maybe_promote_captured_temps):.. here.
(await_statement_walker): Revise to process per statement.
(morph_fn_to_coro): Use revised susp_frame_data layout.
gcc/testsuite/ChangeLog:
2020-04-10 Iain Sandoe <iain@sandoe.co.uk>
PR c++/94538
* g++.dg/coroutines/pr94528.C: New test.
In C++20 this is well-formed:
using T = int[2];
T t(1, 2);
which means that std::is_constructible_v<int[2], int, int> should be true.
But constructible_expr immediately returned the error_mark_node when it
saw a list with more than one element. To give accurate results in
C++20, we have to try initializing the aggregate from a parenthesized list of
values.
To not repeat the same mistake as in c++/93790, if there's only one
element, I'm trying {} only when () didn't succeed. is_constructible5.C
verifies this.
In paren-init24.C std::is_nothrow_constructible_v doesn't work due to
error: invalid 'static_cast' from type 'int' to type 'int [1]'
and
error: functional cast to array type 'int [2]'
This needs to be fixed in libstdc++.
PR c++/94149
* method.c (constructible_expr): In C++20, try using parenthesized
initialization of aggregates to determine the result of
__is_constructible.
* g++.dg/cpp2a/paren-init24.C: New test.
* g++.dg/cpp2a/paren-init25.C: New test.
* g++.dg/ext/is_constructible5.C: New test.
gcc/testsuite/ChangeLog:
2020-04-10 Fritz Reese <foreese@gcc.gnu.org>
* gfortran.dg/asynchronous_5.f03: Add -fdump-tree-original and fix
patterns for scan-tree-dump.
... which as of PR89433 commit b48f44bf77 causes
an ICE. Not sure if this is actually supposed to be valid or invalid code.
Until the interactions between OpenACC and OpenMP 'target' get defined
properly, make this a compile-time error.
gcc/
PR middle-end/89433
PR middle-end/93465
* omp-general.c (oacc_verify_routine_clauses): Diagnose if
"#pragma omp declare target" has also been applied.
gcc/testsuite/
PR middle-end/89433
PR middle-end/93465
* c-c++-common/goacc-gomp/pr93465-1.c: New file.
As a prerequesite for PR94304, it becomes easier to manage selectively
compiling sublibraries when there's only one library to link to.
So a druntime convenience library is built to be part of phobos, however
separate druntime library is still built and installed, to allow linking
only to the core runtime explicitly, rather than pulling in the entire
standard library with it.
The gdc driver no longer generates an '-lgdruntime' option, and the
inclusion of the libdruntime library path has been removed from the
testsuite.
gcc/d/ChangeLog:
* d-spec.cc (LIBDRUNTIME): Remove.
(LIBDRUNTIME_PROFILE): Remove.
(lang_specific_driver): Don't link in libgdruntime.
gcc/testsuite/ChangeLog:
* lib/gdc.exp (gdc_link_flags): Remove libdruntime library path.
libphobos/ChangeLog:
* d_rules.am (libdgruntime_la_LINK): Move to libdruntime/Makefile.am.
(libgphobos_la_LINK): Move to src/Makefile.am
* libdruntime/Makefile.am: Add libgdruntime_convenience library.
* libdruntime/Makefile.in: Regenerate.
* src/Makefile.am (libgphobos_la_LIBADD): Add libgdruntime_convenience
library.
(libgphobos_la_DEPENDENCIES): Likewise.
* src/Makefile.in: Regenerate.
* testsuite/lib/libphobos.exp: Remove libdruntime library paths.
* testsuite/testsuite_flags.in: Likewise.
A composite literal key may not have a global definition, so
Gogo::define_global_names may not see it. In order to correctly
handle the case in which a predeclared identifier is used as a
composite literal key, do an explicit check of the global namespace.
Test case is https://golang.org/cl/227783.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/227784
2020-04-06 Fritz Reese <foreese@gcc.gnu.org>
This patch reorganizes I/O checking code. Checks which were done in the
matching phase which do not affect the match result are moved to the
resolution phase. Checks which were duplicated in both the matching phase
and resolution phase have been reduced to one check in the resolution phase.
Another section of code which used a global async_io_dt flag to check for
and assign the asynchronous attribute to variables used in asynchronous I/O
has been simplified.
Furthermore, this patch improves error reporting and expands test coverage
of I/O tags:
- "TAG must be an initialization expression" reported by io.c
(check_io_constraints) is replaced with an error from expr.c
(gfc_reduce_init_expr) indicating _why_ the expression is not a valid
initialization expression.
- Several distinct error messages regarding the check for scalar
+ character + default kind have been unified to one message reported by
resolve_tag or check_*_constraints.
gcc/fortran/ChangeLog:
2020-04-09 Fritz Reese <foreese@gcc.gnu.org>
PR fortran/87923
* gfortran.h (gfc_resolve_open, gfc_resolve_close): Add
locus parameter.
(gfc_resolve_dt): Add code parameter.
* io.c (async_io_dt, check_char_variable, is_char_type): Removed.
(resolve_tag_format): Add locus to error message regarding
zero-sized array in FORMAT tag.
(check_open_constraints, check_close_constraints): New functions
called at resolution time.
(gfc_match_open, gfc_match_close, match_io): Move checks which don't
affect the match result to new functions check_open_constraints,
check_close_constraints, check_io_constraints.
(gfc_resolve_open, gfc_resolve_close): Call new functions
check_open_constraints, check_close_constraints after all tags have
been independently resolved. Remove duplicate constraints which are
already verified by resolve_tag. Explicitly pass locus to all error
reports.
(compare_to_allowed_values): Add locus parameter and provide
explicit locus all error reports.
(match_open_element, match_close_element, match_file_element,
match_dt_element, match_inquire_element): Remove redundant special
cases for ASYNCHRONOUS and IOMSG tags.
(gfc_resolve_dt): Remove redundant special case for format
expression. Call check_io_constraints, forwarding an I/O list as
the io_code parameter if present.
(check_io_constraints): Change return type to bool. Pass explicit
locus to error reports. Move generic checks after tag-specific
checks, since errors are no longer buffered. Move simplification of
format string to match_io. Remove redundant checks which are
verified by resolve_tag. Remove usage of async_io_dt flag and
explicitly mark symbols used in asynchronous I/O with the
asynchronous attribute.
* resolve.c (resolve_transfer, resolve_fl_namelist): Remove checks
for async_io_dt flag. This is now done in io.c
(check_io_constraints).
(gfc_resolve_code): Pass code locus to gfc_resolve_open,
gfc_resolve_close, gfc_resolve_dt.
gcc/testsuite/ChangeLog:
2020-04-09 Fritz Reese <foreese@gcc.gnu.org>
PR fortran/87923
* gfortran.dg/f2003_io_8.f03: Fix expected error messages.
* gfortran.dg/io_constraints_8.f90: Likewise.
* gfortran.dg/iomsg_2.f90: Likewise.
* gfortran.dg/pr66725.f90: Likewise.
* gfortran.dg/pr88205.f90: Likewise.
* gfortran.dg/write_check4.f90: Likewise.
* gfortran.dg/asynchronous_5.f03: New test.
* gfortran.dg/io_constraints_15.f90: Likewise.
* gfortran.dg/io_constraints_16.f90: Likewise.
* gfortran.dg/io_constraints_17.f90: Likewise.
* gfortran.dg/io_constraints_18.f90: Likewise.
* gfortran.dg/io_tags_1.f90: Likewise.
* gfortran.dg/io_tags_10.f90: Likewise.
* gfortran.dg/io_tags_2.f90: Likewise.
* gfortran.dg/io_tags_3.f90: Likewise.
* gfortran.dg/io_tags_4.f90: Likewise.
* gfortran.dg/io_tags_5.f90: Likewise.
* gfortran.dg/io_tags_6.f90: Likewise.
* gfortran.dg/io_tags_7.f90: Likewise.
* gfortran.dg/io_tags_8.f90: Likewise.
* gfortran.dg/io_tags_9.f90: Likewise.
* gfortran.dg/write_check5.f90: Likewise.
Here due to my recent change to store_init_value we were expanding the
initializer of aw knowing that we were initializing aw. When
cxx_eval_call_expression finished the constructor, it wanted to look up the
value of aw to set TREE_READONLY on it, but we haven't set DECL_INITIAL yet,
so decl_constant_value tried to instantiate the initializer again. And
infinite recursion. Stopped by optimizing the case of asking for the value
of ctx->object, which is ctx->value. It also would have worked to look in
the values hash table, so let's move that up before decl_constant_value as
well.
gcc/cp/ChangeLog
2020-04-09 Jason Merrill <jason@redhat.com>
PR c++/94523
* constexpr.c (cxx_eval_constant_expression) [VAR_DECL]: Look at
ctx->object and ctx->global->values first.
LWG 3324 changed the [cmp.alg] types to use std::compare_three_way
instead of the <=> operator, but we were still using the old
specification. In order to make the existing tests pass the N::X type
needs to be equality comparable, so that three_way_comparable is
satisfied and compare_three_way can be used.
As part of this change I noticed that the compare_three_way call
operator was unconditionally noexcept, which is incorrect.
* libsupc++/compare (compare_three_way): Fix noexcept-specifier.
(strong_order, weak_order, partial_order): Replace uses of <=> with
compare_three_way function object (LWG 3324).
* testsuite/18_support/comparisons/algorithms/partial_order.cc: Add
equality operator so that X satisfies three_way_comparable.
* testsuite/18_support/comparisons/algorithms/strong_order.cc:
Likewise.
* testsuite/18_support/comparisons/algorithms/weak_order.cc: Likewise.
Some more C++20 changes from P1614R2, "The Mothership has Landed".
This includes the proposed resolution for LWG 3426 to fix the three-way
comparison with nullptr_t.
The existing tests for unique_ptr comparisons don't actually check the
results, only that the expressions compile and are convertible to bool.
This also adds a test for the results of those comparisons for C++11 and
up.
* include/bits/unique_ptr.h (operator<=>): Define for C++20.
* testsuite/20_util/default_delete/48631_neg.cc: Adjust dg-error line.
* testsuite/20_util/default_delete/void_neg.cc: Likewise.
* testsuite/20_util/unique_ptr/comparison/compare.cc: New test.
* testsuite/20_util/unique_ptr/comparison/compare_c++20.cc: New test.
This fixes an ICE in rtl_verify_fallthru, at cfgrtl.c:2970
gcc.c-torture/execute/20071210-1.c for -mcpu=msp430 at -O2
and above.
The epilogue_helper insn was treated as a regular insn which will
fallthru, so when a barrier is emitted after it, RTL verification failed
as rtl_verify_fallthru.
gcc/ChangeLog:
2020-04-09 Jozef Lawrynowicz <jozef.l@mittosystems.com>
* config/msp430/msp430.c (msp430_expand_epilogue): Use emit_jump_insn
when to emit the epilogue_helper insn.
* config/msp430/msp430.md (epilogue_helper): Add a return insn to the
RTL pattern.
On the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94495#c5
testcase GCC emits worse debug info after the PR92264 cselib.c
changes.
The difference at -O2 -g -dA in the assembly is (when ignoring debug info):
# DEBUG g => [argp]
# DEBUG k => [argp+0x20]
# DEBUG j => [argp+0x18]
# DEBUG a => di
# DEBUG b => si
# DEBUG c => dx
# DEBUG d => cx
# DEBUG h => [argp+0x8]
# DEBUG e => r8
# DEBUG i => [argp+0x10]
# DEBUG f => r9
...
.LVL4:
+ # DEBUG h => [sp+0x10]
+ # DEBUG i => [sp+0x18]
+ # DEBUG j => [sp+0x20]
+ # DEBUG k => [sp+0x28]
# DEBUG c => entry_value
# SUCC: EXIT [always] count:1073741824 (estimated locally)
ret
.LVL5:
+ # DEBUG k => [argp+0x20]
# DEBUG a => bx
# DEBUG b => si
# DEBUG c => dx
# DEBUG d => cx
# DEBUG e => r8
# DEBUG f => r9
+ # DEBUG h => [argp+0x8]
+ # DEBUG i => [argp+0x10]
+ # DEBUG j => [argp+0x18]
This means that before the changes, h, i, j, k could be all expressed
in DW_AT_location directly with DW_OP_fbreg <some_offset>, but now we need
to use a location list, where in the first part of the function and last
part of the function (everything except the ret instruction) we use that
DW_OP_fbreg <some_offset>, but for the single ret instruction we instead
say those values live in something pointed by stack pointer + offset.
It is true, but only because stack pointer + offset is equal to DW_OP_fbreg
<some_offset> at that point.
The var-tracking pass has for !frame_pointer_needed functions code to
canonicalize stack pointer uses in the insns before it hands it over
to cselib to cfa_base_rtx + offset depending on the stack depth at each
point. The problem is that on the last epilogue pop insn (the one right
before ret) the canonicalization is sp = argp - 8 and add_stores records
a MO_VAL_SET operation for that argp - 8 value (which is the
SP_DERIVED_VALUE_P VALUE the cselib changes canonicalize sp based accesses
on) and thus var-tracking from that point onwards tracks that that VALUE
(2:2) now lives in sp. At the end of function it of course needs to forget
it again (or it would need on any changes to sp). But when processing
that uop, we note that the VALUE has changed and anything based on it
changed too, so emit changes for everything. Before that var-tracking
itself doesn't track it in any register, so uses cselib and cselib knows
through the permanent equivs how to compute it using argp (i.e. what
will be DW_OP_fbreg).
The following fix has two parts. One is it detects if cselib can compute
a certain VALUE using the cfa_base_rtx and for such VALUEs doesn't add
the MO_VAL_SET operation, as it is better to express them using cfa_base_rtx
rather than temporarily through something else. And the other is make sure
we reuse in !frame_pointer_needed the single SP_DERIVED_VALUE_P VALUE in
other extended basic blocks too (and other VALUEs) too. This can be done
because we have computed the stack depths at the start of each basic block
in vt_stack_adjustments and while cselib_reset_table is called at the end
of each extended bb, which throws away all hard registers (but the magic
cfa_base_rtx) and so can hint cselib.c at the start of the ebb what VALUE
the sp hard reg has. That means fewer VALUEs during var-tracking and more
importantly that they will all have the cfa_base_rtx + offset equivalency.
I have performed 4 bootstraps+regtests (x86_64-linux and i686-linux,
each with this patch (that is the new cselib + var-tracking variant) and
once with that patch reverted as well as all other cselib.c changes from
this month; once that bootstrapped, I've reapplied the cselib.c changes and
this patch and rebuilt cc1plus, so that the content is comparable, but built
with the pre-Apr 2 cselib.c+var-tracking.c (that is the old cselib one)).
Below are readelf -WS cc1plus | grep debug_ filtered to only have debug
sections whose size actually changed, followed by dwlocstat results on
cc1plus. This shows that there was about 3% shrink in those .debug*
sections for 32-bit and 1% shrink for 64-bit, with minor variable coverage
changes one or the other way that are IMHO insignificant.
32-bit old cselib
[33] .debug_info PROGBITS 00000000 29139c0 710e5fa 00 0 0 1
[34] .debug_abbrev PROGBITS 00000000 9a21fba 21ad6d 00 0 0 1
[35] .debug_line PROGBITS 00000000 9c3cd27 1a05e56 00 0 0 1
[36] .debug_str PROGBITS 00000000 b642b7d 7cad09 01 MS 0 0 1
[37] .debug_loc PROGBITS 00000000 be0d886 5792627 00 0 0 1
[38] .debug_ranges PROGBITS 00000000 1159fead e57218 00 0 0 1
sum 263075589B
32-bit new cselib + var-tracking
[33] .debug_info PROGBITS 00000000 29129c0 71065d1 00 0 0 1
[34] .debug_abbrev PROGBITS 00000000 9a18f91 21af28 00 0 0 1
[35] .debug_line PROGBITS 00000000 9c33eb9 195dffc 00 0 0 1
[36] .debug_str PROGBITS 00000000 b591eb5 7cace0 01 MS 0 0 1
[37] .debug_loc PROGBITS 00000000 bd5cb95 50185bf 00 0 0 1
[38] .debug_ranges PROGBITS 00000000 10d75154 e57068 00 0 0 1
sum 254515196B (8560393B smaller)
64-bit old cselib
[33] .debug_info PROGBITS 0000000000000000 25e64b0 84d7cc9 00 0 0 1
[34] .debug_abbrev PROGBITS 0000000000000000 aabe179 225e2d 00 0 0 1
[35] .debug_line PROGBITS 0000000000000000 ace3fa6 19a3505 00 0 0 1
[37] .debug_loc PROGBITS 0000000000000000 ce6e960 89707bc 00 0 0 1
[38] .debug_ranges PROGBITS 0000000000000000 157df11c 1c59a70 00 0 0 1
sum 342274599B
64-bit new cselib + var-tracking
[33] .debug_info PROGBITS 0000000000000000 25e64b0 84d8e86 00 0 0 1
[34] .debug_abbrev PROGBITS 0000000000000000 aabf336 225e8d 00 0 0 1
[35] .debug_line PROGBITS 0000000000000000 ace51c3 199ded5 00 0 0 1
[37] .debug_loc PROGBITS 0000000000000000 ce6a54d 85f62da 00 0 0 1
[38] .debug_ranges PROGBITS 0000000000000000 15460827 1c59a20 00 0 0 1
sum 338610402B (3664197B smaller)
32-bit old cselib
cov% samples cumul
0..10 1231599/48% 1231599/48%
11..20 31017/1% 1262616/49%
21..30 36495/1% 1299111/51%
31..40 35846/1% 1334957/52%
41..50 47179/1% 1382136/54%
51..60 41203/1% 1423339/56%
61..70 65504/2% 1488843/58%
71..80 59656/2% 1548499/61%
81..90 104399/4% 1652898/65%
91..100 882231/34% 2535129/100%
32-bit new cselib + var-tracking
cov% samples cumul
0..10 1230542/48% 1230542/48%
11..20 30385/1% 1260927/49%
21..30 36393/1% 1297320/51%
31..40 36053/1% 1333373/52%
41..50 47670/1% 1381043/54%
51..60 41599/1% 1422642/56%
61..70 65902/2% 1488544/58%
71..80 59911/2% 1548455/61%
81..90 104607/4% 1653062/65%
91..100 882067/34% 2535129/100%
64-bit old cselib
cov% samples cumul
0..10 1233211/48% 1233211/48%
11..20 31120/1% 1264331/49%
21..30 39230/1% 1303561/51%
31..40 38887/1% 1342448/52%
41..50 47519/1% 1389967/54%
51..60 45264/1% 1435231/56%
61..70 69431/2% 1504662/59%
71..80 62114/2% 1566776/61%
81..90 104587/4% 1671363/65%
91..100 876085/34% 2547448/100%
64-bit new cselib + var-tracking
cov% samples cumul
0..10 1233471/48% 1233471/48%
11..20 31093/1% 1264564/49%
21..30 39217/1% 1303781/51%
31..40 38851/1% 1342632/52%
41..50 47488/1% 1390120/54%
51..60 45224/1% 1435344/56%
61..70 69409/2% 1504753/59%
71..80 62140/2% 1566893/61%
81..90 104616/4% 1671509/65%
91..100 875939/34% 2547448/100%
2020-04-09 Jakub Jelinek <jakub@redhat.com>
PR debug/94495
* cselib.h (cselib_record_sp_cfa_base_equiv,
cselib_sp_derived_value_p): Declare.
* cselib.c (cselib_record_sp_cfa_base_equiv,
cselib_sp_derived_value_p): New functions.
* var-tracking.c (add_stores): Don't record MO_VAL_SET for
cselib_sp_derived_value_p values.
(vt_initialize): Call cselib_record_sp_cfa_base_equiv at the
start of extended basic blocks other than the first one
for !frame_pointer_needed functions.
This patch implements the "arm_sve_vector_bits" attribute, which can be
used to create fixed-length versions of an SVE type while maintaining
their "SVEness". For example, when __ARM_FEATURE_SVE_BITS==256:
typedef svint32_t vec __attribute__((arm_sve_vector_bits(256)));
creates a 256-bit version of svint32_t.
The attribute itself is quite simple. However, it means that we now
need to implement the full PCS rules for scalable types, whereas
previously we only needed to handle scalable types that were built
directly into the compiler. See:
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst
for more information about these rules.
2020-04-09 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* doc/sourcebuild.texi (aarch64_sve_hw, aarch64_sve128_hw)
(aarch64_sve256_hw, aarch64_sve512_hw, aarch64_sve1024_hw)
(aarch64_sve2048_hw): Document.
* config/aarch64/aarch64-protos.h
(aarch64_sve::handle_arm_sve_vector_bits_attribute): Declare.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_SVE_VECTOR_OPERATIONS when SVE is enabled.
* config/aarch64/aarch64-sve-builtins.cc (matches_type_p): New
function.
(find_type_suffix_for_scalar_type): Use it instead of comparing
TYPE_MAIN_VARIANTs.
(function_resolver::infer_vector_or_tuple_type): Likewise.
(function_resolver::require_vector_type): Likewise.
(handle_arm_sve_vector_bits_attribute): New function.
* config/aarch64/aarch64.c (pure_scalable_type_info): New class.
(aarch64_attribute_table): Add arm_sve_vector_bits.
(aarch64_return_in_memory_1):
(pure_scalable_type_info::piece::get_rtx): New function.
(pure_scalable_type_info::num_zr): Likewise.
(pure_scalable_type_info::num_pr): Likewise.
(pure_scalable_type_info::get_rtx): Likewise.
(pure_scalable_type_info::analyze): Likewise.
(pure_scalable_type_info::analyze_registers): Likewise.
(pure_scalable_type_info::analyze_array): Likewise.
(pure_scalable_type_info::analyze_record): Likewise.
(pure_scalable_type_info::add_piece): Likewise.
(aarch64_some_values_include_pst_objects_p): Likewise.
(aarch64_returns_value_in_sve_regs_p): Use pure_scalable_type_info
to analyze whether the type is returned in SVE registers.
(aarch64_takes_arguments_in_sve_regs_p): Likwise whether the type
is passed in SVE registers.
(aarch64_pass_by_reference_1): New function, extracted from...
(aarch64_pass_by_reference): ...here. Use pure_scalable_type_info
to analyze whether the type is a pure scalable type and, if so,
whether it should be passed by reference.
(aarch64_return_in_msb): Return false for pure scalable types.
(aarch64_function_value_1): Fold back into...
(aarch64_function_value): ...this function. Use
pure_scalable_type_info to analyze whether the type is a pure
scalable type and, if so, which registers it should use. Handle
types that include pure scalable types but are not themselves
pure scalable types.
(aarch64_return_in_memory_1): New function, split out from...
(aarch64_return_in_memory): ...here. Use pure_scalable_type_info
to analyze whether the type is a pure scalable type and, if so,
whether it should be returned by reference.
(aarch64_layout_arg): Remove orig_mode argument. Use
pure_scalable_type_info to analyze whether the type is a pure
scalable type and, if so, which registers it should use. Handle
types that include pure scalable types but are not themselves
pure scalable types.
(aarch64_function_arg): Update call accordingly.
(aarch64_function_arg_advance): Likewise.
(aarch64_pad_reg_upward): On big-endian targets, return false for
pure scalable types that are smaller than 16 bytes.
(aarch64_member_type_forces_blk): New function.
(aapcs_vfp_sub_candidate): Exit early for built-in SVE types.
(aarch64_short_vector_p): Return false for VECTOR_TYPEs that
correspond to built-in SVE types. Do not rely on a vector mode
if the type includes an pure scalable type. When returning true,
assert that the mode is not an SVE mode.
(aarch64_vfp_is_call_or_return_candidate): Do not check for SVE
built-in types here. When returning true, assert that the type
does not have an SVE mode.
(aarch64_can_change_mode_class): Don't allow anything to change
between a predicate mode and a non-predicate mode. Also don't
allow changes between SVE vector modes and other modes that
might be bigger than 128 bits.
(aarch64_invalid_binary_op): Reject binary operations that mix
SVE and GNU vector types.
(TARGET_MEMBER_TYPE_FORCES_BLK): Define.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/attributes_1.c: New test.
* gcc.target/aarch64/sve/acle/general/attributes_2.c: Likewise.
* gcc.target/aarch64/sve/acle/general/attributes_3.c: Likewise.
* gcc.target/aarch64/sve/acle/general/attributes_4.c: Likewise.
* gcc.target/aarch64/sve/acle/general/attributes_5.c: Likewise.
* gcc.target/aarch64/sve/acle/general/attributes_6.c: Likewise.
* gcc.target/aarch64/sve/acle/general/attributes_7.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct.h: New file.
* gcc.target/aarch64/sve/pcs/struct_1_128.c: New test.
* gcc.target/aarch64/sve/pcs/struct_1_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_1_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_1_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_1_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_2_128.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_2_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_2_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_2_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_2_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_3_128.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_3_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/struct_3_512.c: Likewise.
* lib/target-supports.exp (check_effective_target_aarch64_sve128_hw)
(check_effective_target_aarch64_sve512_hw)
(check_effective_target_aarch64_sve1024_hw)
(check_effective_target_aarch64_sve2048_hw): New procedures.