PR analyzer/99196 describes a false positive from -fanalyzer due to
the analyzer not "knowing" that calls to GNU libc's error(3) with a
nonzero status terminate the process and thus don't return.
This patch fixes the false positive by special-casing "error" and
"error_at_line".
gcc/analyzer/ChangeLog:
PR analyzer/99196
* engine.cc (exploded_node::on_stmt): Provide terminate_path
flag as a way for on_call_pre to terminate the current analysis
path.
* region-model-impl-calls.cc (call_details::num_args): New.
(region_model::impl_call_error): New.
* region-model.cc (region_model::on_call_pre): Add param
"out_terminate_path". Handle "error" and "error_at_line".
* region-model.h (call_details::num_args): New decl.
(region_model::on_call_pre): Add param "out_terminate_path".
(region_model::impl_call_error): New decl.
gcc/testsuite/ChangeLog:
PR analyzer/99196
* gcc.dg/analyzer/error-1.c: New test.
* gcc.dg/analyzer/error-2.c: New test.
* gcc.dg/analyzer/error-3.c: New test.
This patch introduces an internal tune flag to break up VL-based scalar ops
into a GP-reg scalar op with the VL read kept separate. This can be preferable on some CPUs.
I went for a tune param rather than extending the rtx costs as our RTX costs tables aren't set up to track
this intricacy.
I've confirmed that on the simple loop:
void vadd (int *dst, int *op1, int *op2, int count)
{
for (int i = 0; i < count; ++i)
dst[i] = op1[i] + op2[i];
}
we now split the incw into a cntw outside the loop and the add inside.
+ cntw x5
...
loop:
- incw x4
+ add x4, x4, x5
gcc/ChangeLog:
* config/aarch64/aarch64-tuning-flags.def (cse_sve_vl_constants):
Define.
* config/aarch64/aarch64.md (add<mode>3): Force CONST_POLY_INT immediates
into a register when the above is enabled.
* config/aarch64/aarch64.c (neoversev1_tunings):
AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS.
(aarch64_rtx_costs): Use AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS.
gcc/testsuite/
* gcc.target/aarch64/sve/cse_sve_vl_constants_1.c: New test.
This patch implements conversions between _Float128 and the 3 Decimal floating
types. It does this by extendending the dfp-bit conversions to add a new
binary floating point type (KF), and doing the conversions in the same manner
as the other binary/decimal conversions.
For conversions from _Float128 to Decimal, this patch uses a function
(__sprintfkf) instead of the sprintf function to convert long double values to
strings. The __sprintfkf function determines if GLIBC 2.32 or newer is used
and calls the IEEE 128-bit version of sprintf (__sprintfieee128). If the GLIBC
is earlier than 2.32, the code will convert _Float128 to __ibm128 and then use
the normal sprintf to convert this value.
For conversions from Decimal to _Float128, this patch uses a function
(__strtokf) instead of strtold to convert the strings from the Decimal
conversion to long double. The __strtokf function determines if GLIBC 2.32 or
newer is used, and if it is, calls the IEEE 128-bit version (__strtoieee128).
If the GLIBC is earlier than 2.32, the code will call strtold and convert the
__ibm128 value to _Float128.
These functions will primarily be used if/when the default PowerPC long double
type is changed to IEEE 128-bit, but they could also be used if the user
explicitly converts _Float128 to/from a Decimal type.
libgcc/
2021-02-22 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/_dd_to_kf.c: New file.
* config/rs6000/_kf_to_dd.c: New file.
* config/rs6000/_kf_to_sd.c: New file.
* config/rs6000/_kf_to_td.c: New file.
* config/rs6000/_sd_to_kf.c: New file.
* config/rs6000/_sprintfkf.c: New file.
* config/rs6000/_sprintfkf.h: New file.
* config/rs6000/_strtokf.h: New file.
* config/rs6000/_strtokf.c: New file.
* config/rs6000/_td_to_kf.c: New file.
* config/rs6000/quad-float128.h: Add new declarations.
* config/rs6000/t-float128 (fp128_dec_funcs): New macro.
(fp128_decstr_funcs): New macro.
(ibm128_dec_funcs): New macro.
(fp128_ppc_funcs): Add the new conversions.
(fp128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(fp128_decstr_objs): Force __float128 <-> string conversions to be
compiled with -mabi=ibmlongdouble.
(ibm128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(FP128_CFLAGS_DECIMAL): New macro.
(IBM128_CFLAGS_DECIMAL): New macro.
* dfp-bit.c (DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
* dfp-bit.h (BFP_KIND): Add new binary floating point kind for
IEEE 128-bit floating point.
(DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
(BFP_SPRINTF): New macro.
See gcc/config/newlib-stdint.h, where targets that have
LONG_TYPE_SIZE == 32, get __INT32_TYPE__ defined to "long
int".
All these tests have "typedef __INT32_TYPE__ int32_t;".
Thus the tests fail for 32-bit newlib targets due to related
warning messages being matched to "aka int" where the
emitted message for these targets have "aka long int".
Tested cris-elf and x86_64-linux, committed as obvious.
gcc/testsuite:
* g++.dg/warn/Warray-bounds-10.C, g++.dg/warn/Warray-bounds-11.C,
g++.dg/warn/Warray-bounds-12.C, g++.dg/warn/Warray-bounds-13.C:
Handle __INT32_TYPE__ being "long int".
Also, tweak the scan-assembler regexps to include a tab,
lest they may spuriously match file-paths in the emitted
assembly code, should some be added at some point. And, add
"mul", "move" and (non-addi-)"add" to insns that shouldn't
appear.
gcc/testsuite:
* gcc.target/cris/biap.c: Add a Y+=X*2 to the Y+=X*4.
Ever since the canonicalization clean-up of (mult X (1 << N)) into
(ashift X N) outside addresses, the CRIS addi patterns have been
unmatched. No big deal.
Unfortunately, nobody thought of adjusting reloaded addresses, so
transforming mult into a shift has to be a kludged for when reload
decides that it has to move an address like (plus (mult reg0 4) reg1)
into a register, as happens building libgfortran. (No, simplify_rtx
et al don't automatically DTRT.) Something less kludgy would make
sense if it wasn't for the current late development stage and reload
being deprecated. I don't know whether this issue is absent for LRA,
though.
I added a testsuite for the reload issue, despite being exposed by a
libgfortran build, so the issue would be covered by C/C++ builds, but
to the CRIS test-suite, not as a generic test, to avoid bad feelings
from anyone preferring short test-times to redundant coverage.
gcc:
* config/cris/cris.c (cris_print_operand) <'T'>: Change
valid operand from is now an addi mult-value to shift-value.
* config/cris/cris.md (*addi): Change expression of scaled
operand from mult to ashift.
* config/cris/cris.md (*addi_reload): New insn_and_split.
gcc/testsuite:
* gcc.target/cris/torture/sync-reload-mul-1.c: New test.
The fix for 98741 introduced two issues. (a) recursive header units
iced because we tried to read the preprocessor state after having
failed to read the config. (b) we could have duplicate imports of
named modules in our pending queue, and that would lead to duplicate
requests for pathnames, which coupled with the use of a null-pathname
to indicate we'd asked could lead to desynchronization with the module
mapper. Fixed by adding a 'visited' flag to module state, so we ask
exactly once.
PR c++/99174
gcc/cp
* module.cc (struct module_state): Add visited_p flag.
(name_pending_imports): Use it to avoid duplicate requests.
(preprocess_module): Don't read preprocessor state if we failed to
load a module's config.
gcc/testsuite/
* g++.dg/modules/pr99174-1_a.C: New.
* g++.dg/modules/pr99174-1_b.C: New.
* g++.dg/modules/pr99174-1_c.C: New.
* g++.dg/modules/pr99174.H: New.
gcc/ChangeLog:
PR target/85074
* config/pa/pa.c (TARGET_ASM_CAN_OUTPUT_MI_THUNK): Define as
hook_bool_const_tree_hwi_hwi_const_tree_true.
(pa_asm_output_mi_thunk): Add support for nonzero vcall_offset.
A member function can be defined in a different header-file than the
one defining the class. In such situations we must unmark the decl as
imported. When the entity is a template we failed to unmark the
template_decl.
Perhaps the duplication of these flags on the template_decl from the
underlying decl is an error. I set on the fence about it for a long
time during development, but I don't think now is the time to change
that (barring catastrophic bugs).
PR c++/99153
gcc/cp/
* decl.c (duplicate_decls): Move DECL_MODULE_IMPORT_P propagation
to common-path.
* module.cc (set_defining_module): Add assert.
gcc/testsuite/
* g++.dg/modules/pr99153_a.H: New.
* g++.dg/modules/pr99153_b.H: New.
gcc/fortran/ChangeLog:
PR fortran/99171
* trans-openmp.c (gfc_omp_is_optional_argument): Regard optional
dummy procs as nonoptional as no special treatment is needed.
libgomp/ChangeLog:
PR fortran/99171
* testsuite/libgomp.fortran/dummy-procs-1.f90: New test.
This adds another dump of the SLP subgraph we're throwing at costing.
2021-02-22 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_bb_vectorization_profitable_p): Dump
costed subgraph.
This adds a missing accumulation to ret.
2021-02-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/99165
* gimple-ssa-store-merging.c (pass_store_merging::process_store):
Accumulate changed to ret.
* g++.dg/pr99165.C: New testcase.
gcc/fortran/ChangeLog:
* trans-expr.c (gfc_conv_procedure_call): Do not add clobber to
allocatable intent(out) argument.
gcc/testsuite/ChangeLog:
* gfortran.dg/intent_optimize_3.f90: New test.
Move custom macros to acinclude.m4 so we can autogenerate aclocal.m4
with aclocal. This matches every other project in the tree.
libiberty/ChangeLog:
* Makefile.in (ACLOCAL, ACLOCAL_AMFLAGS, $(srcdir)/aclocal.m4): Define.
(configure_deps): Rename to ...
(aclocal_deps): ... this. Replace aclocal.m4 with acinclude.m4.
($(srcdir)/configure): Replace $(configure_deps) with
$(srcdir)/aclocal.m4.
* aclocal.m4: Move libiberty macros to acinclude.m4, then regenerate.
* acinclude.m4: New file.
* configure: Regenerate.
The attr-retain-?.c tests assume ELF file syntax / semantics. Some of the
tests skip AIX because of other requirements, and some explicitly skip
Darwin. This patch adds AIX to the explicit skip list.
gcc/testsuite/ChangeLog:
* c-c++-common/attr-retain-5.c: Skip on AIX.
* c-c++-common/attr-retain-6.c: Same.
* c-c++-common/attr-retain-7.c: Same.
* c-c++-common/attr-retain-8.c: Same.
* c-c++-common/attr-retain-9.c: Same.
When switching the s390 backend to store long doubles in vector
registers, the patterns for long double <-> DFP conversions were
forgotten. This did not cause observable problems so far, because
libdfp calls are emitted instead of pfpo. However, when building
libdfp itself, this leads to infinite recursion.
gcc/ChangeLog:
PR target/99134
* config/s390/vector.md (trunctf<DFP_ALL:mode>2_vr): New
pattern.
(trunctf<DFP_ALL:mode>2): Likewise.
(trunctdtf2_vr): Likewise.
(trunctdtf2): Likewise.
(extend<DFP_ALL:mode>tf2_vr): Likewise.
(extend<DFP_ALL:mode>tf2): Likewise.
(extendtftd2_vr): Likewise.
(extendtftd2): Likewise.
gcc/testsuite/ChangeLog:
PR target/99134
* gcc.target/s390/vector/long-double-from-decimal128.c: New test.
* gcc.target/s390/vector/long-double-from-decimal32.c: New test.
* gcc.target/s390/vector/long-double-from-decimal64.c: New test.
* gcc.target/s390/vector/long-double-to-decimal128.c: New test.
* gcc.target/s390/vector/long-double-to-decimal32.c: New test.
* gcc.target/s390/vector/long-double-to-decimal64.c: New test.
One of the very strong invariants in modules is that module numbers
are allocated such that (other than the current TU), all imports have
lesser module numbers, and also that the binding vector is only
appended to with increasing module numbers. This broke down when
module-directives became a thing and the preprocessing became entirely
decoupled from parsing. We'd load header units and their macros (but
not symbols of course) during preprocessing. Then we'd load named
modules during parsing. This could lead to the situation where a
header unit appearing after a named import had a lower module number
than the import. Consequently, if they both bound the same
identifier, the binding vector would be misorderd and bad things
happen.
This patch restores a pending import queue I previously had, but in
simpler form (hurrah). During preprocessing we queue all
module-directives and when we meet one for a header unit we do the
minimal loading for all of the queue, so they get appropriate
numbering. Then we load the preprocessor state for the header unit.
PR c++/98741
gcc/cp/
* module.cc (pending_imports): New.
(declare_module): Adjust test condition.
(name_pending_imports): New.
(preprocess_module): Reimplement using pending_imports.
(preprocessed_module): Move name-getting to name_pending_imports.
* name-lookup.c (append_imported_binding_slot): Assert module
ordering is increasing.
gcc/testsuite/
* g++.dg/modules/pr98741_a.H: New.
* g++.dg/modules/pr98741_b.H: New.
* g++.dg/modules/pr98741_c.C: New.
* g++.dg/modules/pr98741_d.C: New.
gcc/fortran/ChangeLog:
PR fortran/98686
* match.c (gfc_match_namelist): If BT_UNKNOWN, check for
IMPLICIT NONE and and issue an error, otherwise set the type
to its IMPLICIT type so that any subsequent use of objects will
will confirm their types.
gcc/testsuite/ChangeLog:
PR fortran/98686
* gfortran.dg/namelist_4.f90: Modify.
* gfortran.dg/namelist_98.f90: New test.
When successfully reading a module CMI, the user gets no indication of
where that CMI was located. I originally didn't consider this a
problem -- the read was successful after all. But it can make it
difficult to interact with build systems, particularly when caching
can be involved. Grovelling over internal dump files is not really
useful to the user. Hence this option, which is similar to the
-flang-info-include-translate variants, and allows the user to ask for
all, or specific module read notification.
gcc/c-family/
* c.opt (flang-info-module-read, flang-info-module-read=): New.
gcc/
* doc/invoke.texi (flang-info-module-read): Document.
gcc/cp/
* module.cc (note_cmis): New.
(struct module_state): Add inform_read_p bit.
(module_state::do_import): Inform of CMI location, if enabled.
(init_modules): Canonicalize note_cmis entries.
(handle_module_option): Handle -flang-info-module-read=FOO.
gcc/testsuite/
* g++.dg/modules/pr99166_a.X: New.
* g++.dg/modules/pr99166_b.C: New.
* g++.dg/modules/pr99166_c.C: New.
* g++.dg/modules/pr99166_d.C: New.
Check failed if identical = false was requested or for -fcoarray=single
if an array ref was for a coindexed scalar.
gcc/fortran/ChangeLog:
PR fortran/99010
* dependency.c (gfc_dep_resolver): Fix coarray handling.
gcc/testsuite/ChangeLog:
PR fortran/99010
* gfortran.dg/coarray/array_temporary-1.f90: New test.
This avoids declaring a function with VLA arguments or return values
as inlineable. IPA CP still ICEs, so the testcase has that disabled.
2021-02-19 Richard Biener <rguenther@suse.de>
PR middle-end/99122
* tree-inline.c (inline_forbidden_p): Do not inline functions
with VLA arguments or return value.
* gcc.dg/pr99122-3.c: New testcase.
The vla15.C testcase ICEs with
-mcpu=cortex-m1 -mpure-code -fstack-protector -mthumb
as what force_const_mem returns (a SYMBOL_REF) is not a valid
memory address.
Previously the code was moving the address of the force_const_mem
into a register rather than the content of that MEM, so that instruction
must have been supported and loading from a MEM with a single REG base ought
to be valid too.
2021-02-19 Jakub Jelinek <jakub@redhat.com>
PR target/98998
* config/arm/arm.md (*stack_protect_combined_set_insn,
*stack_protect_combined_test_insn): If force_const_mem result
is not valid general operand, force its address into the destination
register first.
* gcc.target/arm/pure-code/pr98998.c: New test.
The verifiers require that DECL_NONLOCAL or EH_LANDING_PAD_NR
labels are always the first label if there is more than one label.
When merging blocks, we don't honor that though.
On the following testcase, we try to merge blocks:
<bb 13> [count: 0]:
<L2>:
S::~S (&s);
and
<bb 15> [count: 0]:
<L0>:
resx 1
where <L2> is landing pad and <L0> is FORCED_LABEL. And the code puts
the FORCED_LABEL before the landing pad label, violating the verification
requirements.
The following patch fixes it by moving the FORCED_LABEL after the
DECL_NONLOCAL or EH_LANDING_PAD_NR label if it is the first label.
2021-02-19 Jakub Jelinek <jakub@redhat.com>
PR ipa/99034
* tree-cfg.c (gimple_merge_blocks): If bb a starts with eh landing
pad or non-local label, put FORCED_LABELs from bb b after that label
rather than before it.
* g++.dg/opt/pr99034.C: New test.
My recent change to the preprocessor conditions in __thread_relax() was
supposed to also change the __gthread_yield() call to __thread_yield(),
which has the right preprocessor checks. Instead I just removed the
check for _GLIBCXX_USE_SCHED_YIELD which means the __gthread_yield()
call will be ill-formed for non-gthreads targets, and targets without
sched_yield(). This fixes it properly.
libstdc++-v3/ChangeLog:
* include/bits/atomic_wait.h (__thread_relax()): Call
__thread_yield() not __gthread_yield().
Prevents generation of a vec_duplicate with illegal predicate in
<ASHIFT:optab><mode>3.
gcc/ChangeLog:
2021-02-19 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR target/98657
* config/aarch64/aarch64-sve.md (<ASHIFT:optab><mode>3): Use
expand_vector_broadcast' to emit the vec_duplicate operand.
gcc/testsuite/ChangeLog:
2021-02-19 Andre Vieira <andre.simoesdiasvieira@arm.com>
PR target/98657
* gcc.target/aarch64/sve/pr98657.c: New test.
It occurred to me that other types of conversions use rvaluedness_matches_p,
but those uses don't affect overload resolution, so we shouldn't look at the
flag for them. Fixing that made decltype64.C compile successfully, because
the non-template candidate was a perfect match, so we now wouldn't consider
the broken template. Changing the argument to const& makes it no longer a
perfect match (because of the added const), so we again get the infinite
recursion.
This illustrates the limited nature of this optimization/recursion break; it
works for most copy/move constructors because the constructor we're looking
for is almost always a perfect match. If it happens to help improve compile
time for other calls, that's just a bonus.
gcc/cp/ChangeLog:
PR c++/96926
* call.c (perfect_conversion_p): Limit rvalueness
test to reference bindings.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/decltype64.C: Change argument to const&.
When compiling this testcase, trying to resolve the initialization for the
tuple member ends up recursively considering the same set of tuple
constructor overloads, and since two of them separately depend on
is_constructible, the one we try second fails to instantiate
is_constructible because we're still in the middle of instantiating it the
first time.
Fixed by implementing an optimization that someone suggested we were already
doing: if we see a non-template candidate that is a perfect match for all
arguments, we can skip considering template candidates at all. It would be
enough to do this only when LOOKUP_DEFAULTED, but it shouldn't hurt in other
cases.
gcc/cp/ChangeLog:
PR c++/96926
* call.c (perfect_conversion_p): New.
(perfect_candidate_p): New.
(add_candidates): Ignore templates after a perfect non-template.
gcc/testsuite/ChangeLog:
PR c++/96926
* g++.dg/cpp0x/overload4.C: New test.
Insn for rematerialization can contain a clobbered hard register. We
can not move such insn through another insn setting up the same hard
register. The patch adds such check.
gcc/ChangeLog:
PR rtl-optimization/96264
* lra-remat.c (reg_overlap_for_remat_p): Check also output insn
hard regs.
gcc/testsuite/ChangeLog:
PR rtl-optimization/96264
* gcc.target/powerpc/pr96264.c: New.
When building Linux kernel, ld in bninutils 2.36 with GCC 11 generates
thousands of
ld: warning: orphan section `.data.event_initcall_finish' from `init/main.o' being placed in section `.data.event_initcall_finish'
ld: warning: orphan section `.data.event_initcall_start' from `init/main.o' being placed in section `.data.event_initcall_start'
ld: warning: orphan section `.data.event_initcall_level' from `init/main.o' being placed in section `.data.event_initcall_level'
Since these sections are marked with SHF_GNU_RETAIN, they are placed in
separate sections. They become orphan sections since they aren't expected
in the Linux kernel linker script. But orphan sections normally don't work
well with the Linux kernel linker script and the resulting kernel crashed.
Add the "retain" attribute to place symbols in separate SHF_GNU_RETAIN
sections. Issue a warning if the configured assembler/linker doesn't
support SHF_GNU_RETAIN.
gcc/
PR target/99113
* varasm.c (get_section): Replace SUPPORTS_SHF_GNU_RETAIN with
looking up the retain attribute.
(resolve_unique_section): Likewise.
(get_variable_section): Likewise.
(switch_to_section): Likewise. Warn when a symbol without the
retain attribute and a symbol with the retain attribute are
placed in the section with the same name, instead of the used
attribute.
* doc/extend.texi: Document the "retain" attribute.
gcc/c-family/
PR target/99113
* c-attribs.c (c_common_attribute_table): Add the "retain"
attribute.
(handle_retain_attribute): New function.
gcc/testsuite/
PR target/99113
* c-c++-common/attr-retain-1.c: New test.
* c-c++-common/attr-retain-2.c: Likewise.
* c-c++-common/attr-retain-3.c: Likewise.
* c-c++-common/attr-retain-4.c: Likewise.
* c-c++-common/attr-retain-5.c: Likewise.
* c-c++-common/attr-retain-6.c: Likewise.
* c-c++-common/attr-retain-7.c: Likewise.
* c-c++-common/attr-retain-8.c: Likewise.
* c-c++-common/attr-retain-9.c: Likewise.
* c-c++-common/pr99113.c: Likewise.
* gcc.c-torture/compile/attr-retain-1.c: Likewise.
* gcc.c-torture/compile/attr-retain-2.c: Likewise.
* c-c++-common/attr-used.c: Don't expect SHF_GNU_RETAIN section.
* c-c++-common/attr-used-2.c: Likewise.
* c-c++-common/attr-used-3.c: Likewise.
* c-c++-common/attr-used-4.c: Likewise.
* c-c++-common/attr-used-9.c: Likewise.
* gcc.c-torture/compile/attr-used-retain-1.c: Likewise.
* gcc.c-torture/compile/attr-used-retain-2.c: Likewise.
* c-c++-common/attr-used-5.c: Don't expect warning for the used
attribute nor SHF_GNU_RETAIN section.
* c-c++-common/attr-used-6.c: Likewise.
* c-c++-common/attr-used-7.c: Likewise.
* c-c++-common/attr-used-8.c: Likewise.