1. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY in SSE FP to FP splitters.
2. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY in SSE INT to FP splitters.
3. Also check TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY and
TARGET_SSE_PARTIAL_REG_DEPENDENCY when handling avx_partial_xmm_update
attribute. Don't convert AVX partial XMM register update if there is no
partial SSE register dependency for SSE conversion.
gcc/
* config/i386/i386-features.c (remove_partial_avx_dependency):
Also check TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY and
and TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY before generating
vxorps.
* config/i386/i386.h (TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY):
New.
(TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
* config/i386/i386.md (SSE FP to FP splitters): Replace
TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY.
(SSE INT to FP splitter): Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY
with TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY.
* config/i386/x86-tune.def
(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): New.
(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
gcc/testsuite/
* gcc.target/i386/avx-covert-1.c: New file.
* gcc.target/i386/avx-fp-covert-1.c: Likewise.
* gcc.target/i386/avx-int-covert-1.c: Likewise.
* gcc.target/i386/sse-covert-1.c: Likewise.
* gcc.target/i386/sse-fp-covert-1.c: Likewise.
* gcc.target/i386/sse-int-covert-1.c: Likewise.
Simply memcpy and memset inline strategies to avoid branches for
-mtune=tremont:
1. Create Tremont cost model from generic cost model.
2. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
load and store for up to 16 * 16 (256) bytes when the data size is
fixed and known.
3. Inline only if data size is known to be <= 256.
a. Use "rep movsb/stosb" with simple code sequence if the data size
is a constant.
b. Use loop if data size is not a constant.
4. Use memcpy/memset libray function if data size is unknown or > 256.
* config/i386/i386-options.c (processor_cost_table): Use
tremont_cost for Tremont.
* config/i386/x86-tune-costs.h (tremont_memcpy): New.
(tremont_memset): Likewise.
(tremont_cost): Likewise.
* config/i386/x86-tune.def (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
Enable for Tremont.
This is a duplication of volatile loads introduced during GCC 9 development
by the 2->2 mechanism of the RTL combiner. There is already a substantial
checking for volatile references in can_combine_p but it implicitly assumes
that the combination reduces the number of instructions, which is of course
not the case here. So the fix teaches try_combine to abort the combination
when it is about to make a copy of volatile references to preserve them.
gcc/
PR rtl-optimization/102306
* combine.c (try_combine): Abort the combination if we are about to
duplicate volatile references.
gcc/testsuite/
* gcc.target/sparc/20210917-1.c: New test.
When the build configuration changes and Makefiles are recreated, the
src/debug/Makefile and src/debug/*/Makefile files are not recreated,
because they're not managed in the usual way by automake. This can lead
to build failures or surprising inconsistencies between the main and
debug versions of the library when doing incremental builds.
This causes them to be regenerated if any of the corresponding non-debug
makefiles is newer.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* src/Makefile.am (stamp-debug): Add all Makefiles as
prerequisites.
* src/Makefile.in: Regenerate.
Compiling these tests still times out too often when running the
testsuite with more parallel jobs than there are available cores.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* testsuite/ext/pb_ds/regression/tree_map_rand.cc: Increase
timeout factor to 3.
* testsuite/ext/pb_ds/regression/tree_set_rand.cc: Likewise.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* doc/xml/manual/using.xml: Generalize to apply to more than
just -std=c++11.
* doc/html/manual/using_macros.html: Regenerate.
This was just a copy and paste error.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/fs_path.h (advance): Remove non-deducible
template parameter.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* src/c++98/Makefile.am: Use CXXCOMPILE not LTCXXCOMPILE.
* src/c++98/Makefile.in: Regenerate.
When the values is guaranteed to fit in the SSO buffer we know the
string won't allocate, so the function can be noexcept. For 32-bit
integers, we know they need no more than 9 bytes (or 10 with a minus
sign) and the SSO buffer is 15 bytes.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI]
(to_string): Add noexcept if the type width is 32 bits or less.
Remove UB in atomic_ref/wait_notify test.
Signed-off-by: Thomas Rodgers <trodgers@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/101761
* testsuite/29_atomics/atomic_ref/wait_notify.cc (test): Use
va and vb as arguments to wait/notify, remove unused bb local.
Although this patch looks quite large, the changes are fairly minimal.
Most of it is duplicating the large function that does the overload
resolution using the automatically generated data structures instead of
the old hand-generated ones. This doesn't make the patch terribly easy to
review, unfortunately. Just be aware that generally we aren't changing
the logic and functionality of overload handling.
2021-09-16 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-c.c (rs6000-builtins.h): New include.
(altivec_resolve_new_overloaded_builtin): New forward decl.
(rs6000_new_builtin_type_compatible): New function.
(altivec_resolve_overloaded_builtin): Call
altivec_resolve_new_overloaded_builtin.
(altivec_build_new_resolved_builtin): New function.
(altivec_resolve_new_overloaded_builtin): Likewise.
* config/rs6000/rs6000-call.c (rs6000_new_builtin_is_supported):
Likewise.
* config/rs6000/rs6000-gen-builtins.c (write_decls): Remove _p from
name of rs6000_new_builtin_is_supported.
This fixes some issues with constrained variable templates:
- Constraints aren't checked when explicitly specializing a variable
template.
- Constraints aren't attached to a static data member template at
parse time.
- Constraints don't get propagated when (partially) instantiating a
static data member template, so we need to make sure to look up
constraints using the most general template during satisfaction.
PR c++/98486
gcc/cp/ChangeLog:
* constraint.cc (get_normalized_constraints_from_decl): Always
look up constraints using the most general template.
* decl.c (grokdeclarator): Set constraints on a static data
member template.
* pt.c (determine_specialization): Check constraints on a
variable template.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-var-templ1.C: New test.
* g++.dg/cpp2a/concepts-var-templ1a.C: New test.
* g++.dg/cpp2a/concepts-var-templ1b.C: New test.
gcc/fortran/ChangeLog:
PR fortran/102287
* trans-expr.c (gfc_conv_procedure_call): Wrap deallocation of
allocatable components of optional allocatable derived type
procedure arguments with INTENT(OUT) into a presence check.
gcc/testsuite/ChangeLog:
PR fortran/102287
* gfortran.dg/intent_out_14.f90: New test.
The error message is obvious -funconfigured-libstdc++-v3 is used
on the g++ command line. So we just add the dependancy.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
ChangeLog:
PR bootstrap/67102
* Makefile.def: Have configure-target-libffi depend on
all-target-libstdc++-v3.
* Makefile.in: Regenerate.
After a recent change only a boolean value is returned.
2021-09-16 Uroš Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386-protos.h (ix86_decompose_address):
Change return type to bool.
* config/i386/i386.c (ix86_decompose_address): Ditto.
This mimics what the main Makefile.in does: compile the generator
files under build (with Makefile.in's 'build/%.o' rule for compilation).
It also adds $(RUN_GEN) to optionally run it with valgrind and
the $(build_exeext) suffix.
Before, the .o files were compiled with $(COMPILE), causing link
error with $(LINKER_FOR_BUILD) for build != host.
gcc/
PR target/102353
* config/rs6000/t-rs6000 (build/rs6000-gen-builtins.o, build/rbtree.o):
Added 'build/' to target, use build/%.o rule.
(build/rs6000-gen-builtins$(build_exeext)): Add 'build/' and
'$(build_exeext)' to target and 'build/' for the *.o files.
(rs6000-builtins.c): Update for those changes; run rs6000-gen-builtins
with $(RUN_GEN).
To verify other changes in the patch series, I have been searching for
"Invalid sum of caller counts" string in symtab dump but found that
there are false warnings about functions which have their body removed
because they are now unreachable. Those are of course invalid and so
this patches avoids checking such cgraph_nodes.
gcc/ChangeLog:
2021-08-20 Martin Jambor <mjambor@suse.cz>
* cgraph.c (cgraph_node::dump): Do not check caller count sums if
the body has been removed. Remove trailing whitespace.
There is no need to make a MODIFY_EXPR for any of the condition
vars that we synthesize.
Expansion of co_return can be carried out independently of any
co_awaits that might be contained which simplifies this.
Where we are rewriting statements to handle await expression
logic, there is no need to carry out any analysis - we just need
to detect the presence of any co_await.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:
* coroutines.cc (await_statement_walker): Code cleanups.
This avoids using native_interpret_type when we cannot do it with
the original type of the variable, instead use an integer type
for the initialization and side-step the size limitation of
native_interpret_int.
2021-09-16 Richard Biener <rguenther@suse.de>
PR middle-end/102360
* internal-fn.c (expand_DEFERRED_INIT): Make pattern-init
of non-memory more robust.
* g++.dg/pr102360.C: New testcase.
The LEON5 can often dual issue instructions from the same 64-bit aligned
double word if there are no data dependencies. Add scheduling information
to avoid scheduling unpairable instructions back-to-back.
gcc/ChangeLog:
* config/sparc/sparc-opts.h (enum sparc_processor_type): Add LEON5
* config/sparc/sparc.c (struct processor_costs): Add LEON5 costs
(leon5_adjust_cost): Increase cost of store with data dependency
on ALU instruction and FPU anti-dependencies.
(sparc_option_override): Add LEON5 costs
(sparc_adjust_cost): Add LEON5 cost adjustments
* config/sparc/sparc.h: Add LEON5
* config/sparc/sparc.md: Include LEON5 scheduling information
* config/sparc/sparc.opt: Add LEON5
* doc/invoke.texi: Add LEON5
* config/sparc/leon5.md: New file.
This is needed to prevent the Store -> (Non-store or load) -> Store
sequence.
gcc/ChangeLog:
* config/sparc/sparc.md (stack_protect_set32): Add NOP to prevent
sensitive sequence for B2BST errata workaround.
A call to the function might have a load instruction in the delay slot
and a load followed by an atomic function could cause a deadlock.
gcc/ChangeLog:
* config/sparc/sparc.c (sparc_do_work_around_errata): Do not begin
functions with atomic instruction in the UT700 errata workaround.
This version detects multiple empty assembly statements in a row and also
detects non-memory barrier empty assembly statements (__asm__("")). It
can be used instead of next_active_insn().
gcc/ChangeLog:
* config/sparc/sparc.c (next_active_non_empty_insn): New function
that returns next active non empty assembly instruction.
(sparc_do_work_around_errata): Use new function.
Check the attribute of instruction to determine if it performs a store
or load operation. This more generic approach sees the last instruction
in the GOTdata_op model as a potential load and treats the memory barrier
as a potential store instruction.
gcc/ChangeLog:
* config/sparc/sparc.c (store_insn_p): Add predicate for store
attributes.
(load_insn_p): Add predicate for load attributes.
(sparc_do_work_around_errata): Use new predicates.
g++.dg/eh/arm-vfp-unwind.C uses an asm statement relying on
double-precision FPU support. This patch extends it support
single-precision, useful for targets without double-precision.
2021-09-16 Richard Earnshaw <rearnsha@arm.com>
gcc/testsuite/
* g++.dg/eh/arm-vfp-unwind.C: Support single-precision.