Even if the operand of -> has dependent type, if it's a pointer we know
that the result will be the target type of that pointer. This should avoid
some unnecessary TYPEOF_EXPR when looking up a name after ->.
gcc/cp/ChangeLog:
* typeck2.c (build_x_arrow): Do set TREE_TYPE when operand is
a dependent pointer.
Segher asked that I update the comments to include the d-form vector stores
(even though they wouldn't be generated by this test).
2021-08-25 Michael Meissner <meissner@linux.ibm.com>
gcc/testsuite/
* gcc.target/powerpc/float128-call.c: Update comments.
I built a compiler on a little endian power8 system where the default long
double was IEEE 128-bit instead of IBM 128-bit. I discovered that on
power8, we would generate a lxvd2x and xxpermdi to deal with the endianess
instead of the Altivec lxv.
In addition, I noticed the constant that was being loaded (1.0q) could be
loaded by the lxvkq instruction.
I rewrote the test to handle all forms of vector load and store that can
be generated.
2021-08-27 Michael Meissner <meissner@linux.ibm.com>
gcc/testsuite/
* gcc.target/powerpc/float128-call.c: Fix test for IEEE 128-bit
long double and power10.
Some newer assemblers emit section start temp symbols for mod init and term
sections if there is no suitable symbol present already.
The temp symbols are linker visible and therefore appear in the symbol tables.
Since the temp symbol number can vary when debug is enabled, that causes
compare-debug fails. The solution is to provide a stable linker-visible
symbol.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/darwin.c (finalize_ctors): Add a section-start linker-
visible symbol.
(finalize_dtors): Likewise.
* config/darwin.h (MIN_LD64_INIT_TERM_START_LABELS): New.
2021-08-27 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-call.c (rs6000-builtins.h): New #include.
(rs6000_init_builtins): Call rs6000_init_generated_builtins. Skip the
old initialization logic when new builtins are enabled.
* config/rs6000/rs6000-gen-builtins.c (write_decls): Rename
rs6000_autoinit_builtins to rs6000_init_generated_builtins.
(write_init_file): Likewise.
We recently had a report of build failure against a Darwin branch on
the latest OS release. This was because (temporarily) the symlink
from libm.dylib => libSystem.dylib had been removed/omitted.
libm is not needed on Darwin, and should not be added unconditionally
even if that is (mostly) harmless since it is a symlink to libc.
There could be cases where the addition was not completely harmless
because the presentation of the symlink would cause the symbols exposed
in libSystem to be considered ahead of ones presented in convenience
libraries.
libgfortran/ChangeLog:
* Makefile.am: Use configured libm availability.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use libtool macro to find libm availability.
* libgfortran.spec.in: Use configured libm availability.
Although the cctools assembler is based of GNU GAS, it is from a
very old version (1.38) which does not support many of the features
that the target supports test is expecting.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Exclude cctools assembler based on
GAS 1.38.
In r12-3048-ge0b6d0b39c6, the GAS version parameter was removed from
the gcc_GAS_CHECK_FEATURE macro. It seems that overlapping comit/test
cycles resulted in several AMDGCN and one Darwin commit with the now
extra parameter still present.
This causes wrong configure code to be generated when autoreconf is
used in the gcc directory.
Fixed by removing the extraneous parm from the AMDGCN and Darwin cases.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* configure.ac (darwin2[[0-9]]* | darwin19*): Alter use of
gcc_GAS_CHECK_FEATURE to remove an extraneous parameter.
(amdgcn-* | gcn-*) Likewise.
Without the 'template', this function template compares 'traverse' to 'f',
and then compares the result to 'a'. Evidently it hasn't been instantiated
yet.
gcc/ChangeLog:
* symbol-summary.h: Added missing template keyword.
This fixes DCE to be able to elide dead control flow in an
infinite loop without an exit edge. This special situation is
handled well by the code finding an edge to preserve since there's
no chance it will find the exit edge and make the loop finite.
2021-08-27 Richard Biener <rguenther@suse.de>
PR tree-optimization/45178
* tree-ssa-dce.c (find_obviously_necessary_stmts): For
infinite loops without exit do not mark control dependent
edges of the latch necessary.
* gcc.dg/tree-ssa/ssa-dce-3.c: Adjust testcase.
This patch is to add the support to make vectorizer able to
vectorize some built-in function scalar versions on Power10.
gcc/ChangeLog:
* config/rs6000/rs6000.c (rs6000_builtin_md_vectorized_function): Add
support for built-in functions MISC_BUILTIN_DIVWE, MISC_BUILTIN_DIVWEU,
MISC_BUILTIN_DIVDE, MISC_BUILTIN_DIVDEU, P10_BUILTIN_CFUGED,
P10_BUILTIN_CNTLZDM, P10_BUILTIN_CNTTZDM, P10_BUILTIN_PDEPD and
P10_BUILTIN_PEXTD on Power10.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/dive-vectorize-1.c: New test.
* gcc.target/powerpc/dive-vectorize-1.h: New test.
* gcc.target/powerpc/dive-vectorize-2.c: New test.
* gcc.target/powerpc/dive-vectorize-2.h: New test.
* gcc.target/powerpc/dive-vectorize-run-1.c: New test.
* gcc.target/powerpc/dive-vectorize-run-2.c: New test.
* gcc.target/powerpc/p10-bifs-vectorize-1.c: New test.
* gcc.target/powerpc/p10-bifs-vectorize-1.h: New test.
* gcc.target/powerpc/p10-bifs-vectorize-run-1.c: New test.
This patch is to make prototypes of some Power10 built-in
functions consistent with what's in the documentation, as
well as the vector version. Otherwise, useless conversions
can be generated in gimple IR, and the vectorized versions
will have inconsistent types.
gcc/ChangeLog:
* config/rs6000/rs6000-call.c (builtin_function_type): Add unsigned
signedness for some Power10 bifs.
Further fixes to structure alignment when the structure is packed
and contains double. This patch checks for packed attribute
at the top level.
gcc/ChangeLog:
PR target/102068
* config/rs6000/rs6000.c (rs6000_adjust_field_align): Use
computed alignment if the entire struct has attribute packed.
This makes the std::function constructor use perfect forwarding, to
avoid an unnecessary move-construction of the target. This means we need
to rewrite the _Function_base::_Base_manager::_M_init_functor function
to use a forwarding reference, and so can reuse it for the clone
operation.
Also simplify the SFINAE constraints on the constructor, by combining
the !is_same_v<remove_cvref_t<F>, function> constraint into the
_Callable trait.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/std_function.h (_function_base::_Base_manager):
Replace _M_init_functor with a function template using a
forwarding reference, and a pair of _M_create function
templates. Reuse _M_create for the clone operation.
(function::_Decay_t): New alias template.
(function::_Callable): Simplify by using _Decay.
(function::function(F)): Change parameter to forwarding
reference, as per LWG 2447. Add noexcept-specifier. Simplify
constraints.
(function::operator=(F&&)): Add noexcept-specifier.
* testsuite/20_util/function/cons/lwg2774.cc: New test.
* testsuite/20_util/function/cons/noexcept.cc: New test.
Add static assertions to std::function, so that more user-friendly
diagnostics are given when trying to store a non-copyable target object.
These preconditions were added as "Mandates:" by LWG 2447, but I'm
committing them separately from implementing that, to allow just this
change to be backported more easily.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* include/bits/std_function.h (function::function(F)): Add
static assertions to check constructibility requirements.
While performing some tests of IEEE 128 float for PPC64LE, Michael
Meissner noticed that __gcc_qsub is substantially slower than
__gcc_qadd. __gcc_qsub calls __gcc_add with the second operand
negated. Because the functions normally are invoked through
libgcc shared object, the extra PLT overhead has a large impact
on the overall time of the function. This patch converts
__gcc_qadd to a static inline function invoked by __gcc_qadd
and __gcc_qsub.
libgcc/ChangeLog:
* config/rs6000/ibm-ldouble.c (ldouble_qadd_internal): Rename from
__gcc_qadd.
(__gcc_qadd): Call ldouble_qadd_internal.
(__gcc_qsub): Call ldouble_qadd_internal with second long double
argument negated.
There is no point to check RTXes before calling force_reg,
force_reg checks for REG RTX by itself.
2021-08-26 Uroš Bizjak <ubizjak@gmail.com>
gcc/
* config/i386/i386.md (*btr<mode>_1): Call force_reg unconditionally.
(conditional moves with memory inputs splitters): Ditto.
* config/i386/sse.md (one_cmpl<mode>2): Simplify.
This patch is the next in the series to improve bit bounds in tree-ssa's
bit CCP pass, this time: bounds for shifts and rotates by unknown amounts.
This allows us to optimize expressions such as ((x&15)<<(y&24))&64.
In this case, the expression (y&24) contains only two unknown bits,
and can therefore have only four possible values: 0, 8, 16 and 24.
From this (x&15)<<(y&24) has the nonzero bits 0x0f0f0f0f, and from
that ((x&15)<<(y&24))&64 must always be zero.
One clever use of computer science in this patch is the use of XOR
to efficiently enumerate bit patterns in Gray code order. As the
order in which we generate values is not significant, it's faster
and more convenient to enumerate values by flipping one bit at a
time, rather than in numerical order [which would require carry
bits and additional logic].
There's a pre-existing ??? comment in tree-ssa-ccp.c that we should
eventually be able to optimize (x<<(y|8))&255, but this patch takes the
conservatively paranoid approach of only optimizing cases where the
shift/rotate is guaranteed to be less than the target precision, and
therefore avoids changing any cases that potentially might invoke
undefined behavior. This patch does optimize (x<<((y&31)|8))&255.
2021-08-26 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* tree-ssa-ccp.c (get_individual_bits): Helper function to
extract the individual bits from a widest_int constant (mask).
(gray_code_bit_flips): New read-only table for effiently
enumerating permutations/combinations of bits.
(bit_value_binop) [LROTATE_EXPR, RROTATE_EXPR]: Handle rotates
by unknown counts that are guaranteed less than the target
precision and four or fewer unknown bits by enumeration.
[LSHIFT_EXPR, RSHIFT_EXPR]: Likewise, also handle shifts by
enumeration under the same conditions. Handle remaining
shifts as a mask based upon the minimum possible shift value.
gcc/testsuite/ChangeLog
* gcc.dg/tree-ssa/ssa-ccp-41.c: New test case.
As suggested by Richard Biener in the comments of PR middle-end/102029,
the new test "INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type) ..." is
redundant, and just "INTEGRAL_TYPE_P (type)" is the preferred form.
2021-08-26 Roger Sayle <roger@nextmovesoftware.com>
Richard Biener <rguenther@suse.de>
gcc/ChangeLog
* match.pd (shift transformations): Remove a redundant
!POINTER_TYPE_P check.
We want to replace all REGs equal to FROM.
2021-08-26 Uroš Bizjak <ubizjak@gmail.com>
gcc/
PR target/102057
* config/i386/i386.md (cmove reg-reg move elimination peephole2s):
Set all_regs to true in the call to replace_rtx.
this patch makes insertion to modref access tree smarter when --param
modref-max-bases and moredref-max-refs are hit. Instead of giving up
we either give up on base alias set (make it equal to ref) or turn the
alias set to 0. This lets us to track useful info on quite large
functions, such as ggc_free.
gcc/ChangeLog:
* ipa-modref-tree.c (test_insert_search_collapse): Update test.
* ipa-modref-tree.h (modref_base_node::insert): Be smarter when
hiting --param modref-max-refs limit.
(modref_tree:insert_base): Be smarter when hitting
--param modref-max-bases limit. Add new parameter REF.
(modref_tree:insert): Update.
(modref_tree:merge): Update.
* ipa-modref.c (read_modref_records): Update.
gcc/ChangeLog:
* ipa-modref-tree.h (modref_ref_node::verify): New member
functoin.
(modref_ref_node::insert): Use it.
(modref_ref_node::try_mere_with): Fix off by one error.
Add more preprocessor conditions to check for constants being defined
before using them, so that the Networking TS headers can be compiled on
a wider range of platforms.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/100285
* configure.ac: Check for O_NONBLOCK.
* configure: Regenerate.
* include/experimental/internet: Include <ws2tcpip.h> for
Windows. Use preprocessor conditions around more constants.
* include/experimental/socket: Use preprocessor conditions
around more constants.
* testsuite/experimental/net/internet/resolver/base.cc: Only use
constants when the corresponding C macro is defined.
* testsuite/experimental/net/socket/basic_socket.cc: Likewise.
* testsuite/experimental/net/socket/socket_base.cc: Likewise.
Make preprocessor checks more fine-grained.
This patch adds 3 more selections to target-supports.exp to see if we can
specify to use a particular long double format (IEEE 128-bit, IBM extended
double, 64-bit), and the library support will track the changes for the long
double. This is needed because two of the tests in the test suite use long
double, and they are actually testing IBM extended double.
This patch also forces the two tests that explicitly require long double
to use the IBM double-double encoding to explicitly run the test. This
requires GLIBC 2.32 or greater in order to do the switch.
I have run tests on a little endian power9 system with 3 compilers. There were
no regressions with these patches, and the two tests in the following patches
now work if the default long double is not IBM 128-bit:
* One compiler used the default IBM 128-bit format;
* One compiler used the IEEE 128-bit format; (and)
* One compiler used 64-bit long doubles.
I have also tested compilers on a big endian power8 system with a compiler
defaulting to power8 code generation and another with the default cpu
set. There were no regressions.
2021-08-25 Michael Meissner <meissner@linux.ibm.com>
gcc/testsuite/
PR target/94630
* gcc.target/powerpc/pr70117.c: Specify that we need the long double
type to be IBM 128-bit. Remove the code to use __ibm128.
* c-c++-common/dfp/convert-bfp-11.c: Specify that we need the long
double type to be IBM 128-bit. Run the test at -O2 optimization.
* lib/target-supports.exp (add_options_for_long_double_ibm128): New
function.
(check_effective_target_long_double_ibm128): New function.
(add_options_for_long_double_ieee128): New function.
(check_effective_target_long_double_ieee128): New function.
(add_options_for_long_double_64bit): New function.
(check_effective_target_long_double_64bit): New function.
The Windows CRT headers define structs with members called f, x, y etc
so don't check those. There are also lots of unnecessary function
parameters in mingw headers using non-reserved names, e.g.
<time.h> uses p and z as parameters of mingw_gettimeofday
<inttypes.h> uses j as a parameter of imaxabs
<pthread.h> uses l, o and func as parameter names
Those should be fixed in the headers instead.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* testsuite/17_intro/names.cc: Adjust for Windows.
While laying some groundwork for constexpr std::vector, I noticed some
bugs in the std::uninitialized_xxx algorithms. The conditions being
checked for optimizing trivial cases were not quite right, as shown in
the examples in the PR.
This consolidates the checks into a single macro. The macro has
appropriate definitions for C++98 or for later standards, to avoid a #if
everywhere the checks are used. For C++11 and later the check makes a
call to a new function doing a static_assert to ensure we don't use
assignment in cases where construction would have been invalid.
Extracting that check to a separate function will be useful for
constexpr std::vector, as that can't use std::uninitialized_copy
directly because it isn't constexpr).
The consolidated checks mean that some slight variations in static
assert message are gone, as there is only one place that does the assert
now. That required adjusting some tests. As part of that the redundant
89164_c++17.cc test was merged into 89164.cc which is compiled as C++17
by default now, but can also use other -std options if the
C++17-specific error is made conditional with a target selector.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/102064
* include/bits/stl_uninitialized.h (_GLIBCXX_USE_ASSIGN_FOR_INIT):
Define macro to check conditions for optimizing trivial cases.
(__check_constructible): New function to do static assert.
(uninitialized_copy, uninitialized_fill, uninitialized_fill_n):
Use new macro.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
Adjust dg-error pattern.
* testsuite/23_containers/vector/cons/89164.cc: Likewise. Add
C++17-specific checks from 89164_c++17.cc.
* testsuite/23_containers/vector/cons/89164_c++17.cc: Removed.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/102064.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/102064.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_fill/102064.cc:
New test.
* testsuite/20_util/specialized_algorithms/uninitialized_fill_n/102064.cc:
New test.
This function claims to remove a single character at index p, but it
actually removes p+1 characters beginning at p. So r.erase(0) removes
the first character, but r.erase(1) removes the second and third, and
r.erase(2) removes the second, third and fourth. This is not a useful
API.
The overload is present in the SGI STL <stl_rope.h> header that we
imported, but it isn't documented in the API reference. The erase
overloads that are documented are:
erase(const iterator& p)
erase(const iterator& f, const iterator& l)
erase(size_type i, size_type n);
Having an erase(size_type p) overload that erases a single character (as
the comment says it does) might be useful, but would be inconsistent
with std::basic_string::erase(size_type p = 0, size_type n = npos),
which erases from p to the end of the string when called with a single
argument.
Since the function isn't part of the documented API, doesn't do what it
claims to do (or anything useful) and "fixing" it would leave it
inconsistent with basic_string, I'm just removing that overload.
libstdc++-v3/ChangeLog:
PR libstdc++/102048
* include/ext/rope (rope::erase(size_type)): Remove broken
function.
So the problem here is there is code in the C++ front-end not to add a
break statement (to the IR) if the previous block does not fall through.
The problem is the code which does the check to see if the block
may fallthrough does not check a CLEANUP_STMT; it assumes it is always
fall through. Anyways this adds the code for the case of a CLEANUP_STMT
that is only for !CLEANUP_EH_ONLY (the try/finally case).
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/cp/ChangeLog:
PR c++/66590
* cp-objcp-common.c (cxx_block_may_fallthru): Handle
CLEANUP_STMT for the case which will be try/finally.
gcc/testsuite/ChangeLog:
PR c++/66590
* g++.dg/warn/Wreturn-5.C: New test.